Last week, the nonprofit Linux Foundation and Harvard’s Lab for Innovation Science published Census II of Free and Open Source Software—Application Libraries. This report identifies more than 1,000 of the most widely deployed open source application libraries. Black Duck Cybersecurity Research Center (CyRC) was among the contributors of anonymized usage data based on scans of codebases at thousands of companies, providing data that allowed for a more complete picture of the free and open source software (FOSS) landscape.
The report authors noted, “It is difficult to fully understand the health, economic value, and security of FOSS because it is produced in a decentralized and distributed manner.” Because there is such variety in the ways software components are packaged, as well as how versions are catalogued and identified, the report organized them into eight Top 500 lists. Mike McGuire, security solutions manager with Black Duck, describes packages and versions as being a bit like the model, year, and trim of a car. “If I told you I drive a Toyota Camry, you still don’t know exactly what I drive. Is it the 1999 version or the 2022 version? It’s important to know this when ordering parts, getting service, tracking recalls, etc.”
The goal of Census II, according to the authors, is to “inform actions to sustain the long-term security and health of FOSS.” The census represents “our best estimate of which FOSS packages are the most widely used by different applications. […] It does not try to measure the risk profiles of that software. There are many indicators that could be used to suggest risk, and different organizations may weight factors differently,” the authors wrote.
McGuire agrees with the caveat. Widely used is not the same as critical. “Tons of apps can be using a specific Java GUI framework, making it very popular, but it may not serve as a critical part of the software should something happen to it.” He added that what is considered critical “is going to be unique to each organization based on how their apps are built.” Still, measuring risk profiles is “easier to do once the most widely used software is identified,” the Census II authors wrote.
The authors outlined several hurdles to improving the way software is identified, catalogued, and maintained. These are important as the industry moves toward widespread standardization and adoption of a software Bill of Materials (SBOM). These challenges include
The report also identified several problems that affect the long-term security of FOSS. These include
McGuire said Census II “provides a view of popular FOSS and some observations about relative complexities.” While it is not prescriptive, Census II does point to the need for organizations and users to be more actively involved in FOSS development, and not leave it solely to the small group of developers who have led the way thus far. The report also shows how important software composition analysis (SCA) tools are to detect legacy software in open source, and the ongoing need for standardization in the SBOM space.