Just as open source software turns 20 years old this year, these numbers are a testament to the incredible magic that happens when communities of developers openly share innovations.
Software Supply Chains at Work
What we’re witnessing here is a software supply chain in full effect. Open source projects contribute packaged code to the community, place it in public warehouses, and it’s then consumed by development teams around the globe to create new front-end, back-end or mobile applications for all of us. Every development team on the planet now utilizes a software supply chain that operates at insane speeds with massive throughput.
In a second tweet, Laurie wrote, “Since we have ~12 million users, that means the average user installed the neatly round number of 1000 packages in 7 days. In reality, most individuals installed very few packages, and a bunch of CI build boxes installed 10s or 100s of thousands of packages each.”
Essentially, there is an army of robots (automated CI tools) downloading billions of components from the web. While this type of automation is efficient, by itself it does not serve teams well.
Repositories to the Rescue
At Sonatype, we saw similar behavior by developers in the early days of Maven. Individual developers would download Java components directly from Maven Central. Maven users sitting right next to one another in the same room and on the same team would download the same versions of the same components needed for their builds, repeatedly. It was for this reason, among others, that our founders invented the Nexus repository manager.
While repository managers have been used for quite some time in the Java community, Laurie’s post tells us that adoption of the technology still lags in younger, less mature, packaged code ecosystems.
Not All Components Are Created Equal
I have often said that if you have 100 developers, you have 100 front doors open into your organization. If you have 1,000 developers? Then there are 1,000 front doors.
If you dissect Laurie’s comment further, you will also realize that any developer can bring any package into your organization at any time. While this does make them more efficient, it also points to security and governance concerns. Every component was developed by someone else, donated out to the internet and consumed in the millions. But you have no true idea of the source or origin of many of the components.
Research I have done in previous years at Sonatype demonstrated that 5.5 percent (1 in 18) of components downloaded by repository managers had known security vulnerabilities. While the percentage is not very large, keep in mind that a repository manager will only download a component once. Once cached, future downloads of that component are unnecessary and the components can be reused infinitely by that team.
Sonatype analysis of more than 40,000 Nexus repositories reveals that the average repository holds more than 1,600 components. Deeper analysis of the 1,600 components housed in the average repository manager found 192 security vulnerabilities were present among the components (some components having more than one security vulnerability).
Within software supply chains, repository managers and private container registries represent procurement gates into the development organization. The gates can be left wide open where component flows are not governed or they can represent opportunities for quality and security checkpoints that ensure defects are not passed downstream.
What’s in Your Repository Manager is in Production
“Alarmingly, many sites continue to rely on npm packages like YUI and SWFObject that are no longer maintained,” the researchers continued. “In fact, the median website in [NU’s] dataset is using a library version 1,177 days older than the newest release, which explains why so many vulnerable libraries tend to linger on the Web.”
What is Wrong With 200 Billion Downloads?
Nothing from an innovation standpoint. It’s awesome and we can all celebrate this achievement with Laurie.
At the same time, it gives us pause to reflect upon other concerns of our day, including cybersecurity threats, technical debt and wasteful context switching.