Dave Gershgorn, an AI reporter, published an interesting article on Quartz late last week with the ungainly but clickable title This Open-Source Tech Company’s IPO Filing Reads Like an Argument Against Building a Business on Open Source.
The open source company in question is data management and machine learning company Cloudera, which submitted its S-1 filing on March 31. If you’re not familiar with the term, an S-1 is a filing with the U.S. Securities and Exchange Commission used by companies planning on going public. If it can hold on to its valuation, the Cloudera IPO would be the second open source unicorn IPO this year (the first was MuleSoft).
S-1s are written by lawyers using boilerplate language as a preventative against suits. If there’s a slight possibility that an executive team could be eaten by alligators because corporate HQ is located near an alligator-infested swamp, there will probably a notation of the possibility in the “Risk Factors” section of their S-1 concluding that the business would be “adversely affected as a result.”
There are 195 mentions of “open source” in Cloudera’s S-1 filing, many of them in the “Risk Factors” portion, and because he’s a good writer, Gershgorn cherry-picks the scariest of the caveats to make his argument on “why investing in an open source-based company is risky.”
Gershgorn begins with: “[Cloudera] could be sued for inadvertently using stolen open source code.”
As the company notes in its S-1:
We may be exposed to increased risk of being the subject of intellectual property infringement claims as a result of acquisitions and our incorporation of open source software into our platform, as, among other things, we have a lower level of visibility into the development process with respect to such technology or the care taken to safeguard against infringement risks.
Indeed, all companies using open source face IP infringement risk. Given that open source is the foundation of modern software—composing as much as 90% of some proprietary code—you could make the case that every company with mission-critical applications faces that risk. It’s why savvy companies—I assume Cloudera among them, since they acknowledge the risk—have processes in place to identify and manage open source to mitigate risk as new code enters the SDLC through acquisitions.
“That lawsuit would likely expose [Cloudera’s] proprietary code,” Gershgorn writes, pointing to Cloudera’ warning that
by the terms of certain open source licenses, we could be required to release the source code of our proprietary software, and to make our proprietary software available under open source licenses, if we combine our proprietary software with open source software in a certain manner.
Again, while it is indeed true that some open source license terms could require proprietary code be released as open source itself, the fact that Cloudera acknowledges the issue indicates that it has processes in place to ensure that doomsday scenario doesn’t happen.
A much scarier scenario would be the company that doesn’t realize the requirement to comply with the licenses of the open source they use – or worse, doesn’t even realize that they have the open source in their proprietary code. Most open source components are governed by one of about 2,500 known open source licenses, and the license obligations can be tracked and managed only if the open source components themselves are identified.
“And the code can be vulnerable to cyberattacks.”
Gershgorn heavily edits the Cloudera IPO S-1 section containing this line to hammer home his point. The full caveat reads:
Further, some open source projects have known vulnerabilities and architectural instabilities and are provided on an “as‑is” basis. Many of these risks associated with usage of open source software, such as the lack of warranties or assurances of title, cannot be eliminated, and could, if not properly addressed, negatively affect the performance of our platform and our business. In addition, we are often required to absorb these risks in our customer and partner relationships by agreeing to provide warranties, support and indemnification with respect to such third party open source software. While we have established processes intended to alleviate these risks, we cannot assure that these measures will reduce these risks.
As well as license compliance, security vulnerabilities and code quality are of concern in open source—as they are in proprietary software. Over 3,600 new open source component vulnerabilities were reported in 2016—almost 10 per day on average.
To me, the key part of this section is the phrase “we have established processes intended to alleviate these risks,” which indicates that Cloudera recognizes the importance of open source code management. Indeed, Cloudera apparently feels secure enough with their insight into the open source they use to provide warranties to customers and partners protecting against those risks.
Is investing in an open source company risky? Not when a company like Cloudera obviously understands the risks associated with open source, and seems prepared to handle those risk with open source identification and management. I think more potential investors will be concerned that Cloudera’s historical losses have (to date) overwhelmed the company’s revenue, and that Cloudera “expects to continue to incur net losses for the foreseeable future,” or that Cloudera faces a bevy of heavily armed competitors such as HP, IBM, Oracle, Amazon Web Services and Hortonworks. That is risk much more difficult to manage than the risk potential of open source code.