Blogs

How Open Source Can Benefit AI Development

Enterprises are increasingly reliant upon open source software. A full 95% of IT leaders say open source tools are key to their enterprise infrastructure. Simultaneously, we’ve witnessed a sharp increase in the adoption of artificial intelligence (AI) and machine learning (ML). Most IT divisions want to take advantage of AI in some form, and open source technologies will likely be vital to these efforts.

I recently met with Moses Guttmann, co-founder and CEO of ClearML, to understand the benefits of adopting open source software within the sphere of artificial intelligence. To Guttmann, open source has many benefits—the visibility it grants can equate to helpful feedback from the developer community, and transparency can inform new feature development and identify potential security gaps. There are also countless open source tools to help organizations develop and deploy ML models.

Below, we’ll consider the benefits and potential drawbacks of incorporating OSS in AI development and explore ways to optimize the use of AI within an organization.

Benefits of Using Open Source for AI

AI has evolved in leaps and bounds. What was pure sci-fi a few years ago has become a reality through ChatGPT and other AI-powered platforms. The capabilities of AI are awe-inspiring, but Guttmann believes it still requires fine-tuning to move from entertainment purposes into a valid business driver.

For example, custom ML models are difficult to construct from scratch, and generic APIs aren’t always that customized to the use case at hand to be valuable. Instead, Guttman foresees more and more organizations downloading open pre-trained models and retraining or customizing them to meet their specific needs.

For software providers, there are many benefits to adopting an open source strategy. The visibility it creates helps evolve a project with feedback from the community, said Guttmann. This, in turn, helps organizations build better, more stable products. For example, ClearML adopts plenty of OSS to power their user interface, backend and other components, in addition to providing a popular open source CI/CD framework for ML workflows. “We’re true believers in the power of community and like to work with open source and contribute back to these projects,” said Guttman.

Use Cases for Open Source in AI Development

Open source software is powering innovation in many ways. And when it comes to AI, there are numerous open source projects to help accelerate AI development. Such projects encompass pre-trained models, data-structuring tools, frameworks to train models and AI deployment pipelines. For example, many open source machine learning frameworks can help engineers build and train their models from scratch. These free-to-use tools include TensorFlow, Keras, PyTorch, H2O, XGBoost and others.

However, training models from scratch is time-intensive and complicated, said Guttmann. Instead, another option is to use pre-trained models accessible on GitHub or other repositories. Some popular examples are PaddleNLP and Natural Language Toolkit for natural language processing, OpenCV for machine vision and DeepFaceLab for deepfakes. There are also many open source Python packages that can structure and clean data for ML model training purposes.

Open source software can also be applied to streamline the ML DevOps pipeline itself. AI requires intense processing and is typically deployed to its own computing environments. Infrastructure management around running and training AI is a promising area where open source could assist, said Guttmann. “Automating this process with an abstraction layer to allow data science and ML engineers to access product-grade hardware is a must,” said Guttmann.

Potential Downsides of Open Source Usage

Yet, of course, there are some potential downsides to using open source software within the field of AI. Supply chain attacks are still a persistent threat, and numerous open source software risks can present themselves. These include bad actors compromising a legitimate package, name confusion attacks, unmaintained software and untracked dependencies. License restrictions could also potentially block the inclusion of OSS in certain projects.

Thankfully, the benefits of OSS appear to outweigh these concerns. Automated vulnerability scanning techniques can help detect known issues. It also helps that 89% of technology leaders see enterprise open source as more secure or as secure as proprietary software, according to a Red Hat report.

Tips For Getting the Most From AI

Open source is powering digital innovation on many fronts, including AI. But, simply looping in OSS isn’t the only aspect of fully leveraging AI—another will be the operationalization of ML management. The right automation infrastructure in place could enable the agility necessary to continuously retrain models. And since there are potential gaps in the CI/CD process, Guttmann recommended separating AI/ML from your normal CI/CD for security reasons.

Guttmann also believes that, in general, companies should be more transparent around ML models, including sharing details behind their creation and visibility regarding the datasets fed into them. Overlooking this could lead to ethical concerns. Furthermore, limiting the “hallucinations” generative AI produces is another huge gap that must be bridged, said Guttmann. This could be solved by simply allowing the system to respond with “I don’t know,” instead of confidently spouting nonsense.

In summary, open source could bring many benefits to bespoke AI/ML adoption. Many mature open source tools exist to enable engineers to create advanced AI and oversee the management and deployment of custom ML models. But the real test will be retaining this open creed by upholding visibility and transparency into the end creations.

Bill Doerrfeld

Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high impact blog on API strategy for providers. He loves discovering new trends, researching new technology, and writing on topics like DevOps, REST design, GraphQL, SaaS marketing, IoT, AI, and more. He also gets out into the world to speak occasionally.

Recent Posts

Survey Sees Cloud Developer Environment Adoption Gaining Momentum

A Coder survey found that while 95% of developers and business leaders are familiar with cloud development environments, the reasons…

3 hours ago

Best Practices for Configuring MySQL Replication

MySQL provides several replication configuration options. However, ensuring it is done correctly may take time and effort, with considerable choices.…

3 hours ago

From Chaos to Clarity: Streamlining DevSecOps in the Digital Era

Organizations need a scalable security orchestration framework that eliminates friction in DevSecOps workflows and drives efficiency in real-time.

1 day ago

Five Great DevOps Job Opportunities

DevOps.com is now providing a weekly DevOps jobs report through which opportunities for DevOps professionals will be highlighted to better…

1 day ago

Job Hunting

1 day ago

OpenTelemetry Project Maintainers Add Code Profiling Capabilities

OpenTelemetry maintainers added profiling capabilities to enable DevOps teams to identify the root cause of issues down to a specific…

4 days ago