Iterative today added an experiment versioning capability to an open source platform for managing machine learning operations (MLOps) using GitOps workflows.
Dmitry Petrov, Iterative CEO, said the latest version of the Data Versioning Control (DVC) platform makes it simpler to save, compare and reproduce machine learning (ML) experiments at scale without requiring organizations to set up an additional repository to track them. Instead, organizations can store those experiments alongside other software artifacts in a Git repository, he said.
As a result, IT teams will no longer need to try to keep track of experiments using spreadsheets or notebook tools such as Jupyter, he noted. Organizations that are building AI models for applications constructed using DevOps workflows will also find it easier to comply with a wide range of existing and forthcoming compliance mandates, added Petrov.
The DVC platform makes use of a Git-like interface to enable organizations to track multiple versions of data sets, models and pipelines across an MLOps workflow. That capability is being extended to make it easier to manage the experiments data science teams create during the construction of an AI model.
Iterative is at the forefront of an emerging debate over how best to infuse AI models into applications. Providers of platforms used by data science teams are making a case for separate repositories for tracking the artifacts used to construct an AI model. Iterative, on the other hand, is arguing for the use of an AI model build framework that uses the same Git repositories where any other type of software artifact is stored and shared. Iterative’s argument is that, in effect, AI models are just another type of software artifact, noted Petrov.
In addition to reducing the total cost of building AI models by reducing the number of platforms required, a Git-based approach also makes it simpler for DevOps and data science teams to collaborate because the DevOps team will have more visibility into what AI models will eventually need to be incorporated into an application.
It may be a while before the development of AI models and applications fully converges. Data science teams today typically have defined their own workflow processes using a wide range of graphical tools. However, as it becomes obvious that almost every application is going to be infused with machine learning and deep learning algorithms to some degree, the need to bridge the current divide between DevOps and data science teams will become more acute.
In the meantime, DevOps teams should assume that many more AI models are not only on the way but that those models will need to be continuously updated. Each AI model is constructed based on a set of assumptions; however, as more data becomes available, AI models are subject to drift that results in less accuracy over time. Organizations may even determine that an entire AI model needs to be replaced because the business conditions on which assumptions were made are no longer valid. One way or another, the updating and tuning of AI models is likely to soon become just another continuous process being managed via a DevOps workflow.