The diverse number of database platforms, deployment environments and data variants needed to support the plethora of applications and services required for rapid DevOps continuous delivery lead times, coupled with the challenges of integrating database changes for application pipelines, is a growing challenge for many organizations as they strive to accelerate their DevOps transformations.
When you stand back and look at the bigger picture, an enterprise with many databases and various data requirements for DevOps continuous delivery presents a problem. At best, it makes things chaotic; at worst, it is a gross impediment to overall organization agility—which undermines the very purpose that DevOps is meant to serve.
In my book, Engineering DevOps, I explained that databases offer several challenges for CI/CD pipelines.
In the simplest case, when a database is used solely by one application or microservice, the database pipeline can be integrated with the application pipeline to ensure its changes use the same version management system and change controls to ensure the database changes are kept in sync with the application. However, many database platforms are not as responsive to changes as the applications themselves and become a bottleneck for the pipeline.
If a database is used by several applications, each with its own DevOps pipeline, then changes to the database must be coordinated with all applications that depend on it. Maintaining a separate database pipeline linked to the application pipelines through CI/CD toolchains, as a strategy, is overly complicated and can often introduce additional bottlenecks and synchronization issues.
Another challenge with database pipelines is the management of data in the database during the CI and CD stages. Data in production needs to be persistent and is ever-changing, even while the underlying database code is changing. During the CI process, sample data from production needs to be used to ensure testing is realistic, and new test data must be introduced to verify new database capabilities are working.
During the CD stages, rollbacks to prior versions of database code must avoid rolling back to versions that are no longer compatible with production data. This is a key reason why database changes must follow a disciplined approach that fully tests database changes with applications prior to full deployment. Rolling deployments, in which applications and database changes are deployed gradually in separate clusters, can minimize the blast radius in case of problems. The required coordination of separate systems adds complexity and bottlenecks.
So, why not simply integrate database changes as artifacts within the same DevOps pipeline as the application?
Gilad David Maayan’s September 2020 article DevOps Database Strategies and Challenges: Automation, Scalability, and More explained some of the specific challenges that arise with integration of traditional databases with DevOps.
Database tools were not designed for integration with DevOps tools. These tools tend to be database- and environment-specific, which can make it difficult to incorporate into pipelines designed for flexibility.
Database performance and ease of management often do not perform well in DevOps environments where multiple applications need to use the database in a short time, such as in the accelerated lead times for DevOps continuous delivery. This is especially true with larger-scale databases. If an application change must sync with a database schema, changing the pipeline can become complicated especially if there is a need to sync across multiple application pipelines to meet continuous delivery goals.
While DevOps processes are proven for continuous delivery of applications, incorporating database changes into the DevOps pipeline is hindered by large data sets, limited database automation and security concerns. As a result, database development and testing continue to rely on outdated processes, with fixed and often shared instances refreshed from backups. A new DevOps Database platform is needed to provide highly dynamic and efficient cloud-native database solutions for modern enterprise digital transformations, and that platform must include DataOps use cases for DevOps, DevSecOps, continuous testing and governance. This new kind of data repository must be designed to integrate seamlessly with DevOps CI/CD pipelines through flexible APIs and cloud-native architectures.
Given the points above, traditional database solutions and parallel data pipeline strategies are not sufficient to meet the challenges of enterprise DevOps transformations. What is needed is a new kind of DevOps database platform that allows the database, and database changes, to be as agile as any other artifact in the DevOps pipeline.
Paul Stanton, VP of product management at Windocks, in his June 2021 YouTube video Windock DevOps Data Repo, said, “Imagine terabyte class database environments managed as DevOps artifacts with entire versioned databases available in seconds.” That’s what we need.
The following are examples of the capabilities required for such a new DevOps database platform.
• Automated database provisioning
• Distributed repositories support
• Simple declarative builds
• Dynamically scalable
• Virtual hard drive images for a wide variety of DBs – Linux, SQL, Oracle and others
• Distributed versioned database environments working on standard file systems and storage systems delivery writeable data at the speed of DevOps – in other words, in seconds.
• Deploys equally to all operating systems and cloud environments, Docker containers and Kubernetes.
• Many clones per virtual machine (VM)
• Security features including access management, data masking and encryption
• Compatible with and easily integrates with CI/CD platforms
• Compatible with multi-cloud (AWS, Azure, GCS)
• Built-in data security and governance into DevOps data.
I will explain these more completely in a subsequent article.
What This Means
Digital transformations are imperative for modern enterprises and organizations if they are to remain competitive. However, the transformations of many enterprises and organizations are being seriously impeded by having too many database platforms and tools that do not integrate well and that bottleneck continuous delivery goals. A new kind of DevOps database platform, in which the database is treated as any other artifact, is needed to resolve technical problems with database integration to DevOps application pipelines. A new DevOps database platform solution is needed to accelerate time-to-value and realize efficient, safe operation for digitally transforming enterprises and organizations.