The pace of change can be managed successfully by defining service level objectives and more in dev environments
Mobile applications, data lakes, microservices, data visualizations, SaaS integrations, automations, IoT data streams, machine learning models—in proof of concepts, pilots and scaling production environments, for customer-facing capabilities and employee workflows—all of these technical capabilities are developed, deployed and enhanced faster today more than ever before.
I spoke to Jason Walker, field CTO at AIOps platform BigPanda, about how the speed of deployment, the breadth of technology services transforming businesses are developing, the greater security threats, and the increase in reliability and performance requirements impact IT Ops.
Walker believes that of all the things we’re trying to do in IT—more, faster, smarter, safer, innovative, secure, reliable—it’s the speed that’s the driving force. “The most significant impact is velocity; the dev-test-deploy cycle time is drastically reduced,” he said. “Without the right guardrails, that breeds unnecessary complexity and a gradual loss of operational awareness.”
He explained that much of the barriers that once slowed down development teams are addressable today when developing service-based architectures on the cloud. “Developers realize that the traditional constraints, either technical dependencies or organizational capability, are much reduced,” Walker noted. “Developing in the cloud for a cloud-based service, leveraging an ecosystem of microservices for inputs, an agile team can move very quickly.”
IT Ops can’t easily say “no” or “slow down” to business stakeholders investing in digital transformation to improve customer experiences and gain competitive advantages with data, analytics and machine learning. Some IT leaders attempted that command-and-control approach during the early days of public clouds, but today, DevOps practices, SRE responsibilities and AIOps capabilities are integral to mainstream IT Ops teams in keeping up with transformational velocities.
So instead of saying “no,” progressive IT Ops teams say “yes, but” by defining service level objectives, (SLOs), capturing service level indicators (SLIs) and managing to error budgets.
Walker agreed. “SLIs, SLOs and error budgets are a very useful way to manage the critical inputs and outputs at the interfaces between microservices and at a high level across the business service, allowing developers to keep changing the ‘interior’ pieces.“
These tools change the operating model and mindset across the entire IT organization by exposing trade-offs to business stakeholders. For example, if a web application has a 99.9% SLO, the whole IT team has a 0.1% error budget. If the SLO is missed, a service level policy identifies areas of investment to improve performance, reliability, security and automation or to address technical debt.
Defining service level objectives helps bring business stakeholders, development teams, SREs and IT Ops together and align on reliability objectives and trade-offs. It’s an important step, but not sufficient for teams that want to exceed service levels during digital transformation.
Walker offered several technical recommendations for IT Ops groups transitioning to service level objectives:
These are balanced recommendations, with the first two focused on how development teams engineer microservices and the last two on how IT Ops teams use monitoring and AIOps to implement actionable SLIs. The middle two recommendations on knowledge and change management processes help the entire organization stay in sync through a fast-paced operating environment.
“The velocity, flexibility and variability of this type of development means leaders at all levels need to understand and align the strategic and tactical goals, and to prevent their teams from drifting away from business goals,” Walker said.
So, the business leaders align on service level objectives, development teams engineer observable containerized microservices, IT Ops executes a monitoring strategy and the CIO ensures communication, collaboration and knowledge sharing. Is that all that’s required?
The issue is that most IT Ops teams are understaffed and get overwhelmed supporting the new cloud-native microservices, legacy systems and everything in between. To address the gap, IT leaders are investing in AIOps, and even hundred-year-old enterprises are successful at adopting machine learning and automation to accelerate IT Ops.
Automation and machine learning event correlation applied to monitors, alerts and observable artifacts are the force multipliers. Open-box machine learning enables IT Ops to triage incidents and improve their mean time to resolution, while automation reduces manual efforts and keeps teams in sync. Organizations that are modernizing applications and supporting hybrid clouds require these capabilities to manage the complexities, run at business speed and manage databases, microservices and applications to higher service level objectives.
As Walker noted, increasing velocity is important for responding to customer opportunities and changing conditions. Driving faster digital transformations is more achievable today when IT Ops leverages automation and AIOps to stabilize speed with reliability and performance.
By investing in open source frameworks and LGTM tools, SRE teams can effectively monitor their apps and gain insights into…
Cognition Labs' Devin is creating a lot of buzz in the industry, but John Willis urges organizations to proceed with…
While most app developers work for organizations that have platform teams, there isn't much consistency regarding where that team reports.
Day Two DevOps is a phase in the SDLC that focuses on enhancing, optimizing and continuously improving the software development…
A global survey of 500 IT professionals suggests organizations are not making a lot of progress in their ability to…
In part five of this series, hosts Alan Shimel and Mitch Ashley are joined by Bryan Cole (Tricentis), Ixchel Ruiz…