DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB

Home » Blogs » 5 Ways to Reduce DevOps Toil

5 Ways to Reduce DevOps Toil

Avatar photoBy: Kenneth Rose on July 29, 2021 Leave a Comment

Over the last several years, DevOps has become a bit of a buzzword. It has become simultaneously a practice, a culture, a team, a job title and a vendor product. You can hire some DevOps, buy some DevOps, adopt DevOps and sprinkle a little bit of DevOps on top for good measure.

But, at its core, DevOps is about service ownership: having a single team with end-to-end ownership of software running in production. You should not have one team building software and another team deploying and operating it. That’s slow. DevOps is literally combining development and operations into a single consolidated team. DevOps means embracing the mantra of “you build it, you own it.”

TechStrong Con 2023Sponsorships Available

The primary benefit of DevOps is increased speed and agility of engineering teams. With DevOps, exactly one team is responsible for a given piece of production software. There are no handoffs or knowledge silos. Practices like continuous delivery can emerge. Features can be put in front of your customers’ eyes with a shorter turnaround time. A secondary benefit is increased reliability and security. With “you build it, you own it,” it is no longer someone else’s job to ensure that software is correct or performant or reliable or secure. These responsibilities are “shifted left” onto exactly one team.

So, where do things go off the rails?

Service ownership and DevOps mean giving teams the agency to make whatever changes are necessary to their software. That can and will come at the expense of product functionality at times. So, the first place things go off the rails is when there is not a healthy balance between product needs and engineering needs. If teams are constantly working on new features but do not have the capacity to fix operational issues that are affecting them, product quality will eventually suffer. Ultimately, product-engineering imbalance generally comes from misaligned incentives or poorly defined goals from management.

The second place where things go off the rails is insufficient training. Especially with organizations early in their journey to adopting DevOps, which may have separate development and operations teams, there is a lot of training on new technologies and processes. Developers need to learn about observability, responding to incidents and debugging production applications. Operators and sysadmins need to learn about coding and the principles of software design. These are new skills for both camps and investment in training and mentorship is required.

What are some best practices for keeping DevOps on track?

There are lots of these best practices. It really depends on your starting point as an engineering organization, but here are five important ones:

1. Track Service Ownership

If DevOps is about “you build it, you own it,” you need an authoritative list of a) what’s running in production and b) which team owns it. If a piece of software goes down, you do not want to find out that the last person to work on it was the intern who left three years ago. Many folks start tracking this kind of information in spreadsheets or wikis, but keeping the data up-to-date becomes challenging. SaaS service catalogs have emerged as the easiest way to automatically keep this data accurate, complete and consistent.

2. Adopt Continuous Integration, Continuous Delivery and Feature Flags

Committing small changes multiple times per day is superior to release trains where one large change is committed on an infrequent basis. CI/CD allows for new features and improvements to be tested faster and put in front of customers more quickly. Feature flag tools let you easily test new functionality and product hypotheses for different segments of customers.

3. Embrace Incident Management

Software breaks sometimes. Having a robust process for how to triage and respond to both alerts and incidents is important. Within the context of DevOps, the team that owns a piece of software should be responsible for being on-call to respond to associated incidents. If the software breaks or behaves in some unexpected way, the team is expected to fix it. The team is also responsible for identifying what happened and investing in ways to ensure the same error does not occur again.

4. Treat Everything as Code, Then Automate all the Things

Applications and microservices? That’s code. Infrastructure? That’s code, too. Security policies? Also code. Deployments? You guessed it–code.

Treating everything as code allows for processes like “code review” to emerge beyond just your applications. You can have multiple eyes on every change, which catches bugs sooner and spreads knowledge to others. As well, version control systems (like Git) are a natural audit log, allowing anyone to see what was changed, when, and by whom.

With everything defined as code, the next step is to invest in automation. How many steps does it take for a developer to deploy new code to production? Or to spin up new infrastructure? Or to hook up monitoring or error tracking tools?

Your goal should be “one step:” run a command. It can be a shell script, a Slack bot or a button with a nice web UI. But it should always be one step. If there are more, you’re now in “manual steps” territory, and that’s brittle. You have no guarantees that successive individuals will run whatever sequence of steps in the same order, which can lead to divergence in how various parts of production are set up.

Another example is maintaining a list of a) what’s running in production and b) who owns it. Manually keeping that list up-to-date is impossible as the size of your production infrastructure or engineering team scales. You’re better off investing in service catalog tooling that can automatically integrate and capture this information.

5. Invest in Reducing Toil

Service ownership and DevOps are two-way streets between management and their engineering teams. On one hand, consolidated ownership will lead to faster delivery and more reliable software. That’s good for management. But, on the other hand, teams require actual empowerment and agency to make decisions about their software. Part of that is having capacity on their roadmaps to make improvements to issues that are causing toil.

For example, if a particular service is paging a lot or experiencing load issues, service ownership means giving the owning team the time to make any necessary changes. That will come at the short-term expense of new production functionality.

There’s no free lunch on this last point. Engineering management can’t just combine dev and ops responsibilities and magically expect results. Service ownership is a two-way street. Otherwise, toil will kill your teams in the long term.

Related Posts
  • 5 Ways to Reduce DevOps Toil
  • Who’s side are you on anyway? DevOps Culture
  • Impacts of DevOps on Testing
    Related Categories
  • Blogs
  • DevOps Culture
  • DevOps Practice
  • Leadership Suite
    Related Topics
  • devops
  • DevOps processes
  • feature flags
  • incident management
  • toil
Show more
Show less

Filed Under: Blogs, DevOps Culture, DevOps Practice, Leadership Suite Tagged With: devops, DevOps processes, feature flags, incident management, toil

« 12 Ways to Bake Security Into a DevOps Transformation
EP 11: Unburdening Developers »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

Evolution of Transactional Databases
Monday, January 30, 2023 - 3:00 pm EST
Moving Beyond SBOMs to Secure the Software Supply Chain
Tuesday, January 31, 2023 - 11:00 am EST
Achieving Complete Visibility in IT Operations, Analytics, and Security
Wednesday, February 1, 2023 - 11:00 am EST

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

Stream Big, Think Bigger: Analyze Streaming Data at Scale
January 27, 2023 | Julia Brouillette
What’s Ahead for the Future of Data Streaming?
January 27, 2023 | Danica Fine
The Strategic Product Backlog: Lead, Follow, Watch and Explore
January 26, 2023 | Chad Sands
Atlassian Extends Automation Framework’s Reach
January 26, 2023 | Mike Vizard
Software Supply Chain Security Debt is Increasing: Here’s How To Pay It Off
January 26, 2023 | Bill Doerrfeld

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

What DevOps Needs to Know About ChatGPT
January 24, 2023 | John Willis
Microsoft Outage Outrage: Was it BGP or DNS?
January 25, 2023 | Richi Jennings
Five Great DevOps Job Opportunities
January 23, 2023 | Mike Vizard
Optimizing Cloud Costs for DevOps With AI-Assisted Orchestra...
January 24, 2023 | Marc Hornbeek
A DevSecOps Process for Node.js Projects
January 23, 2023 | Gilad David Maayan
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.