DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • DevOps Chats
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Communities
    • AWS Community Hub
    • CloudBees
    • IT as Code
    • Rocket on DevOps.com
    • Traceable on DevOps.com
    • Quali on DevOps.com
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Digital Anarchist
  • Media Kit
  • About
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DevSecOps
  • DevOps Onramp
  • Practices
  • ROELBOB
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps

Home » Blogs » DevOps Culture » How To Build a Culture of Resilience Through Good Habits

habits

How To Build a Culture of Resilience Through Good Habits

By: Matthew Fornaciari on June 30, 2020 2 Comments

Good habits are hard to form. I’ve been listening to the audiobook “Atomic Habits“ by James Clear on my morning runs, and something struck me. At Gremlin, along with our software, what we’re trying to promote are positive new habits for our customers. According to the author, one of the primary reasons new habits don’t stick is because there’s often sacrifice without immediate gratification.

Recent Posts By Matthew Fornaciari
  • Why You Need Chaos Engineered into Your Hybrid Cloud Infrastructure
More from Matthew Fornaciari
Related Posts
  • How To Build a Culture of Resilience Through Good Habits
  • DevOps Chat: 7 habits of successful DevOps
  • DevOps Unbound EP 21 Leading a DevOps Transformation – Lessons Learned – TechStrong TV
    Related Categories
  • Blogs
  • DevOps Culture
  • Leadership Suite
    Related Topics
  • devops culture
  • resiliency
  • SLO
  • SRE
Show more
Show less

Psychologically, we’re wired to want instant gratification. But not all habits give immediate rewards; in fact, many delay our gratification for some time. So how do we help ourselves pursue good habits? Put simply: We need to make them obvious, attractive and easy.

One of the ways to make them easy is to focus on what the author calls “gateway habits”—the smallest piece of the habit that can reasonably be achieved in two minutes. So if your goal is to eventually run a marathon, the gateway habit is putting on your running shoes every day.

To build a culture of resilience at your company, start small and create getaway habits. If a team runs a GameDay once a month (time dedicated to experimenting on your systems) or even simply runs their first single chaos engineering experiment, then award that person or team with immediate recognition.

Here are some other ways to build a culture of resilience at your organization:

  • Recognize the change to new habits.
  • Create DNR (do not repeat) items.
  • Adopt “You build it, you own it.”
  • Track the four golden signals.

Recognize the Change to New Habits

Incentives are a great way of kick-starting a new habit, but they don’t necessarily sustain the good behavior. It’s identifying the improvements that result from the new habit that really makes it stick. We’ll get to some specific metrics you can track later in the article, but notice for now that identifying the improvements is when gratification starts to drive enthusiasm. Ideally, that enthusiasm grows until the new habit becomes part of your identity.

In our example of running a marathon, the moment of most significant change is when the person starts to self-identify as a runner. Then, the habit is no longer a chore, but rather part of who they are. Ideally, we want all computer engineers to adopt a specific set of habits until they consider themselves site reliability engineers (SREs) as part of their identity. That’s when the habit is solidified. 

Create DNR (Do Not Repeat) Items

Engineers and product managers want to ship new products and features. There’s nothing quite as satisfying in software development as deploying new code and seeing what you built running out in the world. But, if what you built is consistently breaking or providing a bad user experience, then you are hurting your customers and ultimately your business.

To make sure we are always learning and getting better, at Gremlin we have what we call “DNR,” or Do Not Repeat work. This work consists of action items from outages and incidents that must not ever be repeated, lest we fail to learn our lesson from these failures. Practically, what this means is all feature work is halted until the issues highlighted as DNR work are remedied and the fixes are verified. In other words, you don’t get to write new code until your old code is fixed. We all know that many teams struggle with the trade-offs of moving fast, but ultimately, if you don’t have strict guidelines in place, then more often than not engineers will prioritize shipping something over making sure it’s reliable.

Creating a DNR item is an easy way to incentivize the behavior you want to see internally by appealing to the engineer’s desire to produce new features. We convince them to write better code because better code means they get to spend less time fixing things.

Adopt “You Build It, You Own It”

This is the driving principle of DevOps. It is the reason behind shifting left. When the team that develops the software is different from the team that operates it, then there’s a misalignment of incentives. If I am a developer being tracked (and promoted) solely on the amount of code I ship, my focus will be on getting more bits out the door and not on ensuring the features I release will withstand the burdens of operation. That’s another team’s concern.

That’s the motivation behind the proliferation of the “you build it, you own it” mindset at top-performing organizations. Hell, my first day at Amazon, they tossed me a pager and said “Good luck.” And while that may sound daunting (and it was), I can tell you that it not only motivated me to make sure I built systems to last, but it also fundamentally changed the way I thought about architecting systems. Spoiler: I’m not a big fan of my pager going off at 3 in the morning.

In other words, the person or team building the system needs to be the same person or team that feels the pain if that system is failing. But it’s not just about punishment and pain, that team also needs to be recognized and rewarded when their system is running reliably. This creates an alignment of incentives that promotes the kind of habits seen across top-performing teams.

Track the Four Golden Signals

In monitoring distributed systems, Google’s SRE book outlines the four golden signals of monitoring as latency, traffic, errors and saturation. If I could wave a magic wand and immediately improve the culture of an organization, then I would have service-level objectives (SLOs) tied to these four metrics. But going back to habit formation: If you make acquiring a habit too difficult upfront, then an engineering team will ultimately reject it. So if your organization isn’t mature enough to create SLOs, simply beginning to track these metrics will up-level your game tremendously. They will give you an understanding of what is not working well and help guide your priorities.

To Summarize

Imagine a world where, after a major incident happens, the points of failure involved become DNR work. The same failure is not allowed to happen again and no new feature work will be completed until the fixes are implemented. And more importantly, no new feature work will be completed until those fixes are verified via a chaos engineering experiment, which is then cataloged and run continuously against your system. Then you take that knowledge and share it with other teams so they can run the same experiments and make sure they are immune to those failures as well.

This is how you build a culture of resilience.

Filed Under: Blogs, DevOps Culture, Leadership Suite Tagged With: devops culture, resiliency, SLO, SRE

Sponsored Content
Featured eBook
The Automated Enterprise

The Automated Enterprise

“The Automated Enterprise” e-book shows the important role IT automation plays in business today. Optimize resources and speed development with Red Hat® management solutions, powered by Red Hat Ansible® Automation. IT automation helps your business better serve your customers, so you can be successful as you: Optimize resources by automating ... Read More
« PionerasDev wins IBM Open Source Community Grant to increase women’s participation in programming
Why is Site Reliability Engineering Important? »

TechStrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
Tuesday, August 16, 2022 - 11:00 am EDT
Mistakes You Are Probably Making in Kubernetes
Tuesday, August 16, 2022 - 1:00 pm EDT
Taking Your SRE Team to the Next Level
Tuesday, August 16, 2022 - 3:00 pm EDT

Latest from DevOps.com

Techstrong TV: The Use of AI in Low-Code
August 11, 2022 | Charlene O'Hanlon
Why You Should Rip Up Your Org Chart and Reorganize Around Value Streams 
August 11, 2022 | Jeff Keyes
We Must Kill ‘Dinosaur’ JavaScript | Microsoft Open Sources 3D Emoji
August 11, 2022 | Richi Jennings
What GitHub’s 2FA Mandate Means for Devs Everywhere
August 11, 2022 | Doug Kersten
Four Secure Coding Best Practices for Mobile Apps
August 11, 2022 | Jorge Damian

Get The Top Stories of the Week

  • View DevOps.com Privacy Policy
  • This field is for validation purposes and should be left unchanged.

Download Free eBook

The State of the CI/CD/ARA Market: Convergence
https://library.devops.com/the-state-of-the-ci/cd/ara-market

Most Read on DevOps.com

Putting the Security Into DevSecOps
August 5, 2022 | Ross Moore
Leverage Empirical Data to Avoid DevOps Burnout
August 8, 2022 | Bill Doerrfeld
Cloud-Native: It’s One Thing
August 8, 2022 | Alan Shimel
CREST Defines Quality Verification Standard for AppSec Testi...
August 9, 2022 | Mike Vizard
Don’t Let Developer Toil Affect the Business Value of Your A...
August 8, 2022 | Michael Cote

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2022 ·Techstrong Group, Inc.All rights reserved.