DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB
Hot Topics
  • Where Does Observability Stand Today, and Where is it Going Next?
  • Five Great DevOps Job Opportunities
  • 5 Technologies Powering Cloud Optimization
  • Azure Migration Strategy: Tools, Costs and Best Practices
  • Blameless Integrates Incident Management Platform With Opsgenie

Home » Blogs » DevOps Practice » SRE (Part 2): A Practical Approach

SRE (Part 2): A Practical Approach

Avatar photoBy: Anthony Caiafa on April 10, 2020 1 Comment

In my last piece on cultural SRE, I covered the basics of defining what SRE means for your team. However, some teams may not need dedicated SRE support. This is where the central organization comes in handy. Dedicated resources are focused on building guidelines, standards, services and platforms for teams to consume that can operate autonomously in your environment. Having a broadly communicated central team also allows for these other teams to know where to go when they need assistance or would like some recommendations on their technology stack. 

Recent Posts By Anthony Caiafa
  • SRE (Part 1): A Modern Overview
Avatar photo More from Anthony Caiafa
Related Posts
  • SRE (Part 2): A Practical Approach
  • LinkedIn Preps Site Reliability Engineers (SREs) For Exciting Careers
  • Building DevOps Careers: One Man’s Journey, Part 1
    Related Categories
  • Blogs
  • DevOps Practice
  • Enterprise DevOps
    Related Topics
  • application SRE
  • Frontline Support
  • infrastructure SRE
  • SRE
Show more
Show less

Application/Service SRE Versus Infrastructure SRE

Application SRE includes:

TechStrong Con 2023Sponsorships Available
  • Embedded for application/service teams.
  • Architecture guidance for new services and infrastructure. 
  • Design and implementation for modern technologies.
  • Support within the application/service team.
  • Automation for apps and services for the team.
  • Automation for operational work.
  • Monitoring/metrics for application/service team.
  • Benchmarking and performance assistance for code.
  • Infrastructure deployment and architecture for applications and services.
  • Write documentation and runbooks for alerts issued by application/service stack.

Infrastructure SRE includes: 

  • Build and management of “plumbing” technical infrastructure (provisioning, OS, dns, dhcp, networking, central auth, etc.).
  • Automation of infrastructure services (telemetry, monitoring, log aggregation, configuration management, anomaly detection, orchestration, etc.).
  • Build and management of consumable services and tools (message queues, databases, distributed compute farms, API services/integrations, containers, etc.).
  • Infrastructure as Code (IaC).
  • Implement Global IR&IM process.
  • Support for Application SRE teams.
  • Architectural guidelines and best practice documentation.

Organizational Layout

Organization structures, whether we like to believe it or not, really help in the flow of communication throughout the team. It allows for the team leaders to form a single mission and have that mission carried out by their team members. I mentioned in Part 1 of my SRE Narnia guide that I believe this organization must be centralized to be successful. Below is a high level overview that shows the separation between application SRE and infrastructure SRE.

The leads can handle multiple teams and disciplines. This can shrink and scale as you see fit within your organization. Application SRE is straight forward. The infrastructure SRE could be embedded across existing infrastructure teams or form a new team focused on a specific project.

Example SRE Interaction

Below is an example interaction between an application team (this one runs a ruby app) and an SRE hierarchy. You will see a team called “Frontline Support” as the initial interaction. This team is not required but is a nice to have. It helps in any environment big or small, and can really offload a significant amount of the operational workload along with having a global view of issues coming in.

The duties for frontline support are to follow runbooks for alarms flowing in from the monitoring systems. It is similar to a traditional NOC but with more expertise across the stack they are watching. It also allows for trend analysis and bringing data to conversations around reoccurring issues or bugs in the system.

I do recommend using this as a starting point for junior level team members. It allows them to learn the system quickly and buddy up with more seasoned members that are either on this team or have the pager for the week. It is a great way to train and onboard your technical employees. Nothing works better than showing them what is broken. 

Summary

Overall, everyone’s journey will be different. There are quite a few books and examples that exist out there, but please only use them as ideas to form your own opinion on how these organizations should work. There is no prescription and nothing is ever perfect. I have left out quite a bit of information in this post but I am always interested in having conversations about this topic and others alike. There is one consistent theme you’ll see across the board and that’s shaping the culture. Focus on your culture and hire great talent along with great leaders. Good luck and happy hacking.

— Anthony Caiafa

Filed Under: Blogs, DevOps Practice, Enterprise DevOps Tagged With: application SRE, Frontline Support, infrastructure SRE, SRE

« Three Priorities for IT Operations Continuity During the COVID-19 Pandemic
DevOps Chats: Swiss, Simple Secure Collaboration with Adeya »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

Automating Day 2 Operations: Best Practices and Outcomes
Tuesday, February 7, 2023 - 3:00 pm EST
Shipping Applications Faster With Kubernetes: Myth or Reality?
Wednesday, February 8, 2023 - 1:00 pm EST
Why Current Approaches To "Shift-Left" Are A DevOps Antipattern
Thursday, February 9, 2023 - 1:00 pm EST

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

Azure Migration Strategy: Tools, Costs and Best Practices
February 3, 2023 | Gilad David Maayan
Blameless Integrates Incident Management Platform With Opsgenie
February 3, 2023 | Mike Vizard
OpenAI Hires 1,000 Low Wage Coders to Retrain Copilot | Netflix Blocks Password Sharing
February 2, 2023 | Richi Jennings
Red Hat Brings Ansible Automation to Google Cloud
February 2, 2023 | Mike Vizard
Three Trends That Will Transform DevOps in 2023
February 2, 2023 | Dan Belcher

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

OpenAI Hires 1,000 Low Wage Coders to Retrain Copilot | Netflix Blocks Password Sharing
February 2, 2023 | Richi Jennings
New Relic Bolsters Observability Platform
January 30, 2023 | Mike Vizard
Jellyfish Adds Tool to Visualize Software Development Workflows
January 31, 2023 | Mike Vizard
Cisco AppDynamics Survey Surfaces DevSecOps Challenges
January 31, 2023 | Mike Vizard
Five Great DevOps Job Opportunities
January 30, 2023 | Mike Vizard
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.