Leadership Suite

Defining the Role of a SRE

Soft Skills are Just as Important as Technical Skills in the SRE role

Site reliability engineering (SRE) is a relatively new discipline, having only been in existence for about 15 years. It originated at Google and has gained popularity recently with more companies advertising SRE positions or trying to implement SRE practices. In the technology field, 15 years may seem like an eternity, but the SRE role is very much still in its infancy. There are still challenges defining the role and understanding exactly what it is. Just look through one of the 1,000-plus job listings on LinkedIn for an SRE in the United States—you will see many different job expectations and requirements.

Some may see the SRE role as being similar to DevOps, while others may see them as competing concepts. Liz Fong-Jones and Seth Vargo from Google created a video describing the differences and similarities between DevOps and Site Reliability Engineering. They point out that DevOps is a philosophy and SRE is a prescriptive way of accomplishing that philosophy.

Earlier this year, Catchpoint conducted a survey of 416 professionals with the title or responsibility of an SRE. The goal was to create a real-world profile of an SRE and their organizations. What we found was that some of the challenges in clearly defining the role may be due to differing needs and priorities based on industry and company size.

Being a good SRE requires a mix of technical skills and “soft skills.” It was a relief to see there were no variations in the skills required based on whether a person was an individual contributor, manager or executive. Consistency in articulating goals and expectations is crucial to being a successful SRE and this is where some of the soft skills come into play. Someone may be good at automation, know a variety of scripting languages and is well-versed in application and network protocols, but that doesn’t make them a strong SRE. SREs need to know which questions to ask, not just how to apply technology to solve a problem. This requires being adept at problem-solving, working as part of team and having exceptional communication skills.

The growing importance of soft skills cannot be overstated. West Monroe Partners recently conducted a survey of 600 HR and recruiting professionals who are responsible for hiring IT workers and 650 business professionals who work with IT staff on a regular basis. In the study, 42.9 percent of respondents said a lack of soft skills made it tough to find qualified candidates for technical jobs. When asked which were the most important soft skills for IT, 77.2 percent named teamwork and collaboration, and 63.4 percent named verbal communication.

The SRE is fast becoming a mandatory position at many organizations, as continuous delivery of software applications becomes the norm. Development teams must work within dramatically condensed time frames, and unexpected performance incidents will arise. The SRE, who is responsible for increasing the reliability of supported services, needs to be able to effectively communicate why and how a proposed code or infrastructure change will impact reliability. This is why strong communications skills are so necessary.

The Catchpoint survey also highlighted the need for alerting and notification tools. A full 90 percent of SREs responding noted they cannot live without such tools. These enable them to identify conclusively where the source of a problem lies, thus avoiding time wasted blaming or war-rooming, all while supporting a clear communications process.

If your organization does not yet have an SRE, chances are you may soon. What are some things both employers and employees need to be thinking about?

Advice for Employers

Think about your organization’s specific needs, not whether your organization is doing exactly what is in Google’s SRE book. Specifically, for large organizations where there is enough headcount for people to specialize, the SRE role may look very different than at a small startup. Larger organizations generally will have more headcount, which means people are able to specialize in certain areas. If they aren’t familiar with a certain technology or toolset, others on the team with the knowledge can pick up the slack. Smaller organizations, however, require people to do more or potentially know a little bit about many topics, as often they don’t have the ability to hire experts in all areas. A strong SRE candidate for a large organization may only be good at a single technical skill, but they can still be a successful SRE. A strong SRE candidate at a smaller organization may need to have a wider arsenal of technical skills.

Advice for Potential SRE Candidates

Workers considering a move to an SRE role should understand that technical skills and soft skills are equally vital. Your job will consist of approximately 50 percent traditional IT ops functions—collecting and analyzing logging and diagnostic information, participating in on-call rotations and proactively monitoring and reviewing application performance, for example. You will also need to have good problem-solving skills and communicate clearly to identify and address incidents with your team. Being able to ask the right questions is critical to determining why systems and applications are not operating correctly. Collaboration is at the heart of the SRE role and that requires flexibility, teamwork and solid verbal skills.

Also understand that while you may still have one foot in the developer world, your role will be quite different from other roles on the development team. The majority of SREs do not develop new features or contribute to product road maps, a stark difference to other roles on the development team. SREs write software to automate tasks such as configuration management; they also write software or scripts to monitor systems and applications and help scale and build resiliency in these applications.

Conclusion

The real-world SRE job requisition is still evolving, and may look different from one organization to another. One thing we know for sure is that today’s high-intensity continuous delivery environments demand a cool head—someone who can toggle seamlessly across IT ops and development and successfully bring together both sides around the common goal of delivering high-performing, innovative software and features. An SRE must possess a blend of technical as well as soft skills and, perhaps most of all, must be adept at communicating effectively and inspiring a teamwide approach to excellence.

Dawn Parzych

Dawn Parzych

Dawn Parzych is Director of Product and Solution Marketing at Catchpoint. She enjoys researching, writing and speaking about trends related to application performance, user perception, and how they impact the digital experience. In 15+ year career, Dawn has held a wide variety of roles in the application performance space at Instart Logic, F5 Networks and Gomez.

Recent Posts

GitLab Adds AI Chat Interface to Increase DevOps Productivity

GitLab Duo Chat is a natural language interface which helps generate code, create tests and access code summarizations.

4 hours ago

The Role of AI in Securing Software and Data Supply Chains

Expect attacks on the open source software supply chain to accelerate, with attackers automating attacks in common open source software…

10 hours ago

Exploring Low/No-Code Platforms, GenAI, Copilots and Code Generators

The emergence of low/no-code platforms is challenging traditional notions of coding expertise. Gone are the days when coding was an…

1 day ago

Datadog DevSecOps Report Shines Spotlight on Java Security Issues

Datadog today published a State of DevSecOps report that finds 90% of Java services running in a production environment are…

2 days ago

OpenSSF warns of Open Source Social Engineering Threats

Linux dodged a bullet. If the XZ exploit had gone undiscovered for only a few more weeks, millions of Linux…

2 days ago

Auto Reply

We're going to send email messages that say, "Hope this finds you in a well" and see if anybody notices.

2 days ago