JJ Tang

JJ is the co-founder of Rootly (YC S21), a Slack-native incident management solution. He is based in Toronto, Canada and previously lead product at Instacart and IBM. He is obsessed with developer productivity, F1, and his adopted dog.

What Does AIOps Mean for SREs?

March 15, 2022 | AIOps, devops, it automation, SRE

It seems SREs are of two minds when it comes to AIOps. On one hand, AIOps' potential is pretty exciting. By automating complex workflows and troubleshooting processes, AIOps could make your life ...

The Evolution of Incident Management

February 8, 2022 | incident management, resilience, site reliability engineering, SRE

Have you ever thought about the history of incident management? If you’re an SRE, you might be so caught up in the day-to-day work of managing reliability and responding to incidents that ...

Observability Vs. Monitoring: What’s the Difference?

October 27, 2021 | application performance monitoring, DevOps metrics, logs, observability, traces

As more and more IT organizations pivot from strategies rooted in monitoring to ones that aim to achieve observability, one of the first questions that teams need to answer is, “What is ...

Defining Availability, Maintainability and Reliability in SRE

October 25, 2021 | application availability, DevOps metrics, reliability, site reliability engineering, SRE

In the world of reliability engineering, you’ll frequently encounter the three “-ability” words: Availability, maintainability and reliability. They sound similar and have similar meanings. In fact, these words may seem so similar ...

Why Observability and Monitoring Matter to SREs

October 18, 2021 | application performance monitoring, DevOps metrics, observability, SRE

Site reliability engineers, or SREs, do many things. They help developers build reliability into applications. They manage SLAs and SLOs. They play a leading role in incident management and incident response. For ...

GreenOps, MLOps, Green AI, digital, cloud, sustainable, FinOps sustainable cloud cost Lightstep blue-green deployment authorization

5 Steps to Succeed With Blue-Green Deployment

October 12, 2021 | application performance management, blue-green deployment, chaos engineering, continuous testing

By allowing teams to maintain two production-ready environments at the same time, the blue-green deployment technique can significantly boost reliability. But blue-green deployment can also be difficult to execute and manage. Let's ...

Does Shift Left Matter to SREs?

October 11, 2021 | application observability, application performance management, shift left, SRE

If you’re a software engineer, you’ve likely heard all about shift left, a practice that can streamline certain aspects of software development. But shift left isn’t just for developers. It can be ...