PagerDuty Allies with JFrog to Modernize IT Incident Management

PagerDuty has integrated its IT incident management platform with the continuous integration/continuous delivery (CI/CD) platform from JFrog as part of an effort to streamline troubleshooting of application environments.

Steve Gross, senior director of strategic business development at PagerDuty, said the overall goal is to make it easier for IT operations teams and developers to collaboratively discover the root cause of issues by, for example, being able to identify builds within a JFrog pipeline that might be having a negative impact on a production environment.

At the same time, IT teams employing PagerDuty can now also receive notification from JFrog Xray, a tool that discovers both vulnerabilities in open source software and license compliance issues.

Organizations of all sizes are attempting to minimize the impact of an IT incident by providing deeper insights into how applications were constructed. The goal is to reduce the amount of time IT teams currently spend looking for the root cause of a problem.

Ideally, IT teams would discover those issue before they impact end users. The challenge they face is it can take weeks to discover the root cause of an issue that takes a developer a few hours, at most, to fix. Armed with insights from the CI/CD platform, Gross said it becomes easier for IT operations teams to proactively pinpoint the source of an issue in a way that provides the developer with enough insight to act.

That is critical because, more often than not, it’s the last build of application that was deployed in a production environment that is usually the source of an issue, especially when that IT environment was operating normally before that build was promoted into a production environment.

The data collected by PagerDuty from various DevOps platforms is fed into the PagerDuty tools via a Change Events application programming interface (API). Over time PagerDuty will also be able to apply machine learning algorithms and other forms of artificial intelligence (AI) to identify issues long before they reach a meaningful level of disruption, noted Gross.

In general, there’s a movement to modernize IT incident management in a way that aligns with DevOps best practices as part of an effort to identify issues before an application is actually deployed. The hope is that those efforts will reduce the number of “war room” meetings that need to be convened to determine the root cause of a performance or security issue after an application has been deployed.

It may be a while before IT incident management processes are modernized to that level. However, as IT environments become more complex—in part because of the number of microservices-based applications with multiple dependencies that are now being deployed—it’s apparent existing approaches to IT incident management don’t allow IT teams to respond fast enough to an issue. In fact, that’s more challenging than ever now that many of the members of an IT team are working from home more often.

The pressure to modernize IT incident management increases in almost direct proportion to IT teams’ stress levels. The challenge now is to modernize those processes before the members of an IT team are too burned out to appreciate the effort.