Verica, a provider of an incident management platform, announced this week it is making available a database to make incident reports publicly available via a single repository.
Courtney Nash, senior research analyst at Verica, said the Verica Open Incident Database (VOID) initiative is part of an effort to enable organizations to learn from outages and failures caused by software. Today, reports about such incidents tend to be difficult to find because they are scattered all across the internet, she added.
That lack of transparency makes it difficult to compare and contrast incidents in a way that surfaces meaningful insights, noted Nash.
Launch partners for the VOID include Indeed, NS1, Adaptive Capacity Labs, Security Scorecard and Auxon. Each will provide support to help fund the development, research and dissemination of knowledge generated by the project.
The VOID contains more than just standard company postmortems or status updates, added Nash. Verica is committed to applying metadata to better understand how individuals, companies, media and others reacted during an incident.
One early finding already derived from this initiative is that mean-time-to-resolve (MTTR) is proving to be an inaccurate metric due to the varied distribution of incident data. Other findings include the fact that only one-quarter of incident reports took the time to identify the root cause of an issue, while less than 1% took the time to study incident reports of near-misses.
In general, there’s now more riding on the performance and availability of IT environments than ever. Digital business transformation became all the rage in the wake of the economic downturn brought on by the COVD-19 pandemic. As the economy continues to recover, businesses now expect IT to respond and pivot to changing conditions no matter how sudden.
DevOps teams have, of course, learned to expect the unexpected by embracing modern incident management based on DevOps best practices that enable IT teams to adroitly respond to any outage or sudden degradation in application performance. The most important thing to remember about incident management is that practice makes perfect. While practice runs may never precisely mirror an actual crisis, organizations that have processes that are well-known by their IT incident teams and that practice ‘fire drills’ tend to respond both faster and better. After all, the more muscle memory there is to call on the more resilient the organization will be.
It’s not clear to what degree IT organizations will be willing to share incident reports. However, there is an opportunity for IT professionals to come together that is similar to how engineers from different airlines collaborated to make flying safer for the general public, said Nash. In fact, with so many processes now dependent on software, it’s amazing there have not been more catastrophic events due to software issues. It’s even arguable that it’s only a matter of time before such levels of collaboration are mandated.
In the meantime, the first step toward achieving that goal may simply be sharing a report with colleagues so they can learn from previous mistakes.