Tag: SRE

Datadog Adds Raft of Additional AI Agents and Tools to Portfolio
Datadog at its DASH 2025 conference today previewed a raft of artificial intelligence (AI) agents that automate tasks ranging from assessing infrastructure alerts that normally require the expertise of a site reliability ...

Ciroos.AI Preps AI SRE Agents Trained to Automate Incident Management
Ciroos.AI this week emerged from stealth to provide early access to a set of artificial intelligence (AI) agents that have been trained to augment site reliability engineers (SREs). Fresh off raising $21 ...

Site Reliability Engineering State of the Union for 2024: Embracing Innovation and Efficiency in the Age of Generative AI
SRE practices are set to undergo significant transformations, driven by technological advancements and changing organizational needs ...

SRE in the Age of AI
Site reliability engineering (SRE) is a concept introduced by Google in 2004 and since then it has been adopted by various leading software organizations. In its purest form SRE is what you ...

Our Infrastructure is Still Expanding
Infrastructure is expanding in almost every possible way, and this creates more of a burden on every aspect of IT, specifically DevOps ...

Forget Shift Left: Why ‘No Shift’ is the Future of Software Innovation
A no shift strategy argues for developing and testing directly in production, bypassing the traditional dev-to-production delivery pipeline ...

SREs Say There’s Plenty of Room to Improve Incident Management
A global survey of site reliability engineers (SREs) found diagnosing issues is the most difficult aspect of incident management ...

Unlocking Accountability: How Real-Time App Monitoring Empowers Engineering Teams
Real-time app monitoring is about fundamentally shifting your mindset toward a culture of accountability and continuous improvement ...

Broadcom Survey Surfaces Raft of IT Automation Challenges
A Broadcom survey found islands of automation that operate independently can cause organizations to fail to meet SLAs ...

5 Reasons to Move Beyond SRE to Observability
SRE and observability are both often erroneously equated with monitoring. Organizations should shift focus from the former to the latter ...

Google Taps Nobl9 to Improve Services Reliability
Google added a capability for tracking reliability based on how SLOs are being met that is based on a platform developed by Nobl9 ...

Nobl9 Allies With Microsoft to Simplify SLO Creation
Nobl9 integrated its reliability center platform for managing SLOs as code with the Microsoft Azure Monitor service to simplify creation and tracking of service levels ...