Datadog at its DASH 2025 conference today previewed a raft of artificial intelligence (AI) agents that automate tasks ranging from assessing infrastructure alerts that normally require the expertise of a site reliability engineer (SRE), to fixing code and triaging cybersecurity issues using the company’s security information event management (SIEM) platform.
In addition, Datadog is also adding Proactive App Recommendations that analyze telemetry data to suggest the next best action and APM Investigator to identify and trouble shoot bottlenecks impacting application performance.
At the same time, Datadog is adding data observability tools, along with extensions to its service for monitoring large language models (LLMs), to add support for AI agents and previews of tools to test and validate the impact of prompt changes, model swaps or application changes on the performance of LLM applications. The company has also added an AI Agents Console to make it simpler to discover and invoke its AI agents. Datadog LLM Observability, now generally available, monitors the integrity of AI models and performs toxicity checks that look for harmful behavior across prompts and responses, while Datadog Workload Protection monitors interactions between LLMs and their host platforms. A Datadog Cloud Security and Sensitive Data Scanner (SDS) also enables organizations to meet compliance requirements.
Datadog is also making generally available Datadog Code Security, a service that leverages AI to detect, prioritize and remediate vulnerabilities in their custom code and open-source libraries based on runtime threat activity and business impact. It has also extended the capabilities of its log management suite to run in an on-premises IT environment—meeting data sovereignty requirements in regulated industries—and to help reduce costs with a lower-cost archive tier of storage.
Finally, Datadog has added a managed internal developer portal (IDP) that is integrated with the company’s core application performance monitoring platform. It provides a live system of record for telemetry data to surface what software is running, who is responsible for it, and how it is performing. There are also templates powered by Datadog’s App Builder and Workflow Automation to automate tasks such as creating the scaffolding for a new service along with engineering reports and scorecards to track compliance, reliability, security, observability, cost and other metrics.
In general, Datadog is making use of a range of proprietary and open source LLMs, including one optimized for time-series data that it has developed. Over time, Datadog plans to mix and match LLMs to address specific use cases as additional AI advances are made.
Michael Whetten, vice president of product for Datadog, said, for example, the latest Bits AI SRE, Bits AI Dev Agent and Bits AI Security Analyst agents not only extend the range of tasks being automated, but also enables them to query data, analyze anomalies or scale infrastructure using shared memory in a way makes it possible for AI agents to reuse functions. Datadog has also developed a Model Context Protocol (MCP) server to make it simpler for AI agents to consume its data and interoperate with each other. The overall goal is to not, for example, replace SREs, but rather augment their expertise in a way that makes it easier to manage complex IT environments, he added.
It’s still early days so far as adoption of AI technologies across DevOps workflows is concerned, but a recent Futurum Research survey finds 41% of respondents expect generative AI tools and platforms will be used to generate, review and test code. In fact, arguably the challenge now is not so much deciding whether to use AI as much as it is determining where AI will have the biggest impact on improving both software quality and the speed at which applications are deployed.