A survey of 200 DevOps professionals working for organizations that generate $50 to $500 million in revenue found observability costs have become more challenging to control.
Conducted by Wakefield Research on behalf of Edge Delta, a provider of an observability platform, the survey found nearly all respondents (98%) have experienced overages or unexpected spikes in costs at least a few times a year, with 51% seeing overages or unexpected spikes in spending at least monthly. Only 1% of companies claimed their observability costs are not rising.
The most common causes for those spikes in cost are product launches and updates (46%), followed closely by log data being mistakenly included for ingestion (42%).
A full 93% said their leadership team is very or somewhat aware of rising observability costs, with 91% anticipating increased scrutiny to reduce costs in the next 12 months.
As a result, 84% agreed they’re paying more than they should for observability, even when they limit how much log data gets ingested.
Edge Delta CEO Ozan Unlu said it’s clear that while organizations are investing in observability to increase overall resiliency, the cost of that investment is higher than anticipated because the underlying platform being used was not designed to consume, store and analyze data at the level of scale required.
As a result, the survey found DevOps teams are attempting to reduce those costs by limiting log ingestion often or all the time (82%), with 98% also limiting the amount of log data they collect.
Despite those efforts, 83% noted that their decision to ingest or not ingest data has led to a dispute within their company.
Other issues that arise include increased risk or compliance challenges (47%), additional staff time spent preparing data for ingestion (47%), internal tension from not ingesting data for some teams/functions (42%), disruptions to processes dependent on data pipeline (42%), loss of valuable insights and analytics (38%) and failure to detect a production issue or an outage (31%).
On average, log data has grown 5x over the past three years, and nearly a quarter of respondents (22%) experienced a growth rate of 10x or higher. Well over a third (38%) generate between 500GB and 1TB of data daily, while 36% generate over 1TB. Only 15% generate more than 10TB of data daily.
Observability platforms promise to unify logs, metrics and traces in a way that makes it simpler to launch queries to identify the root cause of an issue. In comparison, legacy monitoring tools are designed to enable the IT team to track a pre-determined set of metrics that are typically less granular.
It’s still early days as far as the adoption of observability platforms is concerned, but it’s apparent that as application environments grow and become more complex, the need for observability is becoming more acute.
The rate at which DevOps teams will embrace observability will naturally vary, but the biggest obstacle might not be the platforms themselves. Instead, the issue is understanding what queries can help DevOps teams better understand the root cause of an IT issue before there is a major disruption. In the long term, it’s expected that machine learning algorithms will leverage the data collected by observability platforms to automatically identify issues that might lead to a disruption long before it actually occurs.
In the meantime, the rate at which IT environments are becoming more complex is clearly outpacing the amount of budget dollars being allocated to manage it.