Imagine a factory building 100,000 units a day, and you can see how a small error can have huge consequences. Now imagine that factory is a database.
The world’s top smartphone maker this month pulled the plug on its $882 Galaxy Note 7 after phones overheated and caught fire, concluding in a disastrous recall for Samsung that could cost the company as much as $17 billion and shutter a global revenue stream. In an unprecedented move, Samsung has told mobile carriers to halt sales or exchanges of the device, requested that users to shut off their phones and issued a mass recall while the South Korean manufacturer investigates reports of fires in original and replacement Note 7s.
This a scant two months after the Galaxy Note 7’s launch, a device that one TechCrunch consumer electronics expert raved as “undoubtedly one of the most iconic handsets around.” While it is not clear what caused these fires, there is speculation that it was due to a faulty component that interacts with the battery, rather than the battery itself.
It goes without saying that there are ample opportunities for error in the assembly of a smartphone. The complicated, highly coordinated process is fraught with thousands of input touchpoints, from hard- and software components to tightly controlled manufacturing processes to manual labor. And though there may be countless checks and balances in place, mistakes do happen. What may seem to be an innocuous error, such as not sufficiently tightening a screw or the shorting of a circuit, in this case has led to epic consequences.
The truth is, these instances happen more often than we think, and to technologies even more vital to the population than a personal smartphone.
Consider today’s era of big data and cloud applications, where enterprise applications ingest and process vast amounts of information from various sources, often in real time, to deliver critical analytics, contextual information or business insights.
This data has weight. It can influence whether fleets of airplanes are grounded at gates all across the country. It can restrict customers from accessing funds from online banking. It can dictate whether millions of online retail dollars go unspent. The reliance on accurate data, and its grip on lives and business, is growing at an exponential rate—like ever-widening ripples in a lake.
To support all of this data, new applications (such as real-time analytics, internet of things, ecommerce and others) are developed and deployed on scale-out, non-relational databases or ported from traditional relational databases.
But, like the manufacturing gaffs experienced by Samsung, corruptions and operational errors by developers occur in enterprise IT environments. Enterprises know this, and billions have been invested in traditional backup and recovery and data protection products to protect against such “err” moments. However, the irony is that data protection products like those that exist for traditional relational databases do not exist for today’s modern, next-generation scale-out databases supplying the new types of data that increasingly control our lives. We’re essentially trusting that the Ma Bell rotary phone repairman will be able to recover personal data—such as contacts, photos and messages—after your Galaxy Note 7 starts smoldering.
So how did we get from the world’s most explosive phablet to distributed databases and big data? Because, like the Galaxy Note 7, the tiniest data error can rapidly expand into magnificent, catastrophic consequences.
Enterprises are adopting cloud applications that gather large amounts of data at a high-ingestion rate and process that data in real-time to deliver actionable insights. These applications are deployed on non-relational databases. As migration to cloud and multi-cloud application environments increases, enterprises also must be able to manage and recover at scale without data loss or corruption and with minimal application downtime.
However, enterprises today face a significant gap in addressing their data protection requirements. Their most valuable data, often needed 24/7 and without fail, runs absent any point-in-time backup options for that ‘“Houston, we have a problem”’ moment. Most studies suggest that more than half of data downtime and loss is triggered by human error, corruption or virus attacks. And yes, while native replication helps enterprises meet business continuity and disaster recovery requirements by providing protection from hardware failures or even natural disasters, it is insufficient (meaning it cannot scale) to meet next-gen enterprise data protection demands. Case in point: If a minor schema corruption impacts the primary data copy, it causes a multitudinous ripple effect that impacts all replicas down the line. Yet, even without protection, industries continue to adopt and deploy scale-out database systems at an accelerated pace.
So just as Samsung witnessed an error replicate itself into such a massive meltdown of global proportions that it threatens to kill an entire product line, organizations also must understand how replication-based protection mechanism can actually act like a virus—spreading a corruption uncontrollably across non-relational, cloud databases.
Samsung’s troubles won’t end with the simple recall of a single product, and their predicament should resonate. There will be punitive measures, and perhaps even a total enterprise shift from the smartphone market. It is not an enviable position, but is does offer a clear example of how a peripheral manufacturing error can swell to engulf an entire brand.
For enterprises rushing to deploy scale-out databases, let Samsung serve as a warning: that deploying any infrastructure that could potentially lead to exponential, irreversible corruption of all data copies is a recipe for harm (if not disaster). As enterprises make the switch to next-gen applications running on scale-out, non-relational databases, an assembly line e-stop needs to be built in, in the form of a point-in-time, version-friendly data protection solution.
Data corruption is not a matter of “if,” but “when.” So have a kill switch ready.