This blog is the first in a two-part series about MongoDB backup and recovery. In this first part, I will discuss the motivations for protecting data that resides on MongoDB and the existing backup and recovery solutions. In the second part, I will discuss how to solve this problem.
In the era of big data, enterprise applications create a large volume of data that may be structured, semi-structured or unstructured in nature. In addition, application development cycles are much shorter and application availability is a critical requirement. Given these application requirements, enterprises are forced to look beyond traditional relational databases to onboard the next-generation, cloud-native Platform 3 applications—infrastructure-as-a-service (IaaS) or platform-as-a-service (PaaS) applications. NoSQL databases such as MongoDB are now being adopted and evaluated by enterprises for these next-generation applications, which include ecommerce and content management. The benefits of MongoDB are many, including dynamic schema, easy scaling through auto-sharding, tunable consistency for reads and built-in replication.
MongoDB database also has native replication capabilities that satisfy availability requirements. However, data protection requirements for scalable point-in-time backup and recovery must be addressed. For robust data protection, enterprises need both backup and replication; Without point-in-time backups, organizations are at substantial risk of losing data due to human error, logical corruption and other operational failures. Traditional backup solutions were built to address the requirements of structured Platform 2 applications on relational databases that used shared storage and had the ACID (atomicity, consistency, isolation and durability) transaction model. Unfortunately, they fall short of addressing point-in-time backup requirements of Platform 3 applications and distributed databases (local storage, eventual consistency and the elastic nature of infrastructure). There are a few alternate script-based solutions such as Strata that enterprises are using to fill the data protection gap, but these solutions are sub-optimal at best.
Manual Scripted Solutions
Manual scripted solutions (MSS) leverage native MongoDB snapshot utility and scripts to transfer data to secondary storage. The scripts (via mongodump) are customized for each MongoDB cluster and require significant operational effort to scale or adapt to any topology changes (such as addition or removal of nodes to your MongoDB database). Further, these scripts are not resilient to failure scenarios such as failure of a node (primary or secondary) or intermittent network issues. Finally, recovery (the paramount value of “backup”) is a manual process, hence, time-consuming (resulting in very high application downtime) and contains data loss risk due to any bugs in the scripts. Overall, these solutions work when the MongoDB environment is small and some data loss may be permitted in the application. Some of the key issues that these solutions face are:
- Lack of enterprise backup solution for sharded configurations
- Database needs to be offline when the snapshots are taken
- Both backup and recovery fail under node failure and other infrastructure failures
- Recovery process is manual and requires verifications, which increases the recovery time
- Recovery at collection-level requires manual recovery that is time consuming
- Recovery to unlike topologies (sharded → unsharded) for test/dev refresh is not available
Most enterprises use these scripted methods as a temporary quick-fix solution. It is like driving your car with flat tires: You can keep going; however, you can’t go at the speed you want to go, nor are you risk-free from disasters.
MongoDB Paid Backup and Recovery
MongoDB (the company) provides a couple of ways to back up MongoDB databases. Enterprises may choose from either a managed backup offering (MMS) that runs in public cloud or, if they are paid MongoDB customers, they may deploy the backup service on-premise. In addition to being exorbitantly costly, the managed backup service stores customers’ data in public cloud. Backup data transfer over WAN may not work for customers who deploy MongoDB on-premise and for the customers who need to keep their sensitive data in-house. Further, there are significant data limitations per shard to use this service.
Using the MongoDB on-premise backup service is possible but is overly complex to deploy and operationalize (this deployment diagram speaks for itself!). Enterprises need to deploy eight servers, additional databases (with additional licensing) and about 6 to 9 times storage capacity (of the database that is backed up) for enabling on-premise backups.
Overall, on-premise backup service is a theoretical solution that brings with it significant CAPEX and OPEX investments:
- Complexity of deploying multiple databases
- Cost of additional infrastructure (servers and storage)
- Cost of licensing additional MongoDB nodes
- Risk of failed backups when nodes fail (secondary from which backup is taken)
- Siloed backup infrastructure for only MongoDB database
Look for Part two of this discussion to come shortly.