It can be tough to find useful, concrete advice to assist you with an enterprise-level data migration project. There’s an awful lot that can go wrong. Common suggestions about finding the right people, minimizing risk, or employing an automated system that does everything for you, offer limited help.
What you really need is a step-by-step checklist that you can work through.
The success or failure of a data migration depends largely on how thoroughly and carefully you prepare for it. If we analyze large migration projects, we invariably find that the successes share many common characteristics. If you’re planning a data migration, these are the questions that you really need to answer:
What type of data is being migrated?
Not all data is created equal. There are various different types of data, so you need to be clear about exactly what you want to move.
- NAS-based – Where network shares are migrated from one system to another.
- LUN-based – Where the disks (LUNs) are migrated from one place to another, at block level.
- VM-based – Where the entire virtual hosts are migrated from one hypervisor to another.
- Application-specific – Where one set of databases or objects, is migrated from one host to another.
What type of migration is it?
Where your data is going, can make a big difference to your overall plan. Is this a local migration, within the same data center, a remote migration, shifting data from one center to another, or is it a cloud migration, where you might be moving data from one cloud provider to another?
What is the current storage environment?
You need a complete overview of your current storage environment before you move a single file. If you’re using SAN, then think about what hosts, in terms of operating systems and applications, are involved. What are the FC switches and RAIDs? Do you have a clear picture of all the inter-dependencies within the SAN? A deep understanding of the environment and its complexity, gives you a better chance of avoiding disaster.
How much data is being migrated?
An accurate estimate of the volume of data that needs to be moved is absolutely vital. You need to account for all the hosts, LUNs, and storage systems. It’s not enough to say how many terabytes of data need to be migrated, you’ll need a picture of how that breaks down before you can determine the time and effort required.
How much time do you have?
There’s always a deadline for any project. Your timeframe may ultimately be dictated by the volume of data and the migration rate, but you can only determine that rate with a complete understanding of the data, application, and environment. Can you migrate in a divide-and-conquer manner? The more you know, the easier it is to control operational complexity and reduce costs.
How active is the data and how much downtime is possible?
In order to determine whether the migration can be completed on time, or even at all, you need to know how active the data is. How much of an impact on production will be tolerated? Some downtime is inevitable, but it has to be worked out and agreed upon ahead of time. Creating the right strategy in advance, with all the key stakeholders, will dramatically boost your chances of success, and reduce stress levels on the day.
What’s the right tool for the job?
This is really going to depend on all of your answers so far. The right way to go about selecting is to draw up a list of evaluation criteria. Think about everything you need and create a list that you can cross-reference against the candidates.
As an example, for a SAN migration, you might want something that offers the following:
- A seamless installation with no disruption to the production environment.
- A clear display for the automatically discovered SAN configuration, complete with identities, status, and I/O statistics for all hosts and LUNs to be migrated (as well as those not to be touched), so the migration operation can be organized correctly and accurately with high confidence.
- The ability to migrate during non-production hours, or automatically yield to production traffic, with adjustable intensity/throttling.
- A centralized reporting system that delivers the statistics you need.
- Data integrity checking of intermediate images before cutover, so issues are flagged early before they can cause problems.
- Secure wiping of the data disks after the data is successfully migrated out.
When you’ve drawn up a list, you can apply it, and see how different options compare.
Who are the right people for the job?
A successful data migration is going to require some level of expertise. Ideally, you want experienced personnel to handle a process like this, and you may need to look externally to find them. If they can give you the answers you need to address the questions we’ve posed here in this article, then you know they’re qualified for the job.
About the Author/Wai T. Lam
Wai T. Lam is co-founder and CTO of Cirrus Data Solutions, a developer of Data Migration Server and Data Caching Server for storage area networks (SANs). He was previously CTO and VP of Engineering at FalconStor, a company he co-founded in 2000. There, he was the chief architect, holding 18 of 21 company patents. Wai received the prestigious China national “Top 1000 Technological Leaders” award in 2013.