Cirrus Data Solutions, Inc. Whitepaper Zero downtime (SAN) Data Migration Page | 1 Executive Summary IT managers are often challenged by the need to move production data from one storage system to another. With Cloud Computing platforms increasingly being adopted, there arises new demand for data migration tools that are better suited for the new platforms. Although the need for data migration is as old as the arrival of the first storage subsystem, there remain few tools that are dedicated for data migration. According to report by Bloor Research (2007), the majority of data migration projects are fulfilled by using simple backup and restore software, or other solutions NOT originally intended for data migration (such as mirroring or replication tools). …this is still true today. This paper explores the above legacy methods used by IT managers to migrate data and the pain points associated with them. The new Cirrus Data Solutions (CDS) Data Migration Server (DMS) appliance is introduced, with an in-depth analysis of how each of the built-in features of the DMS appliance eliminates the pain points of these legacy methods. The DMS appliance is designed from the ground up for data migration, with all the features and conveniences aimed at migrating disks with zero down time (other than the time to cutting-over to the new storage), by providing 100% transparency to the existing production environment throughout the migration project. The Need for Data Migration There are many drivers for Data Migration. The following is a short list: Re-organization of storage schema: Moving one or many files, folders (directories), or volumes in order to better organize information for easier access or reference. Re-organization of database schema: Similar to the above, but for database tables instead of files. Consolidation of storage: Moving data from several smaller disks or volumes to a single newer, bigger disk or volume. Storage refresh: Moving one or many sets of disks from one or many old storage unit to one or many new storage unit. The new storage unit might not be from the same vendor, and each of the new disks might be much larger than the old. Moving to new data center or Private Cloud: Same as storage refresh but the new storage units are located at a new data center that is far away (beyond Fibre Channel connections). Moving archival data to Cloud: Take advantage of the Cloud storage being offered to archive volumes or disks onto the Cloud. Page | 2 Moving data from Cloud to Cloud: With increased competition on Cloud storage providers, the user may want to move data from one Cloud to another in order to optimize cost. Moving data from Cloud to data center: For those who regret moving the data to Cloud, they need a way to move it back to the legacy data center. With the worldwide Cloud initiatives gaining momentum, and with the amount of data growing at a rapid rate, the need to migrate data from one place to another is also growing. According to Gartner, business data continues to grow at 40%-60% rate, and the Computing Technology Industry Association issued a report detailing the results of its cloud computing survey. In the report’s description of the cloud market, CompTIA also cited Gartner’s prediction that “cloud storage will grow at 89.5% CAGR to $2.88 billion” through 2015. Unfortunately, the amount of downtime that is available for moving the increasingly larger set of data being migrated is getting smaller. In many cases, the amount of time required to move the data far exceeds the amount of available down time. In these unfortunate cases, the IT manager has no way to migrate data, and faces very bad consequences. For example, when the lease on their storage is up and the IT manager has no set time to migrate data to the new storage, it is then impossible to take advantage of the new storage which is typically more cost effective, faster, and generates less heat and consumes less power. Without the ability to change, the original storage vendor will certainly take advantage of the situation and offer little if any discount for renewal of the storage lease. Data Migration tools and Downtime Migrating data is not difficult provided that downtime is allowed. In fact, there are many existing legacy tools available, such as backup/restore, or built-in functions within the operating system such as copy/paste, cut/paste, move, etc. Typically IT managers calculate the total amount of data being moved, and estimates the amount of time needed for the move and properly inform the users of the planned outage. At the pre-determined time, the file, folder, volume, or entire application is put offline. A copy process then proceeds to copy the data from the source to the destination disk/volume/folder. Throughout this time, the user cannot be allowed to continue to make changes to the data being copied; otherwise the process has to be repeated again. In some situations, it is possible to minimize the downtime by performing a non-consistent full copy without shutting down anyone, knowing that some files will be changed or skipped due to continued access. Then, a smaller amount of down time can be scheduled to perform an incremental copy (or backup) of the data set. While this may work for a general file share (for example home directories), it is most likely not practical for a database or email system, where the bulk of the data is in a few large database/index files that are constantly changing. Therefore, the drawback of the legacy tools is simply that they require a large amount of downtime and is quite inefficient Page | 3 In short, the legacy methods, including the use of backup/restore tools, are only useful for small amounts of file data where downtime is available and acceptable. But if the data is live, as in a production database or email system, then these tools are quite useless, for the simple reason that most operating system tools prohibit copying while the files are still open. And while backup software can sometime backup files that are still open, this is a futile exercise because the files that are backed up are already out of date upon backup completion. The only way these tools can be effective in migrating data from old to new storage is to shut down the application during the process, thereby ensuring the copied data has not changed. For a large dataset, this requires downtime measured in days, something that cloud and legacy enterprise datacenters cannot afford. In the US, businesses can lose between $84,000 and $108,000 for every hour of system downtime according to estimates from studies and surveys performed by IT industry analyst firms and it can be in the millions at financial services. Data Migration Using Storage Virtualization Appliances For the user with a more sophisticated storage farm, such as one with a Storage Virtualization Appliance layer that has intelligent storage service functions like mirroring or replication, it is possible to use the advanced storage service functions to create a synchronous or asynchronous copy of the source disk on the destination disk. Once the copy is in sync, the migration is complete and the new destination disk can be put into production, replacing the source disk. A Storage Virtualization Appliance is an ideal platform to deliver data migration functions, allowing transparent data migration without any down time. This is because the Storage Virtualization layer hides all storage identities from the host and presents a unified storage view to all hosts within the virtualized farm. Even if the physical storage consists of multiple vendors, the Storage Virtualization layer presents all storage as a single vendor entity. As long as the Virtual Storage vendor’s multi-path driver is properly installed at every host, this solution makes it possible to migrate data from one disk vendor to another without the host being aware of it (except for the performance degradation during the copying). Down time is not required, because there is no need to install any new multi-path driver (typically required if the destination disk is from a new vendor). Unfortunately, if you do not already have a Virtualized Storage Farm, it is virtually impractical to deploy a Storage Virtualization layer temporarily just for the sake of being able to migrate data. This is like buying a fast-food franchise for the sake of enjoying a few burgers. Deploying a Storage Virtualization solution requires a lot of careful planning, and requires down time to insert the Storage Virtualization Appliances into the SAN. This typically involves the execution of the following tasks: Ensure compatibility of the Storage Virtualization Appliances with the existing storage units, FC switches, and hosts. Ensure availability of Multi-path driver for all host OS and versions for the Virtualization Appliance. Page | 4 Secure consent from all Application Host team to install new Multi-path drivers and modify HBA settings, host disk timeout settings, etc, in order to comply with the Storage Virtualization Appliance vendor’s requirements and best practices. Secure consent from the SAN management team to allocate additional ports and to create additional FC zones on the FC switches Secure consent from the Storage management team to change the LUN masking (LUN presentation, LUN assignment) so that the Storage Virtualization Appliances can access the existing storage units. Secure approval for the down time to perform all of the above at the same time in order to insert the Storage Virtualization Appliances Execute back-out plan if anything should go wrong After the migration is done, the Storage Virtualization Appliances must be removed from the storage farm, and the tasks outlined above must be executed in reverse, requiring additional down time. In essence, while a Storage Virtualization Appliance can be used to migrate data, the amount of work and the risks associated with the work make it impractical for the purpose. Only a minority of users who are fortunate enough to already have a Storage Virtualization layer can enjoy the benefit. Data Migration Using Storage Services of Storage Controllers Some higher end storage units have intelligent controllers that provide storage services beyond just RAID. They include storage services to mirror LUNs from one storage unit to another local unit or to one that is at a remote data center. If such capability exists, then to migrate data from one storage unit to another is simple: establish a mirror (or clone, or R2/R3 copy) on the new storage, wait for synchronization to complete, and then with just a bit of down time, switch the product volume to the new storage by unassigning the source disk from the host and then assigning the new storage to the host. The application should be able to start up; provided that the volume and drive letters remain the same (some manual adjustments may be necessary in some cases). For Virtual Machine environments such as VMware, it is also necessary to adjust the disk signature (re-signaturing the disk) in order for the VMFS volumes to be usable. Unfortunately, the mirroring (or replication) capability is typically incompatible among different storage vendors. Sometimes within the same storage vendor’s line of products, the products might be incompatible with each other due to legacy reasons (such as acquisition). Even if the source and destination disks are compatible, the mirroring process typically causes a significant amount of production performance degradation. The original intention of the mirroring capability is for data protection and recovery, not data migration. Page | 5 The Best Data Migration Tool: CDS DMS The Cirrus Data Solutions (CDS) Data Migration Server (DMS) Appliance is the industry’s first disk-block level data migration solution designed from ground up for the purpose of efficiently, transparently and safely migrating data from any legacy or Cloud storage to any other legacy or Cloud storage. Other tools such as Backup/Restore products, Data Mirroring products, Data Replication (Disaster Recovery) products, or FC/iSCSI Storage Routers can be re-positioned as data migration tools. However, most of them either lack features that are specific to data migration (eg. Secure Erasure to ensure the old storage data is cleanly wiped out), or lack total transparency. They may require changing host-side drivers, rezoning the FC switch, or changing the LUN masking at the production storage, all of which requires careful planning and execution. The DMS appliance is a 100% transparent device that requires no changes to the existing environment. It is built on the patented Transparent Data Interception technology (US Patent No. 8,255,538) and when inserting it into the existing production system, the entire migration project will not require any reconfiguration….period. Simply insert the FC ports found on the DMS Appliance, one path at a time into the FC fabric, and data migration can begin. This means IT managers or Data Migration Service providers can avoid all the pain-points associated with certification of host drivers, changing of multi-path software, modification of FC zones and changing of LUN masking/presentations at the storage controllers. The enormous amount of work and potential down-time avoided translates to a significant amount of cost savings and greatly improves ROI. DMS provides data mobility which allows the migration of live data at the disk/LUN level within a heterogeneous storage farm or even across two remote data centers without downtime (until cut-over to the new storage). For a massive storage farm (private cloud or legacy enterprise data center), this ability is important because without it, the lengthy downtime (potentially measured in days) would certainly violate the up-time guarantee that is mandated by the user. It would also undermine the ability of the datacenter operator to consider switching storage when the lease is up, or when there are newer and faster disks available with lower cost of ownership. Lack of an efficient Data Migration solution makes it impossible for the datacenter operator to negotiate for cost-effective storage. The CDS Data Migration Server appliance is the industry’s first purpose-built appliance for realtime data migration without downtime. In a typical deployment, a pair of DMS appliances can be transparently inserted in front of FC SAN storage units being migrated. Built on the patented Transparent Data Intercept (TDI) technology, the entire process of inserting the appliances does not require downtime, and does not require any change in the hosts, SAN switch zoning, and storage unit LUN masking. Once inserted, the DMS appliance is aware of all the writes. The user can then configure the new storage at the local or remote Page | 6 datacenter to receive data. Migration jobs can be defined and scheduled, with various migration priorities and aggressiveness to ensure minimum impact to the production traffic. Competitive Features and Advantages: Transparent Data Intercept eliminates downtime and configuration changes: DMS can be deployed on a live production environment without any down time. There is no need to add host drivers, make FC switch zoning changes, or modify storage system LUN masking. Low Impact to Production Access: Migration jobs can be scheduled and throttled, and can be configured to guarantee minimum impact to production access. In this mode, DMS monitors the production I/O while aggressively migrating data, but will automatically suspend the migration when the production I/O is intensive. The level of aggressiveness is user defined. Other host-based migration solutions require application host outage to install software drivers. Other server appliance based solutions require changes for host multi-path drivers, FC switch zones, and production storage LUN masking. Other solutions typically only have fixed throttling capability. Host-based migration products can severely impact production performance due to the use of local CPU, memory, and I/O resources for migration. Optimized TCP/IP Remote Migration: DMS can migrate to storage located at a remote data center connected via TCP/IP. This allows zero-down-time migration of the entire data center. Other solutions typically only support local migration. High Availability: Two DMS appliances can work together to ensure no single point of failure. Data Integrity Guarantee: Using an advanced data hashing algorithm, migrated data can be verified without impacting the source production disks. The “Consistency Group” concept ensures multiple disks are migrated and cut-off at the exact same moment of time in order to achieve relative referential integrity. Most other solutions require the source disk to be quiescent (or use a snapshot of the source) for comparison against the migrated disk Secure Erasure: After migration is successful, DMS provides a DOD5220.22-M compliant Secure Erase function to completely and irreversibly wipe clean the source disk, so that it is safe to dispose of the disk or return the disk back to the vendor after the lease is up Page | 7 In summary: There are a number of reasons organizations migrate their data; to upgrade to new storage, to move to a new location, to consolidate and/or to reorganize data. Many tools are available to help achieve this goal provided that you have enough downtime to work with. But when the migration needs to be done in a production environment, there is usually no acceptable amount of downtime and the cost associated with this downtime for a company can range from $90K to millions of dollars per hour depending on organization. To address the downtime issue Cirrus Data developed the Data Migration Server (DMS) which is the industry’s first block level data migration solution designed from ground up for the purpose of efficiently, transparently and safely migrating data. Without the need to make switch zoning changes, to modify LUN masking, or to change host drivers, the DMS appliance provides a solution for companies needing to migrate their production data without any downtime, saving both time and money. Page | 8