White Paper NFS FILE MIGRATION TO EMC ISILON Guidance for optimal data migration of NFS workflows Abstract This paper provides technical information and recommendations to help you migrate data from a single NFS protocol workflow on another NAS vendor to an EMC Isilon storage cluster. It includes the best practices for planning, setting up, and executing the migration. November 2014 Copyright © 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Part Number H12517 EMC Isilon Data Migration for NFS 2 Table of Contents Introduction ............................................................................................................ 5 Assumptions ...................................................................................................................... 5 Audience ............................................................................................................................ 5 Prerequisites ...................................................................................................................... 6 The Challenge of Data Migration ......................................................................................... 6 Risk Management............................................................................................................... 6 Data Integrity...................................................................................................................... 7 Data Availability ................................................................................................................. 7 Project Phases and Methodology Overview .............................................................. 7 Discovery and Planning Phase ............................................................................................ 7 Migration Approach and Requirements .......................................................................... 8 Migration Methodology .................................................................................................. 9 Toolset Selection ............................................................................................................ 9 Testing the Migration Methodology .................................................................................. 10 Data Migration Testing ................................................................................................. 10 User Acceptance Testing .............................................................................................. 10 Cutover Methodology Testing ....................................................................................... 10 Rollback Strategy Testing ............................................................................................. 10 Executing the Migration .................................................................................................... 10 Data Transfer ................................................................................................................ 11 Cutover ........................................................................................................................ 11 Acceptance .................................................................................................................. 11 Rollback ....................................................................................................................... 11 Repetition .................................................................................................................... 12 Post Migration .................................................................................................................. 12 Single Protocol NFS Data Migration ........................................................................ 12 Challenges of Single Protocol NFS Data Migration ............................................................ 12 Data-Specific Considerations ........................................................................................... 13 Migration Requirements and Customer Data Collection .................................................... 14 Requirements Gathering ............................................................................................... 14 Current Infrastructure and Data Analysis ...................................................................... 15 Determine Migration Methodology ................................................................................... 15 Migration Sequencing .................................................................................................. 16 Type of Migration.......................................................................................................... 17 Host-Based Migrations ................................................................................................. 18 Isilon Based Migrations ................................................................................................ 19 Migration Tool Selection and Use ................................................................................. 20 Migration Tools ............................................................................................................ 21 EMC Isilon Data Migration for NFS 3 rsync ............................................................................................................................ 21 Isilon-based migrations – isi_vol_copy......................................................................... 26 Isilon-based migrations from VNX – isi_vol_copy_vnx ................................................. 27 Migration Preparation ....................................................................................................... 28 Infrastructure and Environment Setup .......................................................................... 28 Source Host Preparation ............................................................................................... 29 Migration Host Preparation - Source and Target Access ................................................ 29 Isilon Cluster Preparation Configuration ........................................................................... 30 Additional Isilon Cluster Considerations ....................................................................... 32 NFS 16 group limitation ................................................................................................ 32 Production or Pre-Production Cluster ............................................................................ 33 Access Zones and Role Based Access Control .............................................................. 33 NFS RPC threads ........................................................................................................... 33 SmartConnect or Direct Node Connections ................................................................... 33 SyncIQ Considerations ................................................................................................. 33 SmartPools................................................................................................................... 33 SnapshotIQ .................................................................................................................. 33 Antivirus Integration ..................................................................................................... 33 Isilon Guidelines for Large Workloads........................................................................... 34 Migration Approach - Testing and Proof of Concept........................................................... 35 Data Validation ................................................................................................................ 35 Performance ..................................................................................................................... 36 User Acceptance Testing – Data and Workflow Testing ..................................................... 36 Begin Migration Execution ................................................................................................ 37 Pre-Cutover Preparation.................................................................................................... 37 Cutover Event ................................................................................................................... 38 The Go or No Go Decision ................................................................................................. 39 Rollback ........................................................................................................................... 40 Migration Event Complete ................................................................................................ 40 Steady State ..................................................................................................................... 41 Conclusion ............................................................................................................ 41 Appendix: Sample Migration Use Case ................................................................... 42 EMC Isilon Data Migration for NFS 4 Introduction This white paper outlines the recommended approach for migrating single protocol NFS data from other NAS filers to an EMC Isilon storage cluster. Single Protocol NFS data is defined as data read, written, or modified using NFSv2 or NFSv3 protocols. The paper includes best practices on Isilon cluster configuration, tool selection, and host setup to optimize an NFS based data migration. The paper also includes best practices to optimize performance, management and support. Although this guide addresses a single NFS protocol data migration, the approach and many of the best practices can be used as a foundation for other types of data migration. Much of the relevant information for planning, provisioning, and supporting end-user directories on an Isilon storage cluster is available through white papers and guides from EMC Isilon at http://support.emc.com. As such, this guide avoids duplicating content by including only the information that pertains to setting up and operating an Isilon cluster as a destination for a single protocol NFS data migration. Assumptions This document focuses on the data migration of NFS; it does not specifically address the migration to an Isilon cluster of NFS exports, local users and groups, or any other NFS configurations from another network-attached storage (NAS) system. This document should not be used in a multiprotocol migration. Multiprotocol migration requires many additional considerations, and specific actions may be different. This guide assumes that the source data is only accessed through a single protocol NFS workflow. Further, authentication and authorization for POSIX users (UID/GID) needs to be consistent on the source and destination migration clusters. LDAP, NIS, distributed local files, and AD with RFC2307 enabled (SFU) are all centralized means of authentication that should be successfully implemented before the migration begins. An authoritative and consistent source for user authentication is assumed (e.g. AD with RFC2307, LDAP, NIS, etc). If there are multiple sources of authentication from disparate clusters targeted for migration (i.e. several local different /etc/password files) they must be inspected for collisions and manually combined if necessary to prevent UID and GID collisions. Identity management must be optimized with preference for a single external directory service is recommended. Work with the customer to establish a consistent and authoritative set of users and groups if necessary. Workflows will be for NFS v3 with the future possibility of adding SMB workflows in the future; NFSv4 is not addressed in this document. The EMC Isilon filesystem will have a “Balanced” on-disk identity setting for global ACL policy. Files with POSIX mode bits only are in-scope for this document; additional permissions such as SMB ACL’s are not covered. Multiple source clusters may have identical export directories. A directory consolidation plan must be developed to deal with directory name collisions in the newly created single namespace on the Isilon cluster. Files and directories have unique permissions that restrict access to the intended users and groups; a typical migration preserves and transfers them. Post migration permission transformation is not covered in this text. Audience This guide is intended for experienced system and storage administrators who are familiar with file services and network storage administration. This guide assumes you have a working knowledge of the following: EMC Isilon Data Migration for NFS 5 • • • • • • • NAS systems The NFS storage protocol, as appropriate for the specific migration requirements Isilon scale-out storage architecture and the OneFS operating system Additional Isilon features, including SmartConnect, SmartPools policy management, SnapshotIQ, and SmartQuotas File-system management concepts and practices, including provisioning, permissions, and performance optimization Integration practices for connecting and establishing authentication relationships centralized sources (LDAP, NIS, local files, etc. ) Basic shell commands and command line operation. Basic shell scripting. While this guide is intended to provide a consolidated reference point to migrate data to an Isilon storage cluster, it is not intended to be the authoritative source of information on the technologies and features used to provide and support a file-services platform. In the event that additional services are required, EMC IT Services are available to assist in streamlining data migrations, reducing risk and minimizing impact. Prerequisites Some of the features that are described or recommended in this document may require separate pernode licensing from EMC Isilon. For more information, please contact your EMC Isilon representative. The Challenge of Data Migration The migration of a storage system’s data and all the existing user access permissions is a complex process. Moving the data while limiting downtime and protecting the data can be challenging. While you execute a migration, the key requirements are being able to access the data at all times and ensuring data integrity against loss or corruption. It’s critical to understand that a data migration must be considered a unique project. Few environments are the same, and as a result each migration should be considered a unique event. No pre-existing approach will necessarily be appropriate for all migrations. But that’s not to say that common approaches cannot be used after you evaluate and understand the requirements of a specific migration. The goal of this whitepaper is to introduce the recommended approach to designing and executing an NFS data migration to an Isilon cluster. Every migration is different; this guide provides some examples and guidance. EMC IT Services is expert in working with customers to build an individual plan that meets their needs. Whether EMC Services are used for the migration or you plan on managing the migration in-house, the following are key areas that must considered before any migration project: • Investigate the composition of the source data. Is it a deep directory structure or a wide structure with many files per directory? The type and number of files/directories will directly influence how the migration is planned and executed. • Understanding the sequence of a migration project is critical to its success. The ability to predict and manage the time required to execute the data movement is paramount—it may be the single biggest factor that will affect the project’s success. • Maintaining data availability throughout the lifecycle of the data migration project is also critical. Little of today’s data can go for days without being available. In order to maintain data availability, you will need a strategy to maintain access to data throughout the migration. Risk Management It is not uncommon to have a number of challenges or perceived problems that are seen as blocking issues or barriers to executing the migration. With sufficient planning and testing, the perceived risks can be addressed and managed successfully. Common risks and problems associated with data migrations include the following: EMC Isilon Data Migration for NFS 6 • • • • • • • Underestimating the amount of data (file counts, sizes) and the time required to move it Maintaining performance of the existing data solution and the network during the migration Maintaining access to the data throughout the migration Changes to the data permission models Challenges with moving client connections Maintaining a consistent security model after the migration Execution of the actual cutover event This paper helps you understand these challenges and risks and develop a data migration methodology to manage the risks while implementing Isilon best practices to facilitate and optimize the migration. Data Integrity Data integrity is critical. The data must be moved exactly as it is, and any modification to the data during migration may impact the availability of data and the success of the migration. The goal of the project is to ensure the data is successfully moved and its integrity is not compromised during movement. In most cases this includes the migration of all relevant file metadata as well as the underlying data blocks. A complete backup should be made of the source data and the validity of the backup verified before the migration begins. Extreme cases where the data integrity must be maintained can be addressed with the use of checksums. An MD5 checksum can be calculated on each file on both the source and destination post migration, verifying bit-for-bit integrity. Data Availability Any migration activity will require a transition or cutover from the existing source systems to the new destination systems. This cutover will require a window of time when the data is unavailable. Minimizing this time is the goal of all migrations and is often determined by the type and amount of data. A number of migration strategies can be employed to reduce the period of unavailability of data during the cutover. In any cutover, data clients will also need to be redirected toward the new storage target to ensure that they can continue to access their data once the data has been moved. Project Phases and Methodology Overview A data migration project should be broken into distinct phases. The goal of the project phases is to develop a robust and, repeatable migration strategy that aids the execution and leads to a successful migration cutover. Discovery and Planning Phase The goal of the discovery and planning phase is to design a migration methodology and plan which allows you to execute the project with minimal risk and downtime. The following items are components of the discovery and planning phase: • • • • • • Qualify the project Identify the migration scope Understand expectations Identify risks Define the timeline Identify all migration requirements (i.e. Rollback plan) During this phase, a detailed review of the existing source environment and data is undertaken, and then developed into a plan to migrate the data to the new target Isilon environment. The planning phase should be completed and validated before you start the other project phases. The key aspects of the planning phase include discovey of the existing infrastructure, the data, and the Isilon cluster. EMC Isilon Data Migration for NFS 7 Infrastructure Discovery This is where the infrastructure of the existing storage system, network architecture and the network path between the source data and the Isilon cluster is evaluated. If, for example, multiple source filers are to be combined into a single Isilon cluster, extensive analysis of existing exports should occur. Export directory naming collisions (i.e. server1:/exports/home and server2:/exports/home are to be combined from different source filers) should be investigated to verify that the data and directories contained within will be able to coexist in a unified export on the target Isilon cluster (e.g. isilon:/ifs/data/home): For example: server1:/exports/home /user1 /user2 /user3 server2:/exports/home /user4 /user5 /user1 If the /exports/home directories are to be unified into a single /home on the Isilon cluster, the duplicate /user1 directory must be addressed. Possible solutions include combining (if the same user) or renaming (if different users, along with a corresponding UID change if necessary). Work with the customer to address these issues and craft a solution that will be minimally disruptive to their existing environment. Data Discovery This is where you analyze the data and workflows that you plan to migrate and how they map to the target end state on the Isilon cluster. Quotas If quotas are utilized on the source volume, they will need to be recreated on the target Isilon cluster. Implementation of the quotas however, should wait until after the migration is complete to avoid any potential issues while transferring data. Isilon Cluster Configuration Design and Discovery This is where activities such as the design of the Isilon network, disk pools, shares, and authentication can affect the migration design. The configuration of the cluster is critical to the success of the migration. The output of the discovery phase feeds into the migration design and drives the execution of the project. Migration Approach and Requirements The analysis of the data that you collected during the discovery phase drives the migration requirements and the migration plan. The migration requirements break down into subcategories: What – What are you migrating? EMC Isilon Data Migration for NFS 8 • • • • All the data or a subset of the data Replicate the existing data as is or transform it during migration Copy the data but implement a new security model Take a hybrid approach How – How are you going to migrate the data, security, workflows? • • • • • • • Tools used to copy the data and security Cutover strategy; how will client connections be moved? If the data has rapid rate of change, how will you accommodate it? Data is static, can be moved without impact Full data copies and follow-up incremental copies to gather recently updated data Clients access this data currently by method x/y/z. Limit access to the old data and redirect during the cutover When – When are you Implementing the Cutover? • • • • Single mass event Several large cutover events Series of smaller cutovers sustained over a larger time frame Rolling migration Having clearly defined these requirements, a migration methodology can be developed to address them. Migration Methodology Analysis of the migration requirements leads to the development of a migration methodology. The migration methodology follows a waterfall methodology with phases in general occurring on completion of the prior phase. Although the preparation for upcoming phases can occur before prior phases are complete, the execution is defined by the completion of its dependent phase. Figure 1 – Sample Migration Plan The plan addresses how all aspects of the migration are achieved: sequencing, tools, timing, communication and implementation. After you develop a plan, a proof of concept can help you evaluate the approach and test the phases of the plan. Toolset Selection After you finish the discovery phase and develop a methodology, you can select an appropriate selection of migration tools . EMC Isilon Data Migration for NFS 9 Testing the Migration Methodology After you develop a migration methodology, you must review, validate, and test the migration plan. A test migration is usually run on a subset of the data. Running a test migration is also invaluable in estimating the performance and timing of a migration. Data Migration Testing Testing of the actual data movement process and execution is the first phase in the testing of the overall methodology. The data migration testing determines whether the proposed methodology meets the requirements and accomplishes the goals of the project. Role of data migration testing: • • • • • • • Validates the tool selection; does the tool do what you want it to? Does it copy the data and attributes? Does it preserve hard and/or soft links? Validates the data transfer; is the data moved as expected? Validates that the permissions are copied over; are they correct, functional and operational? Benchmarks the data-transfer performance; how long does it take to run full and incremental transfers? Tests the new data: Is it available? Are the read-write settings correct? Does the new workflow work? Gives you the option to experiment with different methods, tools and flags Allows you to tune the process to achieve the best results The testing should give you confidence that the data will be accessible and available after all the data and users are cutover to the new system. User Acceptance Testing Before you execute the full migration and cutover, user acceptance testing (UAT) should be undertaken against the new storage system and a sample of the migrated data. UAT validates that the data is ready for cutover: • • • Data is accessible; users and application can access the data correctly Permission models are correct; the required security is applied to the migrated data Workflows are operational; no issues with the data’s use Cutover Methodology Testing Cutover methodology testing helps determine how you will move client connections—and how the clients will respond to the cutover. Through testing, you can gauge how long it takes to move the connections, what kind of issues may occur, and how to troubleshoot the issues. Testing the cutover strategy thoroughly provides feedback on how to execute the final cutover. Rollback Strategy Testing You should also test your rollback methodology. The rollback testing should validate that your plan to failback or abort a migration works so that you are prepared in case there are issues during the cutover. Make sure to validate that access to the data on the old system can be restored quickly and efficiently without affecting users. Executing the Migration Having completed and validated all your methodology and processes you can move on to the main migration. Core Migration Phases: • • • • Data transfer – all the data is migrated from the old system to the new system Cutover – connections and client are moved to the data on the new storage Acceptance – the new data source is ratified Rollback – only if required EMC Isilon Data Migration for NFS 10 • • Steady State and repeat – migration phase considered complete but additional separation migrations may occur Post migration monitoring – the new system and data is monitored following the cutover Data Transfer A standard approach is to execute an initial Full data copy to move all the initially identified data and to follow it up with a series of Incremental copies, which only move any data that had changed after the initial Full copies had run. This gives you the most flexibility in executing the cutover as the additional incremental copies will be substantially shorter to execute than the initial large data copies. Cutover After data migration, the act of actively moving clients from the old storage to the new storage will occur during a cutover event. High-level cutover plan: • • • • • • • • • Old source – Remove write access to ensure clients are unable to write any new data Execute a final incremental copy to migrate any remaining data from the old system to the new system Test new target data and connectivity; selective UAT GO or NO GO –Decision to move forward with cutover event Update the client-to-storage connection mechanisms; DNS, DFS, VIPs, etc. Monitor the new storage system – monitor load and connections as the clients cutover Validate clients – review and validate that clients can successfully connect and operate Validate workflow – verify business operations work as expected Cutover complete Acceptance Having migrated the data and client connections over to the new storage solution, the storage availability and workflow acceptance of the new data and storage solution must be validated. Once you begin writing new data to the new storage system, the ease with which you can roll back to the old storage system diminishes. If you were to roll back to the old system, any changed data would need to be copied back to the old environment. Unless this newly written data can be discarded, rewritten or manually reconciled, it is highly suggested that any rollback be executed before any significant changes to data have occurred to the new storage. Rollback Make sure you have a fully tested rollback plan in place. A rollback may be needed for a variety of reasons: • • • • Client connectivity or storage name resolutions issues develop following cutover Final incremental not completed in outage window, so not all data is migrated Unplanned IT outage or issue occurring at the same time Data access on the new storage is invalid and workflows are impaired The goal of a rollback plan is to quickly restore the access to the old data storage solution. Assuming the cutover was executed correctly; restoring the prior access should be straightforward and should be able to be implemented with minimal additional disruption. The primary goal would be to restore access within the cutover window so no additional downtime and interruption to data is introduced. It is critical to have a tested rollback plan that can be used if an issue with the cutover occurs. EMC Isilon Data Migration for NFS 11 Repetition After you validate the data transfer through cutover and client acceptance, most migration projects consist of multiple migration cycles. The methodology can be executed again on different sets of data in migration waves that encompass the entire project. Post Migration Following the migration cutover, it is important to monitor both the new storage system and the old storage system. You should observe that client connections are moving to the new system and that active data connections are no longer getting initiated on the old storage. Can clients connect and work with the new storage systems without issue? As connection counts increase on the new storage system, you should monitor the load and performance. Make performance adjustments as needed. You should be monitoring the following items during and after the cutover: • • New system: system load and performance, number of connections, movement of users, security, and performance Old system: are users still connecting to it? Are there legacy connections to it from old applications? If quotas were utilized on the source volume, after the migration is complete is the time to implement them. The Isilon OneFS feature SmartLock should likewise be implemented post migration. Committing files should only be done once data has been completely migrated. You should have a transition plan of what to do with the old storage. Some common approaches are as follows: • • • • • Keep it around for a while but with administrator access only Provide read-only access for users Mothball the system while the new systems transitions Decommission it Purge the data after a defined retention period has been reached Single Protocol NFS Data Migration Although this paper addresses a single NFS protocol data migration, the approach and many of the best practices can be used as a foundation for other types of data migrations. Challenges of Single Protocol NFS Data Migration Moving large amounts of data presents a number of challenges: • • • • • • • Difficult to perform without downtime; most source clusters are overloaded requiring that the data be available at all times and operate at near capacity. Hence the need for migration. However, the migration itself can present a significant additional load to the source cluster. A large number of exports may need to be migrated. You must move not only the data but also the exports and export permissions. This introduces a second type of migration (configuration) that must be undertaken during the project. Consolidation of multiple source filers into a single unified namespace and directory structure May have a large number of differently connected clients that require separate cutover and validation events NFS exports may be mounted deeper in the exported tree Restricted exports to specific hosts and unique permissions; verify export options High rate of change; often large environments contain a large number of concurrently connected clients. You must account for the rapid rate of change of data during and after cutover. EMC Isilon Data Migration for NFS 12 Data-Specific Considerations When you design a migration strategy, you should plan how your data will appear after it has been migrated. Consider the scenario of multiple smaller filers consolidating into a single Isilon cluster. There may be duplicate UID/GID collisions if the source filers are not using a consistent source of authentication. In that case, manual remediation may be necessary to combine and fix user/group accounts that have duplicate ID’s. For example: filer1: has users user1 (UID:305), user2 (UID:423), and user3 (UID:424). filer2: has users user1 (UID:305), user4 (UID:423), and user5 (UID:424). The UID’s for user2 (UID:423) and user4 (UID:423) are the same so one user would have to change to a different UID and the file ownership corrected during the migration. Same issue with user3 (UID:424) and user5 (UID:424); one must change and file ownership be corrected before the users can both use the new cluster. One possible resolution is to make user4 UID:1423 and change ownership of all their files prior to cutover. Likewise, make user5 UID:1424 and modify ownership on all of their files. Keep in mind that OneFS will store all UID/GID information regardless of source. OneFS does not require NFS authentication. Duplicate export paths are a similar issue: filer1: has exports /vol/share/acct, /vol/share/work, and /vol/share/eng filer2: has exports /vol/share/acct2, /vol/share/work and /vol/share/engineering /vol/share/work is the same from both source filers! This is a discussion that must be conducted with the customer prior to migration. A plan for directory consolidation must be developed to deal with export path collisions. One typical solution is an additional directory layer that identifies the original source filer: isilon: would have exports: /ifs/data/filer1/acct, /ifs/data/filer2/acct2, /ifs/data/filer1/work, /ifs/data/filer2/work, /ifs/data/eng, and /ifs/data/engineering With this methodology, duplicate export paths can safely be consolidated from multiple source filers into a single cohesive namespace. Clients however must be updated to reflect the updated export paths. The metadata of files, in particular, can add complexity to a migration. Identify the metadata that you want to migrate with the data. The following metadata can affect your migration strategy: • • • • • File properties such as access time, created time, modified times, owners File attributes such as read only or archive (an Isilon cluster does not support compressed and encrypted) Extended proprietary file attributes that are in use; not supported on Isilon Local users and groups; are these defined on the files? Deduplication in use, or Archive stub files, Mac OS X resource forks present Other data specific considerations: • Date/Atime/Creation time retention requirements – These may not be preserved across migrations depending on which tool is used, e.g. creation time is not preserved with isi_vol_copy. • Symlinks will break – Depending on where the symlinks connect to, the underlying paths will probably change after a migration and require rebuilding • Automount maps will need to be repointed (NIS,NIS+) – Similar to symlinks, the export paths may change and hostname may change as well • Character encoding – Verify that it is the same on source and target; international characters in filenames may be problematic EMC Isilon Data Migration for NFS 13 How does the data need to appear post-migration: • Direct replication of all data and attributes • Move the data, then make updates, fix problems, change the security, etc. Migrate just the data and implement an entirely new security model Migration Requirements and Customer Data Collection Before you can plan your migration, you must collect requirements. Requirements Gathering The data migration planning begins with identifying the data that you want to move from the old storage system to the new storage system. Here is what you need to document: • o o o o • o o o o Current State – What is the current state of: Source Infrastructure Existing storage platforms Network design and implementation Name resolution infrastructure: DNS, DFS, global namespace Servers/clients/OS/applications Source Infrastructure Configurations Volumes Shares/Exports Access Authentication Source Data Logical Structure – data layout, directory depth Is the structure Wide or Deep? Physical Structure – total size, min/max/average file size Number of files Source Data Security Current security model and how file access is enforced Local Users and groups POSIX permissions LDAP users and groups Target State – What will be the target state: Target Infrastructure: Isilon Cluster Network configuration Target Configurations: Directory layout and structure Shares/Exports Access and Authentication model Target Data Logical Structure – same as source or new Physical Structure – same as source or new Target Data Security Same as current security model Migrate and change the security model Move the data and implement a new security model How to gather the data • • • • • • Interview stakeholders Gather documents: network diagrams, run books, infrastructure and application details List of exports Storage reports, etc. Review share permissions Examine directory structure (shallow vs. deep), file composition (small vs. large), and numbers of files EMC Isilon Data Migration for NFS 14 Current Infrastructure and Data Analysis Start the migration design phase by collecting the data needed to develop the migration requirements. Best practice Create and utilize a standardized data collection and migration-planning document along with a standard target configuration guide. By using a structured document to gather and collect all your source data and information, you can identify your migration requirements, which will lead to clear migration design decisions. Why: Will simplify and consolidate migration planning and implementation. You need to collect the following information: • o • • • • • • • • • • • How much data; actual file data, not compressed or deduplicated If there is deduplicated data, how much; this will have to be added to the total How many directories and files; identify the directory trees and quantity What is the directory structure: shallow and flat, wide and deep, or? Directories with more than 10,000 files in them How many exports; do you see share name collisions or reuse on multiple source hosts What is the use of these exports; home directories, application or group use How permissions are applied to source; individual or at the group level How many source locations; single source system or multiple How clients access data; protocols and how they resolve storage names Rate of change of files; how often and where are files changing Networking architecture; source systems and network between it and the Isilon cluster Source system load; understand what load is the source storage under and how much additional overhead from the migration would be tolerable Determine Migration Methodology After you collect information on the source system, the data, and the infrastructure, you are ready to develop a migration methodology. Logical Migration Design By analyzing the structure and layout of the source data, you can make logical migration design decisions—structuring the migration into distinct executable units. A goal of the migration methodology is to identify logical boundaries to facilitate the cutover of your clients and workflows. Some logical migration boundaries are as follows: • • • • Hosts/Filers/Servers/Arrays Volumes Exports Directories – users or groups Best practice – Define migration boundaries Identify logical Migration Boundaries: Identify clearly well-defined data structures to migrate and cutover—for example, entire exports or directories. Be aware of the size of data inside a migration boundary, as the size of the data affects the outage window required to complete a cutover. EMC Isilon Data Migration for NFS 15 Why: This best practice organizes the migration into segments waves, making the migration easier to manage. After you segment the logical boundaries into distinct migration phases, you can address other elements of your workflow, such as metadata, that you need to migrate. File Attributes and Security Most data migration also includes the migration of the file’s metadata: ownership, access times, creation time and security descriptors. Before you can execute your migration, determine how you plan on handling metadata and file security. Common migration approaches: • • • • Migrate data file as-is (no change to permissions/ownership) Migrate data and permissions but also fix on destination (Recalibrate the permissions) Migrate data-only. Create new permissions on the destination, or create a new security design. Migrate away from an existing security model and implement a new model. It is recommended to use a central authentication scheme on Isilon. For example, if the NAS system that you are migrating from uses several directory services, you should consider consolidating the directory services into a single directory service for the new NAS system. Best practice – Understand the attributes of the source data before the migration Be data aware: Identify any DOS attributes, non-standard extended file attributes, and nonstandard permissions that are not supported by an Isilon cluster. Also, identify your local users or groups and have a plan to deal with them. Why: Before you execute the migration, you may need to take additional steps to prepare the data for migration so it will be available on the new storage system. Migration Sequencing The execution of a migration will likely require multiple iterations of the data transfer. If the source data is constantly changing, try to find a window when the source data can be locked in a read-only state or deny access to clients. Once access to the source data is removed, the final data transfer can take place. Otherwise, differences between the data on the source and the target might result. The recommended approach for a data migration is to use a multi-step migration. A multi-step migration consists of an Initial “Full or Level 0” data copy. The Initial data copy is followed by a series of “Incrementals” that update only new or changed data. The Initial data copy moves an entire copy of the source data. It can often take a long time to execute because all the data must be assessed and transferred over the network to the migration target. After the initial copy completes, additional differential transfers copy only the data that has changed since the initial Full was executed. Additionally, any data that is deleted on the source will also be deleted on the target through the incremental process. The size of an incremental copy is affected by the rate of change of the source data. You should run multiple, over-the-top incremental copies to guarantee the integrity and consistency of data that encounters issue during the initial Full copy. Incremental copies will also keep the two data sources in sync with each other and require less catch-up on final copy. A final incremental should always be executed as part of the migration cutover plan to ensure all the latest data is on the new target storage. EMC Isilon Data Migration for NFS 16 Best practice – Run initial Full copies followed by Incremental copies Run Initial Full copies followed by multiple Incremental copies. Always execute a final Incremental during cutover to ensure the latest data from the source is migrated. Why: Executing multiple migration passes will ensure that all the data is transferred and that the latest version of the files will be stored on the target storage system. Type of Migration You must determine how the migration will be executed. There are two possibilities: an indirect execution from a host and a direct execution from an Isilon cluster. With host-based migration, an intermediary host executes a copy process between the source system to the target system through the host, as shown in the following figure: Figure 2 - Host based migration With a host based migration all data is transferred through an intermediary host on route to the Isilon cluster. If the source system is supported, the Isilon cluster can execute a direct source to Isilon data copy by using the Isilon OneFS isi_vol_copy command which will copy data using NDMP protocol: EMC Isilon Data Migration for NFS 17 Figure 3 - Isilon based migration With an Isilon based migration, data is pulled directly the source system to the Isilon cluster utilizing isi_vol_copy. Another method of migration can be achieved by directly running the Linux rsync replication utility on the Isilon cluster itself: Figure 4 - Isilon based migration with rsync on individual nodes Similar to the OneFS isi_vol_copy tool rsync can be run natively on the individual Isilon nodes against locally mounted NFS source exports mounted directly on each Isilon node. Data is transferred directly from the source cluster to the Isilon cluster reducing latency and network congestion eliminating external host computers to move the data. The external NFS mounts on the Isilon nodes MUST BE UNMOUNTED prior to normal cluster operation Host-Based Migrations A host-based approach might be selected for a number of reasons: • • • • Source system does not support Isilon based migration – isi_vol_copy not supported Connectivity is restricted – storage on different networks, a host may bridge the networks Flexibility in execution – separate the execution from administration of the storage systems Security restrictions – can be used to limit access to systems EMC Isilon Data Migration for NFS 18 In a host-based migration, the toolset executing the migration makes a connection to the source and to the target system and then copies the data through the host. For the purpose of this paper, the primary host-based tool is rsync. Best practice – Select a suitable host Select a suitable host to run the migration with adequate network bandwidth and processing power. Why: Because all the data will move through the host, incorrect sizing may lead to a bottleneck or an interruption in the migration. Using multiple hosts may facilitate multi-streamed migrations in which you can maximize network usage and the Isilon nodes by executing multiple migrations concurrently. A host with 10Gb network connectivity is highly recommended. Some common considerations are as follows: • • • • • Adequate resources to execute the migrations; CPU, RAM, Network 10 gigabit network infrastructure where possible Connectivity between host and the source and target storage systems Availability; the host is stable and reliable, no reboots or downtime Dedicated host; not running a lot of other parallel workloads and restricted user access The migration host needs to be as optimized for the migration workload and high network throughput as much as possible, as it will send and receive all the data to be migrated. Isilon Based Migrations If the source system is capable of supporting an Isilon based migration by use of isi_vol_copy or by direct access with rsync, the connectivity exists, and the migration methodology supports utilizing this approach, a direct migration may be a more applicable technique. The main advantage of the direct approach is there is no need for an intermediary host to execute the process or for the data to traverse the external host. Type of Isilon Based Migration Two primary types of Isilon based migration exist. • • NDMP based with isi_vol_copy rsync based – use the unix rsync tool to connect and either push or pull data directly to the Isilon target, running natively on the Isilon cluster Known source systems that support Isilon based NDMP migrations: NetApp® Isilon Requirements: Isilon OneFS 6.5.5.6 or later newer NetApp Requirements: Data ONTAP 7.x or Data ONTAP 8.x operating in 7-mode It is anticipated that additional source systems will be supported in future releases of Isilon OneFS. As in all migration strategies, it is critical to evaluate the migration methodology against the selected approach to determine if the method selected will facilitate the migration goals. Best practice – Evaluate migration approach Evaluate and select the most appropriate migration approach; select the method that meets your specific migration requirements, provides cutover flexibility, and optimizes data throughput. EMC Isilon Data Migration for NFS 19 Why: The selected approach will impact the migration schedule and planning. Having identified the migration approach, the selection of the appropriate migration tool can occur. Migration Tool Selection and Use The data migration requirements will help define the tool selected to execute the data migration. Tool Selection A number of tools are available and will work; any file copy method that can connect over NFS to the source and target storage can be used to move data between the systems. It is recommended to use a tool that can be automated and provides robust functionality—a tool that can copy attributes, security, logging, etc. The common NFS data copy tools are as follows: Tool Advantage • Included with Isilon OneFS • Pulls across all users and group permissions • Supports both SMB and NFS protocols. • Utilizes NDMP Direct migration sourcetarget isi_vol_copy_vnx • Included with Isilon OneFS • Pulls across all user and group permissions • Supports both SMB and NFS protocols. • Utilizes NDMP • Direct migration sourcetarget rsync • EMC preferred tool • Designed for synchronizing directories • Only sends differences when files change • Lots of available switches • Can be scripted Open source-widely available tar, cpio • Can be scripted • Open source-widely available • Good for one full push of data Table 1: Summary of NFS copy tools isi_vol_copy Disadvantage • Only supported against specific source storage systems - NetApp systems running OnTap 6.5 and above with NDMP v4 • Limited error reporting • Only supported against specific source storage systems VNX 7.x OE and Celerra DART 5.6.x or higher • Limited error reporting • Limited error reporting • Limited error reporting • Designed for backup and restore, not active copying • Does not do incrementals EMC Isilon Data Migration for NFS 20 Tool Versions It is important to understand that different tools may behave differently on different hosts; it is strongly suggested to test tool versions and observe their behavior. Best practice – Use the correct tool for the job For NetApp(TM) Filer: use isi_vol_copy. For VNX: use isi_vol_copy_vnx. For general NFS filers: use rsync. Why: Using the correct tool for the job will give you the best chance for a successful migration. Best practice – Use the correct version of the tool Rsync is available on nearly every UNIX and Linux distribution as well as natively on the Isilon cluster. You must use the correct version for the OS for the host that is running the tool. Why: Using the correct version of the tool will optimize throughput and performance, i.e. use a 64bit version if your host OS is 64-bit. Best practice – Use the latest version of migration tools It is recommend to always utilize the latest versions of the chosen file copy tool. Why: Performance is optimized and they often have newer features and bug fixes. Migration Tools The following is an overview of the use of the primary NFS migration tools that can be used in Isilon data migrations: rsync Overview: The rsync tool provides a method to copy files, directories and sub-directories from NFS exports to other NFS exports with the ownership and attributes intact. It was designed to efficiently synchronize files and directories from one location to another minimizing data transfer while using delta encoding where appropriate. If the source and destination have many files (and parts of files) in common, the utility need only transfer the differences. Incremental change copies are thus extremely efficient. Rsync can operate in both a local and remote mode (as a service) and behaves similarly to rcp. It can “pull” or “push” files from filers. Rsync should be run as root to preserve file permissions and ownership. It can also use ssh if necessary for secure environments. Source code is available and rsync is implemented on nearly every modern operating system. Usage: Rsync [options] <source> <destination> EMC Isilon Data Migration for NFS 21 Features: • • Ability to copy file data, ownership, time stamp information Is extremely efficient for incrementals For a full list of rsync features and switches, run the following from a command shell: man rsync Sample rsync command: rsync $OPTS [$SOURCE:]$SOURCEDIR [$TARGET:]$TARGETDIR where variables are typically defined: $OPTS=--force –ignore-errors –delete-excluded –exclude –from=$EXCLUDES – delete –backup –a” $EXCLUDES=/path/filestoexclude $SOURCE=name of source filer $SOURCEDIR=/path/sourcefiles $TARGET=name of isilon node destination system $TARGETDIR=/path/targetdir Rsync by default runs in a local mode, but with the addition of [$HOST:] in front of either $SOURCEDIR or $TARGETDIR it can transfer files remotely between systems. For example: rsync –avh /tmp/foo root@host2:/tmp/bar If run from the source system would transfer the local directory, /tmp/foo, to the remote host, host2 and place them in the target directory, /tmp/bar. Note that a shell script is usually created to automate and distribute the rsync jobs. Entire migrations can be automated and incremental passes scripted to run automatically. Review the scripts with the customer to verify that the sequence of commands matches the expected migration plan. Best practice – rsync and compression If the data is mostly binaries or large uncompressible files it is not recommended to use compression as this will slow the migration considerably. Text files however, will readily compress and if the source data is such, the use of this option will greatly speed the migration. Know the source file composition. Why: Trying to compress non-compressible data will greatly slow the migration. Best practice – exclude snapshots from replication It generally does not make sense to migrate snapshots as they will not automatically work on the target system as intended. Therefore, exclude them from the migration to speed the process. Why: Snapshots will not migrate. For example: EMC Isilon Data Migration for NFS 22 rsync –avh –exclude=’.snapshot*’ /tmp/foo root@host2:/tmp/bar if run from the source system would transfer the local directory, /tmp/foo, to the remote host, host2 and place data in the target directory, /tmp/bar, while excluding snapshots. Best practice – spaces in name Be aware that spaces in file and directory names can cause problems. For example, a directory named, “/spaces in my name” and a file named, “some file.avi” would require special handling on both a command line and in a script: rsync –av foo@foomachine:’/spaces\ in\ my\ name/some\ file.avi’ /local_directory/ The “\” character is used before spaces to prevent the shell from parsing the next word as a separate argument. Why: Spaces in filenames and directories can cause scripts to fail. Be alert for them. Best practice – starting rsync switches Suggested initial rsync switches: --a=archive mode; equals –rlptgoD (no –H,-A,-X) --delete=delete extraneous files on the target (useful on incremental copies if source data has been deleted) --force=force deletion of directories even if not empty (during incremental passes if directories are deleted on the source) --compression=compress the data being transferred (if the data is compressible) Why: It is suggested to start with a baseline of switches and test the copy, validate the results and behavior of the copy, and make the appropriate adjustments to the rsync switches. No single default set of switches will work for all migrations. Remember, rsync can be run multiple times incrementally and different directories/exports may require different options. You should become familiar with many of the rsync switches and their use. The following highlights a few possible options that you should be familiar with. It is important to recognize that each migration will require different switches because of the unique requirements of each dataset. A few useful switches to be aware of include the following: -r,--recursive Recurse into directories -l,--links Copy symlinks as symlinks-p,--perms Preserve permissions -h Output numbers in a human readable format EMC Isilon Data Migration for NFS 23 --progress Show progress during transfer -z,--compress Compress the file data during the transfer -g,--group Preserve group -o,--owner Preserve owner -D Preserve special files and device files --protect-args If you need to transfer files that contain whitespace, either specify --protect-args or escape the whitespace with a “\” -stats Provides detailed list of the total number of files, files transferred, benchmarks, and an average transfer speed -t,--times Preserve modification times -v,--verbose Increase verbosity -n,--dry-run Perform a trial run with no changes being made --exclude=PATTERN Exclude files matching PATTERN --exclude-from=FILE Read exclude patterns from FILE Symbolic Links and Hard Links Be mindful of symbolic links within the source filesystem; they may not point to the same target after migration if paths change. Rsync has multiple methods to deal with symbolic links. Choose the most appropriate after consulting with the customer. They can also be dealt with in a separate migration pass. EMC Isilon Data Migration for NFS 24 By default, links are not transferred at all; a message such as, “skipping non-regular file” is generated for any symlinks that rsync encounters. Switches to deal with links include: --links Symlinks are recreated with the same target on the destination. Not that --archive implies --links. -L,--copy-links Symlinks are “collapsed” by copying their referent, rather than the symlink. -H,--hard-links Preserve hard links --safe-links Ignore symlinks that point outside the tree being replicated. Useful to prevent sensitive system files such as /etc/passwd from being inadvertently copied. Best practice – Know the rsync switches Understand all the rsync switches and when how to use them Why: Different migrations will require the use of different switch to meet the requirement of the data copy and the final state of the migrated data. Discuss with the customer before the migration begins to determine the optimal selection of switches. Best practice – Parallelizing the RSYNC Processes Examine the source directory structure and look for obvious means to divide the source directory tree into smaller, more manageable chunks. For example, if you have a filesystem of 4,000,000 files it might take 6 hours to complete (completely fictional made up number). Consider if the filesystem tree was divided into something like: drwxr-xr-x 2 root root 179 Jul 19 15:00 directory_a drwxr-xr-x 2 root root 179 May 1 00:00 directory_b It would cut the migration time in half if you could run two simultaneous rsync jobs at the same time (assuming the contents of the directories were balanced nearly equal): rsync –av –include=”/directory_a*” –exclude=”/*” –progress remote::/ /localdir/ rsync –av –include=”/directory_b*” –exclude=”/*” –progress remote::/ /localdir/ Best performance would come from spreading requests across multiple Isilon nodes and multiple source network interfaces. Multiple rsync jobs can be run on individual nodes as well, but these processes tend to be network limited. You want to spread the load across as much of the Isilon cluster as possible, maximizing the available bandwidth to each node. Be mindful if the NFS option “map root to nobody” is implemented as this may affect access to files. Why: You will see increased performance, but you may be limited by network bandwidth and source cluster throughput. EMC Isilon Data Migration for NFS 25 Isilon-based migrations – isi_vol_copy Overview: Isi_vol_copy is a native Isilon OneFS tool that supports data migration through the use of the NDMP protocol. The tool allows the cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system to the Isilon preserving permissions and ownership. Usage: isi_vol_copy <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ] [-sa user: | user:password] [-sport ndmp_src_port] [-dhost dest_ip_addr] [-maxino maxino] [-h] Features: • • • • • • • • Utilize native NDMP functionality and connectivity Supports Full and Incremental backup levels Migrates data and all security, attribute information Will restore the set of permissions/ACLs that existed on the source data Will migrate NFS and SMB source data Does not impact or interact with client data access Dedicated data transfer pipe between source and cluster Starting with OneFS 7.0.2, it supports the Backup Restartable Extension, so full backups can be interrupted and restarted from a checkpoint Limitations: • • Source filers have a limits on the number of NDMP threads and simultaneous backup jobs; avoid overrunning the source filer Can be limited by source filer network bandwidth Sample isi_vol_copy command isi_vol_copy <source_filer_IP>:/<source> -sa<ndmpuser>:<ndmppassword> /ifs/data/<source_filer> -full Best practice – isi_vol_copy target data use Do not touch the data on the target Isilon until after the isi_vol_copy has completed. Why: This will create problems and you may have to re-do a full copy. Best practice – simultaneous isi_vol_copy use Do not execute multiple isi_vol_copy going to the same target. i.e. don’t have all your isi_vol_copy migrations going to the same target directory. For example: filer1:/vol/sourcedir -> isilon:/ifs/data filer2:/vol/sourcedir2 -> isilon:/ifs/data EMC Isilon Data Migration for NFS 26 Why: Creates problems for the copy process and may require remediation after migration. Instead: Use an additional directory level: filer1:/vol/sourcedir -> isilon:/ifs/data/filer1/sourcedir filer2:/vol/sourcedir2-> isilon:/ifs/data/filer2/sourcedir2 If consolidation is required, this can happen after the data is migrated and any potential merging of identically named subdirectories can be addressed. Best practice – isi_vol_copy use isi_vol_copy is optimized to streams as much data as possible across a network, always monitor load on the source and target systems for potential impact. Why: Since isi_vol_copy is optimized to streams as much data possible don’t overwhelm older source systems and create potential link saturation or disk problems especially if there are users connected and attempting to access files. Best practice – isi_vol_copy limits Recommend less than 40 million files per volume transfer when using isi_vol_copy. Why: All programs have limits and this is the recommended maximum when using isi_vol_copy for each individual transfer. Larger source volumes should be broken up into smaller chunks (i.e. use a separate isi_vol_copy stream for multiple subdirectories instead of one large transfer of an entire volume). Once the initial copy is complete then incrementals can be run: isi_vol_copy filer1:/vol/sourcedir –sa root:<password> /ifs/data/filer1/source –incr Important: Do not start an incremental copy job until after a full copy is complete. A successful FULL copy must finish before an incremental copy is started. Unlike rsync which automatically does incrementals, isi_vol_copy must be explicitly called with –incr to perform an incremental copy. Subsets of source directories/volumes can be migrated An entire volume does not need to be migrated; subdirectories can be migrated individually as well. For example: A volume, /export/vol1, is exported containing subdirectories /work, /scratch, /tmp, and /home. One could migrate the entire vol1 or any/all of the individual subdirectories under vol1, e.g. /export/vol1/work and /export/vol1/home might be the only necessary directories to move. Isilon-based migrations from VNX – isi_vol_copy_vnx Overview: Similarly, isi_vol_copy_vnx is a native Isilon OneFS tool that supports data migration through the use of the NDMP protocol for VNX. The tool allows the cluster to mimic the behavior of a backup target and EMC Isilon Data Migration for NFS 27 allows the data to be copied directly from the source system to the Isilon preserving permissions and ownership. Usage: isi_vol_copy_vnx <src_filer>:<src_dir> [options] <dest_dir> [ -full | incr ] [-sa user: | user:password] [-sport ndmp_src_port] [-dport ndmp_data_port] [-dhost dest_ip_addr] [-h] Features: • • • • • • • Utilize native NDMP functionality and connectivity Supports Full and Incremental backup levels Migrates data and all security, attribute information Will restore the set of permissions/ACLs that existed on the source data Will migrate NFS and SMB source data Does not impact or interact with client data access Dedicated data transfer pipe between source and cluster Limitations: • • Source filers have a limits on the number of NDMP threads and simultaneous backup jobs; avoid overrunning the source filer Can be limited by source filer network bandwidth Check with EMC support for the latest compatibility with tools, DART codes, and OneFS. Migration Preparation After you finish planning the migration and selecting the tools, you can prepare the source and target systems for the migration. Infrastructure and Environment Setup Network connectivity Since all the data in the migration will traverse the network, you should optimize the network infrastructure and connectivity between the source system(s) and the target Isilon cluster. Common recommendations include the following: • • • • Maximize network bandwidth; 10Gbps preferred to 1Gbps, optimized end to end, MTU of 9000 Limit hops and latency between source and target Isolate migration traffic so as not to compete with client access Limit potential network bottlenecks; routers, firewalls , IDS and shared network infrastructure Best practice – Optimize the network for the migration traffic Optimize the migration network path; look to limit other production traffic from this network, limit network devices the traffic traverse (firewalls, IDS etc.). Ideally look to create a dedicated private migration network that can be optimized for only the migration traffic. EMC Isilon Data Migration for NFS 28 Why: Separating the migration traffic from other network traffic will allow for maximum throughput and reduce potential impact to existing production traffic by limiting network saturation. Migration Account In order for the migration data to be copied from source to target the tool accessing the data must be able to access all of the source and target data. Commonly used migration accounts: • • root Users accounts created explicitly for the execution of isi_vol_copy, e.g. ndmp The account used to connect to the source and target storage systems depends on the security model implemented in the environment. Best practice – Use a specific migration account to execute migration tasks Use a specific migration account or account with group membership that has the required access to all source and target data, i.e. root. Why: Using a dedicated account will allow for oversight and management of the migration data access. It will also allow for the separation of migration tasks and users from other production accounts. Source Host Preparation The source data storage system should be prepared and optimized for the migration. Best practice Access can be restricted to the source cluster exports, preventing users from changing data on the source cluster instead of the migration target clusters. Change exports to read-only once migration and incremental copies are complete to prevent clients from writing to them. Why: Allows you to separate migration events from normal production access. It can be used during post cutover to deny read-writes by a normal user to the source cluster. This prevents updates to the data during data cutover and post-migration while continuing to allow administrative access. Migration Host Preparation - Source and Target Access The migration host should be prepared and optimized for running the migration copies. • • • • Limit workload and access to optimize throughput Restrict access and reduce service issue with the host Prepare all migration jobs as scripts Test and validate network throughput Best practice – watch out for root_squash On the source cluster exports sometimes restrict access by using root_squash to prevent root users connecting remotely and having root privileges. But this is something we need for migrating data. Use the option, “no_root_squash” to turn off root squashing. EMC Isilon Data Migration for NFS 29 Why: Must have root access (or equivalent) to migrate all files and directories. We can also set root squash on the Isilon NFS exports: Here we see under “User/Group Mappings: Use default: Map root users to nobody” has been set for the export /ifs/data/work. This should be disabled to allow root full access to the filesystem. Note that this can be set on an export by export basis. In addition we have restricted root access to a specific client, 192.168.43.200. Typically this would be the host doing the migration. Isilon Cluster Preparation Configuration All primary setup and configuration of the Isilon cluster should be completed before you begin to migrate data. The configuration includes but is not limited to the following: • • • • • • • • • • Authentication provider integration – all authentication providers are online and fully operational If local users, all UID’s and GID’s created and tested Access Zone and Role Based Access Control (RBAC) setup – complete any zone and RBAC setup Exports for clients created and tested Networking design and setup – complete the setup and implementation of the network configuration SmartPools – complete the implementation of any SmartPools policies to limit post-migration work SyncIQ – prepare any existing SyncIQ policies to operate alongside any data migration events SnapShotIQ – prepare any Snapshot policies to operate alongside any data migration events SmartLock – execute all preliminary SmartLock work prior to migration SmartQuotas disabled until the migration is complete EMC Isilon Data Migration for NFS 30 It is suggested to use a dedicated Isilon migration directory export to execute all migrations against. Using a dedicated administrative migration export with the appropriate access and configuration can facilitate the migration without impacting workflows or data permissions. Normal user clients will not mount this export, only migration hosts. Data can then be moved into place after the migration is complete with minimal disruption. Best practice – Implement a logical NFS export path structure methodology Be aware of export rules and how they interact with each other: 1. Path – should be unique though nesting is possible with caution 2. Permission options – restriction by User ID mapping and IP addresses 3. Security – netgroups, authsys, Kerberos The order of evaluation is: path, client ACLs, then security types (unix,krb5). For example, if you have exports: /ifs/data --client=8.8.8.0/24 /ifs/data/something --client=10.10.10.0/24 and your client IP is 10.10.10.10 you would not have access to /ifs/data/something because the export, /ifs/data, has a different IP restriction. The client must be able to traverse the path if the exports are nested. Access for 10.10.10.0/24 would need to be added to the /ifs/data export: /ifs/data --client=8.8.8.0/24 --client=10.10.10.0/24 /ifs/data/something --client=10.10.10.0/24 Why: Complex and restrictive rules may prevent clients from connecting to exports that are nested. Clients may encounter problems mounting a directory that is nested from a different export. If the parent directory has more restrictive permissions, a client may not be able to mount a child export of that directory. Best practice – Create the NFS exports Create the new Isilon NFS exports prior to data migration. Why: This will allow the creation and set up of the exports and export permissions prior to data migration and cutover for initial testing and access validation/UAT. Best practice – Do not use default /ifs export Allows mounting of subdirectories and open access to the whole file system. Why: This is intended for easy setup use only and can be a potential security issue. Best practice – Create the correct NFS export permissions Setup the correct export permissions on the newly created user exports. Why: Setting the correct export permissions will allow you to test and validate workflows when test migrations are undertaken and maintain security. EMC Isilon Data Migration for NFS 31 Note: The migration methodology may include adding an explicit “Deny” permission on users or directories so they cannot write data to these exports until the cutover has been executed as well as specific IP addresses to prevent clients from accessing the exports. Additional Isilon Cluster Considerations The following are some additional Isilon cluster considerations that may need to be addressed prior to and during a data migration. NFS 16 group limitation The NFS standard by default does not support membership in more than 16 groups per individual user. This limitation can be addressed on an EMC Isilon cluster by enabling “Map Lookup UID” under NFS Settings ->NFS Export Settings ->Export Behavior Settings -> Map Lookup UID: Additional information can be found in the EMC document: “NFS supplemental groups limited to 15 in OneFS 6.5.4 and earlier” Article Number:000089550 EMC Isilon Data Migration for NFS 32 Production or Pre-Production Cluster An important consideration around planning and execution of a data migration is the current status of the Isilon cluster. Is the cluster in production or will the migration mark the initial cutover to active production traffic? Our primary goal is to lessen any impact on a production cluster during migration activities, so suitable steps should be taken to address these concerns. Common factors to be aware of while migrating to a cluster: • • • Administratively destructive actions Saturation of network links Cluster load and ingest, impact on production workflows Access Zones and Role Based Access Control If the cluster uses an Isilon access zone or RBAC, the migration methodology may need to be adjusted to accommodate this configuration. Currently, OneFS 7.0 only allows for NFS exports in the default System Zone and no other zones. NFS RPC threads By default the number of NFS RPC thread is set to 16 per node. This can be increased for specific workflows. Consult support for assistance. Isilon OneFS SmartConnect or Direct Node Connections The current status of the cluster may dictate that you look to optimize and segregate migration traffic within the clusters network configuration. • • Use SmartConnect to auto balance traffic Separate migration traffic from existing production traffic, use a direct node or separate SmartConnect zone connection for migration traffic If you use SmartConnect, you should validate and optimize the configuration before transfer data across the network. Isilon OneFS SyncIQ Considerations If the data to be migrated will be replicated to a secondary cluster through a SyncIQ policy additional, planning should be undertaken to address the impact of the data migration and its interaction with active SyncIQ policies. • • • Pause active SyncIQ policies if they include migration paths Schedule SyncIQ jobs to run outside of data copy windows Utilize SmartConnect zones for copying and SyncIQ replication Isilon OneFS SmartPools Any SmartPools data policies should be in place prior to data migration, or additional cluster overhead maybe required to move data within the cluster post-migration. Isilon OneFS SnapshotIQ Any active SnapShotIQ policies should be analyzed for impact during the data migration. Antivirus Integration Review and disable any active antivirus scanning policies that may be running against the target data. EMC Isilon Data Migration for NFS 33 Best practice – Disable Antivirus Scanning Disable active antivirus scanning on migrated data during initial full and incremental copies. Why: The large influx of data associated with the migration can place an excessive load on the antivirus scanning architecture and create a slowdown and potential bottleneck for the inbound data. Isilon Guidelines for Large Workloads Be mindful not to exceed maximum configuration values listed in the table below Guideline Tested or Default Value Theoretical or Max Practical Value NFS max read size NFS max write size NFS3 connections (Per node) 128 KB 1 MB 512 KB 1 MB 1000 n/a NFS3 exports (Per cluster) nfsd threads (Per node) 750 2000 16 16 Filename length 255 B 255 B Path length 1024 B 1024 B Directory depth 8470 Unlimited* Comments This applies to both NFS3 and NFS4. Prior to OneFS 7.0, the maximum read size was 128 KB. This applies to both NFS3 and NFS4. Prior to OneFS 7.0, the maximum write size was 512 KB The number of TCP sockets available on the node is typically what limits NFS connections. Unlike Isilon’s SMB server, nfsd uses file handles instead of file descriptors to represent files, and internally handles work items much differently. 1000 connections is a very conservative, tested limit, and represents 1000 mounts over 20 exports. NFS connection testing is an ongoing test effort. A maximum connection limits for NFSv3 has not been established at this time. Beyond 2000, manageability becomes a problem. Cluster size does not matter. This is a kernel limit, exposed via sysctl. The value should not be changed without consulting Isilon Support. This limit represents the maximum number of simultaneous work items the server can service. Simultaneous work items above this number are queued and serviced when resources become available. Note: Most Unicode character encodings (like UTF-8, which is the OneFS default) allow for multiple bytes per character (UTF-8 allows for up to 4 B/character). So, of the 255 B in the filename, that could represent 255 characters, 63 characters, or some number of characters in between. This is the maximum absolute path (e.g. /ifs/data/foo/bar/baz/) length that can be passed into a syscall, not the maximum depth of a directory in the file system (see Directory Depth). * No specific hard limit is in place, but several other limits could come into play here (inode limits, metadata storage limits, etc.). In tests, command line (shell) utilities starting showing problems at a depth of 8470 (EBADF from many commands). At a depth of about 30000, internal utilities (e.g. Job Engine TreeDelete) also started failing. For utilities that make calls with absolute paths (e.g. cd /1/2/3/…), depth will be limited by “path length” as described above (this applies to path-based OneFS commands like ‘isi snapshot’ and ‘isi quota’). For utilities that access relative paths (e.g. cd 1, cd 2, cd 3…), these higher limits may apply, although the value of extraordinarily deep directories is questionable. EMC Isilon Data Migration for NFS 34 File size 4 TB 4 TB Hardcoded OneFS limit. Note that Job Engine performance can be impacted on files larger than a TB, due to inefficient per-file threading. Table 3: Isilon Guidelines for Large Workloads Migration Approach - Testing and Proof of Concept Having developed the migration approach, selected the toolset and prepared the infrastructure for the data migration, you can proceed with your initial testing of the methodology. The goal of the testing here is to validate the outcome—is data migrated, are the permissions moved, and are the timestamps moved? The testing phase also allows us to tune and modify the migration approach to optimize all parts of the migration. Recommended testing approach: • • • • • Run Full copy – benchmark and monitor Review and validate – potentially look at tuning or tweaking the methodology and re-run Run Incremental copy – benchmark and monitor Review and validate – potentially look at tuning or tweaking the methodology and re-run Continue to run incrementals – and continue to monitor EMC Isilon recommends that you test different copy methodologies to tune and optimize the throughput while meeting your migration requirements. Best practice – Execute multiple test migrations to validate the methodology It is recommended to execute multiple migration tests on smaller subsets of different data. Why: Since different data will tend to have different properties and access profiles, it is important to test all data types and how the migration methodology may need to be modified for different data sets. Critical areas to evaluate and monitor during data migration testing are the following: • • • • Network performance – throughput, saturation, impact Time to execute a Full data copy – will allow for refinement of project plans Time needed to execute an Incremental after X days of data change – will help define cutover Windows Cluster load, Source load, host load – will help tune and refine the migration methodology Best practice – Test all phases of the migration methodology Execute all steps in the migration methodology to identify the time involved and to verify that the proposed methodology fulfills all the migration requirements. Why: To identify issues with the methodology before executing production migrations and cutovers. Data Validation After you migrate the data, you must validate the data and the file attributes. Verify the following aspects of the data: • • • File data copied correctly; data is intact and integrity is maintained File security, ownership and attributes migrated correctly File timestamps are correct EMC Isilon Data Migration for NFS 35 Next, review the access control entries on a file by running the ls –led followed by a file name: You should also validate the data. Common methods include the following: • • • • File size compares Checksum/File Hash Compares – MD5 checksums Tools – md5, sum, cksum Audit and review directory structures Having reviewed the data attributes directly, it is critical to validate that the data works in client workflows. Performance A migration often moves a large amount of data. You must ensure that migration methodology, toolset and environment are optimized for performance and throughput to work within the migration timeline. The common areas to focus on when evaluating performance are as follows: • Identify bottlenecks – attempt to identify the worst performing component i.e. Disable antivirus scanning processes on target and/or source file systems during initial migration copies to minimize CPU impact on client access and its potential impact on elongating copy times. o WAN bandwidth physical (circuit limitations) and concurrency impact of other systems that are replicating data (SAN, backup etc.) over a shared link. This could have an effect on replication performance for SyncIQ jobs that need to be run to completion before certain cutovers can be conducted. • Timing of execution – time of day, day of week tests were executed versus performance • Collect metrics on the data copies, network throughput, source, host and target systems – evaluate the copy as a whole o Best practice – Time the incremental copies Benchmark the Incremental copies by timing how long they take to execute so you can plan and orchestrate the cutover phases appropriately. Why: Knowing how long an incremental copy will take will likely determine the length of time required to execute a cutover and will help determine the data outage window. User Acceptance Testing – Data and Workflow Testing The final step of migration data testing is the User Acceptance Test (UAT) in which the data is tested for integrity with existing workflows. It is suggested that test workflows are used as this data should only be considered test data at this time and may be removed by later migration steps. Best practice – Check workflows with test migrated data Review all workflows on test migrated data. Why: It is critical to validate that newly migrated data can be integrated into workflows (i.e. User Home Director or Group share access, etc.) at cutover time without issues. By testing the workflows, you ensure that cutovers occur without incident. EMC Isilon Data Migration for NFS 36 Begin Migration Execution After you complete all the testing and validation, you can begin to move into the production migration phases. All the information obtained from testing and tuning should be used to modify and optimize the overall methodology so that the production migrations are as clean and quick as possible. The migration execution phases are as follows: • • • Execute the initial full copy Execute Incremental copies to keep the new storage up-to-date and as close to production as possible Based on performance, multiple migrations may be executed simultaneously if supported and if the network infrastructure can support the additional load Multiple hosts (if using a host based migration) can replicate data in parallel; be mindful of load placed on the source cluster. Coordinate hosts so that they are not all writing to the same subdirectory to reduce contention. Spread hosts across the Isilon cluster to maximize network bandwidth and processing power. Alternative migration methodology: Mount the source NFS exports directly on the individual Isilon nodes and run the rsync commands directly on the cluster. The benefit here is if you have multiple nodes, your transfer bandwidth scales and you remove the “middle man” from the process. Data moves directly from the source filer to the Isilon cluster, without having to go to a host first then back out to the Isilon cluster. Latency is greatly reduced and migration times will drop. External NFS exports mounted on the Isilon cluster MUST BE UNMOUNTED prior to normal cluster operation. Best practice – Continue to run Incremental copies Continue to run incrementals even if the cutovers are not scheduled. Why: This will keep the source and target data closer in sync and require less data transfer during the final pre-cutover copy. Depending on the size of the data migrations, the initial full copies may take a while to execute. During this time you can prepare for the final cutover events. Pre-Cutover Preparation After you start to migrate data, you can begin to prepare your cutover events. Best practice – Create a detailed migration plan Create a detailed migration plan with all the specific steps and timing of the migration execution. Why: This document will dictate the commands and work that are being executed. The plan controls the entire migration from start to finish. All roles, tasks and responsibilities are defined. The detailed migration plan dictates how the migration is executed. Best practice – Create a cutover document Create a cutover document that defines the high level cutover tasks, responsibilities and timings. The document should outline the phases and sequence in which tasks are executed. EMC Isilon Data Migration for NFS 37 Why: This document will outline the sequence of events that need to occur during a cutover. It can be used to track and monitor the progress of the cutover. Best practice – Create a schedule and define outage windows Have a well-defined cutover schedule and outage window. Why: The schedule helps execute the migration cutover. The outage window is when you have clearly defined that access to storage will be unavailable and that you can make storage system changes without impact to clients. Best practice – Create a communication plan Have a communication plan. Why: This communication plan will clearly outline the protocols to keep all users up to date on the status of a migration and allow storage administrators to stay focused on the execution of the cutover and not be distracted by requests for information from end users. Best practice – Prepare the DNS name resolution infrastructure for cutover Lower the DNS TTL. Why: This will facilitate the cutover of clients using DNS name resolution by reducing the time between authoritative updates to domain name services. Additional Pre-Cutover preparation steps often include the following: • • • • Prepare the DFS namespace if applicable Create CNAMES in DNS Update scripts used by clients for storage connections Prepare clients and applications Cutover Event Having migrated the data and prepared the environment for cutover, the actual final migration event can occur. In general the high level cutover sequence resembles the following steps: • • • • • • • • • • • Initiate migration cutover window – communicate the event Restrict access or make source data Read-Only – prevent new writes to the old data source Execute a final incremental – copy all final data to new storage Validate final incremental – validate the source data is ready for cutover Final testing executed – the final cutover testing is completed Go or no go call on full cutover – decide if the migration should continue or rollback Update connection and name resolution protocols; DNS, CNAMES, DFS, scripts Enable new storage to Read/Write – enable writes to the new storage Continue testing and user acceptance – continue to test and monitor as production traffic moves over Execute the redirection of client to the new storage – initiate client redirection process Monitor – assess the cutover, new storage and clients On executing a cutover event, the following best practices are recommended: EMC Isilon Data Migration for NFS 38 Best practice – Follow the cutover schedule Follow a cutover schedule. Why: By following a well-defined schedule the migration can be monitored and controlled. It is suggested that you execute cutovers during off hours or when the number of active connections is low. Best practice – Test the migrated cutover data Prepare a number of data and workflow tests to execute against the migrated data. Have a number of well-defined production use cases, data tests and test users available to conduct post cutover testing and review. Why: Having a well-defined use case and users to validate the migration cutover will help in the decision to continue with the cutover. Best practice – Monitor clients and application during migrations Monitor client and application connections to the new storage during the cutover. Why: This will verify that your cutover methodology is working as defined and clients are moving and connecting to the new storage successfully. Best practice – Develop a client connection remediation plan Have a client remediation methodology plan in place and ready to execute against clients that exhibit any issues connecting to the new data targets. Why: Have a well-defined strategy to handle client connection issues, including a dedicated support line, email address, or an IT desk. The Go or No Go Decision During the migration cutover window, a critical point is reached. This threshold determines whether you continue with the cutover or abort the cutover and rollback to the existing storage. Common abort cutover situations: • • • • • • Final incremental does not complete in the outage window Cutover methodology fails; clients not connecting correctly Security issues with new storage system Workflow issues post cutover Load and availability problems Other unknown issues Best practice – Clearly define your cutover criteria It is critical to have a series of cutover criteria that clearly define when a migration will continue or be aborted and rolled back EMC Isilon Data Migration for NFS 39 Why: The criteria remove uncertainty, help with decision-making, and dictate the action to take. Once you begin to write data to the new storage system, reverting to the old system becomes much more complicated because you now need to reconcile data with the original storage system. Rollback If a decision to abort the cutover is taken, a well-defined rollback plan should have been developed and tested ahead of time so that you can restore data access as quickly as possible. Rollback Plan: • • • Prevent any new writes to the new storage Move client connections back to the old storage Enable writes to the old storage Best practice – Develop a rollback plan Have a clearly defined rollback strategy that is easy to implement and can restore user access to data quickly and cleanly. Make sure the plan is tested. Why: A rollback plan will quickly help you restore client data access in the event of a failure in a migration cutover event. If any data has already been written to the new storage and a rollback is executed, then steps to remediate this data must be taken to restore this new data back to the original storage. Common strategies for reconciling data during a rollback are as follows: • • • • Manually reconciled – identify and manually move any data from the new storage to the old storage A reverse incremental – have migration type jobs to run in a reverse direction to update the old storage Data just discarded – consider the data as non-critical and be conformable with not reconciling it Data is re-written from the client or application to the old storage – allow applications and clients to re-write the data The goal of any rollback strategy is to limit the impact to end users and restore data access as seamlessly as possible. It is for this reason that your migration cutover criteria should be well defined and that the rollback strategy should have been well tested in the event that you have to use it. Migration Event Complete After you successfully complete a cutover, you should continue to monitor the new storage system. Best practice – Monitor the new storage post cutover Continue to monitor the cutover storage after the cutover event for any new issue resulting from the cutover. Why: Production load and workflow may be unpredictable; close monitoring should be undertaken to rectify any post-migration issues. EMC Isilon Data Migration for NFS 40 Steady State Repetition Most data migrations will constitute multiple cutover events. Having developed a well-structured migration methodology, these additional cutover events should be run with the same plan and strategy. Lessons learned After you complete a migration, you should assess the success and failures of the methodology. If an additional migration needs to be performed, the lessons learned will allow you to refine the process. Ask yourself the following questions: • • • What worked during the migration and cutover What did not work during the migration and cutover Can the migration methodology be modified or optimized Conclusion The goal of this document is to supply you with solid guidance for conducting an NFS single protocol file migration from a NAS system to an Isilon cluster. The guidance is based on a comprehensive set of industry lessons learned and best practices on the technical aspects and process of data migration. As stated in the beginning of this document, this paper does not aim to be an exhaustive authoritative source on the subject of NFS single protocol migrations, but rather a comprehensive reference document that covers the key areas to help ensure your success. EMC can provide comprehensive services including migration services, Isilon training and education, and residency services to reduced risk and maximize system uptime and service levels during and after a data and system migration. EMC Isilon Data Migration for NFS 41 Appendix: Sample Migration Use Case This appendix provides a sample high-level overview of how to collect information for and plan a migration of NFS source directories from a single NFS server. Keep in mind that the information in this section provides only a skeleton of some of the information that you would want to collect and an overview of the strategy that you want to define for your migration. The use case that follows answers the following question at a high level: What are the recommendations and best practices as well as supported Isilon configurations to migrate all directories and data? From a sample NFS Filer storage system to an Isilon cluster: Source Configuration and Data Sample Directory Structure Single Source System – NFS Filer, 4x1Gb Ethernet ports Total Data - 25TB Max File Size – 4GB Min File Size – 0B Avg File Size – 256KB File Count: 8000000 8 Top level Exports: acct engineering home production RandD scratch temp work User home directories: each user has a single home directory under a higher level share, /exports/home. /exports /acct /engineering /home /production /RandD /scratch /temp /work Table 2 - Source Data Exports and permissions : drwxr-xr-x drwxr-xr-x drwxrwxr-x drwxrwxr-x drwxrwxr-x drwxrwxr-x drwxrwxr-x drwxrwxrwx drwxrwxrwx drwxrwxr-x 10 21 2 2 2 2 2 2 2 2 root root root root bob root prod root root root wheel wheel rd acct eng users prod wheel wheel eng 512 512 512 512 512 512 512 512 512 512 Aug Aug Aug Aug Aug Aug Aug Aug Aug Aug 27 27 27 27 27 27 27 27 27 27 14:50 14:50 14:50 14:50 14:50 14:50 14:50 14:50 14:50 14:50 . .. RandD acct engineering home production scratch temp work Table 3 - Exports and Permissions Additional Source Information Additional Isilon Information EMC Isilon Data Migration for NFS 42 All LDAP – single Domain Source system network connectivity – 4 x 1Gb No firewalls, IDS or QoS Monthly Full backups Anti-virus scanning in place Same datacenter as Isilon No deduplication or offline files DNS – 2 CNAMEs No routing or VLAN restrictions Isilon 3 x X200 – ~61TB OneFS 7.0.x LDAP authentication 3 x LACP (2 x 1Gbps each) SmartQuotas SnapshotIQ Table 4 - Additional Source Data Information Requirements -All data and permissions moved as is, no changes. Retain all existing POSIX permissions and ownership -Eight cutover events; 12 hour window – Saturday 8:00PM through Sunday 8:00AM -One migration per weekend -Each user has a defined quota, quota limits to be replicated on Isilon Migration Project Assumptions (including but not limited to) -Customer will have approved change controls submitted for any migration activity -Migration plan and design will have been reviewed and approved by the customer prior to the start of cutovers -Any recommended array OS upgrades (and firmware updates) necessary for the migration will be applied before migration cutover activity -Source NAS and target Isilon systems are in a known good state prior to conducting migrations -Customer will have successfully completed a full system backup and verified it’s reliability Strategy -Isilon based migration; source NFS exports mounted directly on individual Isilon nodes -Conduct pilot migration; validate methodology, document performance metrics, refine and tune rsync switches and scripts. Test migrated data with clients and users. -Investigate sizes and file counts in each export; this will help determine order of migration (i.e. the largest directories will take the longest time and should be started first) -Validate change rates and time to execute incrementals -Have completed a full backup of all file systems that are to be migrated before cutover; verify backup is good -Develop detailed project timeline and cutover schedule with customer -Develop back out plan, review with customer -Execute migration phases: execute initial full copies, followed by nightly incrementals -Use a DNS update methodology to redirect clients EMC Isilon Data Migration for NFS 43 -During cutover; make sure each source File System is changed to read-only after source directory is successfully replicated preventing clients from making any changes -Reduce DNS TTL’s in advance of cutover Windows -Umount externally mounted NFS exports from each Isilon node. -Develop client communication; customer should provide dedicated cutover IT support desk/personnel Source System Configuration -If possible, restrict access so clients cannot modify data during cutover (change to read-only) Isilon Configuration -Pre-create NFS shares with identical share permissions -Disable Snapshots on data until cutover complete -Disable AV scanning Toolset Selection -use rsync Example: rsync --a --delete sourcefiler:/exports/acct /ifs/data/exports/acct -exclude ‘.snapshot*’ Migration Testing -Map source and target directories from the migration host -Replicate a small test set of data -Use a small set of test users to validate full data access (data to be discarded or overwritten following test) Migration This example directory structure can be broken down into at least 8 separate rsync jobs. Several of the jobs can be run in parallel assuming the source cluster can tolerate the additional load while the migration is occurring. Monitor the source cluster and scale up or down accordingly. If the customer has a preference for the order of directory migration then plan the transfers accordingly. Otherwise, start with the largest directories since they will take the most time to replicate. Example: Initial migration testing indicated that 3 simultaneous rsync jobs would be an acceptable additional load on the source cluster. We will run 3 rsync jobs concurrently so as not to overload the source cluster. Note that in other scenarios, additional nodes could run additional rsync jobs in parallel, if the source cluster has enough performance and bandwidth to accommodate. Mount an individual export to a different node and start an rsync: Isilon node 1: Mount filer1:/export/acct with a command similar to: mount filer1:/export/acct /import/acct (mounts the remotely exported directory to a created directory locally) Rsync to /ifs/data/filer1/acct with a command similar to: EMC Isilon Data Migration for NFS 44 rsync –av /import/acct /ifs/data/filer1 --exclude ‘.snapshot*’ (this would copy files in “archive” mode which ensures symbolic links, devices, attributes, permissions, ownerships, etc. are preserved in the transfer. No compression. Exclude any snapshot directories from the transfer) Repeat with Isilon node 2: Mount filer1:/export/engineering and rsync to /ifs/data/filer1/engineering. And again with Isilon node 3: Mount filer1:/export/home and rsync to /ifs/data/filer1/home. Monitor the source cluster to verify that it is not overloaded with the additional strain of migrating data. The idea is not to impact clients as the migration progresses. Repeat with the remaining shares until all of the source data is migrated. Once an initial full copy of the source data has been completed, incremental copies should be run to propagate any changes that were made once the migration began. On the day of the cutover, final incremental copies should be run and access to the source cluster should be restricted if possible to prevent clients from writing data that may not be migrated. Be sure to unmount the external NFS mounts on the individual Isilon nodes prior to normal cluster operation. User Acceptance Testing -Verify data and permissions on Isilon are replicated correctly following final copies -Review user access, verify users have connectivity and correct permissions -Use a small set of test users to validate full data access during migration event to spot any problems early on -Verify that a user can read/write to a file, create a new file and directory, and traverse the directory structure -Monitor Isilon performance and client connections as load increases Cutover Plan -Make source file systems read-only -Execute final incremental copies -Update DNS -Verify clients can connect to Isilon -Test automated workflows if possible -Initiate user logoff and re-logon Rollback Plan -Reverse DNS update -Make old source file systems RW, remove any restrictions -Remove any connections to Isilon; stop exports -Any data that was written to Isilon is considered lost, no reverse or reconciliation will be done EMC Isilon Data Migration for NFS 45 Exit Criteria -DNS resolves to new target storage and shares -Client successfully read/write data to directories -Workflows successfully complete with no user/permission problems -No connectivity issues Post Cutover -Customer meeting to review and triage the migration -Documentation of migration protocol -Lessons learned discussion with both the internal and customer team EMC Isilon Data Migration for NFS 46