White Paper
NFS FILE MIGRATION TO EMC ISILON
Guidance for optimal data migration of NFS workflows
Abstract
This paper provides technical information and
recommendations to help you migrate data from a single NFS
protocol workflow on another NAS vendor to an EMC Isilon
storage cluster. It includes the best practices for planning,
setting up, and executing the migration.
November 2014
Copyright © 2014 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as
of its publication date. The information is subject to change
without notice.
The information in this publication is provided “as is.” EMC
Corporation makes no representations or warranties of any kind
with respect to the information in this publication, and
specifically disclaims implied warranties of merchantability or
fitness for a particular purpose.
Use, copying, and distribution of any EMC software described in
this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC
Corporation Trademarks on EMC.com.
Part Number H12517
EMC Isilon Data Migration for NFS
2
Table of Contents
Introduction ............................................................................................................ 5
Assumptions ...................................................................................................................... 5
Audience ............................................................................................................................ 5
Prerequisites ...................................................................................................................... 6
The Challenge of Data Migration ......................................................................................... 6
Risk Management............................................................................................................... 6
Data Integrity...................................................................................................................... 7
Data Availability ................................................................................................................. 7
Project Phases and Methodology Overview .............................................................. 7
Discovery and Planning Phase ............................................................................................ 7
Migration Approach and Requirements .......................................................................... 8
Migration Methodology .................................................................................................. 9
Toolset Selection ............................................................................................................ 9
Testing the Migration Methodology .................................................................................. 10
Data Migration Testing ................................................................................................. 10
User Acceptance Testing .............................................................................................. 10
Cutover Methodology Testing ....................................................................................... 10
Rollback Strategy Testing ............................................................................................. 10
Executing the Migration .................................................................................................... 10
Data Transfer ................................................................................................................ 11
Cutover ........................................................................................................................ 11
Acceptance .................................................................................................................. 11
Rollback ....................................................................................................................... 11
Repetition .................................................................................................................... 12
Post Migration .................................................................................................................. 12
Single Protocol NFS Data Migration ........................................................................ 12
Challenges of Single Protocol NFS Data Migration ............................................................ 12
Data-Specific Considerations ........................................................................................... 13
Migration Requirements and Customer Data Collection .................................................... 14
Requirements Gathering ............................................................................................... 14
Current Infrastructure and Data Analysis ...................................................................... 15
Determine Migration Methodology ................................................................................... 15
Migration Sequencing .................................................................................................. 16
Type of Migration.......................................................................................................... 17
Host-Based Migrations ................................................................................................. 18
Isilon Based Migrations ................................................................................................ 19
Migration Tool Selection and Use ................................................................................. 20
Migration Tools ............................................................................................................ 21
EMC Isilon Data Migration for NFS
3
rsync ............................................................................................................................ 21
Isilon-based migrations – isi_vol_copy......................................................................... 26
Isilon-based migrations from VNX – isi_vol_copy_vnx ................................................. 27
Migration Preparation ....................................................................................................... 28
Infrastructure and Environment Setup .......................................................................... 28
Source Host Preparation ............................................................................................... 29
Migration Host Preparation - Source and Target Access ................................................ 29
Isilon Cluster Preparation Configuration ........................................................................... 30
Additional Isilon Cluster Considerations ....................................................................... 32
NFS 16 group limitation ................................................................................................ 32
Production or Pre-Production Cluster ............................................................................ 33
Access Zones and Role Based Access Control .............................................................. 33
NFS RPC threads ........................................................................................................... 33
SmartConnect or Direct Node Connections ................................................................... 33
SyncIQ Considerations ................................................................................................. 33
SmartPools................................................................................................................... 33
SnapshotIQ .................................................................................................................. 33
Antivirus Integration ..................................................................................................... 33
Isilon Guidelines for Large Workloads........................................................................... 34
Migration Approach - Testing and Proof of Concept........................................................... 35
Data Validation ................................................................................................................ 35
Performance ..................................................................................................................... 36
User Acceptance Testing – Data and Workflow Testing ..................................................... 36
Begin Migration Execution ................................................................................................ 37
Pre-Cutover Preparation.................................................................................................... 37
Cutover Event ................................................................................................................... 38
The Go or No Go Decision ................................................................................................. 39
Rollback ........................................................................................................................... 40
Migration Event Complete ................................................................................................ 40
Steady State ..................................................................................................................... 41
Conclusion ............................................................................................................ 41
Appendix: Sample Migration Use Case ................................................................... 42
EMC Isilon Data Migration for NFS
4
Introduction
This white paper outlines the recommended approach for migrating single protocol NFS data from
other NAS filers to an EMC Isilon storage cluster. Single Protocol NFS data is defined as data read,
written, or modified using NFSv2 or NFSv3 protocols. The paper includes best practices on Isilon
cluster configuration, tool selection, and host setup to optimize an NFS based data migration. The
paper also includes best practices to optimize performance, management and support. Although this
guide addresses a single NFS protocol data migration, the approach and many of the best practices
can be used as a foundation for other types of data migration.
Much of the relevant information for planning, provisioning, and supporting end-user directories on an
Isilon storage cluster is available through white papers and guides from EMC Isilon at
http://support.emc.com. As such, this guide avoids duplicating content by including only the
information that pertains to setting up and operating an Isilon cluster as a destination for a single
protocol NFS data migration.
Assumptions
This document focuses on the data migration of NFS; it does not specifically address the migration to
an Isilon cluster of NFS exports, local users and groups, or any other NFS configurations from another
network-attached storage (NAS) system.
This document should not be used in a multiprotocol migration. Multiprotocol migration requires
many additional considerations, and specific actions may be different.
This guide assumes that the source data is only accessed through a single protocol NFS workflow.
Further, authentication and authorization for POSIX users (UID/GID) needs to be consistent on the
source and destination migration clusters. LDAP, NIS, distributed local files, and AD with RFC2307
enabled (SFU) are all centralized means of authentication that should be successfully implemented
before the migration begins.
An authoritative and consistent source for user authentication is assumed (e.g. AD with RFC2307,
LDAP, NIS, etc). If there are multiple sources of authentication from disparate clusters targeted for
migration (i.e. several local different /etc/password files) they must be inspected for collisions and
manually combined if necessary to prevent UID and GID collisions. Identity management must be
optimized with preference for a single external directory service is recommended. Work with the
customer to establish a consistent and authoritative set of users and groups if necessary.
Workflows will be for NFS v3 with the future possibility of adding SMB workflows in the future; NFSv4 is
not addressed in this document.
The EMC Isilon filesystem will have a “Balanced” on-disk identity setting for global ACL policy.
Files with POSIX mode bits only are in-scope for this document; additional permissions such as SMB
ACL’s are not covered.
Multiple source clusters may have identical export directories. A directory consolidation plan must be
developed to deal with directory name collisions in the newly created single namespace on the Isilon
cluster.
Files and directories have unique permissions that restrict access to the intended users and groups; a
typical migration preserves and transfers them. Post migration permission transformation is not
covered in this text.
Audience
This guide is intended for experienced system and storage administrators who are familiar with file
services and network storage administration.
This guide assumes you have a working knowledge of the following:
EMC Isilon Data Migration for NFS
5
•
•
•
•
•
•
•
NAS systems
The NFS storage protocol, as appropriate for the specific migration requirements
Isilon scale-out storage architecture and the OneFS operating system
Additional Isilon features, including SmartConnect, SmartPools policy management, SnapshotIQ, and
SmartQuotas
File-system management concepts and practices, including provisioning, permissions, and performance
optimization
Integration practices for connecting and establishing authentication relationships centralized sources (LDAP,
NIS, local files, etc. )
Basic shell commands and command line operation. Basic shell scripting.
While this guide is intended to provide a consolidated reference point to migrate data to an Isilon
storage cluster, it is not intended to be the authoritative source of information on the technologies and
features used to provide and support a file-services platform. In the event that additional services are
required, EMC IT Services are available to assist in streamlining data migrations, reducing risk and
minimizing impact.
Prerequisites
Some of the features that are described or recommended in this document may require separate pernode licensing from EMC Isilon. For more information, please contact your EMC Isilon representative.
The Challenge of Data Migration
The migration of a storage system’s data and all the existing user access permissions is a complex
process. Moving the data while limiting downtime and protecting the data can be challenging. While
you execute a migration, the key requirements are being able to access the data at all times and
ensuring data integrity against loss or corruption.
It’s critical to understand that a data migration must be considered a unique project. Few environments
are the same, and as a result each migration should be considered a unique event. No pre-existing
approach will necessarily be appropriate for all migrations. But that’s not to say that common
approaches cannot be used after you evaluate and understand the requirements of a specific
migration. The goal of this whitepaper is to introduce the recommended approach to designing and
executing an NFS data migration to an Isilon cluster. Every migration is different; this guide provides
some examples and guidance. EMC IT Services is expert in working with customers to build an
individual plan that meets their needs. Whether EMC Services are used for the migration or you plan on
managing the migration in-house, the following are key areas that must considered before any
migration project:
•
Investigate the composition of the source data. Is it a deep directory structure or a wide
structure with many files per directory? The type and number of files/directories will directly
influence how the migration is planned and executed.
•
Understanding the sequence of a migration project is critical to its success. The ability to
predict and manage the time required to execute the data movement is paramount—it may be
the single biggest factor that will affect the project’s success.
•
Maintaining data availability throughout the lifecycle of the data migration project is also
critical. Little of today’s data can go for days without being available. In order to maintain data
availability, you will need a strategy to maintain access to data throughout the migration.
Risk Management
It is not uncommon to have a number of challenges or perceived problems that are seen as blocking
issues or barriers to executing the migration. With sufficient planning and testing, the perceived risks
can be addressed and managed successfully.
Common risks and problems associated with data migrations include the following:
EMC Isilon Data Migration for NFS
6
•
•
•
•
•
•
•
Underestimating the amount of data (file counts, sizes) and the time required to move it
Maintaining performance of the existing data solution and the network during the migration
Maintaining access to the data throughout the migration
Changes to the data permission models
Challenges with moving client connections
Maintaining a consistent security model after the migration
Execution of the actual cutover event
This paper helps you understand these challenges and risks and develop a data migration
methodology to manage the risks while implementing Isilon best practices to facilitate and optimize
the migration.
Data Integrity
Data integrity is critical. The data must be moved exactly as it is, and any modification to the data
during migration may impact the availability of data and the success of the migration. The goal of the
project is to ensure the data is successfully moved and its integrity is not compromised during
movement. In most cases this includes the migration of all relevant file metadata as well as the
underlying data blocks. A complete backup should be made of the source data and the validity of the
backup verified before the migration begins.
Extreme cases where the data integrity must be maintained can be addressed with the use of
checksums. An MD5 checksum can be calculated on each file on both the source and destination post
migration, verifying bit-for-bit integrity.
Data Availability
Any migration activity will require a transition or cutover from the existing source systems to the new
destination systems. This cutover will require a window of time when the data is unavailable.
Minimizing this time is the goal of all migrations and is often determined by the type and amount of
data. A number of migration strategies can be employed to reduce the period of unavailability of data
during the cutover. In any cutover, data clients will also need to be redirected toward the new storage
target to ensure that they can continue to access their data once the data has been moved.
Project Phases and Methodology Overview
A data migration project should be broken into distinct phases. The goal of the project phases is to
develop a robust and, repeatable migration strategy that aids the execution and leads to a successful
migration cutover.
Discovery and Planning Phase
The goal of the discovery and planning phase is to design a migration methodology and plan which
allows you to execute the project with minimal risk and downtime. The following items are components
of the discovery and planning phase:
•
•
•
•
•
•
Qualify the project
Identify the migration scope
Understand expectations
Identify risks
Define the timeline
Identify all migration requirements (i.e. Rollback plan)
During this phase, a detailed review of the existing source environment and data is undertaken, and
then developed into a plan to migrate the data to the new target Isilon environment. The planning
phase should be completed and validated before you start the other project phases. The key aspects
of the planning phase include discovey of the existing infrastructure, the data, and the Isilon cluster.
EMC Isilon Data Migration for NFS
7
Infrastructure Discovery
This is where the infrastructure of the existing storage system, network architecture and the network
path between the source data and the Isilon cluster is evaluated.
If, for example, multiple source filers are to be combined into a single Isilon cluster, extensive analysis
of existing exports should occur. Export directory naming collisions (i.e. server1:/exports/home
and server2:/exports/home are to be combined from different source filers) should be
investigated to verify that the data and directories contained within will be able to coexist in a unified
export on the target Isilon cluster (e.g. isilon:/ifs/data/home):
For example:
server1:/exports/home
/user1
/user2
/user3
server2:/exports/home
/user4
/user5
/user1
If the /exports/home directories are to be unified into a single /home on the Isilon cluster, the
duplicate /user1 directory must be addressed. Possible solutions include combining (if the same
user) or renaming (if different users, along with a corresponding UID change if necessary). Work with
the customer to address these issues and craft a solution that will be minimally disruptive to their
existing environment.
Data Discovery
This is where you analyze the data and workflows that you plan to migrate and how they map to the
target end state on the Isilon cluster.
Quotas
If quotas are utilized on the source volume, they will need to be recreated on the target Isilon cluster.
Implementation of the quotas however, should wait until after the migration is complete to avoid any
potential issues while transferring data.
Isilon Cluster Configuration Design and Discovery
This is where activities such as the design of the Isilon network, disk pools, shares, and authentication
can affect the migration design. The configuration of the cluster is critical to the success of the
migration.
The output of the discovery phase feeds into the migration design and drives the execution of the
project.
Migration Approach and Requirements
The analysis of the data that you collected during the discovery phase drives the migration
requirements and the migration plan. The migration requirements break down into subcategories:
What – What are you migrating?
EMC Isilon Data Migration for NFS
8
•
•
•
•
All the data or a subset of the data
Replicate the existing data as is or transform it during migration
Copy the data but implement a new security model
Take a hybrid approach
How – How are you going to migrate the data, security, workflows?
•
•
•
•
•
•
•
Tools used to copy the data and security
Cutover strategy; how will client connections be moved?
If the data has rapid rate of change, how will you accommodate it?
Data is static, can be moved without impact
Full data copies and follow-up incremental copies to gather recently updated data
Clients access this data currently by method x/y/z.
Limit access to the old data and redirect during the cutover
When – When are you Implementing the Cutover?
•
•
•
•
Single mass event
Several large cutover events
Series of smaller cutovers sustained over a larger time frame
Rolling migration
Having clearly defined these requirements, a migration methodology can be developed to address
them.
Migration Methodology
Analysis of the migration requirements leads to the development of a migration methodology. The
migration methodology follows a waterfall methodology with phases in general occurring on
completion of the prior phase. Although the preparation for upcoming phases can occur before prior
phases are complete, the execution is defined by the completion of its dependent phase.
Figure 1 – Sample Migration Plan
The plan addresses how all aspects of the migration are achieved: sequencing, tools, timing,
communication and implementation. After you develop a plan, a proof of concept can help you
evaluate the approach and test the phases of the plan.
Toolset Selection
After you finish the discovery phase and develop a methodology, you can select an appropriate
selection of migration tools .
EMC Isilon Data Migration for NFS
9
Testing the Migration Methodology
After you develop a migration methodology, you must review, validate, and test the migration plan. A
test migration is usually run on a subset of the data. Running a test migration is also invaluable in
estimating the performance and timing of a migration.
Data Migration Testing
Testing of the actual data movement process and execution is the first phase in the testing of the
overall methodology. The data migration testing determines whether the proposed methodology meets
the requirements and accomplishes the goals of the project.
Role of data migration testing:
•
•
•
•
•
•
•
Validates the tool selection; does the tool do what you want it to? Does it copy the data and attributes? Does
it preserve hard and/or soft links?
Validates the data transfer; is the data moved as expected?
Validates that the permissions are copied over; are they correct, functional and operational?
Benchmarks the data-transfer performance; how long does it take to run full and incremental transfers?
Tests the new data: Is it available? Are the read-write settings correct? Does the new workflow work?
Gives you the option to experiment with different methods, tools and flags
Allows you to tune the process to achieve the best results
The testing should give you confidence that the data will be accessible and available after all the data
and users are cutover to the new system.
User Acceptance Testing
Before you execute the full migration and cutover, user acceptance testing (UAT) should be undertaken
against the new storage system and a sample of the migrated data. UAT validates that the data is ready
for cutover:
•
•
•
Data is accessible; users and application can access the data correctly
Permission models are correct; the required security is applied to the migrated data
Workflows are operational; no issues with the data’s use
Cutover Methodology Testing
Cutover methodology testing helps determine how you will move client connections—and how the
clients will respond to the cutover. Through testing, you can gauge how long it takes to move the
connections, what kind of issues may occur, and how to troubleshoot the issues. Testing the cutover
strategy thoroughly provides feedback on how to execute the final cutover.
Rollback Strategy Testing
You should also test your rollback methodology. The rollback testing should validate that your plan to
failback or abort a migration works so that you are prepared in case there are issues during the
cutover. Make sure to validate that access to the data on the old system can be restored quickly and
efficiently without affecting users.
Executing the Migration
Having completed and validated all your methodology and processes you can move on to the main
migration.
Core Migration Phases:
•
•
•
•
Data transfer – all the data is migrated from the old system to the new system
Cutover – connections and client are moved to the data on the new storage
Acceptance – the new data source is ratified
Rollback – only if required
EMC Isilon Data Migration for NFS
10
•
•
Steady State and repeat – migration phase considered complete but additional separation migrations may
occur
Post migration monitoring – the new system and data is monitored following the cutover
Data Transfer
A standard approach is to execute an initial Full data copy to move all the initially identified data and
to follow it up with a series of Incremental copies, which only move any data that had changed after
the initial Full copies had run. This gives you the most flexibility in executing the cutover as the
additional incremental copies will be substantially shorter to execute than the initial large data copies.
Cutover
After data migration, the act of actively moving clients from the old storage to the new storage will
occur during a cutover event.
High-level cutover plan:
•
•
•
•
•
•
•
•
•
Old source – Remove write access to ensure clients are unable to write any new data
Execute a final incremental copy to migrate any remaining data from the old system to the new system
Test new target data and connectivity; selective UAT
GO or NO GO –Decision to move forward with cutover event
Update the client-to-storage connection mechanisms; DNS, DFS, VIPs, etc.
Monitor the new storage system – monitor load and connections as the clients cutover
Validate clients – review and validate that clients can successfully connect and operate
Validate workflow – verify business operations work as expected
Cutover complete
Acceptance
Having migrated the data and client connections over to the new storage solution, the storage
availability and workflow acceptance of the new data and storage solution must be validated.
Once you begin writing new data to the new storage system, the ease with which you can roll back to
the old storage system diminishes. If you were to roll back to the old system, any changed data would
need to be copied back to the old environment. Unless this newly written data can be discarded, rewritten or manually reconciled, it is highly suggested that any rollback be executed before any
significant changes to data have occurred to the new storage.
Rollback
Make sure you have a fully tested rollback plan in place. A rollback may be needed for a variety of
reasons:
•
•
•
•
Client connectivity or storage name resolutions issues develop following cutover
Final incremental not completed in outage window, so not all data is migrated
Unplanned IT outage or issue occurring at the same time
Data access on the new storage is invalid and workflows are impaired
The goal of a rollback plan is to quickly restore the access to the old data storage solution. Assuming
the cutover was executed correctly; restoring the prior access should be straightforward and should be
able to be implemented with minimal additional disruption. The primary goal would be to restore
access within the cutover window so no additional downtime and interruption to data is introduced. It
is critical to have a tested rollback plan that can be used if an issue with the cutover occurs.
EMC Isilon Data Migration for NFS
11
Repetition
After you validate the data transfer through cutover and client acceptance, most migration projects
consist of multiple migration cycles. The methodology can be executed again on different sets of data
in migration waves that encompass the entire project.
Post Migration
Following the migration cutover, it is important to monitor both the new storage system and the old
storage system. You should observe that client connections are moving to the new system and that
active data connections are no longer getting initiated on the old storage. Can clients connect and
work with the new storage systems without issue? As connection counts increase on the new storage
system, you should monitor the load and performance. Make performance adjustments as needed.
You should be monitoring the following items during and after the cutover:
•
•
New system: system load and performance, number of connections, movement of users, security, and
performance
Old system: are users still connecting to it? Are there legacy connections to it from old applications?
If quotas were utilized on the source volume, after the migration is complete is the time to implement
them.
The Isilon OneFS feature SmartLock should likewise be implemented post migration. Committing files
should only be done once data has been completely migrated.
You should have a transition plan of what to do with the old storage. Some common approaches are
as follows:
•
•
•
•
•
Keep it around for a while but with administrator access only
Provide read-only access for users
Mothball the system while the new systems transitions
Decommission it
Purge the data after a defined retention period has been reached
Single Protocol NFS Data Migration
Although this paper addresses a single NFS protocol data migration, the approach and many of the
best practices can be used as a foundation for other types of data migrations.
Challenges of Single Protocol NFS Data Migration
Moving large amounts of data presents a number of challenges:
•
•
•
•
•
•
•
Difficult to perform without downtime; most source clusters are overloaded requiring that the data be
available at all times and operate at near capacity. Hence the need for migration. However, the migration
itself can present a significant additional load to the source cluster.
A large number of exports may need to be migrated. You must move not only the data but also the exports
and export permissions. This introduces a second type of migration (configuration) that must be undertaken
during the project.
Consolidation of multiple source filers into a single unified namespace and directory structure
May have a large number of differently connected clients that require separate cutover and validation events
NFS exports may be mounted deeper in the exported tree
Restricted exports to specific hosts and unique permissions; verify export options
High rate of change; often large environments contain a large number of concurrently connected clients. You
must account for the rapid rate of change of data during and after cutover.
EMC Isilon Data Migration for NFS
12
Data-Specific Considerations
When you design a migration strategy, you should plan how your data will appear after it has been
migrated.
Consider the scenario of multiple smaller filers consolidating into a single Isilon cluster. There may be
duplicate UID/GID collisions if the source filers are not using a consistent source of authentication. In
that case, manual remediation may be necessary to combine and fix user/group accounts that have
duplicate ID’s. For example:
filer1: has users user1 (UID:305), user2 (UID:423), and user3 (UID:424).
filer2: has users user1 (UID:305), user4 (UID:423), and user5 (UID:424).
The UID’s for user2 (UID:423) and user4 (UID:423) are the same so one user would have to
change to a different UID and the file ownership corrected during the migration. Same issue with
user3 (UID:424) and user5 (UID:424); one must change and file ownership be corrected
before the users can both use the new cluster. One possible resolution is to make user4 UID:1423
and change ownership of all their files prior to cutover. Likewise, make user5 UID:1424 and
modify ownership on all of their files.
Keep in mind that OneFS will store all UID/GID information regardless of source. OneFS does not
require NFS authentication.
Duplicate export paths are a similar issue:
filer1: has exports /vol/share/acct, /vol/share/work, and /vol/share/eng
filer2: has exports /vol/share/acct2, /vol/share/work and
/vol/share/engineering
/vol/share/work is the same from both source filers! This is a discussion that must be conducted
with the customer prior to migration. A plan for directory consolidation must be developed to deal with
export path collisions. One typical solution is an additional directory layer that identifies the original
source filer:
isilon: would have exports: /ifs/data/filer1/acct, /ifs/data/filer2/acct2,
/ifs/data/filer1/work, /ifs/data/filer2/work, /ifs/data/eng, and
/ifs/data/engineering
With this methodology, duplicate export paths can safely be consolidated from multiple source filers
into a single cohesive namespace. Clients however must be updated to reflect the updated export
paths.
The metadata of files, in particular, can add complexity to a migration. Identify the metadata that you
want to migrate with the data. The following metadata can affect your migration strategy:
•
•
•
•
•
File properties such as access time, created time, modified times, owners
File attributes such as read only or archive (an Isilon cluster does not support compressed and encrypted)
Extended proprietary file attributes that are in use; not supported on Isilon
Local users and groups; are these defined on the files?
Deduplication in use, or Archive stub files, Mac OS X resource forks present
Other data specific considerations:
•
Date/Atime/Creation time retention requirements – These may not be preserved across migrations depending
on which tool is used, e.g. creation time is not preserved with isi_vol_copy.
•
Symlinks will break – Depending on where the symlinks connect to, the underlying paths will probably
change after a migration and require rebuilding
•
Automount maps will need to be repointed (NIS,NIS+) – Similar to symlinks, the export paths may change and
hostname may change as well
•
Character encoding – Verify that it is the same on source and target; international characters in filenames
may be problematic
EMC Isilon Data Migration for NFS
13
How does the data need to appear post-migration:
•
Direct replication of all data and attributes
•
Move the data, then make updates, fix problems, change the security, etc. Migrate just the data and
implement an entirely new security model
Migration Requirements and Customer Data Collection
Before you can plan your migration, you must collect requirements.
Requirements Gathering
The data migration planning begins with identifying the data that you want to move from the old
storage system to the new storage system. Here is what you need to document:
•
o
o
o
o
•
o
o
o
o
Current State – What is the current state of:
Source Infrastructure

Existing storage platforms

Network design and implementation

Name resolution infrastructure: DNS, DFS, global namespace

Servers/clients/OS/applications
Source Infrastructure Configurations

Volumes

Shares/Exports

Access

Authentication
Source Data

Logical Structure – data layout, directory depth

Is the structure Wide or Deep?

Physical Structure – total size, min/max/average file size

Number of files
Source Data Security

Current security model and how file access is enforced

Local Users and groups

POSIX permissions

LDAP users and groups
Target State – What will be the target state:
Target Infrastructure: Isilon Cluster

Network configuration
Target Configurations:

Directory layout and structure

Shares/Exports

Access and Authentication model
Target Data

Logical Structure – same as source or new

Physical Structure – same as source or new
Target Data Security

Same as current security model

Migrate and change the security model

Move the data and implement a new security model
How to gather the data
•
•
•
•
•
•
Interview stakeholders
Gather documents: network diagrams, run books, infrastructure and application details
List of exports
Storage reports, etc.
Review share permissions
Examine directory structure (shallow vs. deep), file composition (small vs. large), and numbers of files
EMC Isilon Data Migration for NFS
14
Current Infrastructure and Data Analysis
Start the migration design phase by collecting the data needed to develop the migration requirements.
Best practice
Create and utilize a standardized data collection and migration-planning document along with a
standard target configuration guide. By using a structured document to gather and collect all your
source data and information, you can identify your migration requirements, which will lead to clear
migration design decisions.
Why: Will simplify and consolidate migration planning and implementation.
You need to collect the following information:
•
o
•
•
•
•
•
•
•
•
•
•
•
How much data; actual file data, not compressed or deduplicated
If there is deduplicated data, how much; this will have to be added to the total
How many directories and files; identify the directory trees and quantity
What is the directory structure: shallow and flat, wide and deep, or?
Directories with more than 10,000 files in them
How many exports; do you see share name collisions or reuse on multiple source hosts
What is the use of these exports; home directories, application or group use
How permissions are applied to source; individual or at the group level
How many source locations; single source system or multiple
How clients access data; protocols and how they resolve storage names
Rate of change of files; how often and where are files changing
Networking architecture; source systems and network between it and the Isilon cluster
Source system load; understand what load is the source storage under and how much additional overhead
from the migration would be tolerable
Determine Migration Methodology
After you collect information on the source system, the data, and the infrastructure, you are ready to
develop a migration methodology.
Logical Migration Design
By analyzing the structure and layout of the source data, you can make logical migration design
decisions—structuring the migration into distinct executable units. A goal of the migration
methodology is to identify logical boundaries to facilitate the cutover of your clients and workflows.
Some logical migration boundaries are as follows:
•
•
•
•
Hosts/Filers/Servers/Arrays
Volumes
Exports
Directories – users or groups
Best practice – Define migration boundaries
Identify logical Migration Boundaries: Identify clearly well-defined data structures to migrate and
cutover—for example, entire exports or directories. Be aware of the size of data inside a migration
boundary, as the size of the data affects the outage window required to complete a cutover.
EMC Isilon Data Migration for NFS
15
Why: This best practice organizes the migration into segments waves, making the migration easier
to manage.
After you segment the logical boundaries into distinct migration phases, you can address other
elements of your workflow, such as metadata, that you need to migrate.
File Attributes and Security
Most data migration also includes the migration of the file’s metadata: ownership, access times,
creation time and security descriptors. Before you can execute your migration, determine how you plan
on handling metadata and file security.
Common migration approaches:
•
•
•
•
Migrate data file as-is (no change to permissions/ownership)
Migrate data and permissions but also fix on destination (Recalibrate the permissions)
Migrate data-only. Create new permissions on the destination, or create a new security design.
Migrate away from an existing security model and implement a new model. It is recommended to use a
central authentication scheme on Isilon. For example, if the NAS system that you are migrating from uses
several directory services, you should consider consolidating the directory services into a single directory
service for the new NAS system.
Best practice – Understand the attributes of the source data before the migration
Be data aware: Identify any DOS attributes, non-standard extended file attributes, and nonstandard permissions that are not supported by an Isilon cluster. Also, identify your local users or
groups and have a plan to deal with them.
Why: Before you execute the migration, you may need to take additional steps to prepare the data
for migration so it will be available on the new storage system.
Migration Sequencing
The execution of a migration will likely require multiple iterations of the data transfer. If the source data
is constantly changing, try to find a window when the source data can be locked in a read-only state or
deny access to clients. Once access to the source data is removed, the final data transfer can take
place. Otherwise, differences between the data on the source and the target might result.
The recommended approach for a data migration is to use a multi-step migration. A multi-step
migration consists of an Initial “Full or Level 0” data copy. The Initial data copy is followed by a series
of “Incrementals” that update only new or changed data. The Initial data copy moves an entire copy of
the source data. It can often take a long time to execute because all the data must be assessed and
transferred over the network to the migration target.
After the initial copy completes, additional differential transfers copy only the data that has changed
since the initial Full was executed. Additionally, any data that is deleted on the source will also be
deleted on the target through the incremental process. The size of an incremental copy is affected by
the rate of change of the source data.
You should run multiple, over-the-top incremental copies to guarantee the integrity and consistency of
data that encounters issue during the initial Full copy. Incremental copies will also keep the two data
sources in sync with each other and require less catch-up on final copy.
A final incremental should always be executed as part of the migration cutover plan to ensure all the
latest data is on the new target storage.
EMC Isilon Data Migration for NFS
16
Best practice – Run initial Full copies followed by Incremental copies
Run Initial Full copies followed by multiple Incremental copies. Always execute a final Incremental
during cutover to ensure the latest data from the source is migrated.
Why: Executing multiple migration passes will ensure that all the data is transferred and that the
latest version of the files will be stored on the target storage system.
Type of Migration
You must determine how the migration will be executed. There are two possibilities: an indirect
execution from a host and a direct execution from an Isilon cluster. With host-based migration, an
intermediary host executes a copy process between the source system to the target system through the
host, as shown in the following figure:
Figure 2 - Host based migration
With a host based migration all data is transferred through an intermediary host on route to the Isilon
cluster.
If the source system is supported, the Isilon cluster can execute a direct source to Isilon data copy by
using the Isilon OneFS isi_vol_copy command which will copy data using NDMP protocol:
EMC Isilon Data Migration for NFS
17
Figure 3 - Isilon based migration
With an Isilon based migration, data is pulled directly the source system to the Isilon cluster utilizing
isi_vol_copy.
Another method of migration can be achieved by directly running the Linux rsync replication utility on
the Isilon cluster itself:
Figure 4 - Isilon based migration with rsync on individual nodes
Similar to the OneFS isi_vol_copy tool rsync can be run natively on the individual Isilon nodes against
locally mounted NFS source exports mounted directly on each Isilon node. Data is transferred directly
from the source cluster to the Isilon cluster reducing latency and network congestion eliminating
external host computers to move the data. The external NFS mounts on the Isilon nodes MUST BE
UNMOUNTED prior to normal cluster operation
Host-Based Migrations
A host-based approach might be selected for a number of reasons:
•
•
•
•
Source system does not support Isilon based migration – isi_vol_copy not supported
Connectivity is restricted – storage on different networks, a host may bridge the networks
Flexibility in execution – separate the execution from administration of the storage systems
Security restrictions – can be used to limit access to systems
EMC Isilon Data Migration for NFS
18
In a host-based migration, the toolset executing the migration makes a connection to the source and to
the target system and then copies the data through the host. For the purpose of this paper, the primary
host-based tool is rsync.
Best practice – Select a suitable host
Select a suitable host to run the migration with adequate network bandwidth and processing
power.
Why: Because all the data will move through the host, incorrect sizing may lead to a bottleneck or
an interruption in the migration. Using multiple hosts may facilitate multi-streamed migrations in
which you can maximize network usage and the Isilon nodes by executing multiple migrations
concurrently. A host with 10Gb network connectivity is highly recommended.
Some common considerations are as follows:
•
•
•
•
•
Adequate resources to execute the migrations; CPU, RAM, Network
10 gigabit network infrastructure where possible
Connectivity between host and the source and target storage systems
Availability; the host is stable and reliable, no reboots or downtime
Dedicated host; not running a lot of other parallel workloads and restricted user access
The migration host needs to be as optimized for the migration workload and high network throughput
as much as possible, as it will send and receive all the data to be migrated.
Isilon Based Migrations
If the source system is capable of supporting an Isilon based migration by use of isi_vol_copy or by
direct access with rsync, the connectivity exists, and the migration methodology supports utilizing this
approach, a direct migration may be a more applicable technique. The main advantage of the direct
approach is there is no need for an intermediary host to execute the process or for the data to traverse
the external host.
Type of Isilon Based Migration
Two primary types of Isilon based migration exist.
•
•
NDMP based with isi_vol_copy
rsync based – use the unix rsync tool to connect and either push or pull data directly to the Isilon target,
running natively on the Isilon cluster
Known source systems that support Isilon based NDMP migrations:
NetApp®
Isilon Requirements: Isilon OneFS 6.5.5.6 or later newer
NetApp Requirements: Data ONTAP 7.x or Data ONTAP 8.x operating in 7-mode
It is anticipated that additional source systems will be supported in future releases of Isilon OneFS.
As in all migration strategies, it is critical to evaluate the migration methodology against the selected
approach to determine if the method selected will facilitate the migration goals.
Best practice – Evaluate migration approach
Evaluate and select the most appropriate migration approach; select the method that meets your
specific migration requirements, provides cutover flexibility, and optimizes data throughput.
EMC Isilon Data Migration for NFS
19
Why: The selected approach will impact the migration schedule and planning.
Having identified the migration approach, the selection of the appropriate migration tool can occur.
Migration Tool Selection and Use
The data migration requirements will help define the tool selected to execute the data migration.
Tool Selection
A number of tools are available and will work; any file copy method that can connect over NFS to the
source and target storage can be used to move data between the systems. It is recommended to use a
tool that can be automated and provides robust functionality—a tool that can copy attributes, security,
logging, etc.
The common NFS data copy tools are as follows:
Tool
Advantage
• Included with Isilon
OneFS
• Pulls across all users and
group permissions
• Supports both SMB and
NFS protocols.
• Utilizes NDMP
Direct migration sourcetarget
isi_vol_copy_vnx
• Included with Isilon
OneFS
• Pulls across all user and
group permissions
• Supports both SMB and
NFS protocols.
• Utilizes NDMP
• Direct migration sourcetarget
rsync
• EMC preferred tool
• Designed for
synchronizing directories
• Only sends differences
when files change
• Lots of available
switches
• Can be scripted
Open source-widely
available
tar, cpio
• Can be scripted
• Open source-widely
available
• Good for one full push of
data
Table 1: Summary of NFS copy tools
isi_vol_copy
Disadvantage
• Only supported against
specific source storage
systems - NetApp systems
running OnTap 6.5 and above
with NDMP v4
• Limited error reporting
• Only supported against
specific source storage
systems VNX 7.x OE and
Celerra DART 5.6.x or higher
• Limited error reporting
• Limited error reporting
• Limited error reporting
• Designed for backup and restore,
not active copying
• Does not do incrementals
EMC Isilon Data Migration for NFS
20
Tool Versions
It is important to understand that different tools may behave differently on different hosts; it is strongly
suggested to test tool versions and observe their behavior.
Best practice – Use the correct tool for the job
For NetApp(TM) Filer: use isi_vol_copy.
For VNX: use isi_vol_copy_vnx.
For general NFS filers: use rsync.
Why: Using the correct tool for the job will give you the best chance for a successful migration.
Best practice – Use the correct version of the tool
Rsync is available on nearly every UNIX and Linux distribution as well as natively on the Isilon
cluster. You must use the correct version for the OS for the host that is running the tool.
Why: Using the correct version of the tool will optimize throughput and performance, i.e. use a 64bit version if your host OS is 64-bit.
Best practice – Use the latest version of migration tools
It is recommend to always utilize the latest versions of the chosen file copy tool.
Why: Performance is optimized and they often have newer features and bug fixes.
Migration Tools
The following is an overview of the use of the primary NFS migration tools that can be used in Isilon
data migrations:
rsync
Overview:
The rsync tool provides a method to copy files, directories and sub-directories from NFS exports to
other NFS exports with the ownership and attributes intact. It was designed to efficiently synchronize
files and directories from one location to another minimizing data transfer while using delta encoding
where appropriate. If the source and destination have many files (and parts of files) in common, the
utility need only transfer the differences. Incremental change copies are thus extremely efficient.
Rsync can operate in both a local and remote mode (as a service) and behaves similarly to rcp. It can
“pull” or “push” files from filers.
Rsync should be run as root to preserve file permissions and ownership.
It can also use ssh if necessary for secure environments.
Source code is available and rsync is implemented on nearly every modern operating system.
Usage:
Rsync [options] <source> <destination>
EMC Isilon Data Migration for NFS
21
Features:
•
•
Ability to copy file data, ownership, time stamp information
Is extremely efficient for incrementals
For a full list of rsync features and switches, run the following from a command shell:
man rsync
Sample rsync command:
rsync $OPTS [$SOURCE:]$SOURCEDIR [$TARGET:]$TARGETDIR
where variables are typically defined:
$OPTS=--force –ignore-errors –delete-excluded –exclude –from=$EXCLUDES –
delete –backup –a”
$EXCLUDES=/path/filestoexclude
$SOURCE=name of source filer
$SOURCEDIR=/path/sourcefiles
$TARGET=name of isilon node destination system
$TARGETDIR=/path/targetdir
Rsync by default runs in a local mode, but with the addition of [$HOST:] in front of either $SOURCEDIR or
$TARGETDIR it can transfer files remotely between systems.
For example:
rsync –avh /tmp/foo root@host2:/tmp/bar
If run from the source system would transfer the local directory, /tmp/foo, to the remote host, host2
and place them in the target directory, /tmp/bar.
Note that a shell script is usually created to automate and distribute the rsync jobs. Entire migrations can
be automated and incremental passes scripted to run automatically. Review the scripts with the customer
to verify that the sequence of commands matches the expected migration plan.
Best practice – rsync and compression
If the data is mostly binaries or large uncompressible files it is not recommended to use
compression as this will slow the migration considerably. Text files however, will readily compress
and if the source data is such, the use of this option will greatly speed the migration. Know the
source file composition.
Why: Trying to compress non-compressible data will greatly slow the migration.
Best practice – exclude snapshots from replication
It generally does not make sense to migrate snapshots as they will not automatically work on the
target system as intended. Therefore, exclude them from the migration to speed the process.
Why: Snapshots will not migrate.
For example:
EMC Isilon Data Migration for NFS
22
rsync –avh –exclude=’.snapshot*’ /tmp/foo root@host2:/tmp/bar
if run from the source system would transfer the local directory, /tmp/foo, to the remote host, host2
and place data in the target directory, /tmp/bar, while excluding snapshots.
Best practice – spaces in name
Be aware that spaces in file and directory names can cause problems.
For example, a directory named, “/spaces in my name” and a file named, “some file.avi”
would require special handling on both a command line and in a script:
rsync –av foo@foomachine:’/spaces\ in\ my\ name/some\ file.avi’
/local_directory/
The “\” character is used before spaces to prevent the shell from parsing the next word as a
separate argument.
Why: Spaces in filenames and directories can cause scripts to fail. Be alert for them.
Best practice – starting rsync switches
Suggested initial rsync switches:
--a=archive mode; equals –rlptgoD (no –H,-A,-X)
--delete=delete extraneous files on the target (useful on incremental copies if source data has been
deleted)
--force=force deletion of directories even if not empty (during incremental passes if directories are
deleted on the source)
--compression=compress the data being transferred (if the data is compressible)
Why: It is suggested to start with a baseline of switches and test the copy, validate the results and
behavior of the copy, and make the appropriate adjustments to the rsync switches. No single
default set of switches will work for all migrations. Remember, rsync can be run multiple times
incrementally and different directories/exports may require different options.
You should become familiar with many of the rsync switches and their use. The following highlights a
few possible options that you should be familiar with. It is important to recognize that each migration
will require different switches because of the unique requirements of each dataset.
A few useful switches to be aware of include the following:
-r,--recursive
Recurse into directories
-l,--links
Copy symlinks as symlinks-p,--perms
Preserve permissions
-h
Output numbers in a human readable format
EMC Isilon Data Migration for NFS
23
--progress
Show progress during transfer
-z,--compress
Compress the file data during the transfer
-g,--group
Preserve group
-o,--owner
Preserve owner
-D
Preserve special files and device files
--protect-args
If you need to transfer files that contain whitespace, either specify --protect-args or escape the
whitespace with a “\”
-stats
Provides detailed list of the total number of files, files transferred, benchmarks, and an average
transfer speed
-t,--times
Preserve modification times
-v,--verbose
Increase verbosity
-n,--dry-run
Perform a trial run with no changes being made
--exclude=PATTERN
Exclude files matching PATTERN
--exclude-from=FILE
Read exclude patterns from FILE
Symbolic Links and Hard Links
Be mindful of symbolic links within the source filesystem; they may not point to the same target after
migration if paths change.
Rsync has multiple methods to deal with symbolic links. Choose the most appropriate after consulting
with the customer. They can also be dealt with in a separate migration pass.
EMC Isilon Data Migration for NFS
24
By default, links are not transferred at all; a message such as, “skipping non-regular file” is generated
for any symlinks that rsync encounters. Switches to deal with links include:
--links
Symlinks are recreated with the same target on the destination. Not that --archive implies --links.
-L,--copy-links
Symlinks are “collapsed” by copying their referent, rather than the symlink.
-H,--hard-links
Preserve hard links
--safe-links
Ignore symlinks that point outside the tree being replicated. Useful to prevent sensitive system files
such as /etc/passwd from being inadvertently copied.
Best practice – Know the rsync switches
Understand all the rsync switches and when how to use them
Why: Different migrations will require the use of different switch to meet the requirement of the
data copy and the final state of the migrated data. Discuss with the customer before the migration
begins to determine the optimal selection of switches.
Best practice – Parallelizing the RSYNC Processes
Examine the source directory structure and look for obvious means to divide the source directory
tree into smaller, more manageable chunks.
For example, if you have a filesystem of 4,000,000 files it might take 6 hours to complete
(completely fictional made up number). Consider if the filesystem tree was divided into something
like:
drwxr-xr-x
2 root
root
179 Jul 19 15:00 directory_a
drwxr-xr-x
2 root
root
179 May 1 00:00 directory_b
It would cut the migration time in half if you could run two simultaneous rsync jobs at the same
time (assuming the contents of the directories were balanced nearly equal):
rsync –av –include=”/directory_a*” –exclude=”/*” –progress remote::/
/localdir/
rsync –av –include=”/directory_b*” –exclude=”/*” –progress remote::/
/localdir/
Best performance would come from spreading requests across multiple Isilon nodes and multiple
source network interfaces. Multiple rsync jobs can be run on individual nodes as well, but these
processes tend to be network limited. You want to spread the load across as much of the Isilon
cluster as possible, maximizing the available bandwidth to each node. Be mindful if the NFS
option “map root to nobody” is implemented as this may affect access to files.
Why: You will see increased performance, but you may be limited by network bandwidth and
source cluster throughput.
EMC Isilon Data Migration for NFS
25
Isilon-based migrations – isi_vol_copy
Overview:
Isi_vol_copy is a native Isilon OneFS tool that supports data migration through the use of the NDMP
protocol. The tool allows the cluster to mimic the behavior of a backup target and allows the data to be
copied directly from the source system to the Isilon preserving permissions and ownership.
Usage:
isi_vol_copy <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr
]
[-sa user: | user:password]
[-sport ndmp_src_port]
[-dhost dest_ip_addr]
[-maxino maxino]
[-h]
Features:
•
•
•
•
•
•
•
•
Utilize native NDMP functionality and connectivity
Supports Full and Incremental backup levels
Migrates data and all security, attribute information
Will restore the set of permissions/ACLs that existed on the source data
Will migrate NFS and SMB source data
Does not impact or interact with client data access
Dedicated data transfer pipe between source and cluster
Starting with OneFS 7.0.2, it supports the Backup Restartable Extension, so full backups can be interrupted
and restarted from a checkpoint
Limitations:
•
•
Source filers have a limits on the number of NDMP threads and simultaneous backup jobs; avoid overrunning
the source filer
Can be limited by source filer network bandwidth
Sample isi_vol_copy command
isi_vol_copy <source_filer_IP>:/<source> -sa<ndmpuser>:<ndmppassword>
/ifs/data/<source_filer> -full
Best practice – isi_vol_copy target data use
Do not touch the data on the target Isilon until after the isi_vol_copy has completed.
Why: This will create problems and you may have to re-do a full copy.
Best practice – simultaneous isi_vol_copy use
Do not execute multiple isi_vol_copy going to the same target. i.e. don’t have all your isi_vol_copy
migrations going to the same target directory. For example:
filer1:/vol/sourcedir -> isilon:/ifs/data
filer2:/vol/sourcedir2 -> isilon:/ifs/data
EMC Isilon Data Migration for NFS
26
Why: Creates problems for the copy process and may require remediation after migration.
Instead: Use an additional directory level:
filer1:/vol/sourcedir -> isilon:/ifs/data/filer1/sourcedir
filer2:/vol/sourcedir2-> isilon:/ifs/data/filer2/sourcedir2
If consolidation is required, this can happen after the data is migrated and any potential merging of
identically named subdirectories can be addressed.
Best practice – isi_vol_copy use
isi_vol_copy is optimized to streams as much data as possible across a network, always monitor
load on the source and target systems for potential impact.
Why: Since isi_vol_copy is optimized to streams as much data possible don’t overwhelm older
source systems and create potential link saturation or disk problems especially if there are users
connected and attempting to access files.
Best practice – isi_vol_copy limits
Recommend less than 40 million files per volume transfer when using isi_vol_copy.
Why: All programs have limits and this is the recommended maximum when using isi_vol_copy for
each individual transfer. Larger source volumes should be broken up into smaller chunks (i.e. use
a separate isi_vol_copy stream for multiple subdirectories instead of one large transfer of an entire
volume).
Once the initial copy is complete then incrementals can be run:
isi_vol_copy filer1:/vol/sourcedir –sa root:<password>
/ifs/data/filer1/source –incr
Important: Do not start an incremental copy job until after a full copy is complete. A successful FULL
copy must finish before an incremental copy is started. Unlike rsync which automatically does
incrementals, isi_vol_copy must be explicitly called with –incr to perform an incremental copy.
Subsets of source directories/volumes can be migrated
An entire volume does not need to be migrated; subdirectories can be migrated individually as well.
For example:
A volume, /export/vol1, is exported containing subdirectories /work, /scratch, /tmp, and
/home. One could migrate the entire vol1 or any/all of the individual subdirectories under vol1, e.g.
/export/vol1/work and /export/vol1/home might be the only necessary directories to move.
Isilon-based migrations from VNX – isi_vol_copy_vnx
Overview:
Similarly, isi_vol_copy_vnx is a native Isilon OneFS tool that supports data migration through the use
of the NDMP protocol for VNX. The tool allows the cluster to mimic the behavior of a backup target and
EMC Isilon Data Migration for NFS
27
allows the data to be copied directly from the source system to the Isilon preserving permissions and
ownership.
Usage:
isi_vol_copy_vnx <src_filer>:<src_dir> [options] <dest_dir> [ -full | incr ]
[-sa user: | user:password]
[-sport ndmp_src_port]
[-dport ndmp_data_port]
[-dhost dest_ip_addr]
[-h]
Features:
•
•
•
•
•
•
•
Utilize native NDMP functionality and connectivity
Supports Full and Incremental backup levels
Migrates data and all security, attribute information
Will restore the set of permissions/ACLs that existed on the source data
Will migrate NFS and SMB source data
Does not impact or interact with client data access
Dedicated data transfer pipe between source and cluster
Limitations:
•
•
Source filers have a limits on the number of NDMP threads and simultaneous backup jobs; avoid overrunning
the source filer
Can be limited by source filer network bandwidth
Check with EMC support for the latest compatibility with tools, DART codes, and OneFS.
Migration Preparation
After you finish planning the migration and selecting the tools, you can prepare the source and target
systems for the migration.
Infrastructure and Environment Setup
Network connectivity
Since all the data in the migration will traverse the network, you should optimize the network
infrastructure and connectivity between the source system(s) and the target Isilon cluster.
Common recommendations include the following:
•
•
•
•
Maximize network bandwidth; 10Gbps preferred to 1Gbps, optimized end to end, MTU of 9000
Limit hops and latency between source and target
Isolate migration traffic so as not to compete with client access
Limit potential network bottlenecks; routers, firewalls , IDS and shared network infrastructure
Best practice – Optimize the network for the migration traffic
Optimize the migration network path; look to limit other production traffic from this network, limit
network devices the traffic traverse (firewalls, IDS etc.). Ideally look to create a dedicated private
migration network that can be optimized for only the migration traffic.
EMC Isilon Data Migration for NFS
28
Why: Separating the migration traffic from other network traffic will allow for maximum throughput
and reduce potential impact to existing production traffic by limiting network saturation.
Migration Account
In order for the migration data to be copied from source to target the tool accessing the data must be
able to access all of the source and target data.
Commonly used migration accounts:
•
•
root
Users accounts created explicitly for the execution of isi_vol_copy, e.g. ndmp
The account used to connect to the source and target storage systems depends on the security model
implemented in the environment.
Best practice – Use a specific migration account to execute migration tasks
Use a specific migration account or account with group membership that has the required access
to all source and target data, i.e. root.
Why: Using a dedicated account will allow for oversight and management of the migration data
access. It will also allow for the separation of migration tasks and users from other production
accounts.
Source Host Preparation
The source data storage system should be prepared and optimized for the migration.
Best practice
Access can be restricted to the source cluster exports, preventing users from changing data on the
source cluster instead of the migration target clusters. Change exports to read-only once migration
and incremental copies are complete to prevent clients from writing to them.
Why: Allows you to separate migration events from normal production access. It can be used during
post cutover to deny read-writes by a normal user to the source cluster. This prevents updates to
the data during data cutover and post-migration while continuing to allow administrative access.
Migration Host Preparation - Source and Target Access
The migration host should be prepared and optimized for running the migration copies.
•
•
•
•
Limit workload and access to optimize throughput
Restrict access and reduce service issue with the host
Prepare all migration jobs as scripts
Test and validate network throughput
Best practice – watch out for root_squash
On the source cluster exports sometimes restrict access by using root_squash to prevent root users
connecting remotely and having root privileges. But this is something we need for migrating data.
Use the option, “no_root_squash” to turn off root squashing.
EMC Isilon Data Migration for NFS
29
Why: Must have root access (or equivalent) to migrate all files and directories.
We can also set root squash on the Isilon NFS exports:
Here we see under “User/Group Mappings: Use default: Map root users to nobody” has been set for
the export /ifs/data/work. This should be disabled to allow root full access to the filesystem. Note
that this can be set on an export by export basis. In addition we have restricted root access to a
specific client, 192.168.43.200. Typically this would be the host doing the migration.
Isilon Cluster Preparation Configuration
All primary setup and configuration of the Isilon cluster should be completed before you begin to
migrate data. The configuration includes but is not limited to the following:
•
•
•
•
•
•
•
•
•
•
Authentication provider integration – all authentication providers are online and fully operational
If local users, all UID’s and GID’s created and tested
Access Zone and Role Based Access Control (RBAC) setup – complete any zone and RBAC setup
Exports for clients created and tested
Networking design and setup – complete the setup and implementation of the network configuration
SmartPools – complete the implementation of any SmartPools policies to limit post-migration work
SyncIQ – prepare any existing SyncIQ policies to operate alongside any data migration events
SnapShotIQ – prepare any Snapshot policies to operate alongside any data migration events
SmartLock – execute all preliminary SmartLock work prior to migration
SmartQuotas disabled until the migration is complete
EMC Isilon Data Migration for NFS
30
It is suggested to use a dedicated Isilon migration directory export to execute all migrations against.
Using a dedicated administrative migration export with the appropriate access and configuration can
facilitate the migration without impacting workflows or data permissions. Normal user clients will not
mount this export, only migration hosts. Data can then be moved into place after the migration is
complete with minimal disruption.
Best practice – Implement a logical NFS export path structure methodology
Be aware of export rules and how they interact with each other:
1. Path – should be unique though nesting is possible with caution
2. Permission options – restriction by User ID mapping and IP addresses
3. Security – netgroups, authsys, Kerberos
The order of evaluation is: path, client ACLs, then security types (unix,krb5).
For example, if you have exports:
/ifs/data
--client=8.8.8.0/24
/ifs/data/something
--client=10.10.10.0/24
and your client IP is 10.10.10.10 you would not have access to /ifs/data/something because the
export, /ifs/data, has a different IP restriction. The client must be able to traverse the path if the
exports are nested. Access for 10.10.10.0/24 would need to be added to the /ifs/data export:
/ifs/data
--client=8.8.8.0/24 --client=10.10.10.0/24
/ifs/data/something
--client=10.10.10.0/24
Why: Complex and restrictive rules may prevent clients from connecting to exports that are nested.
Clients may encounter problems mounting a directory that is nested from a different export. If the
parent directory has more restrictive permissions, a client may not be able to mount a child export
of that directory.
Best practice – Create the NFS exports
Create the new Isilon NFS exports prior to data migration.
Why: This will allow the creation and set up of the exports and export permissions prior to data
migration and cutover for initial testing and access validation/UAT.
Best practice – Do not use default /ifs export
Allows mounting of subdirectories and open access to the whole file system.
Why: This is intended for easy setup use only and can be a potential security issue.
Best practice – Create the correct NFS export permissions
Setup the correct export permissions on the newly created user exports.
Why: Setting the correct export permissions will allow you to test and validate workflows when test
migrations are undertaken and maintain security.
EMC Isilon Data Migration for NFS
31
Note: The migration methodology may include adding an explicit “Deny” permission on users or
directories so they cannot write data to these exports until the cutover has been executed as well
as specific IP addresses to prevent clients from accessing the exports.
Additional Isilon Cluster Considerations
The following are some additional Isilon cluster considerations that may need to be addressed prior to
and during a data migration.
NFS 16 group limitation
The NFS standard by default does not support membership in more than 16 groups per individual user.
This limitation can be addressed on an EMC Isilon cluster by enabling “Map Lookup UID” under NFS
Settings ->NFS Export Settings ->Export Behavior Settings -> Map Lookup UID:
Additional information can be found in the EMC document:
“NFS supplemental groups limited to 15 in OneFS 6.5.4 and earlier”
Article Number:000089550
EMC Isilon Data Migration for NFS
32
Production or Pre-Production Cluster
An important consideration around planning and execution of a data migration is the current status of
the Isilon cluster. Is the cluster in production or will the migration mark the initial cutover to active
production traffic? Our primary goal is to lessen any impact on a production cluster during migration
activities, so suitable steps should be taken to address these concerns.
Common factors to be aware of while migrating to a cluster:
•
•
•
Administratively destructive actions
Saturation of network links
Cluster load and ingest, impact on production workflows
Access Zones and Role Based Access Control
If the cluster uses an Isilon access zone or RBAC, the migration methodology may need to be adjusted
to accommodate this configuration. Currently, OneFS 7.0 only allows for NFS exports in the default
System Zone and no other zones.
NFS RPC threads
By default the number of NFS RPC thread is set to 16 per node. This can be increased for specific
workflows. Consult support for assistance.
Isilon OneFS SmartConnect or Direct Node Connections
The current status of the cluster may dictate that you look to optimize and segregate migration traffic
within the clusters network configuration.
•
•
Use SmartConnect to auto balance traffic
Separate migration traffic from existing production traffic, use a direct node or separate SmartConnect zone
connection for migration traffic
If you use SmartConnect, you should validate and optimize the configuration before transfer data
across the network.
Isilon OneFS SyncIQ Considerations
If the data to be migrated will be replicated to a secondary cluster through a SyncIQ policy additional,
planning should be undertaken to address the impact of the data migration and its interaction with
active SyncIQ policies.
•
•
•
Pause active SyncIQ policies if they include migration paths
Schedule SyncIQ jobs to run outside of data copy windows
Utilize SmartConnect zones for copying and SyncIQ replication
Isilon OneFS SmartPools
Any SmartPools data policies should be in place prior to data migration, or additional cluster overhead
maybe required to move data within the cluster post-migration.
Isilon OneFS SnapshotIQ
Any active SnapShotIQ policies should be analyzed for impact during the data migration.
Antivirus Integration
Review and disable any active antivirus scanning policies that may be running against the target data.
EMC Isilon Data Migration for NFS
33
Best practice – Disable Antivirus Scanning
Disable active antivirus scanning on migrated data during initial full and incremental copies.
Why: The large influx of data associated with the migration can place an excessive load on the
antivirus scanning architecture and create a slowdown and potential bottleneck for the inbound
data.
Isilon Guidelines for Large Workloads
Be mindful not to exceed maximum configuration values listed in the table below
Guideline
Tested
or
Default
Value
Theoretical or
Max Practical
Value
NFS max
read size
NFS max
write size
NFS3
connections
(Per node)
128 KB
1 MB
512 KB
1 MB
1000
n/a
NFS3 exports
(Per
cluster)
nfsd threads
(Per node)
750
2000
16
16
Filename
length
255 B
255 B
Path length
1024 B
1024 B
Directory
depth
8470
Unlimited*
Comments
This applies to both NFS3 and NFS4.
Prior to OneFS 7.0, the maximum read size was 128 KB.
This applies to both NFS3 and NFS4.
Prior to OneFS 7.0, the maximum write size was 512 KB
The number of TCP sockets available on the node is typically what
limits NFS connections. Unlike Isilon’s SMB server, nfsd uses file
handles instead of file descriptors to represent files, and internally
handles work items much differently.
1000 connections is a very conservative, tested limit, and
represents 1000 mounts over 20 exports. NFS connection testing is
an ongoing test effort.
A maximum connection limits for NFSv3 has not been established at
this time.
Beyond 2000, manageability becomes a problem. Cluster size does
not matter.
This is a kernel limit, exposed via sysctl.
The value should not be changed without consulting Isilon Support.
This limit represents the maximum number of simultaneous work
items the server can service. Simultaneous work items above this
number are queued and serviced when resources become available.
Note: Most Unicode character encodings (like UTF-8, which is the
OneFS default) allow for multiple bytes per character (UTF-8 allows
for up to 4 B/character).
So, of the 255 B in the filename, that could represent 255
characters, 63 characters, or some number of characters in
between.
This is the maximum absolute path (e.g. /ifs/data/foo/bar/baz/)
length that can be passed into a syscall, not the maximum depth of
a directory in the file system (see Directory Depth).
* No specific hard limit is in place, but several other limits could
come into play here (inode limits, metadata storage limits, etc.). In
tests, command line (shell) utilities starting showing problems at a
depth of 8470 (EBADF from many commands). At a depth of about
30000, internal utilities (e.g. Job Engine TreeDelete) also started
failing.
For utilities that make calls with absolute paths (e.g. cd /1/2/3/…),
depth will be limited by “path length” as described above (this
applies to path-based OneFS commands like ‘isi snapshot’ and ‘isi
quota’). For utilities that access relative paths (e.g. cd 1, cd 2, cd
3…), these higher limits may apply, although the value of
extraordinarily deep directories is questionable.
EMC Isilon Data Migration for NFS
34
File size
4 TB
4 TB
Hardcoded OneFS limit. Note that Job Engine performance can be
impacted on files larger than a TB, due to inefficient per-file
threading.
Table 3: Isilon Guidelines for Large Workloads
Migration Approach - Testing and Proof of Concept
Having developed the migration approach, selected the toolset and prepared the infrastructure for the
data migration, you can proceed with your initial testing of the methodology. The goal of the testing
here is to validate the outcome—is data migrated, are the permissions moved, and are the timestamps
moved? The testing phase also allows us to tune and modify the migration approach to optimize all
parts of the migration.
Recommended testing approach:
•
•
•
•
•
Run Full copy – benchmark and monitor
Review and validate – potentially look at tuning or tweaking the methodology and re-run
Run Incremental copy – benchmark and monitor
Review and validate – potentially look at tuning or tweaking the methodology and re-run
Continue to run incrementals – and continue to monitor
EMC Isilon recommends that you test different copy methodologies to tune and optimize the
throughput while meeting your migration requirements.
Best practice – Execute multiple test migrations to validate the methodology
It is recommended to execute multiple migration tests on smaller subsets of different data.
Why: Since different data will tend to have different properties and access profiles, it is important
to test all data types and how the migration methodology may need to be modified for different
data sets.
Critical areas to evaluate and monitor during data migration testing are the following:
•
•
•
•
Network performance – throughput, saturation, impact
Time to execute a Full data copy – will allow for refinement of project plans
Time needed to execute an Incremental after X days of data change – will help define cutover Windows
Cluster load, Source load, host load – will help tune and refine the migration methodology
Best practice – Test all phases of the migration methodology
Execute all steps in the migration methodology to identify the time involved and to verify that the
proposed methodology fulfills all the migration requirements.
Why: To identify issues with the methodology before executing production migrations and
cutovers.
Data Validation
After you migrate the data, you must validate the data and the file attributes. Verify the following
aspects of the data:
•
•
•
File data copied correctly; data is intact and integrity is maintained
File security, ownership and attributes migrated correctly
File timestamps are correct
EMC Isilon Data Migration for NFS
35
Next, review the access control entries on a file by running the ls –led followed by a file name:
You should also validate the data. Common methods include the following:
•
•
•
•
File size compares
Checksum/File Hash Compares – MD5 checksums
Tools – md5, sum, cksum
Audit and review directory structures
Having reviewed the data attributes directly, it is critical to validate that the data works in client
workflows.
Performance
A migration often moves a large amount of data. You must ensure that migration methodology, toolset
and environment are optimized for performance and throughput to work within the migration timeline.
The common areas to focus on when evaluating performance are as follows:
•
Identify bottlenecks – attempt to identify the worst performing component
i.e. Disable antivirus scanning processes on target and/or source file systems during initial migration copies
to minimize CPU impact on client access and its potential impact on elongating copy times.
o WAN bandwidth physical (circuit limitations) and concurrency impact of other systems that are replicating
data (SAN, backup etc.) over a shared link. This could have an effect on replication performance for SyncIQ
jobs that need to be run to completion before certain cutovers can be conducted.
•
Timing of execution – time of day, day of week tests were executed versus performance
•
Collect metrics on the data copies, network throughput, source, host and target systems – evaluate the copy
as a whole
o
Best practice – Time the incremental copies
Benchmark the Incremental copies by timing how long they take to execute so you can plan and
orchestrate the cutover phases appropriately.
Why: Knowing how long an incremental copy will take will likely determine the length of time
required to execute a cutover and will help determine the data outage window.
User Acceptance Testing – Data and Workflow Testing
The final step of migration data testing is the User Acceptance Test (UAT) in which the data is tested for
integrity with existing workflows. It is suggested that test workflows are used as this data should only
be considered test data at this time and may be removed by later migration steps.
Best practice – Check workflows with test migrated data
Review all workflows on test migrated data.
Why: It is critical to validate that newly migrated data can be integrated into workflows (i.e. User
Home Director or Group share access, etc.) at cutover time without issues. By testing the
workflows, you ensure that cutovers occur without incident.
EMC Isilon Data Migration for NFS
36
Begin Migration Execution
After you complete all the testing and validation, you can begin to move into the production migration
phases. All the information obtained from testing and tuning should be used to modify and optimize
the overall methodology so that the production migrations are as clean and quick as possible.
The migration execution phases are as follows:
•
•
•
Execute the initial full copy
Execute Incremental copies to keep the new storage up-to-date and as close to production as possible
Based on performance, multiple migrations may be executed simultaneously if supported and if the network
infrastructure can support the additional load
Multiple hosts (if using a host based migration) can replicate data in parallel; be mindful of load
placed on the source cluster. Coordinate hosts so that they are not all writing to the same subdirectory
to reduce contention. Spread hosts across the Isilon cluster to maximize network bandwidth and
processing power.
Alternative migration methodology: Mount the source NFS exports directly on the individual Isilon
nodes and run the rsync commands directly on the cluster. The benefit here is if you have multiple
nodes, your transfer bandwidth scales and you remove the “middle man” from the process. Data
moves directly from the source filer to the Isilon cluster, without having to go to a host first then back
out to the Isilon cluster. Latency is greatly reduced and migration times will drop. External NFS exports
mounted on the Isilon cluster MUST BE UNMOUNTED prior to normal cluster operation.
Best practice – Continue to run Incremental copies
Continue to run incrementals even if the cutovers are not scheduled.
Why: This will keep the source and target data closer in sync and require less data transfer during
the final pre-cutover copy.
Depending on the size of the data migrations, the initial full copies may take a while to execute. During
this time you can prepare for the final cutover events.
Pre-Cutover Preparation
After you start to migrate data, you can begin to prepare your cutover events.
Best practice – Create a detailed migration plan
Create a detailed migration plan with all the specific steps and timing of the migration execution.
Why: This document will dictate the commands and work that are being executed. The plan
controls the entire migration from start to finish. All roles, tasks and responsibilities are defined.
The detailed migration plan dictates how the migration is executed.
Best practice – Create a cutover document
Create a cutover document that defines the high level cutover tasks, responsibilities and timings.
The document should outline the phases and sequence in which tasks are executed.
EMC Isilon Data Migration for NFS
37
Why: This document will outline the sequence of events that need to occur during a cutover. It can
be used to track and monitor the progress of the cutover.
Best practice – Create a schedule and define outage windows
Have a well-defined cutover schedule and outage window.
Why: The schedule helps execute the migration cutover. The outage window is when you have
clearly defined that access to storage will be unavailable and that you can make storage system
changes without impact to clients.
Best practice – Create a communication plan
Have a communication plan.
Why: This communication plan will clearly outline the protocols to keep all users up to date on the
status of a migration and allow storage administrators to stay focused on the execution of the
cutover and not be distracted by requests for information from end users.
Best practice – Prepare the DNS name resolution infrastructure for cutover
Lower the DNS TTL.
Why: This will facilitate the cutover of clients using DNS name resolution by reducing the time
between authoritative updates to domain name services.
Additional Pre-Cutover preparation steps often include the following:
•
•
•
•
Prepare the DFS namespace if applicable
Create CNAMES in DNS
Update scripts used by clients for storage connections
Prepare clients and applications
Cutover Event
Having migrated the data and prepared the environment for cutover, the actual final migration event
can occur.
In general the high level cutover sequence resembles the following steps:
•
•
•
•
•
•
•
•
•
•
•
Initiate migration cutover window – communicate the event
Restrict access or make source data Read-Only – prevent new writes to the old data source
Execute a final incremental – copy all final data to new storage
Validate final incremental – validate the source data is ready for cutover
Final testing executed – the final cutover testing is completed
Go or no go call on full cutover – decide if the migration should continue or rollback
Update connection and name resolution protocols; DNS, CNAMES, DFS, scripts
Enable new storage to Read/Write – enable writes to the new storage
Continue testing and user acceptance – continue to test and monitor as production traffic moves over
Execute the redirection of client to the new storage – initiate client redirection process
Monitor – assess the cutover, new storage and clients
On executing a cutover event, the following best practices are recommended:
EMC Isilon Data Migration for NFS
38
Best practice – Follow the cutover schedule
Follow a cutover schedule.
Why: By following a well-defined schedule the migration can be monitored and controlled. It is
suggested that you execute cutovers during off hours or when the number of active connections is
low.
Best practice – Test the migrated cutover data
Prepare a number of data and workflow tests to execute against the migrated data. Have a number
of well-defined production use cases, data tests and test users available to conduct post cutover
testing and review.
Why: Having a well-defined use case and users to validate the migration cutover will help in the
decision to continue with the cutover.
Best practice – Monitor clients and application during migrations
Monitor client and application connections to the new storage during the cutover.
Why: This will verify that your cutover methodology is working as defined and clients are moving
and connecting to the new storage successfully.
Best practice – Develop a client connection remediation plan
Have a client remediation methodology plan in place and ready to execute against clients that
exhibit any issues connecting to the new data targets.
Why: Have a well-defined strategy to handle client connection issues, including a dedicated
support line, email address, or an IT desk.
The Go or No Go Decision
During the migration cutover window, a critical point is reached. This threshold determines whether
you continue with the cutover or abort the cutover and rollback to the existing storage.
Common abort cutover situations:
•
•
•
•
•
•
Final incremental does not complete in the outage window
Cutover methodology fails; clients not connecting correctly
Security issues with new storage system
Workflow issues post cutover
Load and availability problems
Other unknown issues
Best practice – Clearly define your cutover criteria
It is critical to have a series of cutover criteria that clearly define when a migration will continue or
be aborted and rolled back
EMC Isilon Data Migration for NFS
39
Why: The criteria remove uncertainty, help with decision-making, and dictate the action to take.
Once you begin to write data to the new storage system, reverting to the old system becomes much
more complicated because you now need to reconcile data with the original storage system.
Rollback
If a decision to abort the cutover is taken, a well-defined rollback plan should have been developed
and tested ahead of time so that you can restore data access as quickly as possible.
Rollback Plan:
•
•
•
Prevent any new writes to the new storage
Move client connections back to the old storage
Enable writes to the old storage
Best practice – Develop a rollback plan
Have a clearly defined rollback strategy that is easy to implement and can restore user access to
data quickly and cleanly. Make sure the plan is tested.
Why: A rollback plan will quickly help you restore client data access in the event of a failure in a
migration cutover event.
If any data has already been written to the new storage and a rollback is executed, then steps to
remediate this data must be taken to restore this new data back to the original storage.
Common strategies for reconciling data during a rollback are as follows:
•
•
•
•
Manually reconciled – identify and manually move any data from the new storage to the old storage
A reverse incremental – have migration type jobs to run in a reverse direction to update the old storage
Data just discarded – consider the data as non-critical and be conformable with not reconciling it
Data is re-written from the client or application to the old storage – allow applications and clients to re-write
the data
The goal of any rollback strategy is to limit the impact to end users and restore data access as
seamlessly as possible. It is for this reason that your migration cutover criteria should be well defined
and that the rollback strategy should have been well tested in the event that you have to use it.
Migration Event Complete
After you successfully complete a cutover, you should continue to monitor the new storage system.
Best practice – Monitor the new storage post cutover
Continue to monitor the cutover storage after the cutover event for any new issue resulting from the
cutover.
Why: Production load and workflow may be unpredictable; close monitoring should be undertaken
to rectify any post-migration issues.
EMC Isilon Data Migration for NFS
40
Steady State
Repetition
Most data migrations will constitute multiple cutover events. Having developed a well-structured
migration methodology, these additional cutover events should be run with the same plan and
strategy.
Lessons learned
After you complete a migration, you should assess the success and failures of the methodology. If an
additional migration needs to be performed, the lessons learned will allow you to refine the process.
Ask yourself the following questions:
•
•
•
What worked during the migration and cutover
What did not work during the migration and cutover
Can the migration methodology be modified or optimized
Conclusion
The goal of this document is to supply you with solid guidance for conducting an NFS single protocol file
migration from a NAS system to an Isilon cluster. The guidance is based on a comprehensive set of
industry lessons learned and best practices on the technical aspects and process of data migration. As
stated in the beginning of this document, this paper does not aim to be an exhaustive authoritative source
on the subject of NFS single protocol migrations, but rather a comprehensive reference document that
covers the key areas to help ensure your success. EMC can provide comprehensive services including
migration services, Isilon training and education, and residency services to reduced risk and maximize
system uptime and service levels during and after a data and system migration.
EMC Isilon Data Migration for NFS
41
Appendix: Sample Migration Use Case
This appendix provides a sample high-level overview of how to collect information for and plan a
migration of NFS source directories from a single NFS server. Keep in mind that the information in this
section provides only a skeleton of some of the information that you would want to collect and an
overview of the strategy that you want to define for your migration. The use case that follows answers
the following question at a high level: What are the recommendations and best practices as well as
supported Isilon configurations to migrate all directories and data?
From a sample NFS Filer storage system to an Isilon cluster:
Source Configuration and Data
Sample Directory Structure
Single Source System – NFS Filer,
4x1Gb Ethernet ports
Total Data - 25TB
Max File Size – 4GB
Min File Size – 0B
Avg File Size – 256KB
File Count: 8000000
8 Top level Exports:
acct
engineering
home
production
RandD
scratch
temp
work
User home directories: each user has
a single home directory under a higher
level share, /exports/home.
/exports
/acct
/engineering
/home
/production
/RandD
/scratch
/temp
/work
Table 2 - Source Data
Exports and permissions
:
drwxr-xr-x
drwxr-xr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxrwx
drwxrwxrwx
drwxrwxr-x
10
21
2
2
2
2
2
2
2
2
root
root
root
root
bob
root
prod
root
root
root
wheel
wheel
rd
acct
eng
users
prod
wheel
wheel
eng
512
512
512
512
512
512
512
512
512
512
Aug
Aug
Aug
Aug
Aug
Aug
Aug
Aug
Aug
Aug
27
27
27
27
27
27
27
27
27
27
14:50
14:50
14:50
14:50
14:50
14:50
14:50
14:50
14:50
14:50
.
..
RandD
acct
engineering
home
production
scratch
temp
work
Table 3 - Exports and Permissions
Additional Source Information
Additional Isilon Information
EMC Isilon Data Migration for NFS
42
All LDAP – single Domain
Source system network connectivity –
4 x 1Gb
No firewalls, IDS or QoS
Monthly Full backups
Anti-virus scanning in place
Same datacenter as Isilon
No deduplication or offline files
DNS – 2 CNAMEs
No routing or VLAN restrictions
Isilon 3 x X200 – ~61TB
OneFS 7.0.x
LDAP authentication
3 x LACP (2 x 1Gbps each)
SmartQuotas
SnapshotIQ
Table 4 - Additional Source Data Information
Requirements
-All data and permissions moved as is, no changes. Retain all existing POSIX permissions and
ownership
-Eight cutover events; 12 hour window – Saturday 8:00PM through Sunday 8:00AM
-One migration per weekend
-Each user has a defined quota, quota limits to be replicated on Isilon
Migration Project Assumptions (including but not limited to)
-Customer will have approved change controls submitted for any migration activity
-Migration plan and design will have been reviewed and approved by the customer prior to the start of
cutovers
-Any recommended array OS upgrades (and firmware updates) necessary for the migration will be
applied before migration cutover activity
-Source NAS and target Isilon systems are in a known good state prior to conducting migrations
-Customer will have successfully completed a full system backup and verified it’s reliability
Strategy
-Isilon based migration; source NFS exports mounted directly on individual Isilon nodes
-Conduct pilot migration; validate methodology, document performance metrics, refine and tune rsync
switches and scripts. Test migrated data with clients and users.
-Investigate sizes and file counts in each export; this will help determine order of migration (i.e. the
largest directories will take the longest time and should be started first)
-Validate change rates and time to execute incrementals
-Have completed a full backup of all file systems that are to be migrated before cutover; verify backup
is good
-Develop detailed project timeline and cutover schedule with customer
-Develop back out plan, review with customer
-Execute migration phases: execute initial full copies, followed by nightly incrementals
-Use a DNS update methodology to redirect clients
EMC Isilon Data Migration for NFS
43
-During cutover; make sure each source File System is changed to read-only after source directory is
successfully replicated preventing clients from making any changes
-Reduce DNS TTL’s in advance of cutover Windows
-Umount externally mounted NFS exports from each Isilon node.
-Develop client communication; customer should provide dedicated cutover IT support desk/personnel
Source System Configuration
-If possible, restrict access so clients cannot modify data during cutover (change to read-only)
Isilon Configuration
-Pre-create NFS shares with identical share permissions
-Disable Snapshots on data until cutover complete
-Disable AV scanning
Toolset Selection
-use rsync
Example:
rsync --a --delete sourcefiler:/exports/acct /ifs/data/exports/acct -exclude ‘.snapshot*’
Migration Testing
-Map source and target directories from the migration host
-Replicate a small test set of data
-Use a small set of test users to validate full data access (data to be discarded or overwritten following
test)
Migration
This example directory structure can be broken down into at least 8 separate rsync jobs. Several of the
jobs can be run in parallel assuming the source cluster can tolerate the additional load while the
migration is occurring. Monitor the source cluster and scale up or down accordingly.
If the customer has a preference for the order of directory migration then plan the transfers accordingly.
Otherwise, start with the largest directories since they will take the most time to replicate.
Example:
Initial migration testing indicated that 3 simultaneous rsync jobs would be an acceptable additional
load on the source cluster. We will run 3 rsync jobs concurrently so as not to overload the source
cluster. Note that in other scenarios, additional nodes could run additional rsync jobs in parallel, if the
source cluster has enough performance and bandwidth to accommodate.
Mount an individual export to a different node and start an rsync:
Isilon node 1:
Mount filer1:/export/acct with a command similar to:
mount filer1:/export/acct /import/acct
(mounts the remotely exported directory to a created directory locally)
Rsync to /ifs/data/filer1/acct with a command similar to:
EMC Isilon Data Migration for NFS
44
rsync –av /import/acct /ifs/data/filer1 --exclude ‘.snapshot*’
(this would copy files in “archive” mode which ensures symbolic links, devices, attributes,
permissions, ownerships, etc. are preserved in the transfer. No compression. Exclude any snapshot
directories from the transfer)
Repeat with Isilon node 2:
Mount filer1:/export/engineering and rsync to /ifs/data/filer1/engineering.
And again with Isilon node 3:
Mount filer1:/export/home and rsync to /ifs/data/filer1/home.
Monitor the source cluster to verify that it is not overloaded with the additional strain of migrating data.
The idea is not to impact clients as the migration progresses.
Repeat with the remaining shares until all of the source data is migrated.
Once an initial full copy of the source data has been completed, incremental copies should be run to
propagate any changes that were made once the migration began.
On the day of the cutover, final incremental copies should be run and access to the source cluster
should be restricted if possible to prevent clients from writing data that may not be migrated.
Be sure to unmount the external NFS mounts on the individual Isilon nodes prior to normal cluster
operation.
User Acceptance Testing
-Verify data and permissions on Isilon are replicated correctly following final copies
-Review user access, verify users have connectivity and correct permissions
-Use a small set of test users to validate full data access during migration event to spot any problems
early on
-Verify that a user can read/write to a file, create a new file and directory, and traverse the directory
structure
-Monitor Isilon performance and client connections as load increases
Cutover Plan
-Make source file systems read-only
-Execute final incremental copies
-Update DNS
-Verify clients can connect to Isilon
-Test automated workflows if possible
-Initiate user logoff and re-logon
Rollback Plan
-Reverse DNS update
-Make old source file systems RW, remove any restrictions
-Remove any connections to Isilon; stop exports
-Any data that was written to Isilon is considered lost, no reverse or reconciliation will be done
EMC Isilon Data Migration for NFS
45
Exit Criteria
-DNS resolves to new target storage and shares
-Client successfully read/write data to directories
-Workflows successfully complete with no user/permission problems
-No connectivity issues
Post Cutover
-Customer meeting to review and triage the migration
-Documentation of migration protocol
-Lessons learned discussion with both the internal and customer team
EMC Isilon Data Migration for NFS
46