Business Continuity: Local Module 4.3 © 2006 EMC Corporation. All rights reserved. Local Replication After completing this module you will be able to: Discuss replicas and the possible uses of replicas Explain consistency considerations when replicating file systems and databases Discuss host and array based replication technologies – Functionality – Differences – Considerations – Selecting the appropriate technology © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 2 What is Replication? Replica - An exact copy (in all details) Replication - The process of reproducing data REPLICATION Original © 2006 EMC Corporation. All rights reserved. Replica Module TitleBusiness Continuity: Local - 3 Possible Uses of Replicas Alternate source for backup Source for fast recovery Decision support Testing platform Migration © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 4 Considerations What makes a replica good? – Recoverability Considerations for resuming operations with primary – Consistency/re-startability How is this achieved by various technologies Kinds of Replicas – Point-in-Time (PIT) = finite RPO – Continuous = zero RPO How does the choice of replication technology tie back into RPO/RTO? © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 5 Replication of File Systems Host Apps Operating System Mgmt Utilities DBMS File System Buffer Volume Management Multi-pathing Software Device Drivers HBA HBA HBA Physical Volume © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 6 Replication of Database Applications A database application may be spread out over numerous files, file systems, and devices—all of which must be replicated. Database replication can be offline or online. Data © 2006 EMC Corporation. All rights reserved. Logs Module TitleBusiness Continuity: Local - 7 Database: Understanding Consistency Databases/Applications maintain integrity by following the “Dependent Write I/O Principle” – Dependent Write: A write I/O that will not be issued by an application until a prior related write I/O has completed A logical dependency, not a time dependency – Inherent in all Database Management Systems (DBMS) e.g. Page (data) write is dependent write I/O based on a successful log write – Applications can also use this technology – Necessary for protection against local outages Power failures create a dependent write consistent image A Restart transforms the dependent write consistent to transactionally consistent i.e. Committed transactions will be recovered, in-flight transactions will be discarded © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 8 Database Replication: Transactions Buffer Database Application 1 1 2 2 3 3 4 4 Data Log © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 9 Database Replication: Consistency Source Data Replica 1 1 2 2 3 3 4 4 Log Data Log Consistent Note: In this example, the database is online. © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 10 Database Replication: Consistency Source Replica 1 Data 2 3 3 4 4 Log Inconsistent Note: In this example, the database is online. © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 11 Database Replication: Ensuring Consistency Off-line Replication – If the database is offline or shutdown and then a replica is created, the replica will be consistent. Source Replica – In many cases, creating an offline replica may not be a viable due to the 24x7 nature of business. Data Database Application (Offline) Log Consistent © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 12 Database Replication: Ensuring Consistency Online Replication – Some database applications allow replication while the application is up and running – The production database would have to be put in a state which would allow it to be replicated while it is active – Some level of recovery must be performed on the replica to make the replica consistent Source Replica 1 Data 2 3 3 4 4 Log Inconsistent © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 13 Database Replication: Ensuring Consistency Source 5 5 Replica 1 1 2 2 3 3 4 4 Consistent © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 14 Tracking Changes After PIT Creation At PIT Later Resynch Source = Target Source ≠ Target Source = Target © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 15 Local Replication Technologies Host based – Logical Volume Manager (LVM) based mirroring – File System Snapshots Storage Array based – Full volume mirroring – Full volume: Copy on First Access – Pointer based: Copy on First Write © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 16 Logical Volume Manager: Review Host resident software responsible for creating and controlling host level logical storage – Physical view of storage is converted to a logical view by mapping. Logical data blocks are mapped to physical data blocks. – Logical layer resides between the physical layer (physical devices and device drivers) and the application layer (OS and applications see logical view of storage). Logical Storage LVM Usually offered as part of the operating system or as third party host software LVM Components: – Physical Volumes – Volume Groups – Logical Volumes © 2006 EMC Corporation. All rights reserved. Physical Storage Module TitleBusiness Continuity: Local - 17 Volume Groups One or more Physical Volumes form a Volume Group LVM manages Volume Groups as a single entity Physical Volumes can be added and removed from a Volume Group as necessary Physical Volumes are typically divided into contiguous equalsized disk blocks Physical Volume 1 Physical Volume 2 Physical Volume 3 Volume Group Physical Disk Block A host will always have at least one disk group for the Operating System – Application and Operating System data maintained in separate volume groups © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 18 Logical Volumes Logical Volume Logical Volume Physical Volume 1 Physical Volume 2 Volume Group © 2006 EMC Corporation. All rights reserved. Logical Disk Block Physical Volume 3 Physical Disk Block Module TitleBusiness Continuity: Local - 19 Host Based Replication: Mirrored Logical Volumes PVID1 Host Logical Volume VGDA Physical Volume 1 VGDA Physical Volume 2 Logical Volume PVID2 © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 20 Host Based Replication: File System Snapshots Many LVM vendors will allow the creation of File System Snapshots while a File System is mounted File System snapshots are typically easier to manage than creating mirrored logical volumes and then splitting them © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 21 Host (LVM) Based Replicas: Disadvantages LVM based replicas add overhead on host CPUs If host devices are already Storage Array devices then the added redundancy provided by LVM mirroring is unnecessary – The devices will have some RAID protection already Host based replicas can be usually presented back to the same server Keeping track of changes after the replica has been created © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 22 Storage Array Based Local Replication Replication performed by the Array Operating Environment Replicas are on the same array Array Source Production Server © 2006 EMC Corporation. All rights reserved. Replica Business Continuity Server Module TitleBusiness Continuity: Local - 23 Storage Array Based – Local Replication Example Typically Array based replication is done at a array device level – Need to map storage components used by an application/file system back to the specific array devices used – then replicate those devices on the array Array 1 Logical Volume 1 c12t1d1 c12t1d2 File System 1 Source Vol 1 Replica Vol 1 Source Vol 2 Replica Vol 2 Volume Group 1 © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 24 Array Based Local Replication: Full Volume Mirror Attached Read/Write Not Ready Target Source Array © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 25 Array Based Local Replication: Full Volume Mirror Detached - PIT Read/Write Read/Write Target Source Array © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 26 Array Based Local Replication: Full Volume Mirror For future re-synchronization to be incremental, most vendors have the ability to track changes at some level of granularity (e.g., 512 byte block, 32 KB, etc.) – Tracking is typically done with some kind of bitmap Target device must be at least as large as the Source device – For full volume copies the minimum amount of storage required is the same as the size of the source © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 27 Copy on First Access (COFA) Target device is made accessible for BC tasks as soon as the replication session is started. Point-in-Time is determined by time of activation Can be used in Copy First Access mode (deferred) or in Full Copy mode Target device is at least as large as the Source device © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 28 Copy on First Access Mode: Deferred Mode Write to Source Read/Write Read/Write Target Source Write to Target Read/Write Read/Write Source Target Read from Target Read/Write Read/Write Source © 2006 EMC Corporation. All rights reserved. Target Module TitleBusiness Continuity: Local - 29 Copy on First Access: Full Copy Mode On session start, the entire contents of the Source device is copied to the Target device in the background Most vendor implementations provide the ability to track changes: – Made to the Source or Target – Enables incremental re-synchronization © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 30 Array: Pointer Based Copy on First Write Targets do not hold actual data, but hold pointers to where the data is located – Actual storage requirement for the replicas is usually a small fraction of the size of the source volumes A replication session is setup between the Source and Target devices and started – When the session is setup based on the specific vendors implementation a protection map is created for all the data on the Source device at some level of granularity (e.g 512 byte block, 32 KB etc.) – Target devices are accessible immediately when the session is started – At the start of the session the Target device holds pointers to the data on the Source device © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 31 Pointer Based Copy on First Write Example Target Virtual Device Source © 2006 EMC Corporation. All rights reserved. Save Location Module TitleBusiness Continuity: Local - 32 Array Replicas: Tracking Changes Changes will/can occur to the Source/Target devices after PIT has been created How and at what level of granularity should this be tracked? – Too expensive to track changes at a bit by bit level Would require an equivalent amount of storage to keep track of which bit changed for each the source and the target – Based on the vendor some level of granularity is chosen and a bit map is created (one for Source and one for Target) One could choose 32 Kb as the granularity For a 1 GB device changes would be tracked for 32768 32Kb chunks If any change is made to any bit on one 32Kb chunk the whole chunk is flagged as changed in the bit map 1 GB device map would only take up 32768/8/1024 = 4Kb space © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 33 Array Replicas: How Changes Are Determined Source 0 0 0 0 0 0 0 0 Target 0 0 0 0 0 0 0 0 Source 1 0 0 1 0 1 0 0 Target 0 0 1 1 0 0 0 1 1 0 1 1 0 1 0 1 At PIT After PIT… Resynch 0 © 2006 EMC Corporation. All rights reserved. = unchanged 1 = changed Module TitleBusiness Continuity: Local - 34 Array Replication: Multiple PITs Target Devices 06:00 A.M. Source 12:00 P.M. Point-In-Time 06:00 P.M. 12:00 A.M. : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : A.M. © 2006 EMC Corporation. All rights reserved. P.M. Module TitleBusiness Continuity: Local - 35 Array Replicas: Ensuring Consistency Source C Replica Source Replica 1 1 1 2 2 2 3 3 3 3 4 4 4 4 Consistent © 2006 EMC Corporation. All rights reserved. D Inconsistent Module TitleBusiness Continuity: Local - 36 Mechanisms to Hold IO Host based Array based What if the application straddles multiple hosts and multiple arrays? © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 37 Array Replicas: Restore/Restart Considerations Production has a failure – Logical Corruption – Physical failure of production devices – Failure of Production server Solution – Restore data from replica to production The restore would typically be done in an incremental manner and the Applications would be restarted even before the synchronization is complete leading to very small RTO -----OR------ – Start production on replica Resolve issues with production while continuing operations on replicas After issue resolution restore latest data on replica to production © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 38 Array Replicas: Restore/Restart Considerations Before a Restore – Stop all access to the Production devices and the Replica devices – Identify Replica to be used for restore Based on RPO and Data Consistency – Perform Restore Before starting production on Replica – Stop all access to the Production devices and the Replica devices – Identify Replica to be used for restart Based on RPO and Data Consistency – Create a “Gold” copy of Replica As a precaution against further failures – Start production on Replica RTO drives choice of replication technology © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 39 Array Replicas: Restore Considerations Full Volume Replicas – Restores can be performed to either the original source device or to any other device of like size Restores to the original source could be incremental in nature Restore to a new device would involve a full synchronization Pointer Based Replicas – Restores can be performed to the original source or to any other device of like size as long as the original source device is healthy Target only has pointers Pointers to source for data that has not been written to after PIT Pointers to the “save” location for data was written after PIT Thus to perform a restore to an alternate volume the source must be healthy to access data that has not yet been copied over to the target © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 40 Array Replicas: Which Technology? Full Volume Replica – Replica is a full physical copy of the source device – Storage requirement is identical to the source device – Restore does not require a healthy source device – Activity on replica will have no performance impact on the source device – Good for full backup, decision support, development, testing and restore to last PIT – RPO depends on when the last PIT was created – RTO is extremely small © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 41 Array Replicas: Which Technology? … Pointer based - COFW – Replica contains pointers to data Storage requirement is a fraction of the source device (lower cost) – Restore requires a healthy source device – Activity on replica will have some performance impact on source Any first write to the source or target will require data to be copied to the save location and move pointer to save location Any read IO to data not in the save location will have to be serviced by the source device – Typically recommended if the changes to the source are less than 30% – RPO depends on when the last PIT was created – RTO is extremely small © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 42 Array Replicas: Which Technology? Full Volume – COFA Replicas – Replica only has data that was accessed – Restore requires a healthy source device – Activity on replica will have some performance impact Any first access on target will require data to be copied to target before the I/O to/from target can be satisfied – Typically replicas created with COFA only are not as useful as replicas created with the full copy mode – Recommendation would be to use the full copy mode it the technology allows such an option © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 43 Array Replicas: Full Volume vs. Pointer Based Full Volume Pointer Based Required Storage 100% of Source Fraction of Source Performance Impact None Some RTO Very small Very small Restore Source need not be healthy Requires a healthy source device Data change No limits < 30% © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 44 Module Summary Key points covered in this module: Replicas and the possible use of Replicas Consistency considerations when replicating File Systems and Databases Host and Array based Replication Technologies – Advantages/Disadvantages – Differences – Considerations – Selecting the appropriate technology © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 45 Check Your Knowledge What is a replica? What are the possible uses of a replica? What is consistency in the context of a database? How can consistency be ensured when replicating a database? Discuss one host based replication technology What is the difference between full volume mirrors and pointer based replicas? What are the considerations when performing restore operations for each replication technology? © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 46 Apply Your Knowledge… Upon completion of this topic, you will be able to: List EMC’s Local Replication Solutions for the Symmetrix and CLARiiON arrays Describe EMC’s TimeFinder/Mirror Replication Solution Describe EMC’s SnapView - Snapshot Replication Solution © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 47 EMC – Local Replication Solutions EMC Symmetrix Arrays – EMC TimeFinder/Mirror Full volume mirroring – EMC TimeFinder/Clone Full volume replication – EMC TimeFinder/SNAP Pointer based replication EMC CLARiiON Arrays – EMC SnapView Clone Full volume replication – EMC SnapView Snapshot Pointer based replication © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 48 EMC TimeFinder/Mirror - Introduction Array based local replication technology for Full Volume Mirroring on EMC Symmetrix Storage Arrays – Create Full Volume Mirrors of an EMC Symmetrix device within an Array TimeFinder/Mirror uses special Symmetrix devices called Business Continuance Volumes (BCV). BCVs: – Are devices dedicated for Local Replication – Can be dynamically, non-disruptively established with a Standard device. They can be subsequently split instantly to create a PIT copy of data. The PIT copy of data can be used in a number of ways: – Instant restore – Use BCVs as standby data for recovery – Decision Support operations – Backup – Reduce application downtime to a minimum (offline backup) – Testing TimeFinder/Mirror is available in both Open Systems and Mainframe environments © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 49 EMC TimeFinder/Mirror – Operations Establish – Synchronize the Standard volume to the BCV volume – BCV is set to a Not Ready state when established BCV cannot be independently addressed – Re-synchronization is incremental – BCVs cannot be established to other BCVs – Establish operation is non-disruptive to the Standard device STD BCV Establish Incremental Establish – Operations to the Standard can proceed as normal during the establish © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 50 EMC TimeFinder/Mirror – Operations … Split – Time of Split is the Point-in-Time – BCV is made accessible for BC Operations – Consistency Consistent Split – Changes tracked STD BCV Split © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 51 EMC TimeFinder/Mirror Consistent Split EMC PowerPath Enginuity Consistency Assist Host STD BCV PowerPath is an EMC host based multi- pathing software PowerPath holds I/O during TimeFinder/Mirror Split -Read and write I/O © 2006 EMC Corporation. All rights reserved. STD BCV Symmetrix Microcode holds I/O during TimeFinder/Mirror Split - Write I/O (subsequent reads after first write) Module TitleBusiness Continuity: Local - 52 EMC TimeFinder/Mirror – Operations … Restore – Synchronize contents of BCV volume to the Standard volume – Restore can be full or incremental STD – BCV is set to a Not Ready state – I/Os to the Standard and BCVs should be stopped before the restore is initiated BCV Incremental Restore Query – Provide current status of BCV/Standard volume pairs © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 53 EMC TimeFinder/Mirror Multi-BCVs Standard device keeps track of changes to multiple BCVs one after the other Incremental establish or restore Incremental establish BCV 2:00 a.m. Establish Standard volume Split Standard volume BCV 4:00 a.m. or BCV 4:00 a.m. Establish Incremental restore Split BCV © 2006 EMC Corporation. All rights reserved. 6:00 a.m. Module TitleBusiness Continuity: Local - 54 TimeFinder/Mirror Concurrent BCVs Two BCVs can be established concurrently with the same Standard device Establish BCVs simultaneously or one after the other BCVs can be split individually or simultaneously. Simultaneous. “Concurrent Restores”, are not allowed © 2006 EMC Corporation. All rights reserved. BCV1 Standard BCV2 Module TitleBusiness Continuity: Local - 55 EMC CLARiiON SnapView - Snapshots SnapView allows full copies and pointer-based copies – Full copies – Clones (sometimes called BCVs) – Pointer-based copies – Snapshots Because they are pointer-based, Snapshots – Use less space than a full copy – Require a ‘save area’ to be provisioned – May impact the performance of the LUN they are associated with The ‘save area’ is called the ‘Reserved LUN Pool’ The Reserved LUN Pool – Consists of private LUNs (LUNs not visible to a host) – Must be provisioned before Snapshots can be made © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 56 The Reserved LUN Pool Reserved LUN Pool FLARE LUN 5 Private LUN 5 FLARE LUN 6 Private LUN 6 FLARE LUN 7 Private LUN 7 FLARE LUN 8 Private LUN 8 © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 57 Reserved LUN Allocation Reserved LUN Pool Source LUNs Snapshot 1a Session 1a Private LUN 5 LUN 1 Snapshot 1b Session 1b Private LUN 6 Private LUN 7 Private LUN 8 LUN 2 Snapshot 2a © 2006 EMC Corporation. All rights reserved. Session 2a Module TitleBusiness Continuity: Local - 58 SnapView Terms Snapshot – The ‘virtual LUN’ seen by a secondary host – Made up of data on the Source LUN and data in the RLP – Visible to the host (online) if associated with a Session Session – The mechanism that tracks the changes – Maintains the pointers and the map – Represents the point in time Activate and deactivate a Snapshot – Associate and disassociate a Session with a Snapshot Roll back – Copy data from a (typically earlier) Session to the Source LUN © 2006 EMC Corporation. All rights reserved. Module TitleBusiness Continuity: Local - 59 COFW and Reads from Snapshot Chunk 0’ 0 Chunk 1 Chunk 2 Chunk 3’’ 3’ 3 Primary Host Source LUN Secondary Host SnapView Map Chunk 4 Snapshot Chunk 3 Chunk 0 Reserved LUN © 2006 EMC Corporation. All rights reserved. Map SP memory Module TitleBusiness Continuity: Local - 60 Writes to Snapshot Chunk 0’ Chunk 1 Chunk 2 Chunk 3’’ Source LUN Primary Host Snapshot Secondary Host SnapView Map Chunk 4 Chunk 3 Chunk 0* 0 Chunk 0 Chunk 2* 2 Reserved LUN © 2006 EMC Corporation. All rights reserved. Map Chunk 2 SP memory Module TitleBusiness Continuity: Local - 61 Rollback - Snapshot Active (preserve changes) Chunk 0* 0’ Chunk 1 Chunk 2* 2 Chunk 3’’ 3 Chunk 4 Source LUN Primary Host Secondary Host SnapView Map Snapshot Chunk 3 Chunk 0* Chunk 0 Chunk 2* Reserved LUN © 2006 EMC Corporation. All rights reserved. Map Chunk 2 SP memory Module TitleBusiness Continuity: Local - 62 Rollback - Snapshot Deactivated (discard changes) Chunk 0’ 0 Chunk 1 Chunk 2 Chunk 3’’ 3 Chunk 4 Source LUN Primary Host Secondary Host SnapView Map Snapshot Chunk 3 Chunk 0* Chunk 0 Chunk 2* Reserved LUN © 2006 EMC Corporation. All rights reserved. Map Chunk 2 SP memory Module TitleBusiness Continuity: Local - 63