Chapter 3 Presented by: Anupam Mittal Data protection: Concept of RAID and its Components Data Protection: RAID -2 After completing this chapter, you will be able to: Describe what is RAID and the needs it addresses Describe the concepts upon which RAID is built Define and compare RAID levels Recommend the use of the common RAID levels based on performance and availability considerations Explain factors impacting disk drive performance Data Protection: RAID -3 Performance limitation of a single drive disk drive ◦ Limited Capacity ◦ Limited access speed An individual drive has a certain life expectancy ◦ Measured in MTBF ◦ Example - If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array, then the MTBF of the array becomes 750,000 / 100, or 7,500 hours RAID was introduced to mitigate this problem RAID provides: ◦ Increase capacity ◦ Higher availability ◦ Increased performance Data Protection: RAID -4 RAID Controller Host RAID Array RAID Arrays -5 Physical Array Logical Array RAID Controller Hard Disks Host RAID Array Data Protection: RAID -6 Hardware (usually a specialized disk controller card) ◦ Controls all drives attached to it ◦ Array(s) appear to host operating system as a regular disk drive ◦ Provided with administrative software Software ◦ Runs as part of the operating system ◦ Performance is dependent on CPU workload ◦ Does not support all RAID levels Data Protection: RAID -7 0 Striped array with no fault tolerance 1 Disk mirroring 3 Parallel access array with dedicated parity disk 4 Striped array with independent disks and a dedicated parity disk 5 Striped array with independent disks and distributed parity 6 Striped array with independent disks and dual distributed parity Nested RAID (i.e., 1 + 0, 0 + 1, etc.) Data Protection: RAID -8 RAID Redundancy: Parity 0 4 8 1 5 9 RAID Controller 2 6 10 3 7 11 Host 0123 4567 8 9 10 11 Parity Disk © 2008 EMC Corporation. All rights reserved. RAID Arrays - 9 Parity Calculation 5 + 3 + 4 + 2 = 14 The middle drive fails: 5 Data 3 Data 4 Data 2 Data 5 + 3 + ? + 2 = 14 ? = 14 – 5 – 3 – 2 ?=4 14 Parity RAID Array © 2008 EMC Corporation. All rights reserved. RAID Arrays - 10 Lecture 8, 9, 10 Different RAID levels and their suitability for different application environments: RAID 0, RAID 1 © 2008 EMC Corporation. All rights reserved. RAID Arrays - 11 Stripes Stripe 1 Strip 2 Strip 1 Strip 3 Strips Stripe Strip 1 Strip 2 Strip 3 Stripe 1 Stripe 2 Strips Data Protection: RAID 12 0 1 5 9 2 6 10 RAID Controller 3 7 11 Host Data Protection: RAID 13 Block 0 1 RAID Block 0 1 Controller Host Data Protection: RAID 14 RAID 1 Block 0 Block 2 Block 0 3 2 1 RAID Controller RAID 0 Block 1 Host Block 3 Data Protection: RAID 15 RAID 1 Block 0 Block 0 Block 2 Block 2 RAID Controller Host RAID 0 Block 1 Block 1 Block 3 Block 3 Data Protection: RAID 16 RAID 0 Block 1 Block 3 Block 2 0 RAID Controller RAID 1 Block 1 Host Block 3 Data Protection: RAID 17 RAID 0 Block 0 Block 1 Block 2 Block 3 RAID Controller Host RAID 1 Block 0 Block 1 Block 2 Block 3 Data Protection: RAID 18 Benefits are identical under normal operations Rebuild operations are very different ◦ RAID 1+0 uses a mirrored pair – only 1 disk is rebuilt if a disk fails ◦ RAID 0+1 if a single drive fails, the entire stripe is faulted RAID is 0+1 is a poorer solution and is less common RAID Arrays 19 0 4 8 1 5 9 2 6 10 RAID Controller 3 7 11 Host 0123 4567 8 9 10 11 Parity Disk RAID Arrays 20 0 4 1 6 5 9 RAID Controller Host The middle drive fails: Parity calculation 4 + 6 + 1 + 7 = 18 4 + 6 + ? + 7 = 18 1 ? 3 7 7 11 0123 4 518 67 ? = 18 – 4 – 6 – 7 ?=1 Parity Disk Data Protection: RAID 21 Block 0 3 2 1 Host RAID0 Block Controller Block Parity1 Generated Block 2 Block 3 P0123 Data Protection: RAID 22 RAID 4 – Striping with Dedicated Parity Disk Block 0 Block 4 Block 1 Block 5 Block 0 Parity RAID0 Block Generated Controller Block 2 Block 6 P0123 Block 3 Host Block 7 P0123 P4567 © 2008 EMC Corporation. All rights reserved. RAID Arrays - 23 Block 0 Block 4 Block 1 Block 5 Block 0 4 Block 2 Parity RAID4 Block 0 Generated Controller Block 6 P4 05 16 27 3 Block 3 Host P4567 P0123 Block 7 Data Protection: RAID 24 Two disk failures in a RAID set leads to data unavailability and data loss in single-parity schemes, such as RAID-3, 4, and 5 Increasing number of drives in an array and increasing drive capacity leads to a higher probability of two disks failing in a RAID set RAID-6 protects against two disk failures by maintaining two parities ◦ Horizontal parity which is the same as RAID-5 parity ◦ Diagonal parity is calculated by taking diagonal sets of data blocks from the RAID set members Even-Odd, and Reed-Solomon are two commonly used algorithms for calculating Data Protection: RAID 25 Hardware (usually a specialized disk controller card) ◦ Controls all drives attached to it ◦ Performs all RAID-related functions, including volume management ◦ Array(s) appear to the host operating system as a regular disk drive ◦ Dedicated cache to improve performance ◦ Generally provides some type of administrative software Software ◦ Generally runs as part of the operating system ◦ Volume management performed by the server RAID Arrays 26 Comparison of RAID Levels Data Protection: RAID 27 RAID Comparison RAID 0 1 3 5 Min Disks 2 2 3 3 Storage Efficiency % 100 50 (n-1)*100/n where n= number of disks (n-1)*100/n where n= number of disks 6 4 (n-2)*100/n where n= number of disks 1+0 and 0+1 4 50 Cost Low High Moderate Moderate Read Performance Write Performance Very good for both random and sequential read Very good Good Better than a single disk Good Slower than a single disk, as every write must be committed to two disks Good for random reads and very good for sequential reads Poor to fair for small random writes Good for large, sequential writes Very good for random reads Good for sequential reads Fair for random write Slower due to parity overhead Fair to good for sequential writes Moderate but more than RAID 5 Very good for random reads Good for sequential reads Good for small, random writes (has write penalty) High Very good Good - Data Protection: RAID 28 RAID Controller Ep new Ep old = E4 old - + E4 new 2 XOR Ep new Ep old P0 D1 E4 old D2 D3 E4 new D4 Small (less than element size) write on RAID 3 & 5 Ep = E1 + E2 + E3 + E4 (XOR operations) If parity is valid, then: Ep new = Ep old – E4 old + E4 new (XOR operations) ◦ 2 disk reads and 2 disk writes ◦ ◦ ◦ Reading, calculating and writing parity segment introduces penalty to every write operation Parity RAID penalty manifests due to slower cache flushes Increased load in writes can cause contention and can cause slower read response times Parity Vs Mirroring Data Protection: RAID 29 RAID Controller Data Protection: RAID 31 What is a RAID array? What benefits do RAID arrays provide? What methods can be used to provide higher data availability in a RAID array? What is the primary difference between RAID 3 and RAID 5? What is advantage of using RAID 6? What is a hot spare? Data Protection: RAID 33