13.4.4 Error-Handling Capabilities of Stable Storage By: Bryant Panyarachun ID: 210 Types of Failures • Media Failures • If, after storing X in sectors XL and XR, one of them undergoes a media failure, we can always read X from the other, unless they have both failed. • Probability of both failing is extremely small. • Write Failure • As we write X there is a system failure. • Possible that X can be lost in main memory, and copy of X being written is garbled. • Ex. Half the sector has new value of X while other half stays the same. Write failure: possible cases Failure while writing XL • XL will be “bad” but XR will be “good” • Thus, we can obtain the old value of X. We may also copy XR into XL to repair the damage to XL. • Unless there is a coincident media failure at XR, which is extremely unlikely. Failure after writing XL • We expect that XL will be “good,” so we can read the new value of X from XL. • Since XR may not have the correct value of X, we should copy XL into XR. 13.4.5 Recovery from Disk Crashes • Most serious mode of failure: – Disk Crash or Head Crash. – Data is permanently destroyed. – No way to recover data unless was previously backed up. • Several schemes developed to reduce risk of data loss – Generally involve: redundancy, parity checks, duplicated sectors. – Commonly called RAID (redundancy arrays of independent disks) • Rate of disk crashes – Measured by the mean time to failure • time after 50% of population can be expected to fail and be unrecoverable. • Modern disks mean time to failure is about 10 years. • If mean time to failure is n years, then in any given year, 1/nth of the surviving disks fail. • Reality: disks tend to fail early or late. – Mean time to disk crash not same as mean time to data loss. • There are a number of schemes to recover the data if a disk crashes. • Involves multiple disks; data disks and redundant disks. • Either disks can be used to restore the other in case of a disk crash. 13.4.6 Mirroring as a Redundancy Technique Mirroring each disk • Simplest scheme • Referred to as RAID Level 1. • Gives a mean time to memory loss that is much greater than the mean time to disk failure. • Only way data can be lost is if there is a second disk crash while the first crash is being repaired. Mirroring example • Suppose each disk has a 10-year mean time to failure. – Probability of failure in any given year is 10%. • When disk fails, only need to replace it with a good disk and copy the mirror disk to the new one. • How often will both disks fail? – Suppose process of replacing failed disk is 3 hours = 1/8 of a day, or 1/2920 of a year. – Since the average disks lasts 10 years, probability that mirror disk will fail is (1/10) X (1/2920) or one in 29,200. – If one disk fails every 10 years, then one of the 2 disks will fail once in 5 years on average. One of 29,200 results in data loss. – Result mean time to failure involving data loss is 5 x 29,200 = 146,000 years. 13.4.7 Parity Blocks • Mirroring uses same number of redundant disks as data disks. • Alternate approach: RAID 4 – Uses only 1 redundant disk regardless of number of data disks. – In the redundant disk, the ith block consists of parity checks for the ith block of all data disks. • The jth bits of all the ith blocks, including both the data disks and the redundant disk, must have an even number of 1’s among them. We choose the bit of the redundant disk to make this condition true. RAID 4 • In the redundant disk, choose bit j to be 1 if an odd number of the data disks have 1 in that bit, and we choose bit j of the redundant disk to be 0 if there are an even number of 1’s in that bit. This calculation is called the modulo-2 sum. Ex. Data disk: 1, 2, 3. Redundant disk: 4 Disk 1: 11110000 Disk 2: 10101010 Disk 3: 00111000 Disk 4: 01100010 Writing with RAID 4 • When writing a new block, we need to update the corresponding block of the redundant disk. – Naïve approach: read all corresponding data blocks of all disks, calculate modulo-2 sum, and rewrite block of redundant disk. Requires n + 1 disk I/O’s. – Better to take modulo-2 sum of old a new data. Tells us where there is a change in the number of 1’s among the blocks. – Any even number of 1’s changes to an odd number. Changes are always by one; any even number of 1’s changes to an odd number. If we change the same positions of the redundant block, number of 1’s becomes even again. – Only requires 4 disk I/O’s Writing with RAID 4 • 4 disk I/O’s 1. Read the old value of the data block being changed. 2. Read the corresponding block of the redundant disk. 3. Write the new data block. 4. Recalculate and write the block of the redundant disk. Writing with RAID 4 Example Ex. Disk 1: 11110000 Disk 2: 10101010 Disk 3: 00111000 Redundant disk: 01100010 • Want to change disk 2 to 11001100. • Take modulo-2 sum of old and new values of block on disk 2, to get 01100110. – – 01100110 shows that we must change positions, 2, 3, 6, 7 on block of redundant disk. Replace redundant block with 00000100. (basically modulo-2 sum of itself and 01100110.) Failure Recovery • If redundant disk crashes, replace new disk and recompute the redundant blocks. • If a data disk crashes, need to replace disk and recompute data from the other disks. • Since the number of 1’s among corresponding bits of all disks is even, the bit in any position is the modulo-2 sum of all the bits in the corresponding positions of all the other disks. Failure Recovery Example • If the bit in question is 1, the number of corresponding bits in the other disks that are 1 must be odd; their modulo-2 sum is 1. If the bit in question is 0, then there are an even number of 1’s among the corresponding bits of the other disks; their modulo-2 sum is 0; Disk 1: 11110000 Disk 2: ???????? Disk 3: 00111000 Disk 4: 01100010 • Taking the modulo-2 sum of each column gives us 10101010.