Lecture 13

advertisement
236601 - Coding and
Algorithms for Memories
Lecture 13
1
Large Scale Storage Systems
• Big Data Players: Facebook, Amazon, Google, Yahoo,…
Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!)
• Failures are the norm
2
Node failures at Facebook
Date
XORing Elephants: Novel Erasure Codes for Big Data M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A.
G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, VLDB 2013
3
Problem Setup
• Disks are stored together in a group (rack)
• Disk failures should be supported
• Requirements:
– Support as many disk failures as possible
– And yet…
• Optimal and fast recovery
• Low complexity
4
Reed Solomon Codes
• A code with parity check matrix of the form
1
1 ⋯
1
1 1
_1
2
3
𝑛
0
⋯
𝛼
𝛼 _
𝛼 𝛼 𝛼
𝑛1
2
4
2
2
0
⋯
𝛼
𝛼
𝛼 𝛼 𝛼
⋮_
⋮_ ⋮
⋮ _
⋮ ⋮
_
_1 2 𝑑 1 3 𝑑 1
𝑑 1 𝑛1
𝑑
0
⋯
𝛼
𝛼
𝛼𝛼 𝛼
Where 𝛼 is a primitive element at some extension
field and O(𝛼) > n-1
Claim: Every sub-matrix of size dxd has full rank
5
Reed Solomon Codes
• Advantages:
– Support the maximum number of disk failures
– Are very comment in practice and have relatively
efficient encoding/decoding schemes
• Disadvantages
– Require to work over large fields
Solution: EvenOdd Codes
– Need to read all the disks in order to recover even
a single disk failure – not efficient rebuild
Solution: ZigZag Codes
6
The Repair Problem
•
1
Facebook’s storage Scheme:
RS code
–
10 data blocks
–
4 parity blocks
–
Can tolerate any four disk failures
2
3
4
5
6
7
8
9
10
P1
P2
P3
P4
• A disk is lost – Repair job starts
• Access, read, and transmit data of disks!
• Overuse of system resources during single repair
• Goal: Reduce repair cost in a single disk repair
7
ZigZag Codes
• Designed by Itzhak Tamo, Zhiying Wang, and
Jehoshua Bruck
• The goal: construct codes correcting the max
number of erasures and yet allow efficient
reconstruction if only a single drive fails
8
ZigZag Codes
• Lower bound: The min amount of data required to be
read to recover a single drive failure
– (n,k) code: n drives, k information, and n-k redundancy
– M- size of a single drive in bits
• For (n,n-2) code it is required to read at least 1/2 from
the remaining drives, that is at least (1/2)(n-1)M bits
– The last example is optimal
• In general, for (n,n-r) code it required to read at least
1/r from the remaining drives (1/r)(n-1)M
9
ZigZag Codes
• Example
info 1
info 2
info 3
0
1
2
3
2
3
0
1
1
0
3
2
Row ZigZag
parity parity
0
1
2
3
10
Network Coding for Distributed
Storage
• Goal – show the following:
In general, for (n,n-r) code it required to read at least
1/r from the remaining drives (1/r)(n-1)M
• Network Coding for Distributed Storage
Dimakis, Godfrey, Wu, Wainwright, Ramchandran
• File of size M is partitioned into k pieces of size M/k
• The k pieces are encoded into n encoded pieces using an
(n,k) MDS code
11
Network Coding for Distributed
Storage
• File of size M is partitioned into k pieces of size M/k
• The k pieces are encoded into n encoded pieces using an
(n,k) MDS code
x1
y1
x2
y2
x3
x4
12
Network Coding for Distributed
Storage
• File of size M is partitioned into k pieces of size M/k
• The k pieces are encoded into n encoded pieces using an
(n,k) MDS code
x1
y1
x2
y2
x3
β=?
β
β
x5
x4
13
Network Coding for Distributed
Storage
• File of size M is partitioned into k pieces of size M/k
• The k pieces are encoded into n encoded pieces using an
(n,k) MDS code
∞
∞
S
∞
∞
x1
in
α=1
x2
in
α=1
x3
in
x4
in
x1
ou
t
x2
ou
t
α=1
x3
α=1
x4
ou
t
ou
t
β=?
∞
β
DC
β
x5
in
x5
ou
t
∞
14
ZigZag Codes
• Example
a
c
b
d
a+b
c+d
a+2d
c+b
15
Download