distributed storage and NC - Institute of Network Coding

BASIC Regenerating Codes for Distributed Storage Systems Kenneth Shum (Joint work with Minghua Chen, Hanxu Hou and Hui Li) Window Azure data centers Aug 2013 kshum 2 http://technoblimp.com Inside a data center Aug 2013 kshum 3 Data distribution • Encode and distribute a data file to n storage nodes. Data File: “INC” Aug 2013 kshum 4 Data collector • Data collector can retrieve the whole file by downloading from any k storage nodes. “INC”  Aug 2013 kshum 5 Three kinds of disk failures • Transient error due to noise corruption – repeat the disk access request • Disk sector error – partial failure – detected and masked by the operating system • Catastrophic error – total failure due to disk controller for instance – the whole disk is regarded as erased Aug 2013 kshum 6 Frequency of node failures Figure from “XORing elephants: novel erasure codes for Big Data” by Sathiamoorthy et al. Aug 2013 Number of failed nodes over a single month in a 3000 node production cluster of Facebook. 7 Outline of this talk • Repetition scheme • Traditional erasure-correcting codes – Reed-Solomon codes • Network-coding-based scheme – BASIC regenerating codes Aug 2013 kshum 8 Distributed storage system • Encode a data file and distribute it to n disks • (n,k) recovery property – The data file can be rebuilt from any k disks. • Repair – If a node fails, we regenerate a new node by connecting and downloading data from any d surviving disks. – Aim at minimizing the repair bandwidth (Dimakis et al 2007). • A coding scheme with the above properties is called a regenerating code. Aug 2013 kshum 9 Repetition scheme • GFS: Replicate data 3 times • Gmail: Replicate data 21 times Aug 2013 kshum 10 2x Repetition scheme Divide the data file into 2 parts A, B 1G 1G 1G A B A 1G B Aug 2013 Data Collector Cannot tolerate double disk failures 11 Repair is easy for repetition-based system New node A A B 1G A Repair bandwidth =1G B Aug 2013 12 Reed-Solomon Code Divide the file into 2 parts A A, B B Data Collector A+B A+2B Aug 2013 It can tolerate double disk failures 13 Repair requires essentially decoding the whole file A A New node 1G B 1G A+B Repair bandwidth = 2G A+2B Aug 2013 kshum 14 BASIC regeneration code Divide the data file into 4 parts 0.5G 0.5G 0.5G 0.5G     Aug 2013 Binary Addition Shift Implementable Convolutional Utilization of bit-wise shift in storage was proposed by Piret and Krol (1983), and Qureshi, Foh and Cai (2012). 15 Download from nodes 1 and 2 1G 0.5G 0.5G 0.5G 0.5G  1G Data Collector    Aug 2013 16 Download from nodes 1 and 3 1G 0.5G 0.5G 0.5G 0.5G Data Collector   1G   Aug 2013 17 Download from nodes 1 and 4 1G 0.5G 0.5G 0.5G 0.5G Data Collector    1G  Aug 2013 18 Download from nodes 2 and 3 1G 0.5G 0.5G 0.5G 0.5G Data Collector   1G   Aug 2013 19 Download from nodes 2 and 4 1G 0.5G 0.5G 0.5G 0.5G Data Collector    1G  Aug 2013 20 Download from nodes 3 and 4 0.5G 0.5G 0.5G 0.5G  1G Data Collector   1G  Aug 2013 21 Zigzag decoding à la Gollakata and Katabi (2008) What to solve for P1 and P2. P1  P2 P1  P2 P1  P2’ P1  P2’ Aug 2013 kshum 22 Repair of BASIC regenerating code New node XOR Repair bandwidth=1.5 G   Bitwise shift and XOR   Bitwise shift and XOR Repair of BASIC regenerating code   Decode the blue and red packets by zigzag decoding  Interference alignment  Comparison of the three examples Repetition scheme Reed-Solomon Codes BASIC regenerating codes Storage efficiency 1/2 1/2 1/2 Reliability Tolerate one disk failure Tolerate two disk failures Tolerate two disk failures Repair bandwidth 1G 2G 1.5 G Finite field arithmetic Binary addition and bit-wise shift Computational Very small complexity Aug 2013 kshum 25 Summary • We can reduce repair bandwidth by network coding. • BASIC regenerating codes – A failed storage node can be repaired by simple bit-wise shift and XOR operations. – Small storage overhead due to shifting. Aug 2013 kshum 26 References • Piret and Krol, MDS convolution codes, IEEE Trans. of Information Theory, 1983. • Dimakis, Brighten, Wainwright and Ramchandran, Network coding for distributed storage systems, INFOCOM, 2007. • Gollakata and Katabi, Zigzag decoding: combating hidden terminals in wireless networks, Proc. in the ACM Sigcomm, 2008. • Qureshi, Foh, and Cai, Optimal solution for the index coding problem using network coding over GF(2), Proc. IEEE Conf. on Sensor Mesh and Ad Hoc Comm. and Network, 2012. • Sung and Gong, A zigzag decodable code with MDS property for distributed storage systems, Proc. IEEE Symp. on Information Theory, 2013. • Hou, Shum, Chen and Li, BASIC regenerating code: binary addition and shift for exact repair, Proc. IEEE Symp. on Information Theory, 2013. Aug 2013 kshum 27 Two modes of repair • Exact repair – The content of the new node is exactly the same as the content of the failed node • Functional repair – only requires that the (n,k) recovery property is preserved. Aug 2013 kshum 28

distributed storage and NC - Institute of Network Coding

Related documents

Products

Support

distributed storage and NC - Institute of Network Coding

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib