Top Banner
Codes for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering Indian Institute of Science, Bangalore The 3rd Annual Storage Developer Conference Bengaluru May 25-26, 2017
111

Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

May 21, 2018

Download

Documents

dothuy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes for Big Data: Erasure Coding for DistributedStorage

P. Vijay Kumar

Professor,Department of Electrical Communication Engineering

Indian Institute of Science, Bangalore

The 3rd Annual Storage Developer ConferenceBengaluru

May 25-26, 2017

Page 2: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Thanks go out to

Paul Talbut and Udayan Singh for the invite

and

K. Gopinath and Siddhartha Nandi

for being kind enough to suggest my name..

2 / 41

Page 3: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Acknowledgements

Research Collaborators Joint work with:

Birenjith Sasidharan, Myna Vajha, S. B. Balaji and Nikhil Krishnan(PhD students, IISc)

Bhagyashree Puranik, Ganesh Kini and Vinayak Ramkumar (MTechstudents, IISc)

Srinivasan Narayanamurthy, Syed Hussain and Siddhartha Nandi(NetApp ATG, Bengaluru, India)

3 / 41

Page 4: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Organization

Erasure Coding

Node Failures and the Evolution of Coding Theory

Regenerating Codes

Locally Recoverable Codes (briefly)

Codes with Local Regeneration (briefly)

Codes for Multiple Erasures (briefly)I Codes for Data AvailabilityI Codes with Sequential Recovery

The Coupled-Layer MSR Code in Action

4 / 41

Page 5: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Erasure Coding

5 / 41

Page 6: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Fault Tolerance

Fault tolerance is key to making data loss a very remote possibility

A time-honored means of achieving fault tolerance is replication..

6 / 41

Page 7: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Triple Replication

File%or%Data%Object%

B%A% C% D% E%

Data%Block%A%A%A%

Triple%replica6on%

Stored%in%different%nodes%of%the%storage%network%7 / 41

Page 8: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Drawback of Triple Replication

But triple replication is poor in terms of storage e�ciency: just 33%.Are there better ways ?

A well-known alternative is to use Erasure Coding (EC)

8 / 41

Page 9: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Drawback of Triple Replication

But triple replication is poor in terms of storage e�ciency: just 33%.Are there better ways ?

A well-known alternative is to use Erasure Coding (EC)

9 / 41

Page 10: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Erasure Coding of Data

File%or%Data%Object%

k%%storage%units%

Ak%A2%A1%

Split%the%data%object%%into%k%parts%

P1% P2% Pm%

add%m%parity%storage%units%

(k,m)%erasure%%code%

10 / 41

Page 11: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Two Key Performance Measures

1 Storage e�ciency

k

k +m

2 fault tolerance

- at most m storage units

3 Codes with maximum possible faulttolerance ) MDS codes

4 Reed-Solomon codes - a primeexample

11 / 41

Page 12: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

An Example MDS Code - The RAID 6 Code

Source: https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/RAID_6.svg/1280px-RAID_6.svg.png12 / 41

Page 13: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Other RS Codes in Practice

Most Popular in Practice: Reed-Solomon Codes

8

Intel & Cloudera (2016) “Progress Report: Bringing Erasure Coding to Apache Hadoop”

Storage Systems Reed-Solomon codesLinux RAID-6 RS(10,8)Google File System II (Colossus) RS(9,6)Quantcast File System RS(9,6)Intel & Cloudera’ HDFS-EC RS(9,6)Yahoo Cloud Object Store RS(11,8)Backblaze’s online backup RS(20,17)Facebook’s f4 BLOB storage system RS(14,10)Baidu’s Atlas Cloud Storage RS(12, 8)

H. Dau et al, “Repairing Reed-Solomon Codes with Single and Multiple Erasures,” ITA, 2017,

San Diego.

13 / 41

Page 14: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Evolution of HDFS to Incorporate EC ) HDFS-EC1 Typically, EC reduces the storage cost by 50% compared with 3x

replication2 Motivated by this, Cloudera and Intel initiated the HDFS-EC project3 Targeted for release in Hadoop 3.0.4 Employs a striped layout:

5 Possibility of incorporating more sophisticated EC schemes !

Zhe Zhang, Andrew Wang, Kai Zheng, Uma Maheswara G., and Vinayakumar, “Introduction to

HDFS Erasure Coding in Apache Hadoop,” September 23, 2015.14 / 41

Page 15: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Node Failures and the Evolution of Coding Theory

15 / 41

Page 16: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Node FailuresAn important consideration is how e�ciently the EC can handle nodefailures as such failures are commonplace:

M. Asteris, D. Papailiopoulous, A. Dimakis, R. Vadali, S. Chen, and D. Borthakur, “XORing

elephants: Novel erasure codes for big data, ” PVLDB, 2013.16 / 41

Page 17: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

RS Codes and Node Failures

Under the conventional approach, RS codes are ine�cient in two respectsat node repair:

In the example Facebook [10, 4] RS code,

1 the amount of data download (repair BW) equals 10 times theamount stored within the failed node

2 Also, 10 storage units need to be contacted for repair

there is room for improvement...

17 / 41

Page 18: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coding Theory Responds

1Regenerating codes

I minimize the amount of datadownload (repair bandwidth)needed for node repair

2Locally recoverable codes

I minimize the number of helpernodes contacted for node repair,but also reduce repair bandwidth

3Novel and e�cient approaches

to RS repair a more recentdevelopment

Regenera'ng(Codes(

Codes(with((Locality(

•  Regenera'ng(codes(reduce(repair(bandwidth(•  Codes(with(locality(reduce(repair(degree(

A. G. Dimakis, P. B. Godfrey, Y. Wu, M. Wainwright, and K. Ramchandran, “NetworkCoding for Distributed Storage Systems,” IEEE Trans. Inform. Th., Sep. 2010.

P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, “On the Locality of CodewordSymbols,” IEEE Trans. Inf. Theory, Nov. 2012.

V. Guruswami, M. Wootters, “Repairing Reed-Solomon Codes,” arXiv:1509.04764 [cs.IT] .

18 / 41

Page 19: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Some Comments

Regenerating Codes

1 Minimum Storage Regenerating (MSR) Codes are MDS codes2 Regenerating codes are vector codes, each code symbol is a vector of

code ` symbolsI ` is called the sub-packetization level

Locally Recoverable Codes

1 Locally recoverable codes yield on storage e�ciency for ease of noderepair

Fresh approach to RS repair

1 regard RS codes as vector codes

2 minimize repair bandwidth under a constraint on sub-packetizationlevel `

19 / 41

Page 20: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Regenerating Codes

Focus here on the subclass of Minimum Storage Regenerating (MSR)Codes

20 / 41

Page 21: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Raid Code - Not Very Good at Handling Node Failure..

The conventional approach:

Connect to any 2 nodes,

Reconstruct A and B ,

Extract A

Disk 1

Disk 2

Disk 3

Disk 4

A

B

A+B

A+θB

A

B

(4, 2) MDS codeUsed in RAID 6

B

A+B

New disk 1

But downloading 2 units of data to revive a node that stores 1 units ofdata is clearly, wasteful of network bandwidth..

21 / 41

Page 22: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Replacing the RAID 6 Code with a Regenerating Code

Here, each node now stores two “half-symbols”We download 3 half-symbols as opposed to 2 full-symbols

I Can recover any of {A1,A2,B1}

Disk 1

Disk 2

Disk 3

Disk 4

B 1

2A

1+2A

2+B

1

2A

1+4A

2+2B

1

A1

A2

B1

B2

A1

A2

B1

B2

2A1+2A

2+B

1

2A1+4A

2+2B

1

A2+2B

1+2B

2

A2+2B

1+4B

2

A1

A2

22 / 41

Page 23: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Evolution of MSR Codes

Code Explicit SE SPL OA HN

Product-Matrix Yes Low Low No d

Hadamard & Butterfly* Yes High High No allZig-Zag Code No High High Yes all

Sasidharan et al (1) No High Low Yes allYe-Barg (1) Yes High High Yes all

Ye-Barg (2) Yes High Low Yes allSasidharan et al (2) Yes High Low No d

* ) limited to 2 parity nodes

SE ) storage e�ciency

SPL ) sub-packetization level

OA ) optimal access (number of symbols accessed for repair)

HN ) number of helper nodes needed

23 / 41

Page 24: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

References (MSR Codes with High Storage E�ciency)1 K. V. Rashmi, N. B. Shah, and P. V. Kumar, “Optimal Exact-Regenerating Codes for

Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction,”IEEE Trans. Inf. Theory, Aug. 2011.

2 D. S. Papailiopoulos, A. G. Dimakis, and V. Cadambe, “Repair optimal erasure codesthrough Hadamard designs,” IEEE Trans. Inf. Theory, May 2013.

3 E. En Gad, R. Mateescu, F. Blagojevic, C. Guyot, and Z. Bandic, “ Repair-Optimal MDSArray Codes Over GF (2),” in Proceedings IEEE International Symposium on InformationTheory (ISIT), 2013.

4 Zhiying Wang, Itzhak Tamo, Jehoshua Bruck, “Optimal Rebuilding of Multiple Erasuresin MDS Codes, ” IEEE Trans. Information Theory, Feb. 2017.

5 B. Sasidharan, G. K. Agarwal, and P. V. Kumar, “A high-rate MSR code with polynomialsub-packetization level, ” in IEEE International Symposium on Information Theory, ISIT2015.

6 S. Goparaju, A. Fazeli, and A. Vardy, “Minimum storage regenerating codes for allparameters,” IEEE Information Theory Transactions, April 2017.

7 M. Ye and A. Barg, “Explicit constructions of high-rate MDS array codes with optimalrepair bandwidth, ” IEEE Information Theory Transactions, April 2017.

8 M. Ye and A. Barg, “Explicit constructions of optimal-access MDS codes with nearlyoptimal sub-packetization, ” CoRR, vol. abs/1605.08630, 2016.

9 B Sasidharan, M Vajha, PV Kumar, “An Explicit, Coupled-Layer Construction of aHigh-Rate MSR Code with Low Sub-Packetization Level, Small Field Size andd < (n � 1), ” CoRR, vol. abs/1701.07447, 2017, to be presented at ISIT 2017.

24 / 41

Page 25: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Example Coupled-Layer MSR Code16MB 16MB 16MB 16MB

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

yx"

Data cube representation of the CL-MSR Code.

The cube has: ●  6 columns, each

associated to a distinct node

●  8 horizontal planes.

●  A column has 8 points

●  Each point corresponds to 2MB of storage

2MB

Our coupled-layer perspectiveon the Ye-Barg construction(2)

a (4, 2) MSR code

6 nodes, sub-packetizationlevel is ` = 8

6⇥ 8 = 48 points

in the example to follow, eachpoint stores 2MB

1 M. Ye, and A. Barg, “Explicit constructions of optimal- access MDS codes with nearlyoptimal sub-packetization, ” May 2016.

2 B. Sasidharan, M. Vajha, and PVK. “An Explicit, Coupled-Layer Construction of aHigh-Rate MSR Code with Low Sub-Packetization Level, Small Field Size andd < (n � 1), ” to be presented at ISIT 2017.

25 / 41

Page 26: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Performance of the Coupled-Layer MSR Code

1 A comparison of actual repair time is shown. In the figure,I the (6, 4) code is in our present notation a (4, 2) codeI the (12, 9) code is in our present notation a (9, 3) codeI the (20, 16) code is in our present notation a (16, 4) code

26 / 41

Page 27: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Performance of the Coupled-Layer MSR Code

Similar gains in network bandwidth and disk read

Thus a larger sub-packetization level is not necessarily a problem forimplementation

27 / 41

Page 28: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Locally Recoverable Codes

28 / 41

Page 29: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Windows Azure Storage Coding Solution

Comparison:+In+terms+of+reliability+of+data+and+number+of+helper+nodes+contacted+for+node+repair,+the+two+codes+are+comparable.+++The+overheads+are+quite+different,+29%+for+the+Azure+code+versus+43%+for+the+RS+code.++++This+difference+has+reportedly+saved+MicrosoH+millions+of+dollars!++

P1+

P2+

X1+ X2+ X3+ X4+ X5+ X6+ X7+

PX+XPcode+

Y1+ Y2+ Y3+ Y4+ Y5+ Y6+ Y7+

PY+YPcode+

Y1+ Y2+ Y3+ Y4+ Y5+ Y6+ Y7+ P1+ P2+ PY+

MicrosoH+Azure+Code+

ReedPSolomon+Code+

Comparison: In terms of reliability and number of helper nodes contactedfor node repair, the two codes are comparable. The overheads however arequite di↵erent, 1.29 for the Azure code versus 1.5 for the RS code. Thisdi↵erence has reportedly saved Microsoft millions of dollars.

Reed$Solomon*Codeword*X6*X1* X5*X2* X3* X4* P1* P2* P3*

(any*6*of*9*can*be*used*to*recover*the*codeword)*Huang, Simitci, Xu, Ogus, Calder, Gopalan, Li, Yekhanin, “Erasure Coding in Windows AzureStorage,” USENIX, Boston, MA, 2012.

29 / 41

Page 30: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes with Hierarchical Locality

[4, 3, 2] code ) (3,1) code

[12, 8, 3] code ) (8,4) code

[24, 14, 6] code ) (14,10) code

Codes with hierarchical locality do exactly that by calling for helpfrom an intermediate layer of codes when the local code fails.

These codes may be regarded as the “middle codes”.

B. Sasidharan, G. K.Agarwal, PVK, “Codes With Hierarchical Locality,” arXiv:1501.06683[cs.IT].

30 / 41

Page 31: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes with Local Regeneration

31 / 41

Page 32: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes with Local Regeneration

Regenera'ng(Codes:((Minimize(repair(BW(

Codes(with(Locality:((Minimize(repair(degree(

Codes(with(Local(Regenera'on:((Small(repair(BW(and((small(repair(degree(

A single code that has both locality and regeneration properties

and inherent double replication of data

1 G. M. Kamath, N. Prakash, V. Lalitha, PVK, ‘Codes With Local Regeneration andErasure Correction,” T-IT, Aug. 2014 .

32 / 41

Page 33: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

An Example Code with Local RegenerationThe construction makes can make use of an all-symbol local scalar codeand is also optimal:

1,2, 3,4

3,6, 8,P1

2,5, 8,9

4,7, 9,P1

1,5 6,7

1

2

5 3

6

9

7

4

8

1,2, 3,4

3,6, 8,P2

2,5, 8,9

4,7, 9,P2

1,5, 6,7

1

2

5 P2 3

6

9

7

4

8

Local Code 1 Local Code 2

1 2 9 P1 . . . 1 2 9 P2 . . .

Scalar All-Symbol Locality Code

Local Code 1 Local Code 2

P1

1 2 9 P3 . . .

Local Code 3

1,2, 3,4

3,6, 8,P3

2,5, 8,9

4,7, 9,P3

1,5, 6,7

1

2

5 P3 3

6

9

7

4

8

Local Code 3

33 / 41

Page 34: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes with Availability (Recovery from SimultaneousMultiple Erasures)

34 / 41

Page 35: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Recovery in Parallel

c11 C12 C13 c14 c15

c21 c22 c23 c24 c25

c31 c32 c33 c34 c35

c41 c42 c43 c44 c45

c51 c52 c53 c54 c55

X X

Last column is a parity check on entries to the left in the same rowLast row is a parity check on entries above in the same columnCan recover locally from 2 erasures in parallel 35 / 41

Page 36: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Codes with Sequential Recovery (Recovery fromSimultaneous Multiple Erasures)

36 / 41

Page 37: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Sequential Recovery

c11 C12 C13 c14 c15

c21 c22 c23 c24 c25

c31 c32 c33 c34 c35

c41 c42 c43 c44 c45

c51 c52 c53 c54 c55

X X X

Same code as beforeCan recover locally from 3 erasures in a sequential manner

Sequential recovery enables codes with larger storage e�ciency 37 / 41

Page 38: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

References - Codes for Multiple Erasures

1 A. Wang and Z. Zhang, “Repair locality with multiple erasure tolerance,” IEEE Trans.Inf. Theory, Nov. 2014.

2 N. Prakash, V. Lalitha, and P. V. Kumar, “Codes with locality for two erasures,” in Proc.IEEE Int. Symp. Inform. Theory (ISIT) 2014.

3 W. Song and C. Yuen, “Binary locally repairable codes - sequential repair for multipleerasures,” in Proc. IEEE GLOBECOM, 2016.

38 / 41

Page 39: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Functioning of an Example, Coupled-Layer MSR Code

Goal: To show that a larger sub-packetization level is not necessarilya problem for implementation

39 / 41

Page 40: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Example Coupled-Layer MSR Code16MB 16MB 16MB 16MB

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

yx"

Data cube representation of the CL-MSR Code.

The cube has: ●  6 columns, each

associated to a distinct node

●  8 horizontal planes.

●  A column has 8 points

●  Each point corresponds to 2MB of storage

2MB

Our coupled-layer perspectiveon the Ye-Barg construction(2)

a (4, 2) MSR code

6 nodes, sub-packetizationlevel is ` = 8

6⇥ 8 = 48 points

in the example to follow, eachpoint stores 2MB

1 M. Ye, and A. Barg, “Explicit constructions of optimal- access MDS codes with nearlyoptimal sub-packetization, ” May 2016.

2 B. Sasidharan, M. Vajha, and PVK. “An Explicit, Coupled-Layer Construction of aHigh-Rate MSR Code with Low Sub-Packetization Level, Small Field Size andd < (n � 1), ” to be presented at ISIT 2017.

40 / 41

Page 41: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

64MB

Consider a file of size 64MB

•  Will encode via a [k=4, m=2] MSR Code •  Called the Coupled-Layer MSR Code

Page 42: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

16MB 16MB 16MB 16MB

Step 1: Break file into k = 4 data chunks, each of 16MB.

Page 43: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

16MB 16MB 16MB 16MB

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

yx"

Data cube representation of CL-MSR Code

2MB

The cube has: ●  6 columns, each

associated to a distinct node

●  8 horizontal planes.

●  A column has 8 points

●  Each point corresponds to 2MB of storage

Page 44: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

16MB 16MB 16MB

x" y

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

Place four 16MB chunks in four systematic nodes

Page 45: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

16MB 16MB

x" y

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

Place four 16MB chunks in four systematic nodes

Page 46: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

16MB

x" y

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

Place four 16MB chunks in four systematic nodes

Page 47: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

x" y

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

Place four 16MB chunks in four systematic nodes

Page 48: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

We now have the systematic nodes

Page 49: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Actual data cube A

We will now compute the parity nodes

Page 50: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

Actual data cube A

Will get there through an intermediate “Virtual data cube”

Page 51: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Start filling the virtual data cube on the right as follows

Page 52: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

Certain pairs of points in the cube are “coupled”

Page 53: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

B1 B2

Coupling Transform

A1 A2

The Coupling Transform is a 2x2 matrix transform

Page 54: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2 B1 B2

Place the points obtained in the Virtual data cube

Page 55: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2 B1 B2

B1

B2

Place the points obtained in the Virtual data cube

Page 56: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

Place the points obtained in the Virtual data cube

Page 57: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

B1 B2

Coupling Transform

A1 A2

Place the points obtained in the Virtual data cube

Page 58: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2 B1 B2

B1

B2

Place the points obtained in the Virtual data cube

Page 59: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

B1 B2

Coupling Transform

A1 A2

Place the points obtained in the Virtual data cube

Page 60: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

B1 B2

Place the points obtained in the Virtual data cube

Page 61: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1

A2

B1 B2 B1

B2

Place the points obtained in the Virtual data cube

Page 62: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

B1

B2 A2

A1

Place the points obtained in the Virtual data cube

Page 63: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Place the points obtained in the Virtual data cube

Page 64: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Copy

Red dotted points are not paired, they are simply carried over

Page 65: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Copy

Red dotted points are not paired, they are simply carried over

Page 66: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

x"y

Z"="(0,0,0)"

Z"="(1,1,1)"

Z

We now have data-part of the Virtual data cube

Page 67: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,0,0)"

Each plane is Reed-Solomon coded to obtain parity points

Page 68: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,0,0)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 69: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,0,0)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 70: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,0,0)"

Each plane is Reed-Solomon coded to obtain parity points

Page 71: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(1,0,0)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 72: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,1,0)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 73: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(1,1,0)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 74: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,0,1)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 75: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(1,0,1)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 76: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(0,1,1)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 77: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Z"="(1,1,1)"

RS Encode

Each plane is Reed-Solomon coded to obtain parity points

Page 78: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

Now we have the complete Virtual data cube

Page 79: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

Parity points of Actual data cube can now be computed

Page 80: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

Perform decoupling

Page 81: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

B1 B2

A1 A2

Inverse Coupling

Transform

Perform decoupling

Page 82: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1 A2

Virtual data cube B

B1

B2

Perform decoupling

Page 83: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

A1 A2

Virtual data cube B

B1

B2

A1

A2

Perform decoupling

Page 84: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

Perform decoupling

Page 85: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

B1 B2

A1 A2

Inverse Coupling

Transform

Perform decoupling

Page 86: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

A1 A2

Perform decoupling

Page 87: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

A1 A2

Perform decoupling

A1

A2

Page 88: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

B1 B2

A1 A2

Inverse Coupling

Transform

Perform decoupling

Page 89: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

A1 A2

Perform decoupling

Page 90: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

A1 A2

Perform decoupling

A1

A2

Page 91: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

Perform decoupling

A1

A2

Page 92: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

Copy

Red dotted points are simply carried over

Page 93: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Virtual data cube B

B1

B2

Copy

Red dotted points are simply carried over

Page 94: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Decoupling

Coupling

Actual and Virtual data cubes

Virtual data cube B

Virtual data cube A

Page 95: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

The encoding is now completed!

Page 96: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Problem of Node Repair: One node fails

Page 97: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Problem of Node Repair: One node fails

Page 98: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

For this example, only half of the planes participate in repair

● Total Helper Data = 2MB X 4 X 5 = 40MB ● Opposed to RS code = 16MB X 4 = 64MB

● Much larger savings seen for m > 2

Page 99: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coupling

Couple points

Page 100: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coupling

RS Dec

Run RS decoding on each of the selected planes

Page 101: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coupling

RS Dec

RS Dec

Run RS decoding on each of the selected planes

Page 102: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coupling

RS Dec

RS Dec

RS Dec

Run RS decoding on each of the selected planes

Page 103: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Coupling

RS Dec

RS Dec

RS Dec

RS Dec

Run RS decoding on each of the selected planes

Page 104: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Half the number of required points are now already computed

Page 105: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Remaining points are computed by coupling transform

Page 106: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Remaining points are computed by coupling transform

Page 107: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Remaining points are computed by coupling transform

Page 108: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Remaining points are computed by coupling transform

Page 109: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Replacement Node

Contents of the failed node are now completely recovered

Page 110: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Replacement Node

Node Repair done: system back to original state!

Page 111: Codes for Big Data: Erasure Coding for Distributed … for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering

Thanks!

41 / 41