Top Banner
Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004
22

Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Replica Placement Strategy for Wide-Area Storage Systems

Byung-Gon Chun and Hakim Weatherspoon

RADS Final PresentationDecember 9, 2004

Page 2: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:2

Environment• Store large quantities of data persistently and

availably• Storage Strategy

– Redundancy - duplicate data to protect against data loss– Place data throughout wide area for availability and durability

• Avoid correlated failures– Continuously repair loss redundancy as needed

• Detect permanent node failures and trigger data recovery

Page 3: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:3

Assumptions• Data is maintained on nodes, in the wide area, and

in well maintained sites.• Sites contribute resources

– Nodes (storage, cpu)– Network - bandwidth

• Nodes collectively maintain data– Adaptive - Constant change, Self-organizing, self-

maintaining

• Costs– Data Recovery

• Process of maintaining data availability– Limit wide area bandwidth used to maintain data

Page 4: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:4

Challenge

• Avoiding correlated failures/downtime with careful data placement– Minimize cost of resources used to maintain data

• Storage• Bandwidth

– Maximize• Data availability

Page 5: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:5

Outline• Analysis of correlated failures

– Show that correlated failures exist - are significant

• Effects of common subnet (admin area, geographic location, etc)– Pick a threshold and extra redundancy

• Effects of extra redundancy– Vary extra redundancy– Compare random, random w/ constraint, and oracle

placement– Show that margin between oracle and random is small

Page 6: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:6

Analysis of PlanetLab Trace characteristics

• Trace-driven simulation• Model maintaining data on PlanetLab• Create trace using all-pairs ping*

– Collected from February 16, 2003 to October 6, 2004

• Measure– Correlated failures v. time– Probability of k nodes down simultaneously– {5th Percentile, Median} number of available replicas v. time– Cumulative number of triggered data recovery v. time

*Jeremy Stribling http://infospect.planet-lab.org/pings

Page 7: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:7

Analysis of PlanetLab II Correlated failures

Page 8: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:8

Analysis I - Node characteristics

Page 9: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:9

Analysis II- Correlated Failures

Page 10: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:10

Correlated Failures

Page 11: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:11

Correlated Failures (machine with downtime <= 1000 slots)

Page 12: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:12

Availability Trace

Page 13: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:13

Replica Placement Strategies

• Random• RandomSite

– Avoid to place multiple replicas in the same site– A site in PlanetLab is identified by 2B IP address prefix.

• RandomBlacklist– Avoid to use machines, in blacklist, that are top k

machines with long down time

• RandomSiteBlacklist– Combine RandomSite and RandomBlacklist

Page 14: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:14

Comparison of simple strategies(m=1, th=9, n=14, |blacklist|

=35)

Strategy Random RandomSite

RandomBlacklist

RandomSiteBlacklist

# of repairs

9075 8581 8691 8160

Improvement (%)

5.44 4.23 10.08

Page 15: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:15

Simulation setup• Placement Algorithm

– Random vs. Oracle – Oracle strategies

• Max-Lifetime-Availability• Min-Max-TTR, Min-Sum-TTR, Min-Mean-TTR

• Simulation Parameters– Replication m = 1, threshold th = 9, total replicas n = 15– Initial repository size 2TB– Write rate 1Kbps per node and 10Kbps per node

• 300 storage nodes• System increases in size at rate of 3TB and 30TB per year,

respective.

• Metrics– Number of available nodes– Number of data repairs

Page 16: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:16

Comparison of simple strategies(m=1, th=9)

Page 17: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:17

Results - Random Placement(1Kbps)

Page 18: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:18

Results - Oracle Max-Lifetime-Avail

(1Kbps)

Page 19: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:19

Results - Breakdown of Random (1Kbps)

Page 20: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:20

Results - Random(10Kbps)

Page 21: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:21

Results - Breakdown of Random (10Kbps)

Page 22: Replica Placement Strategy for Wide-Area Storage Systems Byung-Gon Chun and Hakim Weatherspoon RADS Final Presentation December 9, 2004.

Final Presentation:22

Conclusion

• There does exist correlated downtimes. • Random is sufficient

– A minimum data availability threshold and extra redundancy is sufficient to absorb most correlation.