Top Banner
Disk Arrays COEN 180
41

Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Mar 29, 2015

Download

Documents

Mateo Ferrer
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

COEN 180

Page 2: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Large Storage Systems

Collection of disks to store large amount of data.

Performance advantage:Each drive can satisfy only so many IO per

seconds.Data spread across more drives is more

accessible. JBOD: Just a Bunch Of Disks

Page 3: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Large Storage Systems

Principal difficulty: ReliabilityData needs to be stored redundantly:

Mirroring, Replication Simple Expensive (double, triple, … storage costs) Good performance

Erasure correcting codes Complex Save storage Moderate performance

Page 4: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Large Storage Systems

Mirrored Disks Used by Tandem

1970 – 1997, bought by Compact Nonstop architecture

Used redundancy (CPU, storage) for fail-over capacity

Data is replicated on both drives Performance:

Writes as fast as single disk model Reads: Slightly faster, since we can serve the read from the

drive with best expected service time.

Page 5: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Service Time: Time to satisfy a request if system is otherwise idle.

Response Time: Time to satisfy a request at a given system load. Response time = service time + waiting time

Utilization: Time system is busy

Page 6: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics M/M/1 queue single server Assume Poisson arrival,

exponential service time Arrival rate Service time S Utilization U = S (Little’s law) Response time R

Determine R by: R = S + UR

R= S/(1-U) = S/(1- S)

0.2 0.4 0.6 0.8

5

10

15

20

S=1

hence U =

R

Page 7: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Need to determine service time of disk request.

Service time = seek time + latency + transfer time

Industrial (but wrong) determination:Seek time = time to travel one third of a disk.Why?

Page 8: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Assume that head position is randomly on any track.

Assume that target track is another random track.

Given x [0,1], calculate D(x) = distance of random point in [0,1] from

x.

Page 9: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Given x [0,1], calculate D(x) = distance of random point in [0,1] from x.

2

12

)1(

2

)()(

)(

2

22

0

1

1

0

xx

xx

dyxydyyx

dyxyxD

x

x

0.2 0.4 0.6 0.8 1

0.25

0.35

0.4

0.45

0.5

Page 10: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Now calculate the average distance from a random point to a random point in [0,1]

31

223

)(

1

0

23

1

0

x

x

xxx

dxxDD

Page 11: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Is Average Seek Time = Seek Time for Average Distance?

NO: Seek Time is not linearly dependent on average seek

time. Seek Time consists

acceleration cruising (if seek distance is long braking exact positioning

Page 12: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Is Average Seek Time = Seek Time for Average Distance?

Practical measurements suggestsSeek time depends on the seek distance

roughly as a square-root of distance

2 4 6 8 10

1

2

3

4

Page 13: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Performance Modeling Basics

Rules of ThumbKeep utilization of disks between 50% and

80%.

Page 14: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Dealing with reliability RAID

Redundant array of inexpensive (independent) disks

RAID Levels RAID Level 0: JBOD (striping) RAID Level 1: Mirroring RAID Level 2:

Encodes symbols (bytes) with a Hamming code. Stores a bit per symbol on different disk. Not used in practice.

Page 15: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Dealing with reliabilityRAID Levels

RAID Level 3: Encodes symbols (bytes) with the simple parity code. Breaks a file up into n stripes. Calculates parity stripes. Stores all n + 1 stripes on n + 1 disks.

Page 16: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Dealing with ReliabilityRAID Levels

RAID Level 4 Maintains n data drives. Files are stored completely on one drive.

Or perhaps in stripes if files become very large. Additional drive storing the byte-wise parity of the disk

arrays.

ParityData Data Data

Page 17: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Level 4 RAIDUneven load of parity drive and data drives

Page 18: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Dealing with ReliabilityRAID Level 5

No dedicated parity disk Data in blocks Blocks in parallel positions on disks form reliability stripe. One block in each reliability stripe is the parity of the

others.

No performance bottleneck

Page 19: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Dealing with ReliabilityRAID Level 6

Like RAID Level 5, but every stripe has two parity blocks

Lower write performance 2-failure resilience

RAID Level 7 Proprietary name for a RAID Level 3 with lots of

caching. (Marketing bogus)

Page 20: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Disk Array Operations Reads:

Directly from data in RAID Level 3-6 Writes:

Large Writes: Writes to all blocks in a single reliability stripe.

Calculate parity from data and write it. Small Writes:

Need to maintain parity. Option 1: Write data, then read all other blocks in the stripe

and recalculate parity. Option 2: Read old data, then overwrite it. Calculate the

difference (XOR) between old and new data. Then read old parity, XOR it with the result of the previous operation and overwrite with it the parity block.

Page 21: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Disk Array OperationsReconstruction (RAID Level 4-5):

Systematically:Reconstruct only lost data.Read all surviving blocks in the reliability stripe.Calculate its parity. This is the lost data block.Write data block in place of parity.

Out of order reconstruction for data that is being read.

Page 22: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays Performance Analysis

Assume that read and write service times are the same.

seek latency (transfer)

Write operation involves the read-modify operation. About twice as long as read / write service time seek latency transfer two latencies transfer

Page 23: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Performance Analysis Level 4 RAID

Offered read load r Offered write load w n disks

Utilization at data disk: r S /(n – 1) + w 2S/(n – 1)

Utilization at parity disk: w 2S

Equal utilization only if r = 2(n – 2) w

Page 24: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

100 200 300 400 500

0.2

0.4

0.6

0.8

1Disk Arrays

Performance Analysis Level 4 RAID

Offered load . Assume only small writes. Assume read /write ratio of

Utilization at data disk S/n

Utilization at write disk (1- )2 S

parity disk

data disk

Utilization

Offered Load (IO/sec)

Parameters:

4+1 layout

70% reads

Service time 10 msec

Page 25: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Performance Analysis RAID Level 5 Offered load Read ratio n disks

Read Load S/n

Write Load (1- ) 4S/n Every write leads to two read-modify-write ops.

Page 26: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

100 200 300 400 500

0.2

0.4

0.6

0.8

1

100 200 300 400 500

0.2

0.4

0.6

0.8

1

Disk Arrays

Level 4 RAID vs Level 5 RAID

Without parity disk (JBOD)

RAID Level 5

Parameters:4+1 layout70% readsService time 10 msec

parity drive

data drive

Page 27: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

PerformanceSmall writes are expensive.Parity logging (Daniel Stodolsky, Garth Gibson, Mark Holland)

Write operation: Read old data, Write new data, Send XOR to a parity log file.

Whenever parity log file becomes to big, process it by updating parity information.

Page 28: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

ReliabilityAccurately given by the probability of failure at

every moment in time.

5 10 15 20 25 30

0.4

0.6

0.8

1

Page 29: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

ReliabilityOften given by Mean Time To Data LossMTTDLWarning:

MTTDL numbers can be deceiving.

Red line is more reliable during Design Life, but has lower MTTDL

Page 30: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Use Markov Model to model system in various states.States describe system.Assumes constant rates of transitions.Transitions correspond to:

component failure component repair

Page 31: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

One component system

Failure State

(absorbing)

Initial State

MTTDL = MTTF = 1/

Page 32: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Two component system without repair

Failure State

(absorbing)

Initial State:

2 components working

22 1

1 component working, one failed

Page 33: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Two component system with repair

Failure State

(absorbing)

Initial State:

2 components working

22 1

1 component working, one failed

Page 34: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

How to calculate MTTF Start with original Markov model. Remove failure state. Replace transition(s) to failure state with failure

transitions to initial state. This models a meta-system where we replace a failed

system immediately with a new one. Now calculate the steady-state solution of the

Markov model. It typicallyhas become ergodic.

Use this to calculate the average rate of a failure transition being taken. This gives the MTTF.

Page 35: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

One component system

Initial State

System in initial state all the time.

Failure transition taken at rate .

“Loss rate” L = .

MTTDL = 1/L = 1/

Page 36: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Two component system without repair

Initial State:

2 components working

22 1

1 component working, one failed

Steady-state solution

Let x be the probability to be in state 2, y the probability to be in state 1.

Then:

Inflow into state 2 = Outflow from state 2:

2x = y

Total sum of probabilities is 1:

x+y = 1.

Page 37: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Two component system without repair

Initial State:

2 components working

22 1

1 component working, one failed

Steady-state solution

2x = y

x+y = 1.

Solution is:

x = 1/3, y = 2/3.

Loss rate is L = (2/3).

MTTF = 1/L = 1.5 (1/ ).

(1.5 times better than before).

Page 38: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays Two component system with repair

Initial State:

2 components working

22 1

1 component working, one failed

21

2

2

22

3

2

3

3

2

3

2,

3

1,)(2

MTTF

L

yx

yxyx

Page 39: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

RAID Level 4/5 Reliability

Failure State

(absorbing)

Initial State:

n disks

nn n-1

(n-1)

n – 1 disks

Page 40: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

RAID Level 6 Reliability

Initial State:

n disks

nn n-1

(n-2)

n – 1 disks

Failure State

(absorbing)

(n-1)n-2

2

n – 2 disks

Page 41: Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.

Disk Arrays

Sparing Create more resilience by adding a hot spare.Failover to hot spare reconstructs and

replaces contents of the lost disk on spare disk.

Distributed sparing (Menon et al.): Distribute the spare space throughout the disk

array.