Top Banner
Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3 , Ethan Miller 1 , Darrell Long 1 , Andy Hospodor 1,2 , Spencer Ng 3 1 Storage Systems Resource Center, U. of California, Santa Cruz 2 Santa Clara University, Santa Clara, CA 3 Hitachi Global Storage Technologies, San Jose Research Center,
21

Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Dec 18, 2015

Download

Documents

Toby Tyler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Disk Scrubbing in Large Archival Storage Systems

Thomas Schwarz, S.J.1,2 Qin Xin1,3, Ethan Miller1, Darrell Long1, Andy Hospodor1,2,

Spencer Ng3

1 Storage Systems Resource Center, U. of California, Santa Cruz2 Santa Clara University, Santa Clara, CA3 Hitachi Global Storage Technologies, San Jose Research Center,

Page 2: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Introduction

Large archival storage systems: Protect data more proactively Keep disks powered off for long

periods of time Have low rate of data access

Protect data by storing it redundantly.

Page 3: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Introduction Failures can happen

At the block level. At the device level.

Failures may remain undetected for long periods of time.

A failure may unmask one or more additional failures. Reconstruction procedure accesses data on

other devices. Those devices can have suffered previous

failures.

Page 4: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Introduction

We investigate the efficacy of disk scrubbing.

Disk Scrubbing accesses a disk to see whether the data can still be read. Reading a single block shows that the

device still works. Reading all blocks shows that we can

read all the data on the block.

Page 5: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Contents

1. Disk Failure Taxonomy2. System Overview3. Disk Scrubbing Modeling4. Power Cycles and Reliability5. Optimal Scrubbing Interval6. Simulation Results

Page 6: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Disk Failure Taxonomy Disk Blocks

512B sector uses error control coding Read to a block successfully either

corrects all errors, or retries and then: flags block as unreadable, or misreads block.

Disk Failure Rates Depend highly on

Environment: Temperature, Vibrations, Air quality

Age. Vintage.

Page 7: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Disk Failure Taxonomy Block Failure Rate estimate:

Since: 1/3 of all field returns for server drives are due to hard

errors. RAID users (90%) do not return drives with hard errors. 10% of all disks sold account for 1/3 of all errors.

Hence: Mean Time between Block Failures is 3/10 MTBF of all

disk failures. Mean time to disk failure is 3/2 of MTBF. 1 million hour rated drive has

3*105 mean time between block failure. 1.5*106 mean time between disk failure.

This is one back of the envelope calculation based on numbers by one anonymous disk manufacturer. The results seem to be accepted by many.

Page 8: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

System Overview Disks are powered down when not

in use. Use m+k redundancy scheme:

Store data in large blocks. m blocks grouped into an r-group. Add k parity data blocks to r-group.

Small blocks lead to fast reconstruction and good reconstruction load distribution.

Large blocks have slightly better reliability.

Page 9: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

System Overview Disk Scrubbing Scrub an S - block

Can read one block device not failed. Can read all blocks can access all

data. Can read and verify all blocks data

can be read correctly. Use “algebraic signatures” for that. Can even verify that parity data accurately

reflects client data.

Page 10: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

System Overview

If a bad block is detected, we usually can reconstruct its contents with parity / mirrored data.

Scrubbing finds the error before it can hurt you.

Page 11: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Modeling Scrubbing

Random Scrubbing: Scrub an S-block at random. (Exponential distribution).

Deterministic Scrubbing: Scrub an S-block at regular intervals.

Page 12: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Modeling ScrubbingOpportunistic Scrubbing

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0 1 2 3 4 5 6 7 8 9

MTBA

Pro

b. F

aile

d B

lock

Opportunistic Scrubbing Deterministic Scrubbing Random Scrubbing

Opportunistic Scrubbing:

Try to scrub when you access the disk anyway.

“Piggyback scrubs on disk accesses”

Efficiency depends on the frequency of accesses.

MTBA: Mean Time Between Accesses (103

hours).

Average scrub interval 104 hours.

Block MTBF 105 hours.

Page 13: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Power Cycling and Reliability Turning a disk on or off has a significant

impact. Even if disks move actuators away from

surface (laptop disks). No direct data to measure impact of

Power On Hours (POH). Extrapolate from Seagate data:

One on / off cycle is roughly equivalent to running a disk for eight hours.

Page 14: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Determining Scrubbing Intervals

Interval too short: Too much traffic Disks busy

Increased error rate Lower system MTBF.

Interval too long: A failure more likely to unmask other

failures. More failures catastrophic. Lower system MTBF.

Page 15: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Determining Scrubbing Intervals

Mirrored reliability block

N = 250 disks.

Device failure rate: 5·105 hours

Block failure rate: 10-5

Time to read disk: 4 hours.

Deterministic: without considering power-up effects.

Deterministic with cycling: considering power-up effects.

Opportunistic does not pay power-on penalty, but runs disk longer.

Random does not pay power-on penalty. Random with cycling would be below the deterministic with cycling graph.

Page 16: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Determining Scrubbing Intervals

Scrub frequently: You never know what you might find.

Mirrored disks using opportunistic scrubbing (no power-on penalty).

Assumes a high disk access rate.

Page 17: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Simulation Results

1PB archival data store. Disks have MTBF of 105 hours. 10,000 disk drives 10GB reliability blocks. ~1TB/day traffic

Page 18: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Simulation Results

Two-way Mirroring

Page 19: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Simulation Results

RAID 5 redundancy scheme

Page 20: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Simulation ResultsMirroring.

Opportunistic scrubbing with ~ three disk accesses per year.

Observe that additional scrubbing leads to more power-on cycles that slightly increase occurrence of data losses.

Page 21: Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.

Conclusions We have shown that disk

scrubbing is a necessity for very large scale storage systems.

Our simulations show the impact of power-on / power-off on reliability.

We also note that lack of numbers on disk drive reliability prevents public research.