Top Banner
15-447 Computer Architecture Fall 2008 © November 12, 2007 Nael Abu-Ghazaleh [email protected] http://www.qatar.cmu.edu/~msakr/15447-f0 8 Lecture 24 Disk IO and RAID CS 15-447: Computer Architecture
25

15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh [email protected] msakr/15447-f08 Lecture 24 Disk IO.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

November 12, 2007Nael Abu-Ghazaleh

[email protected]://www.qatar.cmu.edu/~msakr/15447-f08

Lecture 24Disk IO and RAID

CS 15-447: Computer Architecture

Page 2: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Interfacing Processor with peripherals

mainmemory

I/O bridge

bus interface

Front side bus, akasystem bus memory bus

L2 Cache

L1 cachedata

L1 cacheInstrs.

To I/O

Processor

Page 3: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Another view

Page 4: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Disk Access

• Seek: position head over the proper track(5 to 15 ms. avg.)

• Rotate: wait for desired sector(.5 / RPM). RPM 5400—15,000 currently

• Transfer: get the data(30-100Mbytes/sec)

Platter

Track

Platters

Sectors

Tracks

Page 5: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Manufacturing Advantages of Disk Arrays

14”10”5.25”3.5”

3.5”

Disk Array: 1 disk design

Conventional: 4 disk designs

Low End High End

Disk Product Families

Page 6: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

RAID: Redundant Array of Inexpensive Disks

• RAID 0: Striping (misnomer: non-redundant)• RAID 1: Mirroring• RAID 2: Striping + Error Correction• RAID 3: Bit striping + Parity Disk• RAID 4: Block striping + Parity Disk• RAID 5: Block striping + Distributed Parity• RAID 6: multiple parity checks

Page 7: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Non-Redundant Array

• Striped: write sequential blocks across disk array

• High performance• Poor reliability:

MTTFArray = MTTFDisk / NMTTFDisk = 50,000 hours (6 years)N = 70 DisksMTTFArray= 700 hours (1 month)

OddBlocks

EvenBlocks

Page 8: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Redundant Arrays of Disks

• Files are "striped" across multiple spindles

• Redundancy yields high data availability• When disks fail, contents are

reconstructed from data redundantly stored in the array

• High reliability comes at a cost:– Reduced storage capacity– Lower performance

Page 9: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

RAID 1: Mirroring

• Each disk is fully duplicated onto its “shadow” very high availability

• Bandwidth sacrifice on writes:Logical write = two physical writes

• Reads may be optimized• Most expensive solution: 100%

capacity overhead

Used in high I/O rate , high availability environments

Page 10: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

RAID 3: bit striping + parity

• A parity bit for every bit in the striped data

• Parity is relatively easy to compute

• How does it perform for small reads/writes?

Page 11: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Redundant Arrays of Disks RAID 3: Parity Disk

P100100111100110110010011

. . .

logical record 10010011

11001101

10010011

00110000

Striped physicalrecords

• Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time• Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate disk

Targeted for high bandwidth applications: Scientific, Image Processing

Page 12: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

RAID 4 (Block interleaved parity)

Page 13: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Redundant Arrays of Disks RAID 5+: High I/O Rate Parity

A logical writebecomes fourphysical I/Os

Independent writespossible because ofinterleaved parity

Reed-SolomonCodes ("Q") forprotection duringreconstruction

A logical writebecomes fourphysical I/Os

Independent writespossible because ofinterleaved parity

Reed-SolomonCodes ("Q") forprotection duringreconstruction

D0 D1 D2 D3 P

D4 D5 D6 P D7

D8 D9 P D10 D11

D12 P D13 D14 D15

P D16 D17 D18 D19

D20 D21 D22 D23 P

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.Disk Columns

IncreasingLogical

Disk Addresses

Stripe

StripeUnit

Targeted for mixedapplications

Page 14: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Nested RAID levels

• RAID 01 and 10 combine mirroring and striping– Combine high performance (striping) and

reliability (mirroring)– Get reliability without having to compute

parities: higher performance and less complex controller

• RAID 05 and 50 (also called 53)

Page 15: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Operating System can help (1) Reducing access time

• Disk defragmentation: why does that work?

• Disk scheduling: operating system can reorder requests– How does it work? Reduce seek time

• Example: Mean seek distance first, Elevator algorithm, Typewriter algorithm– Lets do an example

• Log structured file systems

Page 16: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Log structured file systems

• Idea: most reads to disk are serviced from cache – locality!

• But what about writes? they have to go to disk; if system crashes, we the file system is compromised

• How can we make updates perform better:– Save them in a log (sequentially) instead of

their original location; why does that help?– Tricky to manage

Page 17: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Operating System can help (2) Reliability

• RAIDs are reliable to disk failures, not CPU failures/software bugs– If the cpu writes corrupt data to all redundant

disks, what can we do?

• Backups• Reliability in the operating system

Page 18: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

How are files allocated on disk?

• Index block, has pointers to the other blocks in the file

• Alternatives: linked allocation

• Data and meta data both stored on disk

• What do we do for bigger files?

Page 19: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Unix Inodes

Page 20: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Disk reliability

• Any update to disk, changes both data and meta data– requires several writes

• Operating system may reorder them as we saw

• What happens if there is a crash?– Lets look at examples

• Solution: journaling file system– Update journal before updating filesystem

Page 21: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Flash Memory

• Emerging technology for non-volatile storage – competitor to hard disks, especially for embedded market– Can be used as cache for the disk (much larger than RAM

disks for the same price, and persistent)

• Floating gate transistors: semi-conductor technology (like microprocessors and memory) – we know how to build them big (or small!) and cheap– Faster, lower power than disk drives– ...but still more expensive, and has some limitations

• Two types of flash memory: NAND and NOR

Page 22: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

NOR Flash

• NOR accessed like regular memory and has faster read time– Used for executables/firmware that dont need

to change often (PDAs, cellphones, etc.. code)

– Can be executed in place

• bad write/erase performance (2 seconds to erase a block!)

• bad wear properties (100,000 writes average lifetime)

Page 23: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

NAND Flash

• Accessed like a block device (like a disk drive)– Higher density, lower cost

• Faster write/erase time; longer write life expectancy

• Well suited for cameras, mp3 players, USB drives...

• Less reliable than NOR (requires error correction codes)

Page 24: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Different properties from Disks

• Flash memory has quite different properties from disks – Emphasis on seek time gone

• Needs to erase a segment before writing (small writes are expensive!)– Slow...(especially NOR erase/write and NAND random

access reads)– Must be done in large segments (10s of KBytes)– Can only be rewritten a limited number of times

Page 25: 15-447 Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh naelag@cmu.edu msakr/15447-f08 Lecture 24 Disk IO.

15-447 Computer Architecture Fall 2008 ©

Summary of Flash circa. 2006