Top Banner
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 13: I/O Systems November 3, 2009
53

CS352H: Computer Systems Architecture

Mar 19, 2016

Download

Documents

faunus

CS352H: Computer Systems Architecture. Topic 13: I/O Systems November 3, 2009. Introduction. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections. I/O System Characteristics. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell

CS352H: Computer Systems Architecture

Topic 13: I/O SystemsNovember 3, 2009

Page 2: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2

IntroductionI/O devices can be characterized by

Behavior: input, output, storagePartner: human or machineData rate: bytes/sec, transfers/sec

I/O bus connections

Page 3: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3

I/O System Characteristics

Dependability is importantParticularly for storage devices

Performance measuresLatency (response time)Throughput (bandwidth)Desktops & embedded systems

Mainly interested in response time & diversity of devicesServers

Mainly interested in throughput & expandability of devices

Page 4: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4

Dependability

Fault: failure of a componentMay or may not lead to system failure

Service accomplishmentService delivered

as specified

Service interruptionDeviation from

specified service

FailureRestoration

Page 5: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5

Dependability Measures

Reliability: mean time to failure (MTTF)Service interruption: mean time to repair (MTTR)Mean time between failures

MTBF = MTTF + MTTRAvailability = MTTF / (MTTF + MTTR)Improving Availability

Increase MTTF: fault avoidance, fault tolerance, fault forecastingReduce MTTR: improved tools and processes for diagnosis and repair

Page 6: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6

Disk Storage

Nonvolatile, rotating magnetic storage

Page 7: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7

Disk Sectors and Access

Each sector recordsSector IDData (512 bytes, 4096 bytes proposed)Error correcting code (ECC)

Used to hide defects and recording errorsSynchronization fields and gaps

Access to a sector involvesQueuing delay if other accesses are pendingSeek: move the headsRotational latencyData transferController overhead

Page 8: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8

Disk Access Example

Given512B sector, 15,000rpm, 4ms average seek time, 100MB/s transfer rate, 0.2ms controller overhead, idle disk

Average read time4ms seek time+ ½ / (15,000/60) = 2ms rotational latency+ 512 / 100MB/s = 0.005ms transfer time+ 0.2ms controller delay= 6.2ms

If actual average seek time is 1msAverage read time = 3.2ms

Page 9: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9

Disk Performance Issues

Manufacturers quote average seek timeBased on all possible seeksLocality and OS scheduling lead to smaller actual average seek times

Smart disk controller allocate physical sectors on diskPresent logical sector interface to hostSCSI, ATA, SATA

Disk drives include cachesPrefetch sectors in anticipation of accessAvoid seek and rotational delay

Page 10: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10

Disk Specs

Page 11: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11

Flash Storage

Nonvolatile semiconductor storage100× – 1000× faster than diskSmaller, lower power, more robustBut more $/GB (between disk and DRAM)

Page 12: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12

Flash Types

NOR flash: bit cell like a NOR gateRandom read/write accessUsed for instruction memory in embedded systems

NAND flash: bit cell like a NAND gateDenser (bits/area), but block-at-a-time accessCheaper per GBUsed for USB keys, media storage, …

Flash bits wears out after 1000’s of accessesNot suitable for direct RAM or disk replacementWear leveling: remap data to less used blocks

Page 13: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13

Flash Types

Page 14: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14

Flash Specs

Page 15: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15

Interconnecting Components

Need interconnections betweenCPU, memory, I/O controllers

Bus: shared communication channelParallel set of wires for data and synchronization of data transferCan become a bottleneck

Performance limited by physical factorsWire length, number of connections

More recent alternative: high-speed serial connections with switches

Like networks

Page 16: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16

Bus Types

Processor-Memory busesShort, high speedDesign is matched to memory organization

I/O busesLonger, allowing multiple connectionsSpecified by standards for interoperabilityConnect to processor-memory bus through a bridge

Page 17: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17

Bus Signals and Synchronization

Data linesCarry address and dataMultiplexed or separate

Control linesIndicate data type, synchronize transactions

SynchronousUses a bus clock

AsynchronousUses request/acknowledge control lines for handshaking

Page 18: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18

I/O Bus ExamplesFirewire USB 2.0 PCI Express Serial ATA Serial

Attached SCSI

Intended use External External Internal Internal External

Devices per channel

63 127 1 1 4

Data width 4 2 2/lane 4 4

Peak bandwidth

50MB/s or 100MB/s

0.2MB/s, 1.5MB/s, or 60MB/s

250MB/s/lane1×, 2×, 4×, 8×, 16×, 32×

300MB/s 300MB/s

Hot pluggable Yes Yes Depends Yes Yes

Max length 4.5m 5m 0.5m 1m 8m

Standard IEEE 1394 USB Implementers Forum

PCI-SIG SATA-IO INCITS TC T10

Page 19: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19

Typical x86 PC I/O System

Page 20: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20

I/O Management

I/O is mediated by the OSMultiple programs share I/O resources

Need protection and schedulingI/O causes asynchronous interrupts

Same mechanism as exceptionsI/O programming is fiddly

OS provides abstractions to programs

Page 21: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21

I/O Commands

I/O devices are managed by I/O controller hardwareTransfers data to/from deviceSynchronizes operations with software

Command registersCause device to do something

Status registersIndicate what the device is doing and occurrence of errors

Data registersWrite: transfer data to a deviceRead: transfer data from a device

Page 22: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22

I/O Register Mapping

Memory mapped I/ORegisters are addressed in same space as memoryAddress decoder distinguishes between themOS uses address translation mechanism to make them only accessible to kernel

I/O instructionsSeparate instructions to access I/O registersCan only be executed in kernel modeExample: x86

Page 23: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23

Polling

Periodically check I/O status registerIf device ready, do operationIf error, take action

Common in small or low-performance real-time embedded systems

Predictable timingLow hardware cost

In other systems, wastes CPU time

Page 24: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24

Interrupts

When a device is ready or error occursController interrupts CPU

Interrupt is like an exceptionBut not synchronized to instruction executionCan invoke handler between instructionsCause information often identifies the interrupting device

Priority interruptsDevices needing more urgent attention get higher priorityCan interrupt handler for a lower priority interrupt

Page 25: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25

I/O Data Transfer

Polling and interrupt-driven I/OCPU transfers data between memory and I/O data registersTime consuming for high-speed devices

Direct memory access (DMA)OS provides starting address in memoryI/O controller transfers to/from memory autonomouslyController interrupts on completion or error

Page 26: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26

DMA/Cache Interaction

If DMA writes to a memory block that is cachedCached copy becomes stale

If write-back cache has dirty block, and DMA reads memory blockReads stale data

Need to ensure cache coherenceFlush blocks from cache if they will be used for DMAOr use non-cacheable memory locations for I/O

Page 27: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27

DMA/VM Interaction

OS uses virtual addresses for memoryDMA blocks may not be contiguous in physical memory

Should DMA use virtual addresses?Would require controller to do translation

If DMA uses physical addressesMay need to break transfers into page-sized chunksOr chain multiple transfersOr allocate contiguous physical pages for DMA

Page 28: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28

Measuring I/O Performance

I/O performance depends onHardware: CPU, memory, controllers, busesSoftware: operating system, database management system, applicationWorkload: request rates and patterns

I/O system design can trade-off between response time and throughput

Measurements of throughput often done with constrained response-time

Page 29: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29

Transaction Processing Benchmarks

TransactionsSmall data accesses to a DBMSInterested in I/O rate, not data rate

Measure throughputSubject to response time limits and failure handlingACID (Atomicity, Consistency, Isolation, Durability)Overall cost per transaction

Transaction Processing Council (TPC) benchmarks (www.tcp.org)TPC-APP: B2B application server and web servicesTCP-C: on-line order entry environmentTCP-E: on-line transaction processing for brokerage firmTPC-H: decision support — business oriented ad-hoc queries

Page 30: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30

File System & Web Benchmarks

SPEC System File System (SFS)Synthetic workload for NFS server, based on monitoring real systemsResults

Throughput (operations/sec)Response time (average ms/operation)

SPEC Web Server benchmarkMeasures simultaneous user sessions, subject to required throughput/sessionThree workloads: Banking, Ecommerce, and Support

Page 31: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31

I/O vs. CPU Performance

Amdahl’s LawDon’t neglect I/O performance as parallelism increases compute performance

ExampleBenchmark takes 90s CPU time, 10s I/O timeDouble the number of CPUs/2 years

I/O unchanged

Year CPU time I/O time Elapsed time % I/O timenow 90s 10s 100s 10%+2 45s 10s 55s 18%+4 23s 10s 33s 31%+6 11s 10s 21s 47%

Page 32: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32

RAID

Redundant Array of Inexpensive (Independent) DisksUse multiple smaller disks (c.f. one large disk)Parallelism improves performancePlus extra disk(s) for redundant data storage

Provides fault tolerant storage systemEspecially if failed disks can be “hot swapped”

RAID 0No redundancy (“AID”?)

Just stripe data over multiple disksBut it does improve performance

Page 33: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33

RAID 1 & 2

RAID 1: MirroringN + N disks, replicate data

Write data to both data disk and mirror diskOn disk failure, read from mirror

RAID 2: Error correcting code (ECC)N + E disks (e.g., 10 + 4)Split data at bit level across N disksGenerate E-bit ECCToo complex, not used in practice

Page 34: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34

RAID 3: Bit-Interleaved Parity

N + 1 disksData striped across N disks at byte levelRedundant disk stores parityRead access

Read all disksWrite access

Generate new parity and update all disksOn failure

Use parity to reconstruct missing data

Not widely used

Page 35: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35

RAID 4: Block-Interleaved Parity

N + 1 disksData striped across N disks at block levelRedundant disk stores parity for a group of blocksRead access

Read only the disk holding the required block

Write accessJust read disk containing modified block, and parity diskCalculate new parity, update data disk and parity disk

On failureUse parity to reconstruct missing data

Not widely used

Page 36: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36

RAID 3 vs RAID 4

Page 37: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37

RAID 5: Distributed Parity

N + 1 disksLike RAID 4, but parity blocks distributed across disks

Avoids parity disk being a bottleneckWidely used

Page 38: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 38

RAID 6: P + Q Redundancy

N + 2 disksLike RAID 5, but two lots of parityGreater fault tolerance through more redundancy

Multiple RAIDMore advanced systems give similar fault tolerance with better performance

Page 39: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 39

RAID Summary

RAID can improve performance and availabilityHigh availability requires hot swapping

Assumes independent disk failuresToo bad if the building burns down!

See “Hard Disk Performance, Quality and Reliability”http://www.pcguide.com/ref/hdd/perf/index.htm

Page 40: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 40

I/O System Design

Satisfying latency requirementsFor time-critical operationsIf system is unloaded

Add up latency of components

Maximizing throughputFind “weakest link” (lowest-bandwidth component)Configure to operate at its maximum bandwidthBalance remaining components in the system

If system is loaded, simple analysis is insufficientNeed to use queuing models or simulation

Page 41: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41

Server Computers

Applications are increasingly run on serversWeb search, office apps, virtual worlds, …

Requires large data center serversMultiple processors, networks connections, massive storageSpace and power constraints

Server equipment built for 19” racksMultiples of 1.75” (1U) high

Page 42: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 42

Rack-Mounted Servers

Sun Fire x4150 1U server

Page 43: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 43

Sun Fire x4150 1U server

4 cores each

16 x 4GB = 64GB DRAM

Page 44: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 44

I/O System Design Example

Given a Sun Fire x4150 system withWorkload: 64KB disk reads

Each I/O op requires 200,000 user-code instructions and 100,000 OS instructions

Each CPU: 109 instructions/secFSB: 10.6 GB/sec peakDRAM DDR2 667MHz: 5.336 GB/secPCI-E 8× bus: 8 × 250MB/sec = 2GB/secDisks: 15,000 rpm, 2.9ms avg. seek time, 112MB/sec transfer rate

What I/O rate can be sustained?For random reads, and for sequential reads

Page 45: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 45

Design Example (cont)

I/O rate for CPUs Per core: 109/(100,000 + 200,000) = 3,3338 cores: 26,667 ops/sec

Random reads, I/O rate for disksAssume actual seek time is average/4Time/op = seek + latency + transfer= 2.9ms/4 + 4ms/2 + 64KB/(112MB/s) = 3.3ms303 ops/sec per disk, 2424 ops/sec for 8 disks

Sequential reads112MB/s / 64KB = 1750 ops/sec per disk14,000 ops/sec for 8 disks

Page 46: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 46

Design Example (cont)

PCI-E I/O rate 2GB/sec / 64KB = 31,250 ops/sec

DRAM I/O rate5.336 GB/sec / 64KB = 83,375 ops/sec

FSB I/O rateAssume we can sustain half the peak rate5.3 GB/sec / 64KB = 81,540 ops/sec per FSB163,080 ops/sec for 2 FSBs

Weakest link: disks2424 ops/sec random, 14,000 ops/sec sequentialOther components have ample headroom to accommodate these rates

Page 47: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 47

Fallacy: Disk Dependability

If a disk manufacturer quotes MTTF as 1,200,000 hr (140yr)

A disk will work that long

Wrong: this is the mean time to failureWhat is the distribution of failures?What if you have 1000 disks

How many will fail per year?

Annual Failure Rate(AFR) = 8760 hrs/disk1200000 hrs/failure

= 0.0073 failures/disk ×100%= 0.73%

So 0.73% x 1000 disks = 7.3 failures expected in a year

Page 48: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 48

Fallacies

Disk failure rates are as specifiedStudies of failure rates in the field

Schroeder and Gibson: 2% to 4% vs. 0.6% to 0.8%Pinheiro, et al.: 1.7% (first year) to 8.6% (third year) vs. 1.5%

Why?A 1GB/s interconnect transfers 1GB in one sec

But what’s a GB?For bandwidth, use 1GB = 109 BFor storage, use 1GB = 230 B = 1.075×109 BSo 1GB/sec is 0.93GB in one second

About 7% error

Page 49: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 49

Pitfall: Offloading to I/O Processors

Overhead of managing I/O processor request may dominate

Quicker to do small operation on the CPUBut I/O architecture may prevent that

I/O processor may be slowerSince it’s supposed to be simpler

Making it faster makes it into a major system componentMight need its own coprocessors!

Page 50: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 50

Pitfall: Backing Up to Tape

Magnetic tape used to have advantagesRemovable, high capacity

Advantages eroded by disk technology developmentsMakes better sense to replicate data

E.g, RAID, remote mirroring

Page 51: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 51

Fallacy: Disk Scheduling

Best to let the OS schedule disk accessesBut modern drives deal with logical block addresses

Map to physical track, cylinder, sector locationsAlso, blocks are cached by the drive

OS is unaware of physical locationsReordering can reduce performanceDepending on placement and caching

Page 52: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 52

Pitfall: Peak Performance

Peak I/O rates are nearly impossible to achieveUsually, some other system component limits performanceE.g., transfers to memory over a bus

Collision with DRAM refreshArbitration contention with other bus masters

E.g., PCI bus: peak bandwidth ~133 MB/secIn practice, max 80MB/sec sustainable

Page 53: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 53

Concluding Remarks

I/O performance measuresThroughput, response timeDependability and cost also important

Buses used to connect CPU, memory,I/O controllers

Polling, interrupts, DMAI/O benchmarks

TPC, SPECSFS, SPECWebRAID

Improves performance and dependability