Top Banner
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineering and Computer Science c C. D. Cantrell (05/1999) INPUT/OUTPUT (I/O) SUBSYSTEMS Overview of I/O performance measurement and analysis Processor interface issues Buses Types and characteristics of I/O devices . Hard disk storage . Network interfaces I/O system design
62
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

INPUT/OUTPUT (I/O) SUBSYSTEMS

• Overview of I/O performance measurement and analysis

• Processor interface issues

• Buses

• Types and characteristics of I/O devices

. Hard disk storage

. Network interfaces

• I/O system design

Page 2: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

MOTIVATION FOR STUDYING I/O

• CPU performance improves by 50% to 100% per year

• I/O systems’ performance improvements are limited by physics(in some cases)

. Mechanical delays (disk drives):Latency improvement is of order 5% per year

. Electrical and optical phenomena (dispersion, attenuation, crosstalk):Improvement is 5% to 25% per year

• Amdahl’s law implies that, sooner or later, most of the latency willbe due to the part that is hardest to improve

. Given: 10% of instructions perform I/O, CPU is 10 x faster

. Improvement is only 5 x ⇒ lose 50% of improvement

• I/O bottleneck lowers the value of CPU improvements

. As technology evolves, a diminishing fraction of total latency is dueto the CPU

Page 3: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

I/O PERFORMANCE METRICS

•Bandwidth (bits or bytes per second):

. Peak

. Sustained

. Useful for buses and networks

• Throughput (I/O processes per second)

. Useful for file serving and transaction processing

• Latency = total time for an I/O process from start to finish

. Most important to users◦ Latency too great ⇒ user loses train of thought◦ Latency

= controller time + wait time + no. bytesbandwidth + CPU time

− overlap

Page 4: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

PROCESSOR INTERFACE ISSUES

• Interconnections

. Buses

• Processor interface

. Interrupts

. Memory-mapped I/O

• I/O control structures

. Polling

. Interrupts

. DMA

. I/O controllers

. I/O processors

• Capacity, access time, bandwidth, cost

Page 5: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

BUSES

•Bus: A communication link shared by multiple subsystems

. Physically: Parallel conductors (traces on die or PC board; cable)

. Advantages:◦ Low cost (compared to point-to-point wiring)◦ Versatility of interconnections

. Disadvantages:◦ Electrical problems ⇒ short length¦ Bus skew¦ Dispersion¦ Crosstalk◦ Shared resource ⇒ contention

. Organization:◦ Control lines to signal & acknowledge requests◦ Data lines to carry addresses, data or commands

Page 6: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

PROCESSOR–I/O INTERFACE BUS TYPES

• Backplane bus

. Processor, memory and I/O devices coexist on the same bus

. In olden times, often built into the backplane of a computer◦ An interconnection structure that was part of the chassis

. Processor architecture includes explicit I/O instructions (IN, OUT)

. Standard backplane buses: VMEbus, Multibus, NuBus, PCI,ISA (Industry Standard Architecture) bus

• I/O bus

. Examples: IDE, SCSI

Page 7: io

Processor MemoryBackplane bus

a. Processor, memoryand I/O devices on thesame bus

I/O devices

Processor MemoryBackplane bus

b. Processor andmemory are on a backplane bus; bus adapters provideinterfaces for variousI/O buses

Busadapter

Busadapter

I/Obus

I/Obus

Busadapter

I/Obus

Processor MemoryProcessor-memory bus

c. Processor and memory are on afast synchronousbus

A bus adapter interfaces the processor-memorybus to the backplane bus

Busadapter

Backplanebus

Busadapter

I/O bus

Busadapter

I/O bus

Page 8: io

I/O SYSTEM USING ONLY A BACKPLANE BUS

Mainmemory

I/Ocontroller

I/Ocontroller

I/Ocontroller

Disk Graphicsoutput

Network

Memory–I/O bus

Processor

Cache

Interrupts

Disk

Page 9: io

I/O SYSTEM USING AN I/O BUS

Cache

I/O bus

I/Ocontroller

Disk Disk Graphicsoutput

Network

I/Ocontroller

I/Ocontroller

CPU-memory bus

CPU

Busadapter Main

memory

Page 10: io

MACINTOSH 72xx I/O SYSTEM

Mainmemory

I/Ocontroller

I/Ocontroller

Graphicsoutput

PCI

CDROM

Disk

Tape

I/Ocontroller

Stereo

I/Ocontroller

Serialports

I/Ocontroller

Appledesktop bus

Processor

PCIinterface/memory controller

EthernetSCSI bus

outputinput

BACKPLANEBUS

I/OCONTROLLERS

AND BUSADAPTERS

Page 11: io

PENTIUM II I/O SYSTEM

ISAbridge

Modem

Mouse

PCIbridgeCPU

Mainmemory

SCSI USB

Local bus

Soundcard Printer Available

ISA slot

ISA bus

IDEdisk

AvailablePCI slot

Key-board

Mon-itor

Graphicsadaptor

Level 2cache

Cache bus Memory bus

PCI bus

BACKPLANEBUS

I/OCONTROLLERS

AND BUSADAPTERS

Tanenbaum, Structured Computer Organization

Page 12: io

Gigaplane-XBIncludes: XB-Interconnect, 4 address buses, bulk power distribution

Local Power Converters

Enterprise 10000 hardware architecture

• Data is packet-switched using a crossbar• Addresses are broadcast

Page 13: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

3-STATE BUFFER

• A 3-state buffer has 2 inputs and 1 output

. Enable asserted: Output = input (state is either 0 or 1)

. Enable deasserted: High-impedance state (denoted × or Z)◦ Output can be driven by another device

. Equivalent to a mechanical switch

Page 14: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

EXCITATION TABLE FOR 3-STATE BUFFER

• A tristate buffer has 3 possible output values:

. Asserted

. Deasserted

. High impedance (floating)

enable in out0 0 Z0 1 Z1 0 01 1 1

Page 15: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

USE OF TRISTATES TO ENABLE/DISABLE BUS ACCESS

Page 16: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

BUS DESIGN CONSTRAINTS

• Laws of physics limit bus speeds

. Transmission speed ≤ speed of light

. Crosstalk◦ Occurs because:¦ A time-varying voltage on a conductor induces a chargeq2 = C12 v1 on another, parallel conductor¦ A time-varying current in a conductor induces a voltagev2 = L12 di1/dt in another, parallel conductor

◦ Limits bus clock frequency◦ Can be reduced by:¦ Grounding alternate conductors¦ Abandoning the bus concept and using twisted-pair, point-to-

point connections (Seymour Cray)◦ EMI & reflections limit number of devices connected to bus

• Real estate on die or PC board limits number of lines

Page 17: io

COMPLEX ULTRA-SCSI CHAIN

7.62 30.48 30.48 10.16 10.16 10.16 10.16 10.16210.82 TERMINATOR

FIVE 7.37-CM STUBS,25 pF EACH

THREE 12.45-CM STUBS,25 pF EACH

TERMINATOR

3 METERS (10 FEET) OVERALL LENGTH(INDIVIDUAL MEASUREMENTS IN CENTIMETERS)

DEVICEPOSITION

7 6 5 4 3 2 1 0DRIVER

Page 18: io

ACK SIGNALS ON COMPLEX ULTRA-SCSI CHAIN

DEVICEPOSITION

2 V

OLT

S P

ER

DIV

ISIO

N

0

6

2

4

4

0

2

4

0

2

4

0

2

4

0

2

4

0

2

7

6

5

4

0

DRIVER INPUT

DRIVER OUTPUT @ 7

DRIVER INPUT @ 6

DRIVER INPUT @ 5

DRIVER INPUT @ 4

DRIVER INPUT @ 0

LOGIC SIGNALDRIVING SCSIDRIVER

ACK SIGNALS

10 NANOSECONDS PER DIVISION

Page 19: io

ACK SIGNALS ON POINT-TO-POINT ULTRA-SCSI BUS

2 V

OLT

S P

ER

DIV

ISIO

N

6

4

2

0

4

2

0

4

2

0

DRIVER INPUT

DRIVER OUTPUT

RECEIVER INPUT AFTER 25 M

10 NANOSECONDS PER DIVISION

25 METERS (82 FEET) OVERALL LENGTH

TERMINATOR TERMINATOR

DRIVER RECEIVERONLY END LOADS FOR THIS TESTSHIELDED 34-PAIR EXTERNAL CABLE

ACK SIGNAL

Page 20: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

ASYNCHRONOUS vs. SYNCHRONOUS BUSES

• Bus communication protocol: Specification of sequence of events andtiming requirements for transferring information on a bus

• Asynchronous bus transfers:

. Certain conductors on the bus are control lines

. Signals on the control lines control the sequence of events

• Synchronous bus transfers:

. Events are sequenced relative to a master clock signal

. Once a certain kind of transfer has been initiated, no furthercommand signaling is necessary to control the transfer

Page 21: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

SYNCHRONOUS BUSES

• Bus clock is phase-locked to processor clock

. Bus clock frequency = 1n× processor clock frequency (n = 1 to 6)

. Clock signal is carried on a control line

. Communications protocol defined with reference to bus clock signal

. Local bus (e.g., VESA Local Bus):◦ Extends the processor’s bus control signals◦May connect processor to L2 cache◦May connect processor and memory to high-speed I/O devices

• Advantages:

. Fast & wide

. Simple logic (finite state machine)

• Disadvantages:

. Must be short (bus skew; attenuation; crosstalk)

. All devices must run at same frequency

Page 22: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

80286 – PENTIUM I/O

• Separate I/O and memory address spaces

. Since the 8086, I/O or memory access is signaled by M/IO#(memory access if high, I/O if low)◦ For MOVE (memory–CPU copy), M/IO# is high◦ For IN or OUT (I/O), M/IO# is low◦ M/IO# is a processor signal that does not appear on the ISA bus◦ Instead, M/IO# is an input to the bus controller

. I/O address space is 0x0000 to 0xffff

Page 23: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

80286 SIGNALS

CLK

8 0 2 8 6

A15A14A13A12A11A10

A 9A 8A 7A 6A 5A 4A 3A 2A 1A 0

COD/INTA/M / I O /

BHE

S1S0

HLDA

PEREQ

INTR

RESET

READY

BUSY

HOLD

NMI

LOCK

CAP

A16A17A18A19A20A21A22A23

D0D1D2D3D4D5D6D7

PEACK

ERROR

D8D9D10D11D12D13D14D15

3 1

5 14 94 74 54 34 13 93 75 04 84 64 44 24 03 83 6

6 36 45 75 96 1

5 45 3

2 9

781 01 11 21 31 41 51 61 71 81 92 02 12 22 32 42 52 62 72 83 23 33 4

16 76 66 86 5645

52

60

62

Upper databus transceiver

Lower databus transceiver

Addresslatch

Page 24: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

ISA BUS

• ISA ≡ Industry Standard Architecture

. Synchronous

. Industry response to IBM’s MicroChannel architecture

. Uses both the PC/AT and the IBM PC bus standards◦ Interface cards have 2 sets of connectors◦ PC bus: 8 data lines, 20 address lines◦ ISA bus: 16 data lines, 24 address lines; bus frequency 8.33 MHz

Maximum possible throughput: 2 bytes×8.33 MHz = 16.67 MB/s. Separate I/O and memory address spaces◦ Since the 8085, I/O or memory access is signaled by IO/M#

(I/O if high, memory access if low)¦ For MOVE (memory–CPU copy), IO/M# is high¦ For IN or OUT (I/O), IO/M# is low◦ I/O address space is 0x0000 to 0xffff

Page 25: io

ISA BUS CONNECTORS

Motherboard PC busPC bus

connectors ContactPlug-inboard

Chips

New connector for PC/AT Edge connector

CPU andotherchips

Tanenbaum, Structured Computer Organization

Page 26: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

PCI BUS

• PCI ≡ Peripheral Component Interconnect

. Synchronous

. PCI 1.0: Clock frequency 33 MHz, 32-bit-wide data path

. PCI 2.1: Clock frequency 66 MHz, 64-bit-wide data path◦Maximum theoretical bandwidth:

8 bytes× 66 MHz = 528 MB/s. Transactions are negative-edge-triggered. Address and data lines are multiplexed. Bus arbiter usually built into the chipset. Every PCI device has a 256-byte configuration address space that

is readable by other devices ⇒ Plug ’n Play

• PCI cards

. Options include voltage (5 V vs. 3.3 V), width (32 bits/120 pins vs.64 bits/184 pins) and frequency (33 vs. 66 MHz)

Page 27: io

PCI BUS ARBITER

PCIarbiter

PCIdevice

RE

Q#

GN

T#

PCIdevice

RE

Q#

GN

T#

PCIdevice

RE

Q#

GN

T#

PCIdevice

RE

Q#

GN

T#

Tanenbaum, Structured Computer Organization

Page 28: io

PCI BUS TIMING FOR READ AND WRITE CYCLES

Φ

T1 T2 T3 T4 T5 T6 T7

Turnaround

Address AddressData Data

Read Idle

Bus cycle

White

AD

C/BE#

FRAME#

IRDY#

DEVSEL#

TRDY#

Read cmd Wr ite cmdEnable Enable

Tanenbaum, Structured Computer Organization

Page 29: io

PCI BUS SIGNALS

Signal Lines Master Slave DescriptionCLK 1 Clock (33 MHz or 66 MHz)AD 32 × × Multiplexed address and data linesPAR 1 × Address or data parity bitC/BE 4 × Bus command/bit map for bytes enabledFRAME# 1 × Indicates that AD and C/BE are assertedIRDY# 1 × Read: master will accept; write: data presentIDSEL 1 × Select configuration space instead of memoryDEVSEL# 1 × Slave has decoded its address and is listeningTRDY# 1 × Read: data present; write: slave will acceptSTOP# 1 × Slave wants to stop transaction immediatelyPERR# 1 Data parity error detected by receiverSERR# 1 Address parity error or system error detectedREQ# 1 Bus arbitration: request for bus ownershipGNT# 1 Bus arbitration: grant of bus ownershipRST# 1 Reset the system and all devices

Sign Lines Master Slave DescriptionREQ64# 1 × Request to run a 64-bit transactionACK64# 1 × Permission is granted for a 64-bit transactionAD 32 × Additional 32 bits of address or dataPAR64 1 × Parity for the extra 32 address/data bitsC/BE# 4 × Additional 4 bits for byte enablesLOCK 1 × Lock the bus to allow multiple transactionsSBO# 1 Hit on a remote cache (for a multiprocessor)SDONE 1 Snooping done (for a multiprocessor)INTx 4 Request an interruptJTAG 5 IEEE 1149.1 JTAG test signalsM66EN 1 Wired to power or ground (66 MHz or 33 MHz)

MANDATORY PCI BUS SIGNALS

OPTIONAL PCI BUS SIGNALS

Tanenbaum, Structured Computer Organization

Page 30: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

OBTAINING BUS ACCESS

• Goal: Give every device fair access

•Method: Use bus masters

. A master enables bus access for one or more devices (by enabling/disablingtristate buffers)

. Single bus master can be a bottleneck

. Multiple masters require arbitration◦ Every device has a priority (IRQ number, SCSI ID, . . . )◦ Extra control lines needed for bus request/access

. Arbitration methods:◦ Centralized & parallel (SCSI)◦ Daisy chain (VMEbus)◦ Distributed arbitration using self-selection (NuBus)◦ Distributed arbitration using collision detection (Ethernet)

Page 31: io

A BUS TRANSACTION WITH A SINGLE MASTER

Memory Processor

Bus request lines

Bus

Disks

Bus request lines

Bus

Disks

Processor

Bus request lines

Bus

Disks

a. Device generatesbus request

b. Master (processor)responds by generatingcontrol signals (for read,etc.)

c. Processor notifies I/Odevice that its request isbeing processed; devicethen puts address for therequest on the bus

ProcessorMemory

Memory

Page 32: io

DAISY CHAIN

Device n

Lowest priority

Device 2Device 1

Highest priority

Busarbiter

Grant

Grant Grant

Release

Request

A daisy chain bus uses a bus grant line that chains through each devicefrom highest to lowest priority. The protocol is:1. Signal on the request line2. Wait for a low-to-high transition on the grant line (indicates reassignment)3. Intercept the grant signal and stop asserting the request line4. Use the bus5. Signal that the bus is no longer required by asserting the release line

Page 33: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

ASYNCHRONOUS BUSES

• Not clocked

. Can accomodate many kinds of devices (disk, tape, scanner, . . . )

• Data transfer controlled with handshaking protocol on dedicatedcontrol lines; represent with a finite state machine for each device

• Example (SCSI-1 bus):

. Bus controller asserts Sel (select device) and transmits device ID

. Selected device responds with Ack

. Controller asserts Cmd (command), Msg (message), and Req (requesta data transfer) signals, then transmits command bytes

. Device responds to each byte with Ack

. Controller deasserts Cmd, asserts I/O, then transmits data bytes

. Device responds to each byte with Ack

Page 34: io

STEPS OF AN ASYNCHRONOUS OUTPUT OPERATION

Memory Processor

Control lines

Data lines

Disks

Memory Processor

Control lines

Data lines

Disks

Processor

Control lines

Data lines

Disks

a. Initiation of a read operation from memory. Control lines: Read command; Data lines: Address

b. Memory access

c. Memory puts the data on the data lines of the bus and uses the control lines to signal the I/O device that the data is available

Memory

Page 35: io

STEPS OF AN ASYNCHRONOUS INPUT OPERATION

Memory Processor

Control lines

Data lines

Disks

Processor

Control lines

Data lines

Disks

a. Control lines: Write request to memory; Data lines: Address

b. Memory signals the device that it is ready; Data is transferred

Memory

Page 36: io

ASYNCHRONOUS BUS HANDSHAKING PROTOCOL

DataRdy

Ack

Data

ReadReq 13

4

57

642 2

1. When memory sees ReadReq asserted, it reads the address from the data bus and asserts Ack2. I/O device sees Ack asserted, releases ReadReq and data lines3. Memory sees ReadReq deasserted, drops Ack to acknowledge ReadReq4. Memory puts requested data on the data lines, asserts DataRdy5. I/O device sees DataRdy, reads data, signals that it has seen the data by asserting Ack6. Memory sees Ack, drops DataRdy, releases data lines7. I/O device sees DataRdy deasserted, drops Ack to signal end of transmission

I/O deviceMemory

Page 37: io

1Record fromdata linesand assert

Ack

ReadReq

ReadReq________

ReadReq

ReadReq

3, 4Drop Ack;

put memorydata on datalines; assert

DataRdy

Ack

Ack

6Release data

lines andDataRdy

________

___

Memory

2Release data

lines; deassertReadReq

Ack

DataRdy

DataRdy

5Read memorydata from data

lines;assert Ack

DataRdy

DataRdy

7Deassert Ack

I/O device

Put addresson data

lines; assertReadReq

________

Ack___

________

New I/O request

New I/O request

Page 38: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

SCSI-1: AN ASYNCHRONOUS BUS (1)

• SCSI := Small Computer System Interface

. Many “standard” implementations

. Can connect many different kinds of devices:◦ Logic board◦ Hard drive◦ CD-ROM drive◦ Tape drive◦ Scanner

. Controller chip on logic board or plug-in

. Controller is connected by cable to internal or peripheral devices

. Devices are daisy-chained

. Device ID is set by hardware switches

Page 39: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

SCSI-1: AN ASYNCHRONOUS BUS (2)

• SCSI-1 bus configuration

. Peripheral SCSI-1 devices are connected by cable

. Each bit of a data byte is transferred on a separate wire (line) ofthe cable

. Each device must have a unique ID number between 0 and 7◦ The ID is signaled by asserting one of the lines DB(0) – DB(7)◦ In case of contention, the device with the highest ID wins◦ The logic board has ID 7, so it always wins

Page 40: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

http://scitexdv.com/SCSI2/

SCSI ID BITS

Page 41: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (02/1999)

SCSI-1: AN ASYNCHRONOUS BUS (3)

• SCSI signaling sequence for data transfer

. Controller broadcasts SEL (select) signal on pin 44 and the IDnumber on one of the data lines

. Device selected responds with ACK (acknowledge) signal on pin 48(handshake)

. Controller sends REQ (request) signal on pin 48 to order device toperform a task (such as transferring a data byte)

. Command bytes are transferred on the data bus

. A handshake must take place for each data byte transferred

Page 42: io

SCSI Bus SignalsSignal Driven By Signal Explanation

DB0–DB7 Initiator/Target 8-Bit Bidirectional Data Bus.DBP Initiator/Target Data-Bus Parity Line. Optional.ATN Initiator Attention. Used to send a message to the target when it controls the bus.BSY Initiator/Target Busy. Indicates that the bus is unavailable for use.ACK Initiator Acknowledge. Used by the initiator for handshaking.RST Any Device Reset. Used to initiate a bus-free phase.MSG Target Driven by the target to indicate that the current transfer is a message.

SEL Initiator Select. Used by the initiator to select a target before command execution. Also used by the target to reconnect when the reselection phase is implemented.

C/D Target Control/Data. Used during the information transfer phases to transfer commands, sta-tus, data or messages over the bus.

REQ Target Request. Used by the target during information transfer phases.I/O Target Input/Output. Determines the direction of the transfer.

Page 43: io

Phase Sequences of the SCSI Bus

ARBITRATION(OPTIONAL)

SELECTION

RESELECTION(OPTIONAL)

COMMAND

DATA

STATUS

MESSAGE(OPTIONAL)

BUS FREE

Page 44: io

SCSI Information Transfer PhasesSignal

SEL BSY MSG C/D I/O Direction Phase

0 1 0 0 0 To Target Data Out

0 1 0 0 1 From Target Data In

0 1 0 1 0 To Target Command

0 1 0 1 1 From Target Status

0 1 1 0 0 — Reserved

0 1 1 0 1 — Reserved

0 1 1 1 0 To Target Message Out

0 1 1 1 1 From Target Message In

Page 45: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

http://homebrew.cs.ubc.ca/415/project-submissions/group9/notes/scsi-2.html

SCSI BUS TOPOLOGY

Page 46: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

I/O AS SYNCHRONIZATION OF DATA TRANSFERS

• Fundamental problems of communication between devices, or betweenthe CPU and peripheral devices:

. Detection that a data transfer is necessary◦ Dedicated polling◦ Interrupts◦ Periodic polling

. Synchronization of two devices, or a device and a CPU, withdifferent speeds◦Wait state insertion◦ DMA◦ Dual-ported memory◦ FIFO buffers◦ Caches

Page 47: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

POLLED I/O

Is dataready?

noyes

Readdata;done?

noyes

A polling loopis not an efficientway to use a CPUunless the deviceis very fast. If the

device is fast,then “data ready”

checks can beinterspersed among usefulinstructions.

Wait

In most cases it is more efficient for the I/O

device to tell the CPUwhen data is ready,or when a transfer is

complete, than for theCPU to check the devicefrequently. An I/O devicecan use interrupts to tell

the CPU that a data transfer should be started, or is finished.

Page 48: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

DEDICATED vs. PERIODIC POLLING

• Periodic polling means that the CPU periodically interrogates theI/O device (e.g., via an oscillator–counter–decoder combination) tosee whether data is ready

•Dedicated polling (spin waiting) means that the I/O device con-troller sets or clears bits in a status register that is read in a tightloop by the CPU

. When a system call for keyboard input is issued, and dedicatedpolling is in use, the CPU executes code somewhat like this:get_loop: lw $a0, Device_Status

bgez $a0, get_looplb $2, Device_Datarfe

. This operation transfers only a single byte; data may be missed

. A different approach is necessary for block transfers

Page 49: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

INTERRUPT-DRIVEN I/O (1)

• An interrupt is an event that occurs outside the execution cycle andthat causes processing of the current thread to stop. Interrupts can be used to give I/O devices a means to signal the

CPU that an event has occurred that requires action by the CPU(data is ready, etc.)

. An interrupt causes an exception, which results in a jump to theappropriate exception handling code (MIPS: address 0x80000080)

. There are (at least) two principal methods for detecting interruptsin hardware:◦ Connect the interrupt request output of an I/O device to one of

the inputs of an interrupt controller¦ Interrupts may be level-triggered or edge-triggered◦ Connect one interrupt line to an OR of inputs from several devices

that are periodically strobed for data ready¦ Device that caused the interrupt can be detected by reading a

status word formed from inputs from the devices

Page 50: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

INTERRUPT-DRIVEN I/O (2)

• On a RISC machine, an interrupt causes a jump to the general excep-tion handling code (with a few special cases such as Reset and UTLBMiss)

. Method of P&H Chapter 5: Execution is suspended immediately◦ This method is required for some exceptions (TLB miss, page

fault) unless execution can be undone◦ Restarting is hard in ISAs where memory is accessed at multiple

times during execution of an instruction. Method of choice: The instruction that caused the exception is

allowed to finish; subsequent instructions are suspended. Pending interrupts must be handled before next instruction is fetched. The exception handler determines the code to execute, based on

the Cause register contents. The operating system determines what state needs to be saved (if

any) besides the EPC and Cause registers

Page 51: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

INTERRUPT-DRIVEN I/O (3)

•MIPS R2000 interrupt handler:

. Saves $a0 and $v0 in special locations◦ save0 is at address 0x90000250; save1 is at address 0x90000254◦ $a0 and $v0 can’t be pushed onto the stack, because the cause

of the exception may be a bad stack pointer!. Copies coprocessor 0 Cause and EPC registers into $k0 and $k1

. Pushes current Kernel/User mode and Interrupt Enable Mode bitsonto the stack in the Status register (see next slide)

. The kernel’s exception handler uses a jump table (or a sequenceof beq’s) to determine the right code to execute (see SPIM kerneltext)

. The operating system clears the interrupts, if any

. After executing an rfe instruction, the processor may restartexecution at the address in the EPC

Page 52: io

15 8 5 4 3 2 1 0

Interruptmask

Old Previous Current

Kern

el/

user Inte

rrupt

enab

leKe

rnel

/us

er Kern

el/

userInte

rrupt

enab

le

Inte

rrupt

enab

le

MIPS R2000 STATUS REGISTER

Stack for kernel/user and interrupt enable bitslets processor respond to two levels of

exceptions before software must save theStatus register

BEV

TS PE CM PZ SwC

IsC

22 1631 28

CU

15 10 5 2

Pendinginterrupts

Exceptioncode

(ExcCode)

MIPS R2000 CAUSE REGISTER

Page 53: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

EXCEPTION CODES IN THE MIPS R2000 ISA

ExcCode Name Description0 Int External interrupt1 MOD TLB modification exception2 TLBL TLB miss exception (Load or instruction fetch)3 TLBS TLB miss exception (Store)4 AdEL Address error exception (Load or instruction fetch)5 AdES Address error exception (Store)6 IBE Instruction fetch bus error exception7 DBE Data load or store bus error exception8 Sys System call exception9 Bp Breakpoint exception10 RI Reserved or undefined instruction exception11 CpU Coprocessor unusable exception12 Ovf Arithmetic overflow exception

Page 54: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

SINGLE- AND MULTIPLE-LINE INTERRUPT SYSTEMS

CPU

Interruptflip-flop

SINGLE-LINE INTERRUPT SYSTEM

MULTIPLE-LINE INTERRUPT SYSTEM

CPU

Interruptregister 10 2 3 INTERRUPT REQUEST NUMBERS

Page 55: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

VECTORED INTERRUPT SYSTEM

Priorityencoder

Interruptrequestlines

Interruptmaskregister

Interruptregister

Interruptnumber to CPU

Inputactive

Interruptpending

Page 56: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

INTERRUPT-DRIVEN I/O (4)

• In the Motorola 68000 series, the CPU checks for pending interruptsafter execution of each instruction

. CPU saves status register (SR) and enters supervisor mode

. After determining the interrupt number N, the CPU saves stateinformation and executes M[4N]→ PC, causing a branch to the textat the location pointed to by M[4N]

Page 57: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

VECTORED INTERRUPTS IN THE IBM PC

8259AInterruptcontroller

TOCPU

D0-D7

CSA0WR

INTA

RD

INT IRQ0IRQ1IRQ2IRQ3IRQ4IRQ5IRQ6IRQ7

+5 v

8259AInterruptcontroller

IRQ8IRQ9IRQ10IRQ11IRQ12IRQ13IRQ14IRQ15

INTINTA

D0-D7

CSA0WRRD

Page 58: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

MEMORY-MAPPED I/O

• Instead of having multiple address spaces for memory, I/O, etc., havea single address space

. Loading from a memory location that is mapped to an I/O devicereads a data byte or word from the device

. Storing to a memory location that is mapped to an I/O devicewrites a data byte or word to the device

. Used in Motorola 68000 series

• In order to synchronize I/O properly, additional memory locationsmay be mapped to status words for the I/O devices

Page 59: io

1

Interruptenable

Ready

1Unused

Receiver control(0xffff0000)

8

Received byte

Unused

Receiver data(0xffff0004)

1

Interruptenable

Ready

1Unused

Transmitter control(0xffff0008)

Transmitter data(0xffff000c)

8

Transmitted byte

Unused

SPIM’s MEMORY-MAPPED I/O REGISTERS

Page 60: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

NETWORK INTERFACE CARD

TCLKTETXD

CDRXDCOL

COMMUNICATIONCONTROLLER

(FRAMING,BUS INTERFACE)

ETHERNETINTERFACEADAPTER

(SIGNALING)

BUS INTERFACE

JACK

Page 61: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

I/O PROCESSORS

• An I/O processor (IOP) is a processor with (usually) a morerestricted instruction set than the CPU

. Purpose: Offload I/O processing from the CPU◦ Used in CDC 6600, IBM S/360–370, ...

. I/O instructions executed by an IOP are called channel commandwords in the IBM world

. A CPU and its IOPs are really a shared-memory multiprocessor

Page 62: io

THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science

c© C. D. Cantrell (05/1999)

RELATION OF I/O TO PROCESSOR ARCHITECTURE

• I/O instructions and buses have disappeared

• Interrupt vectors have been replaced by jump tables

• Interrupt stack replaced by shadow registers

. Handler saves registers and re-enables higher-priority interrupts

• Interrupt types reduced in number

. Handler must query interrupt controller

• Caches cause problems for I/O

. Flushing degrades performance heavily

. Solution: “snooping” (borrowed from shared-memorymultiprocessors)

• Virtual memory frustrates DMA

• Load-store architecture inconsistent with atomic I/O operations

• Stateful processors hard to context switch