2Systems Architecture, Fifth Edition
Chapter Goals
• Describe the system bus and bus protocol• Describe how the CPU and bus interact with
peripheral devices• Describe the purpose and function of device
controllers• Describe how interrupt processing coordinates the
CPU with secondary storage and I/O devices
3Systems Architecture, Fifth Edition
Chapter Goals (continued)
• Describe how buffers, caches, and data compression improve computer system performance
4Systems Architecture, Fifth Edition
5Systems Architecture, Fifth Edition
System Bus
• Connects CPU with main memory and peripheral devices
• Set of data lines, control lines, and status lines• Bus protocol
– Number and use of lines
– Procedures for controlling access to the bus
• Subsets of bus lines: data bus, address bus, control bus
6Systems Architecture, Fifth Edition
7Systems Architecture, Fifth Edition
Bus Clock and Data Transfer Rate
• Bus clock pulse– Common timing reference for all attached devices
– Frequency measured in MHz
• Bus cycle– Time interval from one clock pulse to the next
• Data transfer rate– Measure of communication capacity
– Bus capacity = data transfer unit x clock rate
8Systems Architecture, Fifth Edition
Bus Protocol
• Governs format, content, timing of data, memory addresses, and control messages sent across bus
• We can’t let two devices put data on the bus at the same time. So we need access control.
• Approaches for access control– Master-slave approach – traditional – CPU is bus
master and all other devices are slaves
9Systems Architecture, Fifth Edition
Bus Protocol
• Approaches for access control (transferring data without CPU):– Direct memory access (DMA) – DMA controller
gets data from device and stores in RAM
– Peer-to-peer buses – any device can become master via bus arbitration protocol
Local Bus vs External Bus
• Traditionally, the local bus is connected to CPU and cache and RAM and other internal devices
• External bus connects the main processing unit to I/O devices
• Differences between local and external buses is getting fuzzy – new bus protocols can support both
10Systems Architecture, Fifth Edition
Parallel vs Serial Bus
• Parallel bus is older technology in which bus is a connection of wires that a devices “plugs into”
• Serial bus interconnects one device after another and creates a daisy-chain of devices
• Timing skew has become a problem with parallel bus design
11Systems Architecture, Fifth Edition
Serial Bus vs Parallel Bus
12Systems Architecture, Fifth Edition
Parallel
Serial (daisychain)
Example System Buses• IBM PC Bus - 8 bit data 20 bit address, used in all
early IBM PCs and clones.• PC-AT bus (ISA) - Compatible with PC bus, but has
second strip of connectors with an additional 36 lines. These lines give a 16 bit data bus for 80286 chip
• VESA Local bus (VL-bus or VLB) – found alongside ISA bus in pcs; acted as a high-speed bus for DMA and memory-mapped I/O; aka Very Long Bus!
13Systems Architecture, Fifth Edition
Example System Buses
• IBM Microchannel - Bus for IBM PS/2 computer; closed architecture with high licensing costs
• EISA (Extended industry standard architecture) - Several non-IBM companies reacted to Microchannel and designed EISA. Provides for 32 bit data bus.
14Systems Architecture, Fifth Edition
VME Bus• Used in SGI systems• Begun by Motorola, became an IEEE standard (IEEE
P1014)• 32 bit bus, asynchronous design (see next slide)• No circuitry on motherboard• Hundreds of companies design board for VME, 300
page set of VME definitions, very stable• Bus lines provide automatic self-testing and status
reporting• Now VME64 with a 64-bit bus
15Systems Architecture, Fifth Edition
16Systems Architecture, Fifth Edition
PCI Bus
• (Peripheral component interconnect) - used in pc and Mac systems
• Well defined and fast• Is the local bus in a machine with other buses• Intel based; CPU bus and peripherals plug directly
into PCI bus• Allows devices to talk to each other without CPU
intervention
17Systems Architecture, Fifth Edition
18Systems Architecture, Fifth Edition
Older architecture
19Systems Architecture, Fifth Edition
Newer architecture
CPU
northbridge chip
southbridge chip
RAM
Video card
PCI busReal time clockUSBPower managementOther devices
Front side bus(system bus)
Cache
Backside bus
PCI Bus
• Plug-in boards have software settings, not DIP switches
• 532 Mbps transfer speed (PCI v.3.0)• Synchronous bus (see figure on next slide)• Initiator and target design (master/slave)• Address and data lines multiplexed
20Systems Architecture, Fifth Edition
21Systems Architecture, Fifth Edition
PCI Bus
• OS queries all PCI buses at boot time to find out what devices are present and what system resources (interrupt lines, memory, etc.) each needs. It then allocates the resources and tells each device what its allocation is.
• Each device can request up to six areas of memory space or I/O port space
22Systems Architecture, Fifth Edition
PCI Versions
• 32-bit, 33MHz (5V, added in Rev. 2.0) • 64-bit, 33MHz (5V, added in Rev. 2.0) • 32-bit, 66MHz (3.3V only, added in Rev. 2.1) • 64-bit, 66MHz (3.3V only, added in Rev. 2.1)
PCI-X
• PCI-extended• Twice as fast as PCI – 1.06 GB/s• Designed for servers to support Gigabit Ethernet cards,
Fibre Channel and Ultra320 SCSI controllers• PCI-X backwards compatible with older PCI standards
(except the 5v ones)• PCI-X only runs as fast as the slowest device• In 2003 PCI SIG ratified PCI-X 2.0 which added 266 MHz
and 533 MHz options, or roughly 2.15 GB/s and 4.3 GB/s throughput (but losing ground to PCIe)
• PCI-X 3.0 in development, but how far with popularity of PCIe?
PCI-Express
• PCIe or PCI-E
• Not the same as PCI. PCI is a parallel bus, where PCIe is a serial bus (like USB)
• Hub on motherboard acts as crossbar switch allowing multiple simultaneous full-duplex connections
• Serial format starting to win out over parallel format due in part to timing skew
• PCIe is a layered protocol, consisting of a Transaction Layer, a Data Link Layer, and a Physical Layer (fairly complex, like USB)
From top to bottom – PCIe x4, x16, x1, x16, and an older PCI connectorfrom Wikipedia
A PCIe card will fit in any slot that isat least wide enough
27Systems Architecture, Fifth Edition
SCSI (Small Computer System Interface)
• Family of standard buses designed primarily for secondary storage devices
• Most often used for disk drives but can interface pretty much any device
• Implements both a low-level physical I/O protocol and a high-level logical device control protocol
28Systems Architecture, Fifth Edition
SCSI Interfaces – Parallel
• Still common is the older parallel SCSI (aka SPI)• Popular forms include
– SCSI-1
– Fast SCSI
– Fast-Wide SCSI
– Ultra Wide SCSI
• See handout on parallel SCSI specs
29Systems Architecture, Fifth Edition
SCSI Interfaces – Serial
• Serial SCSI – modern addition to SCSI system• Faster data rates, hot swapping, and improved
fault isolation among the advantages of serial SCSI
• Once again clock skew issue of high speed parallel interfaces is driving the change from parallel to serial
30Systems Architecture, Fifth Edition
SCSI Interfaces – iSCSI
• SCSI command set stays the same, its just that the physical specifications essentially no longer exist
• Physical specs are TCP/IP• SCSI-3 implemented over a network• iSCSI competing with Fibre Channel• Many felt iSCSI would not be as fast as Fibre
Channel due to TCP/IP overhead, but now systems are using TCP Offload Engine and 10G Ethernet
31Systems Architecture, Fifth Edition
32Systems Architecture, Fifth Edition
33Systems Architecture, Fifth Edition
Desirable Characteristicsof a SCSI Bus
• Non-proprietary standard• High data transfer rate• Peer-to-peer capability• High-level (logical) data access commands• Multiple command execution• Interleaved command execution• But typically quite a bit more expensive.
34Systems Architecture, Fifth Edition
I/O Ports
• I/O ports are the pathways between the CPU and a peripheral device
• Logical and Physical Access– Usually a memory address that can be read/written by
the CPU and a single peripheral device
– Also a logical abstraction that enables CPU and bus to interact with each peripheral device as if the device were a storage device with linear address space
35Systems Architecture, Fifth Edition
Physical access: System bus is usually physically implemented on a large printed circuit board with attachment points for devices.
36Systems Architecture, Fifth Edition
Logical access: The device, or its controller, translates linear sector address into corresponding physical sector location on a specific track and platter.
37Systems Architecture, Fifth Edition
Device Controllers
• Implement the bus interface and access protocols• Translate logical addresses into physical addresses• Enable several devices to share access to a bus
connection
38Systems Architecture, Fifth Edition
39Systems Architecture, Fifth Edition
Mainframe Channels
• Advanced type of device controller used in mainframe controllers
• Compared with device controllers:– Greater data transfer capacity
– Larger maximum number of attached peripheral devices
– Greater variability in types of devices that can be controlled
40Systems Architecture, Fifth Edition
Interrupt Processing
• Used by application programs to coordinate data transfers to/from peripherals, notify CPU of errors, and call operating system service programs
• When interrupt is detected, executing program is suspended; pushes current register values onto the stack and transfers control to an interrupt handler
• When interrupt handler finishes executing, the stack is popped and suspended process resumes from point of interruption
41Systems Architecture, Fifth Edition
Interrupt Processing
• Secondary storage and I/O devices are much slower than RAM, ROM, cache memory, and the CPU (see table on next slide)
• When the CPU asks for data from an I/O device, what should the CPU do?– Sit in a wait cycle?
– Go do something else?
Interrupt Processing
42Systems Architecture, Fifth Edition
43Systems Architecture, Fifth Edition
Multiple Types of Interrupts
• Categories of interrupts– I/O event
– Error condition
– Service request
– Processor to processor communication
• Can one interrupt be interrupted by another type of interrupt?
44Systems Architecture, Fifth Edition
45Systems Architecture, Fifth Edition
Buffers and Caches
• Improve overall computer system performance by employing RAM to overcome mismatches in data transfer rate and data transfer unit size
46Systems Architecture, Fifth Edition
Buffers
• Small storage areas (usually DRAM or SRAM) that hold data in transit from one device to another
• Use interrupts to enable devices with different data transfer rates and unit sizes to efficiently coordinate data transfer
• Buffer overflow
47Systems Architecture, Fifth Edition
Classic example of a buffer: a print buffer
48Systems Architecture, Fifth Edition
Computer system performance improves dramatically with larger buffer.
49Systems Architecture, Fifth Edition
Computer system performance improves dramatically with larger buffer.
Assumes a32-bit bus
50Systems Architecture, Fifth Edition
Computer system performance improves dramatically with larger buffer.
2 interrupts eachtime we fill up thebuffer.
Buffer will be filled64KB/buffer sizetimes
51Systems Architecture, Fifth Edition
Computer system performance improves dramatically with larger buffer.
Sum of bustransfers andbus interrupts
52Systems Architecture, Fifth Edition
Computer system performance improves dramatically with larger buffer.
Assumes 100CPU cycles tohandle aninterrupt.
53Systems Architecture, Fifth Edition
Diminishing Returns
• When multiple resources are required to produce something useful, adding more and more of a single resource produces fewer and fewer benefits
• Applicable to buffer size
54Systems Architecture, Fifth Edition
Law of diminishing returns affects both bus and CPU performance
Similar chart to thelast one, but now theamount to transferis 64B instead of64KB.
Note howimprovement stopsonce the buffer sizeequals the transferamount.
55Systems Architecture, Fifth Edition
Cache
• Differs from buffer:– Data content not automatically removed as used
– Used for bidirectional data
– Used only for storage device accesses
– Usually much larger
– Content must be managed intelligently
• Achieves performance improvements differently for read and write accesses
56Systems Architecture, Fifth Edition
Write access: Sending confirmation (2) before data is written to secondary storage device (3) can improve program performance; program can immediately proceed with other processing tasks.
57Systems Architecture, Fifth Edition
Read accesses are routed to cache (1). If data is already in cache, it is accessed from there (2). If data is not in cache, it must be read from the storage device (3). Performance improvement realized only if requested data is already waiting in cache.
58Systems Architecture, Fifth Edition
Cache Controller
• Processor that manages cache content• Guesses what data will be requested; loads it from
storage device into cache before it is requested• Can be implemented in
– A storage device storage controller or communication channel
– Operating system
59Systems Architecture, Fifth Edition
Cache
Primary storage cache Secondary storage cache
• Can limit wait states by using SRAM cached between CPU and SDRAM primary storage
• Level one (L1): within CPU
• Level two (L2): on-chip
• Level three (L3): off-chip
• Gives frequently accessed files higher priority for cache retention
• Uses read-ahead caching for files that are read sequentially
• Gives files opened for random access lower priority for cache retention
60Systems Architecture, Fifth Edition
Intel Itanium® 2 microprocessor uses three levels of primary storage caching.
61Systems Architecture, Fifth Edition
Processing Parallelism
• Increases computer system computational capacity; breaks problems into pieces and solves each piece in parallel with separate CPUs
• Techniques– Multicore processors
– Multi-CPU architecture
– Clustering
62Systems Architecture, Fifth Edition
Multicore Processors
• Include multiple CPUs and shared memory cache in a single microchip
• Typically share memory cache, memory interface, and off-chip I/O circuitry among the cores
• Reduce total transistor count and cost and provide synergistic benefits
63Systems Architecture, Fifth Edition
64Systems Architecture, Fifth Edition
Multi-CPU Architecture
• Employs multiple single or multicore processors sharing main memory and the system bus within a single motherboard or computer system
• Common in midrange computers, mainframe computers, and supercomputers
• Cost-effective for– Single system that executes many different
application programs and services
– Workstations
65Systems Architecture, Fifth Edition
Scaling Up
• Increasing processing by using larger and more powerful computers
• Used to be most cost-effective• Still cost-effective when maximal computer power
is required and flexibility is not as important
66Systems Architecture, Fifth Edition
Scaling Out
• Partitioning processing among multiple systems• Speed of communication networks; diminished
relative performance penalty• Economies of scale have lowered costs• Distributed organizational structures emphasize
flexibility• Improved software for managing multiprocessor
configurations
67Systems Architecture, Fifth Edition
High-Performance Clustering
• Connects separate computer systems with high-speed interconnections
• Used for the largest computational problems(e.g., modeling three-dimensional physical phenomena)
68Systems Architecture, Fifth Edition
Partitioning the problem to match the cluster architecture ensures that most data exchange traverses high-speed paths.
69Systems Architecture, Fifth Edition
Compression
• Reduces number of bits required to encode a data set or stream
• Effectively increases capacity of a communication channel or storage device
• Requires increased processing resources to implement compression/decompression algorithms while reducing resources needed for data storage and/or communication
Trading data sizeagainst CPU time
70Systems Architecture, Fifth Edition
Compression Algorithms
• Vary in:– Type(s) of data for which they are best suited
– Whether information is lost during compression
– Amount by which data is compressed
– Computational complexity
• Lossless versus lossy compression
71Systems Architecture, Fifth Edition
Compression can be used to reduce disk storage requirements (a) or to increase communication channel capacity (b).
72Systems Architecture, Fifth Edition
MPEG standards address recording and encoding formats for both images and sound.
Exploitsvarying
sensitivityof the earto sounds
to performlossy
compression
Chip Interfacing
• You are working for Nokia on a new cellphone• This phone will have a processor, one EPROM,
one RAM, and an I/O chip to control display and keyboard
• The processor has a 16-bit address bus• With 16 bits, you can have 65,536 bytes of storage
73Systems Architecture, Fifth Edition
Chip Interfacing
• For the I/O chip, we could attach it as an I/O device, then set CS line on PIO to IORQ line on CPU
• Or we could choose a particular address and have that address go into the CS line of the I/O chip
• The latter form is called memory-mapped I/O
74Systems Architecture, Fifth Edition
Chip Interfacing
• The I/O chip needs 4 bytes of address space (3 I/O ports and 1 status register)
• The EPROM is an 8K chip so it needs 8K of address space (13 bits needed to select 8K)
• Likewise, the RAM needs 8K of address space
75Systems Architecture, Fifth Edition
Chip Interfacing
• You don’t want addresses of chips to overlap, so place the devices in memory as follows:– EPROM starts at address 0 (0000h) and is 8K
(8192, or 2000h) long so ends at 1FFFh
– RAM starts at address 32K (32,768, or 8000h) and is 8K long so ends at 9FFFh
– I/O starts at address 65532 (FFFCh) and is 4 bytes long so ends at 65535 (FFFFh)
76Systems Architecture, Fifth Edition
Chip Interfacing• So, hexadecimal address ranges for each chip are:
– EPROM: 0000 – 1FFF
– RAM: 8000 – 9FFF
– I/O: FFFC – FFFF
• That would place the devices at the following binary addresses:– EPROM: 000xxxxxxxxxxxxx
– RAM: 100xxxxxxxxxxxxx
– I/O: 11111111111111xx
77Systems Architecture, Fifth Edition
Memory Allocation
78Systems Architecture, Fifth Edition
EPROM RAM I/O
0K 8K-1 32K 40K-1 65532-65535
Interface
79Systems Architecture, Fifth Edition
:A0
A12
A13
A14
A15
EPROM
~CS
RAM
~CS
I/O
~CS
80Systems Architecture, Fifth Edition
Summary
• How the CPU uses the system bus and device controllers to communicate with secondary storage and input/output devices
• Hardware and software techniques for improving data efficiency, and thus, overall computer system performance: bus protocols, interrupt processing, buffering, caching, and compression