Top Banner
Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance
157

Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Dec 13, 2015

Download

Documents

Jane Haxby
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Chapter 1 Microcomputers and Microprocessors

Microprocessor Evolution and Performance

Page 2: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Contents

Introduction to microcomputer systemMicroprocessor evolution

the INTEL processor family

Microprocessor performance

Page 3: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Introduction to Microcomputer

An microcomputer can be interpreted as a machine with: I/O devices for Input/Output, microprocessor for processing, memory units for storage Buses for connecting the above components

In 1970, a microcomputer was normally interpreted as a computer considerably smaller than a mini-computer, possibly using ROM for program storage

Page 4: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Basic hardware units

Input e.g. keyboard, mouse

Microprocessor e.g. 8085, 8086, mc68000 microprocessors

Memory e.g. RAM, hard disk

Output e.g. monitor, printer

Page 5: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Buses

Buses: External connections to input/output unit

Major Buses: Address bus: address of memory locations

containing instructions or data Data bus: contents of memory locations Control Bus: synchronization and handshaking

between components

Page 6: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

General Architecture

Inputunit

Microprocessingunit

Outputunit

Secondarymemory

Primarymemory

MemoryUnit

Page 7: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Processor History

Vacuum Tubes to IC’s

Page 8: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

First Generation Computers

Vacuum tube technology Large room, air-conditioned Tube life-time: 3,000 hours

Useless Machine? 1951: 1st Univac I (UNIVersal Automatic

Computer) delivered 1952: Prediction of presidential election by CBS 1952: IBM Model 710 Data Processing System

Page 9: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Second Generation Computers

The Transistor Is Born (Solid-State Era) 1948: invention of bipolar transistors

1956: Nobel physics award: Drs. William Shockley, John Bardeen and Walter H. Brattain (Bell Labs)

1954: Bell Labs: all-transistorized computer (TRADIC)800 transistorsMuch less heatMore reliable and less costly

Page 10: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Second Generation Computers

Mainframe Computers 1958: IBM’s 1st transistorized computer

7070/7090 1959: 1401 (business-oriented model) Built on circuit boards mounted into rack panels,

or frames Main frame (mainframe): the CPU portion of the

computer Popular with business and industry

Page 11: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Third Generation Computers

Invention of IC: 1959 Dr. Robert Noyce (Fairchild) and Jack Kilby (TI) Kilby: fabricating resistors, capacitors and transistors on a

germanium wafer, and connecting these parts with fine gold wires

Noyce: isolating individual components with reverse-biased diodes, and deposing an adherent metal film over the circuit, thus connecting the components

1st IC: 2-transistor multivibrator By mid 1960s: memory chips with 1,000 components are

common

Page 12: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Third Generation Computers

1964: IBM 360 Series (32-bit) The first to use IC technology

A family of 6 compatible computers

40 different I/O and auxiliary storage devices Memory capacity: 16K words to over 1MB. 32-bit registers x 16 24-bit address bus 128-bit data bus

Page 13: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Third Generation Computers

1964: IBM 360 Series (32-bit) 375,000 computations per second

(<< 150 mips Pentium 100)

$5 billion development cost

IBM became the leading mainframe company

Page 14: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Minicomputer

1960s: Space Race between US & USSR IC industry boom A tremendous demand by scientists and engineers for an

inexpensive computer that they could operate by themselves

1965: DEC PDP-8 (by Edson de Castro’s group)Low-cost ($25,000) minicomputer12-bit16-bit PDP-11

Supermini …

Page 15: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

1968: INTEL (Integrated Electronics) Founded by Robert Noyce and Gordon Moore

(Fairchild) Original goals: semiconductor memory market 1969: customized IC’s for Busicom for calculator Ted Hoff and Stan Mazor: proposed 4-bit CPU on

a single chip, plus ROM, RAM chips

Page 16: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

1971: 4000 Family By Fredrico Faggin 4001: 2K ROM with 4-bit I/O port 4002: 320-bit RAM, 4-bit output port 4003: 10-bit serial-in parallel-out shift register 4004: 4-bit processor

Processor-on-a-chip: Micro-processor era

Page 17: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

1972: 8008, 8-bit1974: 8080, an improved version

Page 18: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

8-bit CPUs16-bit address (64K)

MC6800: Motorola 6502: MOS Technology (spin-off from Motorola)

Apple-II, Apple DOS

Z-80: Zilog (spin-off from Intel)Z-80 cards on Apple-II, CP/M

Page 19: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

16-bit CPUs (Late 1970s) 8086, 80186, 80286: Intel

PC, PC-DOS, MS-DOS, SCO-Unix

MC68000: Motorola16-bit instructionsHardware multiply and divide20-bit address buses (1MB)Workstations: Sun3

Page 20: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessors: CPU on a Chip

32-bit CPUs 80386, 80486: Intel MC68020, 68030: Motorola

64-bit CPUs Pentium, Pentium Pro (64-bit external data bus,

32-bit internal registers, not recognized as 64-bit CPUs in terms of internal register word length)

Page 21: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microcomputers: Computers Based on Microprocessors

1975: MITS Altair 8800 (Kit) $399, i8080, programmed by depositing 1s/0s via front

panel switches

Other Computers boom 8080: MITS, … 6800: SWTPC 6800, … Z-80: TRS-80, … 6502: Apple I, 8K, programmed with BASIC

Steve Jobs & Steve Wozniak, millionaires from PC COM’s …

Page 22: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Personal Computers: the Open Architecture Era

1982: IBM PC A system board (mother board) Intel 8088 processor 16K memory 5 expansion slots

Third-party vendors to supply various IO adapter cardsOpen architectureComputer with interchangeable components

Page 23: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Micro-controllers: Microcomputers on a Chip

Microcontroller: a computer on a chip Microprocessor, plus On-chip memory, plus Input/output ports

1995: microcontrollers out sold microprocessors 10:1 embedded on various equipments:

Thermostat, machine tools, communication, automotive, …

Evolution: getting greater IO capabilities Intel: MCS-51, MCS-96, …

Page 24: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

High-Performance Processors

Supercomputers Aircraft design, global climate modeling, oil-

bearing formation, molecular design of new drugs, financial behavior

CDC6600, 7600: Seymour Cray Cray-1: 1976, the first true supercomputer

ECL, 128 KW power consumption130 MFLOPS (Pentium 100: 150 MFLOPS)$5.1 million

Page 25: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

High-Performance Processors

Parallel Processors Tens of gigaflops Multi-processors wired by a common bus Each is given a portion of the problem to solve Hypercube: early 1980s

Cosmic Cube, iPSC (with i860/RISC chips)

2D rectangular Mesh architecture: multiple processor at each node

Intel: teraflops computer with 4500 nodes, each powered by 2 Pentium Pro 200.

Page 26: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

RISC vs. CISC

RISC: Reduced Instruction Set Computer (1980s) A small number of fixed-length instructions Simple addressing modes A large number of registers Instructions executed in one clock cycle

Intel i860 (“Cray on a Chip”) 82 instructions, 32-bit long each Four addressing modes 32 general-purpose registers

Page 27: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

RISC vs. CISC

CISC: Complex Instruction Set Computer A large number of variable length instructions Multiple addressing modes A small number of registers Multiple number of clock cycles to execute

Intel 8086 Over 3000 instruction forms, 1-6 bytes 9 addressing modes 8 general-purpose registers Execution from 2 to 80+ cycles

Page 28: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

RISC vs. CISC

RISC Control unit is much simpler (simpler instructions,

execution in 1 CLK) Faster execution with less total on-chip logic Chip area: 10% (vs 50% for CISC) More area for register file, data and instruction

caches, FPU, and co-processor PowerPC: 32-bit, by IBM, Apple, Motorola Sparc: for SunMicro workstations

Page 29: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Application-Specific Processors

DSP Chips Mostly for analog signal processing ADC-DSP-DAC architecture Avoid processing analog signals using discrete

circuits, involving capacitors and inductance DSP: conduct complex mathematic functions

Digital filter, spectrum analysis

Page 30: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Application-Specific Processors

DSP Chip Architecture Different data/program areas: Harvard Architecture Hardware multipliers and adders, optimized to execute on

a single cycle Arithmetic pipelining: several instructions operated at once Hardware loop control Multiple IO ports for communication with other processors

Page 31: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Summary of Processor History

1940s: Vacuum tube, large and consuming large power

1950s: Transistor (1948-)

1959: First IC (second industrial revolution)

1960s: IC was popular to build CPU’s.

1971: Intel 4004 microprocessor (2300 transistors)

Starts of the microprocessor age

Late 1970’s: 8080/85

Page 32: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Summary of Processor History

1980: RISC (reduced instruction set computer)

CISC (complicated instruction set computer) vs. RISC

CISC family: Intel 80x86, Pentium; Motorola 68000 series

All others are RISC series.

Page 33: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Evolution of INTEL Processors

4004 (’71)-Pentium Pro (’93-)

Page 34: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

INTEL

Integrated Electronics 1968: founded by Robert Noyce and Gordon

Moore IA: Intel Architecture (e.g, IA-16, IA-32, IA-64)

since 8008 (’72) had became the de facto standardEvolution:

Internal register sizesExternal bus widthsReal, Protected, and Virtual 8086 modes

Page 35: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

4-bit Processors

4004 first microprocessor became available in 1971 4-bit microprocessor:

4-bit registers & 4-bit data bus#transistors: 2250Min. feature size: 10 micronsAddress bus: 10 bits/1K0.06 MIPS (@ 0.108 MHz)No internal cache

Page 36: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8-bit Processors

8008, 8080, 8085 became available in 1974 8-bit microprocessor

8080

Page 37: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086: IA standard

Became available in 1978 16-bit data bus 20-bit address bus (was 16-bit for 8080) memory organization: 16 segments of 64KB (1 MB limit)

Re-organize CPU into BIU (bus interface unit) and EU (execution unit) Allow fetch and execution simultaneously

Internal register expanded to 16-bit Allow access of low/high byte separately

Page 38: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086

Hardware multiply and divide instructionsExternal math co-processorInstruction set compatible with 8080/80858086: defined the 80x86 architecture

Page 39: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086

Not quite successful 16-bit data bus: Requires two separate 8-bit memory

banks Memory chips were expensive

Page 40: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8088: PC standard

Became available in 1979, almost identical to 80868-bit data bus: for hardware compatibility with 808016-bit internal registers and data bus (same as 8086)20-bit address bus (was 16-bit for 8080)

BIU re-designedmemory organization: 16 segments of 64KB (1 MB limit)

Two memory accesses for 16-bit data (less efficient) But less cost

8088: used by IBM PC (1982), 16K-64K, 4.77MHz

Page 41: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80186, 80188: High Integration CPU

PC system: 8088 CPU + various supporting chips

Clock generator8251: serial IO (RS232)8253: timer/counter8255: PPI (programmable periphial interface)8257: DMA controller8259: interrupt controller

80186/80188: 8086/8088 + supporting functions Compatible instruction set (+ 9 new instructions)

Page 42: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286

Became available in 1982used in IBM AT computer (1984)16-bit data busclock speed 25% faster than 8088, throughput

5 times greater than 808824-bit address bus (16 MB) (vs. 20-bit/1M

8086)

Page 43: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Larger address space: 24-bit address bus Real Mode vs. Protected Mode

Real Mode: Power on default mode Function like a 8086: use 20-bit least significant address

lines (1M) Software compatible with 286 16 new instructions (for Protected Mode management) Faster 286: redesigned processor, plus higher clock rate (6-

8MHz)

Page 44: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Protected Mode: Multi-program environment Each program has a predetermined amount of

memory Addressed via segment selector (physical

addresses invisible): 16M addressable Multiple programs loaded at once (within their

respective segments), protected from read/write by each other

Page 45: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Protected Mode: Cannot be switch back to real mode to avoid

illegal access by switching back and forth between modes

A faster 8086 only? MS-DOS requires that all programs be run in Real

Mode

Page 46: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Clock Speed

Electrical signals cannot change instantaneously (transition period required)

System clock provides timing signal for synchronization

Cannot be used to compare the performance of microprocessors with different instruction sets e.g., a 66 MHz Pentium is twice as fast as a 66 MHz

80486

Page 47: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386DX (aka. 80386)

available in 1985, a major redesign of 86/286 Compatibility commitment through 2000

32-bit data and address buses (4 GB memory) Real Address Mode: 1M visible, 286 real mode Protected Virtual Address Mode:

On board MMUSegmented tasks of 1byte to 4G bytes

• Segment base, limit, attributes defined by a descriptor register

Page swapping: 4K pages, up to 64TB virtual memory spaceWindows, OS/2, Unix/Linux

Page 48: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386DX (aka. 80386)

Virtual 8086 mode (a special Protected mode feature): permitted multiple 8086 virtual machines-multitasking (similar to real mode) Windows (multiple MSDOS’s)

Clock rate: max. 40MHz, 2 pulses per R/W bus cycle External memory cache to avoid wait

Fast SRAM93% hit rate with 64K cache

Compatible instructions (14 new)

Page 49: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386SX

80386SX: (for transition to 32-bit) 16-bit data bus/32-bit register 24-bit address bus

Page 50: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX

1989: a polished 386, 6 new OS level instructionsvirtually identical to 386 in terms of compatibilityRISC design concepts

fewer clock cycles per operation, a single clock cycle for most frequently used instructions

Max 50MHz 5 stage execution pipeline

Portions of 5 instructions execute at once

Page 51: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX

Highly Integrated: On board 8K memory cache FPP (equivalent to external 80387 co-processor)

Twice as fast as 386 at any given clock rate 20Mhz 486 ~= 40Mhz 386

Page 52: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486SX

80486SX NOT a 16-bit version for transition purpose no coprocessor No internal cache For low-end applications Max. 33Mhz only

Page 53: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX2/DX4: Overdrive Chips

Processor speed increased too fast Redesign of microcomputer for compatibility

becomes harder Solution: Separating internal speed with external

speed, improve performance independently

80486DX2/DX4 – internal clock twice/three times (NOT four times) the external clock: runs faster internally

Page 54: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX2/DX4: Overdrive Chips

System board design is independent of processor upgrade (less expensive components are allowed)

Processor operate at maximum speed data rate internally Only slow access to external data operates at system board rate Internal cache offset the speed gap

486DX2 66: 66 internal, 33 external486DX4 100: 100 internal, 33 external (3x)Overdrive sockets: for upgrading 486dx/sx to

486dx2/dx4 (with overdrive socket pin-outs)

Page 55: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

available in 199232-bit architectureSuperscaler architecture

Scaling: scaling down etchable feature size to increase complexity of IC (e.g., DRAM)

10 microns/4004 to 0.13 microns (2001) Superscaler: go beyond simply scaling down Two instruction pipelines: each with own ALU, address

generation circuitry, data cache interface Execute two different instructions simultaneously

Page 56: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

Onboard cache Separate 8K data and code caches to avoid access

conflictsFPPInstruction pipeline: 8 stageOptimized floating point functions

5x-10x FLOP’s of 486 2x performance of 486 at any clock rate

Page 57: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

Compatibility with 386/486: Internal 32-bit registers and address bus Data bus expanded to 64-bits for higher data

transfer rateCompare 8088 to 386sx transition

Page 58: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

non-clone competition from AMD, Cyrixdevelopment of brand identity by Intel

Page 59: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro: Two Chips in One

Became available in 1995Superscaler of degree 3

Can execute 3 instructions simultaneously

Optimized for 32-bit operating systems (e.g., Windows NT, OS2/Warp)

Two separate silicon die on the same package Processor: 0.35 u, 5.5 million transistors 256KB(/512K) Level 2 cache included on chip, 15.5

million transistors in smaller area

Page 60: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro: Two Chips in One

On Board Level 2 cache Simplifies system board design Requires less space Gains faster communication with processor

Internal (level 1) cache: 8KPentium Pro 133 ~= 2x Pentium 66 ~= 4x

486DX2 66

Page 61: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro:Dynamic Execution

Dynamic execution: reduce idle processor time by predicting instruction behaviors Multiple Branch Prediction: look as far as 30 instructions

ahead to anticipate program branches Data Flow Analysis: looks at upcoming instructions and

determine if they are available for processing, depending on other instructions. Determine optimal execution sequences.

Speculative Execution: execute instructions in different order as entered. Speculative results are stored until final states can be determined.

Page 62: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Processor Future

What’s More from Moore’s Law?

Page 63: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Moore's Law

In 1965, Gordon Moore predicted that:

“The number of transistors per integrated circuit would double every 18 months”

He forecast that this trend would continue through 1975

Page 64: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Moore’s Law

Page 65: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Other Microprocessors

Motorola family from 6809 (Apple II) through 68040

PowerPC joint venture between Apple, IBM, and Motorola

RISC Processors DEC Alpha, MIPS, Sun SPARC, etc.

Page 66: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

CISC vs. RISC

CISC (Complex Instruction Set Computer) CISC processors have a large versatile instruction

set that supports many complex addressing modes move complexity from software to hardware

RISC (Reduced Instruction Set Computer) RISC processors have a small instruction set move complexity from hardware to software

Page 67: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Microprocessor Performance

Two main factors:

Respond time the time between the start and completion of a

task, also referred to as execution time

Throughput the total amount of work done in a given time

Page 68: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

MIPS

Million Instructions Per Second MIPS = (Instruction count) / (Execution time in micro

second X 106)

It specifies performance inversely to execution time

Faster machines have a higher MIPS rating

Page 69: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Some Problems of MIPS

Cannot compare computers with different instruction sets, since the instruction count will certainly differ

MIPS varies between programs on the same computer

Page 70: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

iCOMP

An index provided by Intel for comparison of performance of their 32-bit microprocessors

Based on a variety of performance components that represent integer mathematics, graphics, etc.

Combine results of a set of software application benchmarks

Page 71: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.
Page 72: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Chapter 2Computer Codes, Programming, and Operating Systems

Number SystemsComputer CodesProgrammingOperating Systems

Page 73: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Number Systems

Decimal: Base 10Binary: Base 2Octal: Base 8Hexadecimal: Base 16

Page 74: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Base Conversion: 210

Binary to Decimal D = i=0,n-1 bi x 2i

Decimal to Binary Repeated subtraction

D’ = i=0,m-1 bi x 2i = D - 2m (bm=1)

D <= D’ & m <= m’ (m’: max exp. s.t. (bm’=1)

Long divisionD’ = D/2 … bi & D <= D’

Page 75: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.
Page 76: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

MCS-51 Program DevelopmentMCS-51 Program Development

EditorEditor AssemblerAssembler LinkerLinker

SymbolConverter

SymbolConverter ICEICE

TargetTarget

Program

.ASM .OBJ.HEX

.SYM

.SDT

(X8051) (Link)

(CVTSYM)

Page 77: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Chapter 380x86 Processor Architecture

8086/88Segmented Memory8038680486PentiumPentium Pro

Page 78: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

The 8086 and 8088

Processor ModelProgramming Model

Page 79: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086: IA standard

Became available in 1978 16-bit data bus 20-bit address bus (was 16-bit for 8080) memory organization: 16 segments of 64KB (1 MB limit)

Re-organize CPU into BIU (bus interface unit) and EU (execution unit) Allow fetch and execution simultaneously

Internal register expanded to 16-bit Allow access of low/high byte separately

Page 80: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8088: PC standard

Became available in 1979, almost identical to 80868-bit data bus: for hardware compatibility with 808016-bit internal registers and data bus (same as 8086)20-bit address bus (was 16-bit for 8080)

BIU re-designedmemory organization: 16 segments of 64KB (1 MB limit)

Two memory accesses for 16-bit data (less efficient) But less cost

8088: used by IBM PC (1982), 16K-64K, 4.77MHz

Page 81: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80186, 80188: High Integration CPU

PC system: 8088 CPU + various supporting chips

Clock generator8251: serial IO (RS232)8253: timer/counter8255: PPI (programmable periphial interface)8257: DMA controller8259: interrupt controller

80186/80188: 8086/8088 + supporting functions Compatible instruction set (+ 9 new instructions)

Page 82: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Processor Model: BIU+EU

BIU Memory & IO address generation

EU Receive codes and data from BIU

Not connected to system buses

Execute instructions Save results in registers, or pass to BIU to memory

and IO

Page 83: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

83

8086 Processor Model

BH BLAH AL

DH DLCH CL

BPDISISP

ALU

Flags

CSESSSDSIP

Address Generationand Bus Control

Instruction Queue

EU BIU

Page 84: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Fetch and Execution Cycle

BIU+EU allows the fetch and execution cycle to overlap 0. System boot, Instruction Queue is empty 1. IP =>BIU=> address bus && IP++ 2. Mem[(IP-1)] => Instruction Queue[tail++] 3a. InstrQ[head] => EU => execution 3b. Mem[IP++] => InstrQ[tail++]

Maybe multiple instructions

Repeat 3a+3b (overlapped)

Page 85: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Waiting Conditions: Memory Access

BIU+EU: execute (almost) continuously without waiting

Waiting Conditions: Accessing memory locations not in queue BIU suspend instruction fetch Issues external memory address Resumes instruction fetch and execution

Page 86: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Waiting Conditions: Jump

Next Jump Instruction Instructions in queue are discarded EU wait for the next instruction after the jump

location to be fetched by BIU Resume execution

Page 87: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Waiting Conditions: Long Instructions

Long Instruction is being executed Instruction Full BIU waits Resume instruction fetch after EU pull one or tow

bytes from queue

Page 88: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

BIU: 8088 vs. 8086

BIU is the major difference8088:

data bus: 8-bit (vs. 16-bit/8086) Instruction queue: 4 bytes (vs. 6-byte/8086)

Only 30% slower than 8086 If queue is kept full

Page 89: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

89

8086 Programming Model

BH BLAH AL

DH DLCH CL

BPDISISP

CSESSSDS

IPFlags H Flags L

Page 90: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Programming Model

Data Group: AX (AH+AL): Accumulator BX (BH+BL): Base CX (CH+CL): Counter DX (DH+DL): Data

Page 91: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Programming Model

Segment Group: CS: Code Segment DS: Data Segment ES: Extra Segment SS: Stack Segment

Segment Registers: Base address to particular segments

Page 92: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Programming Model

Pointer/Index Group: IP: Instruction Pointer CS SI: Source IndexDS DI: Destination IndexES SP: Stack PointerSS

Index Registers: Index (offset) or Pointer to a Base address

Page 93: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 8086 Flag WordFlag Word

Flag L :

SF ZF X AF X PF X CF

CF: Carry FlagCF= 0 : No Carry (Add) or Borrow (SUB)

CF= 1 : high-order bit Carry/Borrow

AF: Aux. Carry: Carry/Borrow on bit 3 (Low nibble of AL)

SF: Sign Flag: (0: positive, 1: negative)

ZF: Zero Flag: (1: result is zero)

PF: (Even) Parity Flag (even number of 1’s in low-order 8 bits of result)

Page 94: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 8086 Flag WordFlag Word

Flag H :

X X X X OF DF IF TF

TF: Trap flag (single-step after next instruction; clear by single-step interrupt)

IF: Interrupt-Enable: enable maskable interrupts

DF: Direction flag: auto-decrement (1) or increment(0) index on string operations

OF: Overflow: signed result cannot be expressed within #bits in destination operand

Page 95: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Segmented Memory

Linear vs. Segmented Linear Addressing:

The entire memory is regarded as a wholethe entire memory space is available all the time

Segmented:memory is divided into segmentsProcess is limited to access designated segments at a

given time

Page 96: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Memory Organization

Even and Odd Memory Banks 16-bit data bustwo-byte / two one-byte access Allows processor to work on bytes or on words

(16-bit)IO operations are normally conducted in bytes

Can handle odd-length instructionsSingle byte instructionsMultiple byte (and very long) instructions

Page 97: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Memory Organization

Memory Space: 20-bit address bus Linearly, 1M bytes directly addressable

Memory Banks Can read 16-bit data (512K words) from even and

odd-addressed simultaneouslyneed Two memory banks in parallelBHE control line: allows addressing even/odd banks

or both

Page 98: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Memory Organization: Alignment

Endianess: One way to model multi-byte CPU register

AX AH+AL Two ways to store operands in memory

Big-endian CPU: (IBM370, M68*, Sparc) High-order-byte-first (HOBF) Maps highest-order byte of internal registerlowest (1st)

memory byte address Operand addressaddress of MSB

MOV R1, N N: 1st byte in memory & MSB of register

Page 99: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Memory Organization: Alignment

Little-endian CPU: (DEC, Intel) Low-order-byte-first (LOBF) Maps lowest-order byte of register 1st memory byte Operand address address of LSB (1st memory byte)

MOV AX, N N: 1st byte in memory & LSB of registerALN, AHN+1

Configurable: Can switch between Big/Little-endian, or Provide instructions which convert 16-/32-bit data between

two byte ordering (80486)

Page 100: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Memory Organization

Aligned operand Operand aligned at even-byte (word/dword) boundaries Allows single access to read/write one operand

Through internal shift/swap mechanism, if necessary

Mis-aligned words: Word operand not start at even address Need 2 read cycles to read/write the word (8086)

Issues two addresses to access the two even-aligned words containing the operand in order to access the operand

slower but transparent to programmer

Page 101: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Memory Organization

8088 always 2 cycles for word operations

Aligned or not

Because of 8-bit external data busSingle memory bank is sufficient

Page 102: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

8086 Memory Map

Memory Map: How memory space is allocated ROM Area: boot, BIOS RAM: OS/User Apps & data Unused Reserved: for future hardware/software uses Dedicated: for specific system interrupt and rest

functions, etc.

Page 103: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Segment Registers

64K memory segments x 1616-bit offset eachCS, DS, ES, SS

Page 104: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Logical and Physical Addresses

Physical: 20-bitLogical: 16-bit

16-byte segment boundaries

Address Translation E.g., CS:IP

Page 105: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286

First with Protection ModeReview of 286 Protected Mode … Next

Page 106: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286

Became available in 1982used in IBM AT computer (1984)16-bit data busclock speed 25% faster than 8088, throughput

5 times greater than 808824-bit address bus (16 MB) (vs. 20-bit/1M

8086)

Page 107: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Larger address space: 24-bit address bus Real Mode vs. Protected Mode

Real Mode: Power on default mode Function like a 8086: use 20-bit least significant address

lines (1M) Software compatible with 286 16 new instructions (for Protected Mode management) Faster 286: redesigned processor, plus higher clock rate (6-

8MHz)

Page 108: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Protected Mode: Multi-program environment Each program has a predetermined amount of

memory Addressed via segment selector (physical

addresses invisible): 16M addressable Multiple programs loaded at once (within their

respective segments), protected from read/write by each other

Page 109: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80286: Real vs. Protected Modes

Protected Mode: Cannot be switch back to real mode to avoid

illegal access by switching back and forth between modes

A faster 8086 only? MS-DOS requires that all programs be run in Real

Mode

Page 110: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Model

Refine 286 Protect ModeExpand to 32-bit registersNew Virtual 8086 Mode

Page 111: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Review

Page 112: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386DX (aka. 80386)

available in 1985, a major redesign of 86/286 Compatibility commitment through 2000

32-bit data and address buses (4 GB memory) Real Address Mode: 1M visible, 286 real mode Protected Virtual Address Mode:

On board MMUSegmented tasks of 1byte to 4G bytes

• Segment base, limit, attributes defined by a descriptor register

Page swapping: 4K pages, up to 64TB virtual memory spaceWindows, OS/2, Unix/Linux

Page 113: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386DX (aka. 80386)

Virtual 8086 mode (a special Protected mode feature): permitted multiple 8086 virtual machines-multitasking (similar to real mode) Windows (multiple MSDOS’s)

Clock rate: max. 40MHz, 2 pulses per R/W bus cycle External memory cache to avoid wait

Fast SRAM93% hit rate with 64K cache

Compatible instructions (14 new)

Page 114: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386SX

80386SX: (for transition to 32-bit) 16-bit data bus/32-bit register 24-bit address bus

Page 115: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Larger address space: 32-bit address bus (4G) Real Mode vs. Protected Mode (refined from 286)

Real Mode: Power on default mode Function like a 8086: (1) use only 20-bit least significant

address lines (1M) (2) segmented memory retained (64K) Software compatible with 286

New Real Mode Features: access to 32-bit register set two new segments: F, G

Page 116: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Protected Mode: new addressing mechanism vs. real mode supports protection levels segment size: 1 to 4G (not 64K, fixed) segment register: pointer to a descriptor table

not base address

Page 117: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Protected Mode: descriptor table: (8 byte per entry)

32-bit base address of segmentsegment sizeaccess rights

memory address = base address (in table) + offset (in instruction)

Page 118: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Protected Mode: Paging mechanism:

map 32-bit linear address (base+offset) =>physical address & page frame address

(4K page frames in system memory)64TB of virtual memory

Page 119: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Protected Mode: Protection mechanism:

tasks/data/instructions are assigned a privilege level (PL)

tasks running at lower PL cannot access tasks or data segments at a higher PL

running programs that are protected from the others

Page 120: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386: Real vs. Protected Modes

Two Ways to Run 8086 Programs: Real Mode Virtual 8086 Mode

Virtual 8086 Mode: runs multiple 8086+other 386 (protected mode) programs

independently each sees 1 MB (mapped via paging to anywhere in 4GB

space) running V8086+ Protected mode simultaneously

Page 121: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model386

Page 122: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model: BIU+CPU+MMU

BIU control 32-bit address and data buses keep instruction queue full (16 bytes)

Address pipelining address of next memory location is output halfway through

current bus cycle more address decode time slower memory chip is OK easier to keep up with faster (2 CLK) bus cycle of 386

Page 123: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model: BIU

dynamic data bus sizing switch between 16-/32-bit data bus on the fly accommodate to external 16-bit memory cards or

IO devices adjust bus timing to use only the least significant

16 bits

Page 124: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model: BIU

External memory 4 memory banks (4x8=32bits) BE0-BE3 for bank selection access byte or word or double word

aligned operands: 1 bus cyclemis-aligned (not %4): 2 bus cycles

Page 125: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model: CPU

CPU=IU (instruction) +EU (execution) fetching & execution overlap

IU: retrieval instructions from queue decode store in decoded queue

EU:ALU+registers (32-bit) execute decode instructions

Page 126: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Processor Model: MMU

Segmentation unit Real mode: generate the 20-bit physical address Protected mode: store base/size/rights in descriptor

registerscache descriptor tables in RAMfaster operations

Paging Unit determines physical addresses associated with active

segments (divided into 4K pages) virtual memory support to allow larger programs

Page 127: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Programming Model

General Purpose Registers Data & Addresses Groups Status & Control Flags

VM, RF, NT, IOPL

Segment Group

Page 128: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Programming Model

Special purpose Registers

Page 129: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Programming Model

Memory Management segment descriptors

keep base, size, access rights3 types of tables: global (GDT), local (LDT), interrupt

(IDT)addressing:

• index (to a table) + RPL• base + offset (from instruction)

PagingTLB

Page 130: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80386 Programming Model

Protection (PL) task: CPL instruction: RPL data segment: DPL

Gates special descriptors that allows access to higher PL

tasks from lower PL tasks

Page 131: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486 Review …

Page 132: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX

1989: a polished 386, 6 new OS level instructionsvirtually identical to 386 in terms of compatibilityRISC design concepts

fewer clock cycles per operation, a single clock cycle for most frequently used instructions

Max 50MHz 5 stage execution pipeline

Portions of 5 instructions execute at once

Page 133: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX

Highly Integrated: On board 8K memory cache FPP (equivalent to external 80387 co-processor)

Twice as fast as 386 at any given clock rate 20Mhz 486 ~= 40Mhz 386

Page 134: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486SX

80486SX NOT a 16-bit version for transition purpose no coprocessor No internal cache For low-end applications Max. 33Mhz only

Page 135: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX2/DX4: Overdrive Chips

Processor speed increased too fast Redesign of microcomputer for compatibility

becomes harder Solution: Separating internal speed with external

speed, improve performance independently

80486DX2/DX4 – internal clock twice/three times (NOT four times) the external clock: runs faster internally

Page 136: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

80486DX2/DX4: Overdrive Chips

System board design is independent of processor upgrade (less expensive components are allowed)

Processor operate at maximum speed data rate internally Only slow access to external data operates at system board rate Internal cache offset the speed gap

486DX2 66: 66 internal, 33 external486DX4 100: 100 internal, 33 external (3x)Overdrive sockets: for upgrading 486dx/sx to

486dx2/dx4 (with overdrive socket pin-outs)

Page 137: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Features

386 features: Real/Protected Modes Memory Management PL’s registers & bus sizes

New features 6 OS instructions 8K/16K onboard cache (was external before 386)

Page 138: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Features

A better 386 5 stage instruction pipeline

IF/ID/EX => PF/D1/D2/EX/WBPF: instructions => Q (2*16-bytes)D1: determine opcodeD2: determine memory address of operandsEX: execute indicated OPWB: update register

Page 139: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Features

Reduced Instruction Cycle Times 5 stage instruction pipeline (e.g., Fig. 3.18) instruction cycle times:

8086: 4 CLK80386: 2 CLK80486: 1 CLK (close to RISC)about 2X faster than 386

Page 140: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: 386+FPU+Cache

386 units retained: BIU, CPU, MMUnew: FPU (80387) + Cache (8K/16K)FPU:

387 onboard0.8 u => #transistors increased (275K => 1+ millions)simplified system board designspeedup FP operations

Page 141: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.
Page 142: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: Cache

Cache (8K/16K (dx4)) Function: bridge processor memory bandwidth

8088: 4.77MHz80486: 50MHzPentium: 100MHzPentium Pro: 133 MHzMain Memory (DRAM): relatively slow

Fast Static RAMs (SRAM) as cache

Page 143: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: Cache

Organization: 8K 4-way set associative

4 direct mapped caches wired in paralleleach block maps to a set of 4 lines

unified: data & code in the same cache write-through: update cache and memory page on

write operations

Page 144: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: Cache

locality (why caches help?) spatial locality: e.g., array of data temporal: e.g., loops in codes

operations on hit/miss128-bit cache lines

32-bit x N to catch locality (N=4) 128-bit = 16-byte

Page 145: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: Cache

Mapping: memory => many-to-many => cache Data RAM: save memory data Tag RAM: save memory address information

3 methods of mapping fully associative: memory block to any cache line direct map: memory block to specific line

trashing

set associative: memory block to a set of cache lines

Page 146: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

486 Processor Model: Cache

Replacement policy (LRU) valid bits: all 4 lines in use ?

NO => use any unused lineYES => find one to replace

LRU bits: which is least recently used

Page 147: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.
Page 148: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.
Page 149: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Review …

Page 150: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

available in 199232-bit architectureSuperscaler architecture

Scaling: scaling down etchable feature size to increase complexity of IC (e.g., DRAM)

10 microns/4004 to 0.13 microns (2001) Superscaler: go beyond simply scaling down Two instruction pipelines: each with own ALU, address

generation circuitry, data cache interface Execute two different instructions simultaneously

Page 151: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

Onboard cache Separate 8K data and code caches to avoid access

conflictsFPPInstruction pipeline: 8 stageOptimized floating point functions

5x-10x FLOP’s of 486 2x performance of 486 at any clock rate

Page 152: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

Compatibility with 386/486: Internal 32-bit registers and address bus Data bus expanded to 64-bits for higher data

transfer rateCompare 8088 to 386sx transition

Page 153: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium: Superscaler Processor

non-clone competition from AMD, Cyrixdevelopment of brand identity by Intel

Page 154: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro Review …

Page 155: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro: Two Chips in One

Became available in 1995Superscaler of degree 3

Can execute 3 instructions simultaneously

Optimized for 32-bit operating systems (e.g., Windows NT, OS2/Warp)

Two separate silicon die on the same package Processor: 0.35 u, 5.5 million transistors 256KB(/512K) Level 2 cache included on chip, 15.5

million transistors in smaller area

Page 156: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro: Two Chips in One

On Board Level 2 cache Simplifies system board design Requires less space Gains faster communication with processor

Internal (level 1) cache: 8KPentium Pro 133 ~= 2x Pentium 66 ~= 4x

486DX2 66

Page 157: Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance.

Pentium Pro:Dynamic Execution

Dynamic execution: reduce idle processor time by predicting instruction behaviors Multiple Branch Prediction: look as far as 30 instructions

ahead to anticipate program branches Data Flow Analysis: looks at upcoming instructions and

determine if they are available for processing, depending on other instructions. Determine optimal execution sequences.

Speculative Execution: execute instructions in different order as entered. Speculative results are stored until final states can be determined.