Top Banner
Processor Organization and Performance Chapter 6 S. Dandamudi
66

Processor Organization and Performance Chapter 6 S. Dandamudi.

Jan 03, 2016

Download

Documents

Edgar Tyler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Processor Organization and Performance Chapter 6 S. Dandamudi.

Processor Organization and Performance

Chapter 6

S. Dandamudi

Page 2: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 2

Outline

• Introduction

• Number of addresses 3-address machines 2-address machines 1-address machines 0-address machines Load/store architecture

• Flow control Branching Procedure calls Delayed versions Parameter passing

• Instruction set design issues Operand types Addressing modes Instruction types Instruction formats

• Microprogrammed control Implementation issues

• Performance Performance metrics Execution time calculation Means of performance The SPEC benchmarks

Page 3: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 3

Introduction

• We discuss three processor-related issues» Instruction set design issues

– Number of addresses

– Addressing modes

– Instruction types

– Instruction formats

» Microprogrammed control

– Hardware implementation

– Software implementation

» Performance issues

– Performance metrics

– Standards

Page 4: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 4

Number of Addresses

• Four categories 3-address machines

» 2 for the source operands and one for the result 2-address machines

» One address doubles as source and result 1-address machine

» Accumulator machines» Accumulator is used for one source and result

0-address machines» Stack machines» Operands are taken from the stack» Result goes onto the stack

Page 5: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 5

Number of Addresses (cont’d)

• Three-address machines Two for the source operands, one for the result RISC processors use three addresses Sample instructions

add dest,src1,src2

; M(dest)=[src1]+[src2]

sub dest,src1,src2

; M(dest)=[src1]-[src2]

mult dest,src1,src2

; M(dest)=[src1]*[src2]

Page 6: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 6

Number of Addresses (cont’d)

• Example C statement

A = B + C * D – E + F + A Equivalent code:

mult T,C,D ;T = C*D

add T,T,B ;T = B+C*D

sub T,T,E ;T = B+C*D-E

add T,T,F ;T = B+C*D-E+F

add A,T,A ;A = B+C*D-E+F+A

Page 7: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 7

Number of Addresses (cont’d)

• Two-address machines One address doubles (for source operand & result) Last example makes a case for it

» Address T is used twice

Sample instructions

load dest,src ; M(dest)=[src]

add dest,src ; M(dest)=[dest]+[src]

sub dest,src ; M(dest)=[dest]-[src]

mult dest,src ; M(dest)=[dest]*[src]

Page 8: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 8

Number of Addresses (cont’d)

• Example C statement

A = B + C * D – E + F + A Equivalent code:

load T,C ;T = Cmult T,D ;T = C*Dadd T,B ;T = B+C*Dsub T,E ;T = B+C*D-Eadd T,F ;T = B+C*D-E+Fadd A,T ;A = B+C*D-E+F+A

Page 9: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 9

Number of Addresses (cont’d)

• One-address machines Uses special set of registers called accumulators

» Specify one source operand & receive the result

Called accumulator machines Sample instructions

load addr ; accum = [addr]

store addr ; M[addr] = accum

add addr ; accum = accum + [addr]

sub addr ; accum = accum - [addr]

mult addr ; accum = accum * [addr]

Page 10: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 10

Number of Addresses (cont’d)

• Example C statement

A = B + C * D – E + F + A Equivalent code:

load C ;load C into accummult D ;accum = C*Dadd B ;accum = C*D+Bsub E ;accum = B+C*D-Eadd F ;accum = B+C*D-E+Fadd A ;accum = B+C*D-E+F+Astore A ;store accum contents in A

Page 11: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 11

Number of Addresses (cont’d)

• Zero-address machines Stack supplies operands and receives the result

» Special instructions to load and store use an address

Called stack machines (Ex: HP3000, Burroughs B5500) Sample instructions

push addr ; push([addr])

pop addr ; pop([addr])

add ; push(pop + pop)

sub ; push(pop - pop)

mult ; push(pop * pop)

Page 12: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 12

Number of Addresses (cont’d)

• Example C statement

A = B + C * D – E + F + A Equivalent code:

push E subpush C push Fpush D addMult push Apush B addadd pop A

Page 13: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 13

Number of Addresses (cont’d)

Page 14: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 14

Load/Store Architecture

• Instructions expect operands in internal processor registers Special LOAD and STORE instructions move data between

registers and memory RISC and vector processors use this architecture Reduces instruction length

Page 15: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 15

Load/Store Architecture (cont’d)

• Sample instructionsload Rd,addr ;Rd = [addr]

store addr,Rs ;(addr) = Rs

add Rd,Rs1,Rs2 ;Rd = Rs1 + Rs2

sub Rd,Rs1,Rs2 ;Rd = Rs1 - Rs2

mult Rd,Rs1,Rs2 ;Rd = Rs1 * Rs2

Page 16: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 16

Number of Addresses (cont’d)

• Example C statement

A = B + C * D – E + F + A Equivalent code:

load R1,B mult R2,R2,R3load R2,C add R2,R2,R1load R3,D sub R2,R2,R4load R4,E add R2,R2,R5load R5,F add R2,R2,R6load R6,A store A,R2

Page 17: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 17

Flow of Control

• Default is sequential flow• Several instructions alter this default execution

Branches» Unconditional

» Conditional

» Delayed branches

Procedure calls» Delayed procedure calls

Page 18: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 18

Flow of Control (cont’d)

• Branches Unconditional

» Absolute address

» PC-relative

– Target address is specified relative to PC contents

Example: MIPS» Absolute address

j target» PC-relative

b target

Page 19: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 19

Flow of Control (cont’d)

Page 20: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 20

Flow of Control (cont’d)

• Branches Conditional

» Jump is taken only if the condition is met

Two types» Set-Then-Jump

– Condition testing is separated from branching

– Condition code registers are used to convey the condition test result

» Example: Pentium code

cmp AX,BX

je target

Page 21: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 21

Flow of Control (cont’d)

» Test-and-Jump

– Single instruction performs condition testing and branching

» Example: MIPS instruction

beq Rsrc1,Rsrc2,targetJumps to target if Rsrc1 = Rsrc2

• Delayed branching Control is transferred after executing the instruction

that follows the branch instruction» This instruction slot is called delay slot

Improves efficiency

Page 22: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 22

Flow of Control (cont’d)

• Procedure calls Requires two pieces of information to return

» End of procedure

– Pentiumuses ret instruction

– MIPSuses jr instruction

» Return address

– In a (special) registerMIPS allows any general-purpose register

– On the stackPentium

Page 23: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 23

Flow of Control (cont’d)

Page 24: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 24

Flow of Control (cont’d)

Delay slot

Page 25: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 25

Parameter Passing

• Two basic techniques Register-based

» Internal registers are used

– Faster

– Limit the number of parameters

Stack-based» Stack is used

– More general

• Recent processors use Register window mechanism

» Examples: SPARC and Itanium (discussed in later chapters)

Page 26: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 26

Operand Types

• Instructions support basic data types Characters Integers Floating-point

• Instruction overload Same instruction for different data types Example: Pentium

mov AL,address ;loads an 8-bit value

mov AX,address ;loads a 16-bit value

mov EAX,address ;loads a 32-bit value

Page 27: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 27

Operand Types

• Separate instructions Instructions specify the operand size Example: MIPS

lb Rdest,address ;loads a byte

lh Rdest,address ;loads a halfword

;(16 bits)

lw Rdest,address ;loads a word

;(32 bits)

ld Rdest,address ;loads a doubleword

;(64 bits)

Page 28: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 28

Addressing Modes

• Refers to how the operands are specified Operands can be in three places

» Registers

– Register addressing mode

» Part of instruction

– Constant

– Immediate addressing mode

– All processors support these two addressing modes

» Memory

– Difference between RISC and CISC

– CISC supports a large variety of addressing modes

– RISC follows load/store architecture

Page 29: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 29

Addressing Modes (cont’d)

Most RISC processors support two memory addressing modes

–address = Register + constant–address = Register + Register

CISC processors like Pentium support a variety of addressing modes

» Motivation: To efficiently support high-level language data structures

– Example: Accessing a 2-D array

Page 30: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 30

Instruction Types

• Several types of instructions Data movement

» Pentium: mov dest,src» Some do not provide direct data movement instructions» Indirect data movement

add Rdest,Rsrc,0 ;Rdest = Rsrc+0 Arithmetic and Logical

» Arithmetic– Integer and floating-point, signed and unsigned– add, subtract, multiply, divide

» Logical–and, or, not, xor

Page 31: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 31

Instruction Types (cont’d)

• Condition code bits S: Sign bit (0 = +, 1= ) Z: Zero bit (0 = nonzero, 1 = zero) O: Overflow bit (0 = no overflow, 1 = overflow) C: Carry bit (0 = no carry, 1 = carry)

• Example: Pentium

cmp count,25

je target

Page 32: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 32

Instruction Types (cont’d)

Flow control and I/O instructions» Branch» Procedure call» Interrupts

I/O instructions» Memory-mapped I/O

– Most processors support memory-mapped I/O– No separate instructions for I/O

» Isolated I/O– Pentium supports isolated I/O– Separate I/O instructionsin AX,io_port ;read from an I/O portout io_port,AX ;write to an I/O port

Page 33: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 33

Instruction Formats

• Two types Fixed-length

» Used by RISC processors

» 32-bit RISC processors use 32-bits wide instructions

– Examples: SPARC, MIPS, PowerPC

» 64-bit Itanium uses 41-bit wide instructions

Variable-length» Used by CISC processors

» Memory operands need more bits to specify

• Opcode Major and exact operation

Page 34: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 34

Instruction Formats (cont’d)

Page 35: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 35

Microprogrammed Control

• Introduction in Chapter 1• 1-bus datapath

Assume all entities are 32-bit wide PC register

» Program counter IR register

» Holds the instruction to be executed MAR register

» Address of the operand to be stored in memory MDR register

» Holds the operand for memory operations

Page 36: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 36

Microprogrammed Control (cont’d)

1-bus datapath

Page 37: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 37

Microprogrammed Control (cont’d)

ALU circuit details

Page 38: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 38

Microprogrammed Control (cont’d)

• Has 32 32-bit general-purpose registers Interface only with the A-bus Each register has two control signals

» Gxin and Gxout

• Control signals used by the other registers PC register:

» PCin, PCout, and PCbout IR register:

» IRout and IRbin MAR register:

» MARin, MARout, and MARbout MDR register:

» MDRin, MDRout, MDRbin and MDRbout

Page 39: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 39

Microprogrammed Control (cont’d)

Memory interface implementation details

Page 40: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 40

Microprogrammed Control (cont’d)

add %G9,%G5,%G7

Implemented as» Transfer G5 contents to A register

– Assert G5out and Ain» Place G7 contents on the A bus

– Assert G7out» Instruct ALU to perform addition

– Appropriate ALU function control signals

» Latch the result in the C register

– Assert Cin» Transfer contents of the C register to G9

– Assert Cout and G9in

Page 41: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 41

Microprogrammed Control (cont’d)

• Example instruction groups Load/store

» Moves data between registers and memory

Register» Arithmetic and logic instructions

Branch» Jump direct/indirect

Call» Procedures invocation mechanisms

More…

Page 42: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 42

Microprogrammed Control (cont’d)

High-level FSM for instruction execution

Page 43: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 43

Microprogrammed Control (cont’d)

• Implementation Hardware

» Typical approach in RISC processors

Software» Typical approach in CISC processors

• Hardware implementation PLA based implementation shown

» Three control signals– Opcode via the IR register

– Status and condition codes

– Counter to keep track of the steps in instruction execution

Page 44: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 44

Microprogrammed Control (cont’d)

Controller implementation

Page 45: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 45

Microprogrammed Control (cont’d)

• Software implementation Typically used in CISC

» Hardware implementation is complex and expensive

• Example

add %G9,%G5,%G7 Three steps

S1 G5out: Ain;

S2 G7out: ALU=add: Cin;

S3 Cout: G9in: end;

Page 46: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 46

Microprogrammed Control (cont’d)

• Uses a microprogram to generate the control signals Encode the signals of each step as a codeword

» Called microinstruction

A instruction is expressed by a sequence of codewords» Called microroutine

• Microprogram essentially implements the FSM discussed before

• A simple microprogram structure is on the next slide

Page 47: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 47

Microprogrammed Control (cont’d)

Simple microcode organization

Page 48: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 48

Microprogrammed Control (cont’d)

• A simple microcontroller can execute a microprogram to generate the control signals Control store

» Stores microprogram

Uses PC» Similar to PC

Address generator» Generates appropriate address depending on the

– Opcode, and

– Condition code inputs

Page 49: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 49

Microprogrammed Control (cont’d)

Microcontroller

Page 50: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 50

Microprogrammed Control (cont’d)

• Problems with previous design: Makes microprograms long

by replicating the common parts of microcode

• Efficient way: Keep only one copy of

common code Use branching to jump to

the appropriate microroutine

Page 51: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 51

Microprogrammed Control (cont’d)

• Microinstruction format Two basic ways

» Horizontal organization

» Vertical organization

Horizontal organization– One bit for each signal

– Very flexible

– Long microinstructions

– Example: 1-bus datapathNeeds 90 bits for each microinstruction

Page 52: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 52

Microprogrammed Control (cont’d)

Horizontal microinstruction format

Page 53: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 53

Microprogrammed Control (cont’d)

• Vertical organization Encodes to reduce microinstruction length

» Reduced flexibility

Example: » Horizontal organization

– 64 control signals for the 32 general purpose registers

» Vertical organization

– 5 bits to identify the register and 1 for in/out

Page 54: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 54

Microprogrammed Control (cont’d)

General register control circuit

Page 55: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 55

Microprogrammed Control (cont’d)

Microcontroller for vertical microcode

Page 56: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 56

Microprogrammed Control (cont’d)

• Adding more buses reduces time needed to execute instructions No need to multiplex the bus

• Exampleadd %G9,%G5,%G7

Needed three steps in 1-bus datapath Need only two steps with a 2-bus dtatpath

S1 G5out: Ain;

S2 G7out: ALU=add: G9in;

Page 57: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 57

Microprogrammed Control (cont’d)

2-bus datapath

Page 58: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 58

Performance

• Two popular metrics Response time

» User- oriented

Throughput» System-oriented

• Performance of components Processors, networks, disks,… Some simple metrics

» MIPS

– Simple instruction execution rate

» MFLOPS

Page 59: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 59

Performance (cont’d)

• Calculating execution time Three factors

» Instruction count (IC)

– CISC processors have simple to complex instructions

» Clocks per instruction (CPI)

– RISC vs. CISC differences

» Clock period (T)

Execution time = IC * CPI * T

This is not response time» Not considering queuing delays

Page 60: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 60

Performance (cont’d)

• Means of performance Arithmetic mean

» Equal weight

Weighted arithmetic mean» Different weights can be assigned

Geometric mean» Geometric mean of a1, a2, …, an is

(a1 * a2 * … * an)1/n

Weighted geometric mean

a1w2 * a2w2 * … * anwn

Page 61: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 61

Performance (cont’d)

Resp. time on machine Normalized values

REF A B Ratio A B Ratio

Program 1 10 11 12 1.1 1.2

Program 2 40 49.5 60 1.24 1.5

Arith. mean 30.25 36 1.19 1.17 1.35 1.16

Geo. mean 23.33 26.83 1.15 1.167 1.342 1.15

Page 62: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 62

Performance (cont’d)

Resp. time on machine

A B

Program 1 20 200

Program 2 50 5

Arith. mean 35 102.5

Geo. mean 31.62 31.62

Problem with arithmetic mean

Page 63: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 63

Performance (cont’d)

• SPEC Benchmarks SPEC CPU2000

» Measures performance of processors, memory, and compiler

» Consists of 26 applications

– Spans four languagesC, C++, FORTRAN 77, and FORTRAN 90

» Consists of

– IntegerCINT2000

– Floating-pointCFP2000

Page 64: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 64

Performance (cont’d)

0

100

200

300

400

500

600

700

600 800 1000 1200 1400 1600 1800 2000

Clock rate (MHz)

SP

EC

int2

000

PIII

P4

Page 65: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 65

Performance (cont’d)

0

100

200

300

400

500

600

700

600 800 1000 1200 1400 1600 1800 2000

Clock rate (MHz)

SP

EC

fp20

00

P4

PIII

Page 66: Processor Organization and Performance Chapter 6 S. Dandamudi.

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 6: Page 66

Performance (cont’d)

• SPEC Benchmarks SPECmail2001

» Standardized mail server benchmark

– For systems supportingPOP3SMTP

» Uses both throughput and response times

SPECweb99» Benchmark for HTTP servers

SPECjvm98» Benchmark for JVM client platform

Last slide