Top Banner
Chapter 11 CPU Structure and Function
72

Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Jan 02, 2016

Download

Documents

Marlene Fleming
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Chapter 11CPU Structure and Function

Page 2: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

CPU Structure

• CPU must:—Fetch instructions—Interpret instructions—Fetch data—Process data—Write data

Page 3: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

CPU With Systems Bus

Page 4: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

CPU Internal Structure

Page 5: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Registers

• CPU must have some working space (temporary storage)

• Called registers• Number and function vary between

processor designs• One of the major design decisions• Top level of memory hierarchy

Page 6: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

User Visible Registers

• General Purpose• Data• Address• Condition Codes

Page 7: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

General Purpose Registers (1)

• May be true general purpose• May be restricted• May be used for data or addressing• Data

—Accumulator

• Addressing—Segment

Page 8: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

General Purpose Registers (2)

• Make them general purpose—Increase flexibility and programmer options—Increase instruction size & complexity

• Make them specialized—Smaller (faster) instructions—Less flexibility

Page 9: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

How Many GP Registers?

• Between 8 - 32• Fewer = more memory references• More does not reduce memory references

and takes up processor real estate• See also RISC

Page 10: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

How big?

• Large enough to hold full address• Large enough to hold full word• Often possible to combine two data

registers—C programming—double int a;—long int a;

Page 11: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Condition Code Registers

• Sets of individual bits—e.g. result of last operation was zero

• Can be read (implicitly) by programs—e.g. Jump if zero

• Can not (usually) be set by programs

Page 12: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Control & Status Registers

• Program Counter• Instruction Decoding Register• Memory Address Register• Memory Buffer Register

• Revision: what do these all do?

Page 13: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Program Status Word

• A set of bits• Includes Condition Codes• Sign of last result• Zero• Carry• Equal• Overflow• Interrupt enable/disable• Supervisor

Page 14: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Supervisor Mode

• Intel ring zero• Kernel mode• Allows privileged instructions to execute• Used by operating system• Not available to user programs

Page 15: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Other Registers

• May have registers pointing to:—Process control blocks (see O/S)—Interrupt Vectors (see O/S)

• N.B. CPU design and operating system design are closely linked

Page 16: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Example Register Organizations

Page 17: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Instruction Cycle

• Revision• Stallings Chapter 3

Page 18: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Indirect Cycle

• May require memory access to fetch operands

• Indirect addressing requires more memory accesses

• Can be thought of as additional instruction subcycle

Page 19: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Instruction Cycle with Indirect

Page 20: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Instruction Cycle State Diagram

Page 21: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Instruction Fetch)

• Depends on CPU design• In general:

• Fetch—PC contains address of next instruction—Address moved to MAR—Address placed on address bus—Control unit requests memory read—Result placed on data bus, copied to MBR,

then to IR—Meanwhile PC incremented by 1

Page 22: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Data Fetch)

• IR is examined• If indirect addressing, indirect cycle is

performed—Right most N bits of MBR transferred to MAR—Control unit requests memory read—Result (address of operand) moved to MBR

Page 23: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Fetch Diagram)

Page 24: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Indirect Diagram)

Page 25: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Execute)

• May take many forms• Depends on instruction being executed• May include

—Memory read/write—Input/Output—Register transfers—ALU operations

Page 26: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Interrupt)

• Simple• Predictable• Current PC saved to allow resumption

after interrupt• Contents of PC copied to MBR• Special memory location (e.g. stack

pointer) loaded to MAR• MBR written to memory• PC loaded with address of interrupt

handling routine• Next instruction (first of interrupt handler)

can be fetched

Page 27: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Data Flow (Interrupt Diagram)

Page 28: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Prefetch

• Fetch accessing main memory• Execution usually does not access main

memory• Can fetch next instruction during

execution of current instruction• Called instruction prefetch

Page 29: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Improved Performance

• But not doubled:—Fetch usually shorter than execution

– Prefetch more than one instruction?

—Any jump or branch means that prefetched instructions are not the required instructions

• Add more stages to improve performance

Page 30: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Pipelining

• Fetch instruction• Decode instruction• Calculate operands (i.e. EAs)• Fetch operands• Execute instructions• Write result

• Overlap these operations

Page 31: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Two Stage Instruction Pipeline

Page 32: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Timing Diagram for Instruction Pipeline Operation

Page 33: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

The Effect of a Conditional Branch on Instruction Pipeline Operation

Page 34: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Six Stage Instruction Pipeline

Page 35: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Alternative Pipeline Depiction

Page 36: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Speedup Factorswith InstructionPipelining

Page 37: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Dealing with Branches

• Multiple Streams• Prefetch Branch Target• Loop buffer• Branch prediction• Delayed branching

Page 38: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Multiple Streams

• Have two pipelines• Prefetch each branch into a separate

pipeline• Use appropriate pipeline

• Leads to bus & register contention• Multiple branches lead to further pipelines

being needed

Page 39: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Prefetch Branch Target

• Target of branch is prefetched in addition to instructions following branch

• Keep target until branch is executed• Used by IBM 360/91

Page 40: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Loop Buffer

• Very fast memory• Maintained by fetch stage of pipeline• Check buffer before fetching from memory• Very good for small loops or jumps• c.f. cache• Used by CRAY-1

Page 41: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Loop Buffer Diagram

Page 42: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Branch Prediction (1)

• Predict never taken—Assume that jump will not happen—Always fetch next instruction —68020 & VAX 11/780—VAX will not prefetch after branch if a page

fault would result (O/S v CPU design)

• Predict always taken—Assume that jump will happen—Always fetch target instruction

Page 43: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Branch Prediction (2)

• Predict by Opcode—Some instructions are more likely to result in a

jump than thers—Can get up to 75% success

• Taken/Not taken switch—Based on previous history—Good for loops

Page 44: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Branch Prediction (3)

• Delayed Branch—Do not take jump until you have to—Rearrange instructions

Page 45: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Branch Prediction Flowchart

Page 46: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Branch Prediction State Diagram

Page 47: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Dealing With Branches

Page 48: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.
Page 49: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Intel 80486 Pipelining• Fetch

— From cache or external memory— Put in one of two 16-byte prefetch buffers— Fill buffer with new data as soon as old data consumed— Average 5 instructions fetched per load— Independent of other stages to keep buffers full

• Decode stage 1— Opcode & address-mode info— At most first 3 bytes of instruction— Can direct D2 stage to get rest of instruction

• Decode stage 2— Expand opcode into control signals— Computation of complex address modes

• Execute— ALU operations, cache access, register update

• Writeback— Update registers & flags— Results sent to cache & bus interface write buffers

Page 50: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

80486 Instruction Pipeline Examples

Page 51: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Pentium 4 Registers

Page 52: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..• General: there are eight 32-bit general-purpose regiser.

These may be used for all types of Pentium instruction; they can also hold operands for address calculations. Some of these registers also serve special purposes. For example, string instructions use the contents of the ECX, ESI and EDI registers as operands without having to reference these register explicitly in the instruction. As a result, a number of instructions can be encoded more compactly.

• Segment: The six 16-bit segment registers contain segment selectors, which index into segment tables. The code segment CS register references teh segment containing the instruction being executed. The stack segment SS register references the segment containing a user-visible stack. The remaining segment registers DS,ES,FS,GS enable the user to reference up to four separate data segments at a time.

Page 53: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• Flags: The EFLAGS register contains condition codes and various mode bits.

• Instruction pointer: Contains the address of the current instructions. There are also the registers specifically devoted to the floating-point unit.

• Numeric: Each register holds an extended-precision 80bit floating point number. There are eight registers that function as a stack, with push and pop operations available in the instruction set.

• Control: The 16bit control register contains bits that control the operation of the floating point unit, including the type of rounding control; single,double, or extended precision; and bits to enable or disable various exception conditions.

Page 54: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• Status: The 16bit status register contains bits that reflect the current state of the floating point unit, including a 3-bit pointer to the top of the stack; condition codes reporting the outcome of the last operation; and exception flags.

• Tag word: This 16bit register contains a 2bit tag for each floating point numeric register, which indicates the nature of the contents of the corresponding register. The four possible values are valid, zero,special and empty. These tags enable programs to check the contents of a numeric register without performing complex decoding of the actual data in the register. For example, when a context switch is made, the processor need not save any floating point register that are empty.

Page 55: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

EFLAGS Register

Page 56: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• Trap flag: when set, causes an interrupt after the execution of each instruction. This is used for debugging.

• Interrupt enable flag (IF): when set, the processor will recognize external interrupts.

• Direction Flag (DF): determines whether string processing instructions increment or decrement the 16bit half-registers SI and DI (for 16 bit operation) or the 32bit registers ESI and EDI (for 32bit operation).

• I/O privilege flag (IOPL): when set, causes the processor to generate an exception on all access to I/O devices during protected-mode operation.

Page 57: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• Resume flag (RF): allows the programmer to disable debug exceptions so that the instruction can be restarted after a debug exception without immediately causing another debug exception.

• Alignment check (AC): Activates if a word or doubleword is addressed on a nonword or nondoubleword boundry .

• Identification flag (ID): If this bit can be set and cleared, then this processor supports the processorID instruction. This instruction provides information about the vendor, family and model.

Page 58: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Control Registers

Page 59: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Control register

• Protection enable (PE): Enable/disable protected mode of operation

• Monitor coprocessor (MP): Only of interest when running programs from earlier machines on the Pentium; it relates to the presence of an arithmetic coprocessor.

• Emulation (EM): set when the processor does not have a floating point unit, and causes an interrupt when an attempt is made to execute floating point instruction.

• Task switched (TS): Indicates that the processor has switched tasks.

• Extension type (ET): used to indicate support of math coprocessor instructions on earlier machines.

Page 60: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• Numeric error (NE): Enables the standard mechanism for reporting floating point errors on external bus lines

• Write protected (WP): when this bit is clear, read only user level pages can be written by a supervisor process. This feature is useful for supporting process creation in some operating systems.

• Alignment mask (AM): Enables/disables alignment checking

• Not write through (NW): selects mode of operation of the data cache. When this bit is set, the data cache is inhibited from cache write-through operations.

• Cache disable (CD): Enables/disables the internal cache write-through operations.

• Paging (PG): Enables/disables paging.

Page 61: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

MMX Register Mapping

• MMX uses several 64 bit data types• Use 3 bit register address fields so that

eight MMX registers are supported. • No MMX specific registers

—Aliasing to lower 64 bits of existing floating point registers

Page 62: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Mapping of MMX Registers to Floating-Point Registers

Page 63: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Key characteristics of MMX• Recall that the floating point registers are treated as a

stack for floating point operations. For MMX operations, these registers are accessed directly.

• The first time that an MMX instruction is executed after any floating-point operations, the FP tag word is marked valid. This reflects the change from stack operation to direct register addressing.

• The EMMS instruction sets bits of the FP tag word to indicate that all registers are empty. It is important that the programmer insert this instruction at the end of an MMX code block so that subsequent floating point operations function properly.

• When a value is written to an MMX register, bits[79:64] of the corresponding FP register are set to all ones. This set the value in the FP register to infinity when viewed as a floating point value. This ensures that an MMX data value will not look like a valid floating point value.

Page 64: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Pentium Interrupt Processing

• Interrupts—Maskable : received on the processor INTR pin. The

processor does not recognize a maskable interrupt unless the interrupt enable flag (IF) is set.

—Nonmaskable: received on the processor NMI pin. Recognition of such interrupts cannot be prevented.

• Exceptions—Processor detected: result when the processor

encounters an error while attempting to execute an instruction.

—Programmed: These are instructions that generate an exception

• Interrupt vector table—Each interrupt type assigned a number—Index to vector table—The table contains 256 * 32 bit interrupt vectors

Page 65: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Cont..

• 5 priority classes• Class 1: Traps on the previous instruction (vector

1)• Class 2: External interrupts (2,32-255)• Class 3: Faults from fetching next instruction

(3,4)• Class 4: Faults from decoding the next instruction

(6,7)• Class 5: Faults on executing an instruction

Page 66: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Interrupt handling

1) If the transfer involves a change of privilege level, then the current stack segment register and the current extended stack pointer (ESP) register are push onto the stack

2) The current value of the EFLAGS register is pushed onto stack

3) Both the interrupt (IF) and trap (TF) flags are cleared. This disables INTR interrupts and the trap or single-step feature.

4) The current code segment (CS) pointer and the current instruction pointer are pushed onto the stack

5) If the interrupt is accompanied by an error code, then the error code is pushed onto the stack

6) The interrupt vector contents are fetched and loaded into the CS and IP or EIP registers. Execution continues from the interrupt service routine.

Page 67: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

PowerPC User Visible Registers

Page 68: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Fixed-point unit includes the following:

• General: There are 32 64-bit general purpose register. These may be used to load, store, and manipulate data operands and may also be used for register indirect addressing. Register 0 is treated somewhat differently. For load and store operations and several of the add instructions, register 0 is treated as having a constant value zero regardless of its actual contents.

• Exception register (XER): Includes 3 bits that report exceptions in integer arithmetic operations. This register also includes a byte count field that is used as an operand for some string instructions

Page 69: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Floating point unit

• General: There are 32 64bit general purpose registers, used for all floating point operations.

• Floating point status and control register (FPSCR): This 32 bit register contains bits that control the operations of the floating-point unit and bits that record the status resulting from floating point operations.

Page 70: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

PowerPC Register Formats

Page 71: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Interrupt Processing

Page 72: Chapter 11 CPU Structure and Function. CPU Structure CPU must: —Fetch instructions —Interpret instructions —Fetch data —Process data —Write data.

Interrupt Handling1) The processor places the address of the instruction to be

executed next in the save/restore register 0 (SRR0). This is the address of the currently executing instruction if the interrupt was caused by a failed attempt to execute that instruction; otherwise, it is the address of the next instruction to be executed after the current instruction.

2) The processor copies machine state information from the MSR to the save/restore Register 1 (SRR1). The bits that are depicted as unshaded in Table 12.7 (page 440) are copied. The remaining bits of SRR1 are loaded with information specific to the interrupt type.

3) The MSR is set to a hardware defined value specific to the interrupt type. For all interrupt types, address translation is turned off and external interrupt are disabled

4) The processor then transfer control to the appropriate interrupt handler. The address of the interrupt handlers are stored in the interrupt table (table 12.6). The base address of that table is determined by bit 57 of the MSR.