Top Banner

of 102

Module1 - Overview and Computer System

Apr 10, 2018

Download

Documents

Suhaila Najib
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/8/2019 Module1 - Overview and Computer System

    1/102

    Module 1

    Overview: Introduction toComputer Architecture and

    Organization

  • 8/8/2019 Module1 - Overview and Computer System

    2/102

    Architecture & Organization 1 Architecture is those attributes visible to the

    programmer

    Instruction set, number of bits used for datarepresentation, I/O mechanisms, addressingtechniques.

    e.g. Is there a multiply instruction?

    Organization is how architecture features areimplemented

    Control signals, interfaces, memory technology.

    e.g. Is there a hardware multiply unit or is it

    done by repeated addition?

  • 8/8/2019 Module1 - Overview and Computer System

    3/102

    Architecture & Organization 2All Intel x86 family share the same

    basic architecture

    The IBM System/370 family share thesame basic architecture

    This gives code compatibility

    At least backwards

    Organization differs between differentversions

  • 8/8/2019 Module1 - Overview and Computer System

    4/102

    Structure & Function Structure is the way in which

    components relate to each other

    Function is the operation of individualcomponents as part of the structure

  • 8/8/2019 Module1 - Overview and Computer System

    5/102

    Function All basic computer functions are:

    Data processingprocess data in variety of forms

    and requirements Data storageshort and long term data storage for

    retrieval and update

    Data movementmove data between computer

    and outside world. Devices that serve as sourceand destination of datathe process is known asI/O

    Controlcontrol of process, move and store datausing instruction.

  • 8/8/2019 Module1 - Overview and Computer System

    6/102

    Functional View

  • 8/8/2019 Module1 - Overview and Computer System

    7/102

    Operations (a) Data movementdevice operation

    Operations (b) Datastorage device operation read and write

  • 8/8/2019 Module1 - Overview and Computer System

    8/102

    Operation (c)Processing from/tostorage

    Operation (d)Processing from storage to I/O

  • 8/8/2019 Module1 - Overview and Computer System

    9/102

    Structure - Top Level - 1

    Computer

    Main

    Memory

    Input

    Output

    Systems

    Interconnection

    Peripherals

    Communication

    lines

    CentralProcessing

    Unit

    Computer

  • 8/8/2019 Module1 - Overview and Computer System

    10/102

    Structure - Top Level - 2 Central Processing Unit (CPU) controls

    operation of the computer and performs data

    processing functions Main memory stores data

    I/O moves data between computer andexternal environment

    System interconnection providescommunication between CPU, main memoryand I/O

  • 8/8/2019 Module1 - Overview and Computer System

    11/102

    Structure - The CPU - 1

    Computer Arithmeticand

    Login Unit

    Control

    Unit

    Internal CPU

    Interconnection

    Registers

    CPU

    I/O

    Memory

    SystemBus

    CPU

  • 8/8/2019 Module1 - Overview and Computer System

    12/102

    Structure - The CPU - 2 Control unit (CU)control the operation of

    CPU

    Arithmetic and logic unit (ALU)-performs dataprocessing functions

    Registers-provides internal storage to the

    CPU CPU Interconnection-provides communication

    between control unit (CU), ALU and registers

  • 8/8/2019 Module1 - Overview and Computer System

    13/102

    Structure - The Control Unit

    CPU

    Control

    Memory

    Control Unit

    Registers and

    Decoders

    Sequencing

    Login

    Control

    Unit

    ALU

    Registers

    InternalBus

    Control Unit

    Implementation of control unit micro programmedim lementation

  • 8/8/2019 Module1 - Overview and Computer System

    14/102

    Module 1Overview: Von Neumann Machine

    and Computer Evolution

  • 8/8/2019 Module1 - Overview and Computer System

    15/102

    First Generation Vacuum Tubes

    ENIAC - background Electronic Numerical Integrator And Computer

    Eckert and Mauchly

    University of Pennsylvania Trajectory tables for weapons (range and

    trajectory)

    Started 1943

    Finished 1946 Too late for war effort, but used determine the

    feasibility of hydrogen bomb

    Used until 1955

  • 8/8/2019 Module1 - Overview and Computer System

    16/102

    ENIAC - diagram

  • 8/8/2019 Module1 - Overview and Computer System

    17/102

    ENIAC - details Decimal (not binary)

    20 accumulators of 10 digits Programmed manually by switches

    18,000 vacuum tubes

    30 tons 15,000 square feet

    140 kW power consumption

    5,000 additions per second

  • 8/8/2019 Module1 - Overview and Computer System

    18/102

    Von Neumann Machine/Turing Stored Program concept

    Main memory storing programs and data- That could

    be changed and programming will become easier ALU operating on binary data

    Control unit interpreting instructions from memoryand executing

    Input and output (I/O) equipment operated bycontrol unit

    Princeton Institute for Advanced Studies

    IAS computer

    Completed 1952

  • 8/8/2019 Module1 - Overview and Computer System

    19/102

    Structure of von Neumann

    machine

  • 8/8/2019 Module1 - Overview and Computer System

    20/102

    Von Neumann Architecture Data and instruction are stored in a

    single read-write memory

    The contents of this memory isaddressable by location

    Execution occurs in a sequential fashion

    from one instruction to the next

  • 8/8/2019 Module1 - Overview and Computer System

    21/102

    1000 x 40 bit wordsBinary number number word

    2 x 20 bit instructions instruction word1

    IAS details-1

  • 8/8/2019 Module1 - Overview and Computer System

    22/102

    IAS details-2

    Set of registers (storage in CPU)

    Memory Buffer Registercontains word to be storedin memory or sent to I/O unit, receive a word from

    memory of from I/O unit Memory Address Registerspecify the address in

    memory for MBR

    Instruction Register-contains opcode

    Instruction Buffer Register-temporary storage forinstruction

    Program Counter-contains next instruction to befetched from memory

    Accumulator and Multiplier Quotient-temporary

    storage for operands and result of ALU operations

  • 8/8/2019 Module1 - Overview and Computer System

    23/102

    Structureof IAS

    detail-3

    Control unit

    fetches instructionsfrom memory andexecutes them oneby one

  • 8/8/2019 Module1 - Overview and Computer System

    24/102

    Commercial Computers 1947 - Eckert-Mauchly Computer Corporation-

    manufacture computer commercially

    UNIVAC I (Universal Automatic Computer)-firstcommercial computer

    US Bureau of Census 1950 calculations

    Became part of Sperry-Rand Corporation

    Late 1950s - UNIVAC II Faster

    More memory

  • 8/8/2019 Module1 - Overview and Computer System

    25/102

    IBM Punched-card processing equipment

    1953 - the 701

    IBMs first stored program computer

    Scientific calculations

    1955 - the 702

    Business applications

    Lead to 700/7000 series

  • 8/8/2019 Module1 - Overview and Computer System

    26/102

    Second Generation:

    Transistors Replaced vacuum tubes

    Smaller

    Cheaper

    Less heat dissipation

    Solid State device Made from Silicon (Sand)

    Invented 1947 at Bell Labs

    William Shockley et al.

  • 8/8/2019 Module1 - Overview and Computer System

    27/102

    Transistor Based Computers Second generation machines

    More complex arithmetic and logic

    unit(ALU)and control unit(CU) Use of high level programming languages

    NCR & RCA produced small transistormachines

    IBM 7000 DEC - 1957

    Produced PDP-1

  • 8/8/2019 Module1 - Overview and Computer System

    28/102

    Transistors Based Computers

  • 8/8/2019 Module1 - Overview and Computer System

    29/102

    The Third Generation: Integrated Circuits

    Microelectronics

    Literally - small electronics

    A computer is made up of gates, memory cells andinterconnections Data storage-provided by memory cells

    Data processing-provided by gates

    Data movement-the paths between components that areused to move data

    Control-the paths between components that carry control

    singnals

    These can be manufactured on a semiconductor

    e.g. silicon wafer

  • 8/8/2019 Module1 - Overview and Computer System

    30/102

    Integrated Circuits

    Early integratedcircuits- know as smallscale integration (SSI)

  • 8/8/2019 Module1 - Overview and Computer System

    31/102

    Moores Law Increased density of components on chip-refer chart

    Gordon Moore co-founder of Intel

    Number of transistors on a chip will double every year-refer chart

    Since 1970s development has slowed a little

    Number of transistors doubles every 18 months

    Cost of a chip has remained almost unchanged

    Higher packing density means shorter electrical paths,giving higher performance

    Smaller size gives increased flexibility

    Reduced power and cooling requirements

    Fewer interconnections increases reliability

  • 8/8/2019 Module1 - Overview and Computer System

    32/102

    Growth in CPU Transistor

    Count

  • 8/8/2019 Module1 - Overview and Computer System

    33/102

    IBM 360 series 1964

    Replaced (& not compatible with) 7000 series

    First planned family of computers Similar or identical instruction sets

    Similar or identical O/S

    Increasing speed

    Increasing number of I/O ports (i.e. more terminals)

    Increased memory size

    Increased cost

    Multiplexed switch structure-multiplexor

  • 8/8/2019 Module1 - Overview and Computer System

    34/102

    IBM 360 - images

  • 8/8/2019 Module1 - Overview and Computer System

    35/102

    DEC PDP-8 1964

    First minicomputer (afterminiskirt!)

    Did not need air conditionedroom

    Small enough to sit on a labbench

    $16,000

    $100k++ for IBM 360

    Embedded applications & OEM-integrate into a total systemwith other manufacturers

  • 8/8/2019 Module1 - Overview and Computer System

    36/102

    DEC - PDP-8 Bus Structure

    Universal bus structure (Omnibus)-separatesignals paths to carry control, data and addresssignalsCommon bus structure- control by CPU

  • 8/8/2019 Module1 - Overview and Computer System

    37/102

    Later Generations:

    Semiconductor Memory 1950s to 1960s-memory made of ring of

    ferromagnetic material called cores-fast but

    expensive, bulky and destructive read Semiconductor memory-By Fairchild 1970

    Size of a single core

    i.e. 1 bit of magnetic core storage

    Holds 256 bits Non-destructive read

    Much faster than core

    Capacity approximately doubles each year

  • 8/8/2019 Module1 - Overview and Computer System

    38/102

    Generations of Computer Vacuum tube - 1946-1957 Vacuum tube

    Transistor - 1958-1964 Transistor

    Small scale integration SSI - 1965 on

    Up to 100 devices on a chip Medium scale integration MSI - to 1971

    100-3,000 devices on a chip

    Large scale integration LSI - 1971-1977

    3,000 - 100,000 devices on a chip

    Very large scale integration VLSI - 1978 -1991 100,000 - 100,000,000 devices on a chip

    Ultra large scale integration ULSI 1991 -

    Over 100,000,000 devices on a chip

  • 8/8/2019 Module1 - Overview and Computer System

    39/102

    Microprocessors Intel 1971 - 4004

    First microprocessor

    All CPU components on a single chip 4 bit design

    Followed in 1972 by 8008

    8 bit

    Both designed for specific applications 1974 - 8080

    Intels first general purpose microprocessor

    Design to be CPU of a general purpose

    microcomputer

  • 8/8/2019 Module1 - Overview and Computer System

    40/102

    Pentium Evolution - 1 8080

    first general purpose microprocessor

    8 bit data path

    Used in first personal computer Altair 8086

    much more powerful

    16 bit

    instruction cache, prefetch few instructions

    8088 (8 bit external bus) used in first IBM PC

    80286 16 MB memory addressable

    80386

    First 32 bit design

    Support for multitasking- run multiple programs at the same time

  • 8/8/2019 Module1 - Overview and Computer System

    41/102

    Pentium Evolution - 2 80486

    sophisticated powerful cache and instruction pipelining

    built in maths co-processor

    Pentium Superscalar technique-multiple instructions executed in

    parallel

    Pentium Pro

    Increased superscalar organization

    Aggressive register renaming branch prediction

    data flow analysis

    speculative execution

  • 8/8/2019 Module1 - Overview and Computer System

    42/102

    Pentium Evolution - 3 Pentium II

    MMX technology

    graphics, video & audio processing

    Pentium III Additional floating point instructions for 3D graphics

    Pentium 4

    Note Arabic rather than Roman numerals

    Further floating point and multimedia enhancements

    Itanium 64 bit

    see chapter 15

    Itanium 2

    Hardware enhancements to increase speed

    See Intel web pages for detailed information on processors

  • 8/8/2019 Module1 - Overview and Computer System

    43/102

    Module 1Computer System: Designing and

    Understanding Performance

  • 8/8/2019 Module1 - Overview and Computer System

    44/102

    Designing for Performance

    Microprocessor Speed Achieve full potential of speed if microprocessor is fed

    constant data and instructions. Techniques used: Branch prediction-processor looks ahead in the instruction code

    and predicts which branches (group of instructions) are likely tobe processed next

    Data flow analysis-processor analyzes instructions which aredependent on each others result or data to create an optimizedschedule of instructions

    Speculative execution-use branch prediction and data flowanalysis, processor speculatively executes instructions ahead oftheir program execution, holding the result in temporarylocations.

  • 8/8/2019 Module1 - Overview and Computer System

    45/102

    Performance Balance Processor

    and Memory Performance balance- an adjusting of

    organization and architecture to balance

    the mismatch capabilities of variouscomponents (etc processor vs. memory)

    Processor speed increased

    Memory capacity increased Memory speed lags behind processor

    speed

    Performance Balance Processor and

  • 8/8/2019 Module1 - Overview and Computer System

    46/102

    Performance Balance Processor andMemory (Performance Gap)

  • 8/8/2019 Module1 - Overview and Computer System

    47/102

    Performance Balance Processor

    and Memory - Solutions Increase number of bits retrieved at one time

    Make DRAM wider rather than deeper

    Change DRAM interface Include cache in DRAM chip

    Reduce frequency of memory access

    More complex and efficient cache between processorand memory

    Cache on chip/processor

    Increase interconnection bandwidth between processorand memory

    High speed buses

    Hierarchy of buses

  • 8/8/2019 Module1 - Overview and Computer System

    48/102

    Performance Balance: I/O

    Devices Peripherals with intensive I/O demands-refer chart

    Large data throughput demands-refer chart

    Processors can handle this I/O process, but the problemis moving data between processor and devices

    Solutions:

    Caching

    Buffering

    Higher-speed interconnection buses

    More elaborate bus structures

    Multiple-processor configurations

  • 8/8/2019 Module1 - Overview and Computer System

    49/102

    Performance Balance: I/ODevices

  • 8/8/2019 Module1 - Overview and Computer System

    50/102

    Key is Balance : Designers 2 factors:

    The rate at which performance is changing in the

    various technology areas (processor, busses,memory, peripherals) differs greatly from one typeof element to another

    New applications and new peripheral devicesconstantly change the nature of demand on thesystem in term of typical instruction profile andthe data access patterns

  • 8/8/2019 Module1 - Overview and Computer System

    51/102

    Improvements in Chip

    Organization and Architecture Increase hardware speed of processor

    Fundamentally due to shrinking logic gate size

    More gates, packed more tightly, increasing clockrate

    Propagation time for signals reduced

    Increase size and speed of caches

    Dedicating part of processor chip

    Cache access times drop significantly

    Change processor organization and architecture

    Increase effective speed of execution

    Parallelism

  • 8/8/2019 Module1 - Overview and Computer System

    52/102

    Problems with Clock Speed

    and Logic Density Power

    Power density increases with density of logic and clockspeed

    Dissipating heat RC delay

    Speed at which electrons flow limited by resistance andcapacitance of metal wires connecting them

    Delay increases as RC product increases

    Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance

    Memory latency

    Memory speeds lag processor speeds

    Solution:

    More emphasis on organizational and architecturalapproaches

  • 8/8/2019 Module1 - Overview and Computer System

    53/102

    Intel Microprocessor

    Performance

  • 8/8/2019 Module1 - Overview and Computer System

    54/102

    Approach 1: Increased Cache

    Capacity Typically two or three levels of cache

    between processor and main memory

    (L1, L2, L3)

    Chip density increased

    More cache memory on chip

    Faster cache access

    Pentium chip devoted about 10% ofchip area to cache

    Pentium 4 devotes about 50%

  • 8/8/2019 Module1 - Overview and Computer System

    55/102

    Approach 2: More Complex

    Execution Logic Enable parallel execution of instructions

    Two approaches introduced:

    Pipelining

    Superscalar

    (covered later on)

  • 8/8/2019 Module1 - Overview and Computer System

    56/102

    Diminishing Returns from

    Approach 1 and Approach 2 Internal organization of processors very complex

    Can get a great deal of parallelism

    Further significant increases likely to berelatively modest

    Benefits from cache are reaching limit

    Increasing clock rate runs into power dissipation

    problem Some fundamental physical limits are being

    reached

  • 8/8/2019 Module1 - Overview and Computer System

    57/102

    New Approach Multiple

    Cores Multiple processors on single chip

    With large shared cache

    Within a processor, increase in performanceproportional to square root of increase in complexity

    If software can use multiple processors, doublingnumber of processors almost doubles performance

    So, use two simpler processors on the chip rather

    than one more complex processor With two processors, larger caches are justified

    Power consumption of memory logic less than processinglogic

    Example: IBM POWER4

    Two cores based on PowerPC

  • 8/8/2019 Module1 - Overview and Computer System

    58/102

    POWER4 Chip Organization

  • 8/8/2019 Module1 - Overview and Computer System

    59/102

    Module 1Computer System: Designing and

    Understanding Performance(Book: Computer Organization and Design,3ed, David L. Patterson and

    John L. Hannessy, Morgan Kaufmann Publishers)

  • 8/8/2019 Module1 - Overview and Computer System

    60/102

    Introduction Hardware performance is often key to the effectiveness of an

    entire system of hardware and software

    For different types of applications, different performance

    metrics may by appropriate, and different aspects of a computersystems may be the most significant in determining overallperformance

    Understanding how best to measure performance andlimitations of performance is important when selecting a

    computer system To understand the issues of assessing performance.

    Why a piece of software performs as it does?

    Why one instruction set can be implemented to perform better thananother?

    How some hardware feature affects performance?

  • 8/8/2019 Module1 - Overview and Computer System

    61/102

    Defining Performance - 1

    How do we say one computer has better performance than another?Peformance based on speed

    To take a single passenger from one point to another in the least time ConcordePerformance based on passenger throughput

    To transport 450 passengers from one point to another - 747

  • 8/8/2019 Module1 - Overview and Computer System

    62/102

    Response Time and Throughput - 2 As an individual computer user, you are interested in reducing

    response time (or execution time)

    The time between the start and completion of a task

    Data center managers are often interested in increasingthroughput

    The total amount of work done in a given time

    Example: What is improved with the following changes? Replacing the processor in a computer with a faster version will improve

    response time and throughput Adding additional processors to a system that uses multiple processors for

    separate tasks (e.g., searching the web) will improve throughtput

  • 8/8/2019 Module1 - Overview and Computer System

    63/102

    Performance and Execution

    Time - 3

  • 8/8/2019 Module1 - Overview and Computer System

    64/102

    Relative Performance - 4 If computer A runs a program in 10 seconds and computer B runs the

    same program in 15 seconds, how much faster is A than B?

    We know A is n times faster than B

    Performance(A) = Execution time(B) = nPerformance(B) Execution time(B)

    Performance ratio

    15 =1.5

    10

    We can say A is 1.5 times faster than B

    B is 1.5 times slower than A

    To avoid the potential confusion between the terms increasing anddecreasing, we usually say improve performance or improve

    execution time

  • 8/8/2019 Module1 - Overview and Computer System

    65/102

    Measuring Performance Time measure of computer performance

    Definition of time - Wall-clock time, response time, or elapsed time

    CPU execution time (or CPU time)

    The time the CPU spends computing for this task and does not includetime spent waiting for I/O or running other programs

    User CPU time vs. system CPU time

    Clock cycles (e.g., 0.25 ns) vs. clock rate (e.g., 4 GHz)

    Different applications are sensitive to different aspects of the performanceof a computer system

    Many applications, especially those running on servers, depend as much onI/O performance and total elapsed time measured by a wall clock is ofinterest

    In some application environments, the user may care about throughput,response time, or a complex combination of the two (e.g., maximumthroughput with a worst-case response time)

  • 8/8/2019 Module1 - Overview and Computer System

    66/102

    CPU Execution TimeA simple formula relates the most

    basic metrics (i.e., clock cycles and

    clock cycle time) to CPU time

  • 8/8/2019 Module1 - Overview and Computer System

    67/102

    Improving Performance Our favorite program runs in 10 seconds on computer A, which has a 4 GHz

    clock. If a computer B will run this program in 6 seconds given thatcomputer B requires 1.2 times as many clock cycles as computer A for thisprogram. What is computer Bs clock rate?

    Clock Cycles for program A

    CPU Time(A) = CPU Clock Cycles(A) / Clock Rate(A)

    10 s = CPU Clock Cycles(A) / 4 GHz

    10 s = CPU Clock Cycles(A) / 4 X 10*9 Hz

    CPU Clock Cycles(A) = 40 x 10*9 cycles

    CPU Time(B)CPU Time(B) = 1.2 X CPU Clock Cycles(A) / Clock Rate(B)

    6 s = 1.2 X CPU Clock Cycles(A) / Clock Rate(B)

    Clock Rate (B) = 1.2 X 40 X 10*9 cycles / 6 seconds

    Clock Rate (B) = 48 X 10*9 cycles / 6 seconds

    Clock Rate (B) = 8 X 10*9 cycles / seconds

    Clock Rate B = 8 GHz

  • 8/8/2019 Module1 - Overview and Computer System

    68/102

    Clock Cycles Per Instruction (CPI) The execution time must depend on the number of

    instructions in a program and the average time perinstruction

    CPU clock cycles = Instructions for a program Averageclock cycles per instruction

    Clock cycles per instruction (CPI)

    Average number of clock cycles each instruction takesto execute

    CPI can provide one way of comparing two differentimplementations of the same instruction setarchitecture

  • 8/8/2019 Module1 - Overview and Computer System

    69/102

    Using Performance Equation-1 Suppose we have two implementations of the same instruction set architecture

    and for the same program. Which computer is faster and by how much?

    Computer A: clock cycle time=250 ps and CPI=2.0

    Computer B: clock cycle time=500 ps and CPI=1.2

    Say I = number of instructions for the program, find number of clock cycles forA and B

    CPU Clock Cycles(A) = I X CPI(A)

    CPU Clock Cycles(A) = I X 2.0

    CPU Clock Cycles(B) = I X CPI(B)

    CPU Clock Cycles(B) = I X 1.2

    Compute CPU Time for A and B

    CPU Time(A) = CPU Clock Cycles(A) X Clock Cycle Time(A)

    CPU Time(A) = I X 2.0 X 250 ps = I X 500 ps

    CPU Time(B) = CPU Clock Cycles(B) X Clock Cycle Time(B)

    CPU Time(B) = I X 1.2 X 500 ps = I X 600 ps

  • 8/8/2019 Module1 - Overview and Computer System

    70/102

    Using Performance Equation-2 Clearly A is faster. The amount faster is

    the ratio of execution time.Performance(A) = Execution time(B) = I X 600 ps = 1.2 timesPerformance(B) Execution time(B) I X 500 ps

    We can conclude, A is 1.2 times faster

    than B for this program

  • 8/8/2019 Module1 - Overview and Computer System

    71/102

    Basic Peformance Equation Basic performance equation

    Instruction count can be measured by using software tools thatprofile the execution or by using a simulator of the architecture

    Hardware counters, which are included on many processors,can be used alternatively to record a variety of measurements Number of instructions executed

    Average CPI

    Sources of performance loss

  • 8/8/2019 Module1 - Overview and Computer System

    72/102

    Basis Components of

    Performance

  • 8/8/2019 Module1 - Overview and Computer System

    73/102

    Measuring the CPI Sometimes it is possible to compute the CPU clock cycles by looking

    at the different types of instructions and using their individual clockcycle counts

    CPIi = count of the number of instructions of class i executed

    CPIi = average number of cycles per instruction for that instruction class

    n = number of instruction classes

    Remember that overall CPI for a program will depend on both thenumber of cycles for each instruction type and the frequency ofeach instruction type in the program execution

  • 8/8/2019 Module1 - Overview and Computer System

    74/102

    Affecting Factors for the CPU

    Performance

  • 8/8/2019 Module1 - Overview and Computer System

    75/102

    Comparing Code Segments-1A compiler designer is trying to decide between 2 codesequences for a particular computer.

    Which code sequence executes the most instructions?

    Which will be faster? What is the CPI for each sequence?

  • 8/8/2019 Module1 - Overview and Computer System

    76/102

    Comparing Code Segments-2 Sequence 1 (Instruction Count(1)) 2+1+2=5 instructions

    Sequence 2 (Instruction Count(2)) 4+1+1=6 instructions

    CPU Clock Cycles(1)= (2X1)+(1X2)+(2X3) = 10 cycles

    CPU Clock Cycles(2)= (4X1)+(1X2)+(1X3) = 9 cycles

    So code Sequence 2 faster, even though it executes 1 extra instruction

    Code Sequence 2 uses fewer clock cycles, must have lower CPI

    CPI = CPU Clock Cycles/Instruction Count

    CPI(1) = CPU Clock Cycles(1)/Instruction Count(1) = 10/5 = 2

    CPI(2) = CPU Clock Cycles(2)/Instruction Count(2) = 9/6 = 1.5

  • 8/8/2019 Module1 - Overview and Computer System

    77/102

    Module 1

    Computer System: ComputerComponents and Bus

    Interconnection Structure

  • 8/8/2019 Module1 - Overview and Computer System

    78/102

    Von Neumann Architecture Data and instruction are stored in a

    single read-write memory

    The contents of this memory isaddressable by location

    Execution occurs in a sequential fashion

    from one instruction to the next

  • 8/8/2019 Module1 - Overview and Computer System

    79/102

    Program Concept Hardwired program-connecting/combining

    various logic components to store data and

    perform arithmetic and logic operations Hardwired systems are inflexible

    General purpose hardware can do differenttasks, given correct control signals

    Instead of re-wiring, supply a new set ofcontrol signals

  • 8/8/2019 Module1 - Overview and Computer System

    80/102

    What is a program? A sequence of steps

    For each step, an arithmetic or logical

    operation is done For each operation, a different/new set of

    control signals is needed

    For each operation a unique code is provided

    e.g. ADD, MOVE A hardware segment accepts the code and

    issues the control signals

  • 8/8/2019 Module1 - Overview and Computer System

    81/102

    Hardware and Software

    Approaches

  • 8/8/2019 Module1 - Overview and Computer System

    82/102

    Components The Control Unit and the Arithmetic and

    Logic Unit constitute the Central

    Processing Unit Data and instructions need to get into

    the system and results out

    Input/output Temporary storage of code and results

    is needed

    Main memory

    Computer Components:

  • 8/8/2019 Module1 - Overview and Computer System

    83/102

    Computer Components:Top Level View

  • 8/8/2019 Module1 - Overview and Computer System

    84/102

    Interconnection StructuresAll the units (processor, memory and

    I/O components) must be connected

    interconnection structure Different type of connection for

    different type of unit Memory

    Input/Output

    CPU

  • 8/8/2019 Module1 - Overview and Computer System

    85/102

    Computer Modules

  • 8/8/2019 Module1 - Overview and Computer System

    86/102

    Memory Connection Receives and sends data

    Receives addresses (of locations)

    Receives control signals

    Read

    Write

    Timing

  • 8/8/2019 Module1 - Overview and Computer System

    87/102

    Input/Output Connection(1) Similar to memory from computers

    viewpoint

    Output Receive data from computer

    Send data to peripheral

    Input Receive data from peripheral

    Send data to computer

  • 8/8/2019 Module1 - Overview and Computer System

    88/102

    Input/Output Connection(2) Receive control signals from computer

    Send control signals to peripherals

    e.g. spin disk

    Receive addresses from computer

    e.g. port number to identify peripheral

    Send interrupt signals (control)

  • 8/8/2019 Module1 - Overview and Computer System

    89/102

    CPU Connection Reads instruction and data

    Writes out data (after processing)

    Sends control signals to other units

    Receives (& acts on) interrupts

  • 8/8/2019 Module1 - Overview and Computer System

    90/102

    Buses Interconnection There are a number of possible

    interconnection systems

    Single and multiple BUS structures aremost common

    e.g. Control/Address/Data bus (PC)

    e.g. Unibus (DEC-PDP)

  • 8/8/2019 Module1 - Overview and Computer System

    91/102

    What is a Bus?A communication pathway connecting

    two or more devices

    Usually broadcast/shared medium

    Often grouped

    A number of channels in one bus

    e.g. 32 bit data bus is 32 separate singlebit channels

    Power lines may not be shown

  • 8/8/2019 Module1 - Overview and Computer System

    92/102

    Bus Interconnection Scheme

  • 8/8/2019 Module1 - Overview and Computer System

    93/102

  • 8/8/2019 Module1 - Overview and Computer System

    94/102

    Address bus Identify the source or destination of

    data

    e.g. CPU needs to read an instruction(data) from a given location in memory

    Bus width determines maximum

    memory capacity of system e.g. 8080 has 16 bit address bus giving

    64k address space

  • 8/8/2019 Module1 - Overview and Computer System

    95/102

    Control Bus Control and timing information for data

    and address line-

    Typical control lines: Memory read/write signal

    Interrupt request

    Clock signals Bus grant/request

  • 8/8/2019 Module1 - Overview and Computer System

    96/102

    Big and Yellow? What do buses look like?

    Parallel lines on circuit boards

    Ribbon cables

    Strip connectors on mother boards

    e.g. PCI

    Sets of wires

    Ph i l R li ti f B

  • 8/8/2019 Module1 - Overview and Computer System

    97/102

    Physical Realization of Bus

    Architecture

  • 8/8/2019 Module1 - Overview and Computer System

    98/102

    Single Bus Problems Lots of devices on one bus leads to:

    Propagation delays

    Long data paths mean that co-ordination of bususe can adversely affect performance

    If aggregate data transfer approaches buscapacity

    Most systems use multiple buses toovercome these problems

    Traditional (ISA)

  • 8/8/2019 Module1 - Overview and Computer System

    99/102

    Traditional (ISA)

    (with cache)

  • 8/8/2019 Module1 - Overview and Computer System

    100/102

    High Performance Bus

  • 8/8/2019 Module1 - Overview and Computer System

    101/102

    PCI Bus Peripheral Component Interconnection

    Intel released to public domain

    32 or 64 bit

    64 data lines

    High bandwidth, processor independentbus

    PCI B

  • 8/8/2019 Module1 - Overview and Computer System

    102/102

    PCI Bus