Top Banner
TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA KHOA HỌC & KỸ THUẬT MÁY TÍNH Ôn tp Môn: Kiến Trúc Máy Tính - 504002 TP. HCM 4/2015
21

Ôn tập Kiến Trúc Máy Tính

Sep 20, 2015

Download

Documents

longpro2000

Chương 4,5
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • TRNG I HC BCH KHOA

    KHOA KHOA HC & K THUT MY TNH

    n tp Mn: Kin Trc My Tnh - 504002

    TP. HCM 4/2015

  • Ni dung

    I. Kiu lnh ........................................................................................................................... 3

    II. Single clock processor ...................................................................................................... 3

    II.1 Single clock processor ................................................................................................. 3

    II.2 Kin trc single cycle ................................................................................................... 3

    II.3 Bi tp .......................................................................................................................... 7

    II.4 p n/Gi ................................................................................................................ 7

    III. Pipeline processor ............................................................................................................. 9

    III.1 Pipeline processor ........................................................................................................ 9

    III.1.1 Hazard ................................................................................................................ 9

    III.2 Bi tp: ....................................................................................................................... 11

    III.3 p n/gi ............................................................................................................... 13

    IV. Memory ........................................................................................................................... 14

    IV.1 Memory ...................................................................................................................... 14

    IV.2 Cache .......................................................................................................................... 15

    IV.2.1 Block placement .............................................................................................. 16

    IV.2.2 Block identification ......................................................................................... 17

    IV.2.3 Block replacement ........................................................................................... 17

    IV.2.4 Write strategy .................................................................................................. 17

    IV.2.5 Miss/hit ............................................................................................................ 18

    IV.3 Bi tp: ....................................................................................................................... 19

    IV.4 p n/Gi .............................................................................................................. 20

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    3

    Cc yu t nh hng n hiu sut ca h thng:

    - di ca chng trnh (instruction count)

    - S chu k trn 1 lnh (CPI)

    - Thi gian ca 1 chu k (clock cycle time).

    I. Kiu lnh

    Cc kiu format ca tp lnh trong MIPS:

    Loai R

    Op (6 bit) Rs (5bit) Rt (5bit) Rd (5 bit) Sa (5 bit) Funct (6 bit)

    Loai I

    Op (6 bit) Rs (5bit) Rt (5bit) Immediate (16 bit)

    Loai J

    Op (6 bit) Immediate (26 bit)

    Trong o:

    - Op: opcode ca lnh.

    - Rs, Rt, Rd: Thanh ghi ngun, target, ich.

    - Sa (shift amount): s bit dich trong cac lnh shift.

    - Immediate: i din cho s.

    - Funct: 6 bit function.

    II. Single clock processor

    II.1 Single clock processor

    - u im: mt clock mt chu k.

    - Nhc im: mt chu k tn nhiu thi gian, mi lnh d nhanh hay chm u thc thi

    trong mt chu k.

    II.2 Kin trc single cycle

    Tham kho Hnh II-1

    - Thanh PC: tr n lnh ang thc thi.

    - Instruction memory: cha code thc thi, khi ny ch cho php c.

    Registers file: cha 32 thanh ghi, do cn 5 bit xc nh thanh ghi no (25=32).

    xc nh chi tit thanh ghi, ta tham kho bng thanh ghi di.

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    4

    - B m rng du: mc ch c bn l m rng du t con s 16bit 32bits.

    - B chn (MUX): dng chn input a vo trong trng hp c nhiu input v 1

    output, hoc chn ng ra trong trng hp c 1 ng vo v nhiu ng ra. Tn hiu select

    quyt nh s la chn

    - ALU: thc hin tnh ton.

    - Data memory: l vng nh cha d liu trong phn data. Ch c lnh LOAD v

    STORE mi c th truy xut vo khi ny.

    Hnh II-1: Kin trc single cycle processor

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    5

    Bng II-1 Danh sch 32 thanh ghi

    Bng II-2 Gi tr ca ALUop

    Input output 4bit encoding

    Op[6bit] funct[6bit] ALUCtrl

    R-type Add ADD 0

    R-type Sub SUB 10

    R-type And AND 100

    R-type Or OR 101

    R-type Xor XOR 110

    R-type Slt SLT 1010

    Addi X ADD 0

    Slti X SLT 1010

    Andi X AND 100

    Ori X OR 101

    Xori X XOR 110

    Lw X ADD 0

    Sw X ADD 0

    Beq X SUB 10

    Bne X SUB 10

    J X X X

    Bng II-3 ngha ca cc tn hiu iu khin.

    (Ta gi s tn hiu tch cc l bng 1, tn hiu khng tch cc l 0)

    Tn hiu ngha Gi tr = tch cc Gi tr = khng tch cc

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    6

    RegDest Chn thanh ghi kt qu Chn Rd ghi kt qu Chn Rt ghi kt qu

    RegWrite Cho php ghi kt qu ngc vo thanh ghi

    Cho php Khng cho php

    ExtOp Dng cho phn m rng du khi dng s

    M rng du Khng quan tm

    ALUSrc Chn thanh ghi hoc s a vo ALU

    Thanh ghi vi s Thanh ghi vi thanh ghi

    Memwrite Cho php ghi vo vng

    data memory Cho php ghi Khng cho php ghi

    MemRead Cho php c t vng data memory

    Cho php c Khng cho php c

    MemtoReg Dng chn ng t data memory n thanh ghi

    Chn ng t data memory n thanh ghi (lnh load)

    Chn ng t kt qu ca ALU n thanh ghi

    Beq, Bne Dng cho cc lnh nhy c iu kin

    Nu iu kin nhy tha mn, PC s di 1 on n nhn

    Khi iu kin khng tha, khi PC = PC + 4 (thc thi lnh tip theo)

    J Dng cho lnh nhy khng c iu kin

    Nhy n nhn cn PC ch n lnh tip theo

    PCSrc Chn ngun cho PC Khi lnh nhy xy ra Khi lnh nhy ko xy ra

    Ch :

    - Khng quan tm n RegDes,Memread, MemtoReg khi tn hiu RegWrite = 0

    - Khi ALUSrc = 0 th ta khng quan tm n Extop

    Tham kho thm b tn hiu ca lnh c th slide Main Control Signal Values silde 45.

    - Bng lit k ng i c tr lu nht ca cc lnh (b qua tr ca b m rng du,

    MUX, ADDER, dy, PC)

    ALU Instruction

    Fetch Decode ALU Reg

    Load Instruction

    Fetch Decode ALU

    Memory

    Read Reg

    Store Instruction

    Fetch Decode ALU

    Memory

    Write

    Branch Instruction

    Fetch Decode ALU

    Jump Instruction

    Fetch Decode

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    7

    II.3 Bi tp

    Dng li kin trc c miu t Hnh II-1 tr li cc cu hi bn di:

    Cho bng delay ca cc khi nh sau:

    I-MEM ADDER MUX ALU REG D-MEM Control

    200ps 100ps 30ps 180ps 150ps 200ps 100ps

    a) Xc nh ng i c tr lu nht ca lnh AND, LOAD, v tnh tr ?

    b) Xc nh cc tn hiu ca khi control unit (main Unit) khi thc thi lnh BEQ $1, $2,

    ABC. Vi $1 = 0x00FF, $2 = 0x00FE

    c) Thnh phn phn cng no khng s dng khi ta thc thi lnh SLTI, lnh J

    d) B qua delay ca cc khi adder, mux, control. Xc nh data path v thi gian ca

    cc kiu lnh

    o ALU

    o LOAD

    o STORE

    o BRANCH

    o JUMP

    e) Xc nh thi gian ca single cycle v multi-cycle

    f) Gi s c 1 chng trnh gm 40% ALU, 20% Loads, 10% stores, 20% branches, &

    10% jumps. Tnh CPI trong trng hp single cycle , multi cylce

    g) Tnh speed up

    II.4 p n/Gi

    a)

    - lnh ADD : I-MEM (200) REGs(150) MUX(30) ALU(180) MUX(30)

    REGs(150) = 740

    - lnh LOAD : I-MEM (200) REGs(150) MUX(30) ALU(180) D-MEM(200)

    MUX(30) REGs(150) = 940

    b)

    - Xt lnh BEQ $1, $2, ABC. Vi $1 = 0x00FF, $2 = 0x00FE, lnh ny c ngha: nu

    thanh ghi $1 m bng thanh ghi $2 th n s nhy n nhn ABC

    - Ta thy ni dung thanh ghi $1 v $2 khc nhau nn lnh BEQ khng thc hin nhy n

    nhn ABC. T , ta a ra tn hiu cho lnh nh sau:

    Tn hiu Gi tr Gii thch

    RegDest X Khng quan tm

    RegWrite 0 Khng ghi kt qu vo thanh ghi

    ExtOp X Khng quan tm

    ALUSrc 0 Thanh ghi vi thanh ghi

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    8

    Memwrite 0 Khng truy xut vo vng data

    MemRead X Khng quan tm

    MemtoReg X Khng quan tm

    Beq 1 Lnh BEQ

    Bne 0 Khng phi lnh BNE

    J 0 Lnh branch khng phi lnh jump

    PCSrc 0 iu kin nhy khng xy ra

    ALUop 10 Tham kho bng xxx trn

    c)

    - Lnh SLTI dng set gi tr thanh ghi ch ln 1 nu thanh ghi em so snh nh hn 1 s

    cho trc, ngc li n s reset gi tr thanh ghi ch xung 0 nu thanh ghi em so snh

    ln hn 1 s cho trc.

    - V d SLTI $1, $2, 100 th thanh ghi $1 = 1 khi $2 < 100, ngc li $1 = 0 khi $2 >= 100

    t ta xt ng i ca lnh nh sau:

    - Qua I-MEM (lnh no cng qua I-MEM) control unit, reg files, mux, signed extens,

    khng dng b cng cho PC ALU khng qua D-MEM MUX , Reg files

    d)

    Instruction

    class

    Instruction

    memory

    Register

    read

    ALU

    Operation

    Data

    memory

    Register

    write Total

    ALU 200 150 180 150 680ps

    Load 200 150 180 200 150 880ps

    Store 200 150 180 200 730ps

    Branch 200 150 180 530ps

    Jump 200 150 350ps

    e)

    - Tnh thi gian ca single cycle = max ca tt c cc lnh, c th trong trng hp ny lnh

    LOAD c gi tr ln nht (thi gian thc thi lu nht) single cycle = 880ps.

    - Tnh thi gian ca multi cycle = max ca 5 bc (IF- INSTRUCTION MEMORY, ID

    REG FILES, EXE- ALU, MEM DATA MEMORY, WRITE BACK REG FILES)

    trong 1 lnh, c th trong trng hp ny bc Instruction memory tn nhiu thi gian

    nht multi cycle = 200ps.

    f)

    - CPI l s chu k trn lnh.

    - CPI ca single cycle = 1.

    - CPI ca multi cycle = 0.44 + 0.25 + 0.14+ 0.23 + 0.12 = 3.8

    g)

    - Thi gian chy ca 1 chng trnh = CPI * thi gian ca mt cycle.

    - Speed up = thi gian chy ca single cycle / thi gian chy ca multi cycle = (1 * 880) / (

    3.8 * 200) = 880/760 = 1.16

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    9

    III. Pipeline processor

    III.1 Pipeline processor

    a) Pipeline chia lnh thc thi ra thnh 5 bc, mi bc thc thi trong trong mt chu k

    - IF: ly lnh t I-MEM.

    - ID: gii m lnh: bit c l lnh no, kiu lnh g, xc nh thanh ghi cn dng, a

    ch cn nhy, s cn tnh ton

    - EX: thc thi lnh hoc tnh ton a ch cho lnh load/store.

    - MEM: truy xut data i vi lnh load/store.

    - WB: ghi ngc kt qu li thanh ghi.

    b) Hiu sut ca pipe line vi single cycle.

    - Ta chia lnh ra thnh k bc.

    - Thi gian thc thi n lnh ca single cycle = n * single cycle.

    - Thi gian thc thi n lnh ca pipe line = (k + n -1) * pipeline clock cycle .

    o Lnh u tin my k chu k, n -1 lnh cn li mi lnh 1 chu k.

    - Ta gi s single cycle = k* pipeline clock cycle

    - Khi speedup = (n* k* pipeline clock cycle)/ ((k + n -1) * pipeline clock cycle )

    - Khi n th speedup k (tc l pipeline nhanh ti a gp k ln single cycle)

    Ch :

    Pipeline khng rt ngn thi gian thc thi ca mt lnh, m ch tng hiu sut ln bng cch

    tng thng nng ca my. Khi cc bc ca mt lnh c thi gian thc thi khc nhau th s lm

    gim speed up.

    - Thi gian fill v drain cng ng thi lm gim speed up

    - hin thc pipeline ngi ta dng thanh ghi lu kt qu li mi bc:

    o Tn hiu t khi control unit (main control)

    o Tt c tn hiu iu khin c sinh ra bc ID

    o Mi bc dng 1 s tn hiu iu khin

    o RegDst c dng trong bc ID

    o ExtOp, ALUSrc, ALUCtrl ,J, Beq, Bne, zero c dng trong bc EXE

    o MemRead, MemWrite, MemtoReg dng trong bc MEM

    o RegWrite dng trong bc WB

    III.1.1 Hazard

    Khi hin thc pipeline s gy ra cc loi hazard: structural hazard, data hazard, control hazard.

    III.1.1.1 Structural hazard

    Xy ra khi c s tranh chp ti nguyn phn cng, 2 lnh cng dng chung phn cng trong

    cng chu k.

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    10

    III.1.1.2 Data hazards

    Xy ra khi c s ph thuc d liu.

    III.1.1.2.1 S ph thuc d liu:

    Read After Write RAW Hazard

    I. add $s1, $s2, $s3 #thanh ghi $s1 c ghi

    J. sub $s4, $s1, $s3 #thanh ghi $s1 c c

    Khi , data hazard xut hin khi lnh J c $s1 m lnh I li cha tnh xong kt qu ca $1

    Write After Read: Name Dependence

    I: sub $t4, $t1, $t3 # $t1 c c trc

    J: add $t1, $t2, $t3 # $t1 c ghi sau

    R rang, ta thy khng c s ph thuc d liu y, ch c ph thuc tn bin. loi

    b s ph thuc v tn bin, ta i tn thanh ghi.

    I: sub $t4, $t1, $t3

    J: add $t5, $t2, $t3

    Write After write: Name Dependence

    I: sub $t1, $t4, $t3 # $t1 c ghi

    J: add $t1, $t2, $t3 # $t1 c ghi li ln na

    R rng ta thy khng c s ph thuc d liu y, ch c ph thuc tn bin. Kt qu

    ch ph thuc vo lnh J sau. loi b s ph thuc v tn bin, ta i tn thanh ghi.

    I: sub $t1, $t4, $t3

    J: add $t5, $t2, $t3

    Read After Read: khng gy ra s ph thuc.

    III.1.1.2.2 Phng php gii quyt data hazard

    a) Chn stall vo m bo lnh trc tr kt qu v m lnh sau c th c c kt

    qu trong chu k k tip. Phng php ny khng tn ti nguyn phn cng (gim

    gi thnh), ch to ra delay cho chng trnh (gim hiu sut).

    b) Dng k thut forward. Phng php ny cn thm ti nguyn phn cng (gi thnh

    tng) hin thc + chn stall khi cn thit.

    - Khi xy ra hazard i vi lnh load, cho d ta c dng k thut forward th cng phi tn

    1 stall gii quyt chng.

    - hin thc forward, ngi ta thm b mux cho vic la chn input cho ALU.

    - Cc lnh khc (khc lnh load) th kt qu c cho ra bc ALU (EXE) nn khi ta

    dng k thut forward s khng cn stall na.

    V d: Gi s c s ph thuc gia cc lnh nh sau:

    1

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    11

    2

    3

    4

    - Lnh 2 ph thuc 1,

    - Lnh 3 ph thuc 1,

    - Lnh 4 ph thuc 1.

    Khi dng k thut forward cho lnh

    - lnh 2 th to forward t EXE(1) EXE(2)

    - lnh 3 th to forward t MEM(1) EXE(3)

    - lnh 4 th to forward t WB(1) EXE(4)

    Ch : cc forwarding u forward v v tr EXE.

    c) Sp xp li code

    Vic sp xp phi m bo th t trc sau khi c s ph thuc v m bo tnh ng n

    ca chng trnh.

    III.1.1.2.3 Control hazard

    Dng tin on tng hiu sut ca chng trnh.

    - Tin on 1 bit

    - Tin on 2 bit

    III.2 Bi tp:

    1) Cho s v cc thng s ca b x l single clock nh hnh bn di.

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    12

    Thi gian delay ca mi khi cho nh bng bn di.

    I-MEM ALU REG D-MEM

    200ps 150ps 200ps 200ps

    a) B qua tr ca khi ADD, MUX, Control. Tnh single cycle, pipeline clock?

    b) Tnh thi gian thc thi ca chng trnh gm 150 line code i vi single cycle v

    pipeline. T tnh speed up so snh single cycle v pipeline (khng c stall)

    c) Gi s chng trnh khng c stall v thng k c l c ALU 50%, Beq 25%, lw

    15%, sw10%. Tnh speed up gia multi cycle v pipeline.

    2) Cho on code sau:

    addi $1, $zero, 100

    addi $2, $zero, 100

    add $3, $1, $2

    lw $4, L_4

    lw $5, L_5

    and $6, $4, $5

    sw $6, L_KQ

    a) Xc nh s ph thuc gia cc lnh v thanh ghi no gy ra s ph thuc .

    b) Chn stall gii quyt hazard trn, cn bao nhiu stall?

    c) Sp xp li th t cc lnh sao cho khi chy on code th t stall nht m tnh logic

    ca chng trnh vn khng i.

    d) Dng k thut forward gii quyt hazard th khi chy s c bao nhiu stall.

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    13

    III.3 p n/gi

    1a)

    - Single cycle = thi gian thc thi lnh di nht (lnh load) = I-Mem Regs ALU

    D-Mem Regs = 200 + 200 + 150 + 200 + 200 = 950ps

    - Pipeline clock = max (I-Mem, Regs, ALU, D-Mem, Regs) = 200

    1b)

    - Thi gian thc thi 150 ca single cycle = 150 * 950 = 142500 ps

    - Thi gian thc thi 150 ca pipeline = (5 + 150 - 1)* 200 =30800 ps

    - Speed up = 142500/30800 = 4.62

    1c)

    - CPI ca multi cycle = (50%* 4 + 25%*3 + 15%*5 + 10%*4) = 3.9

    - CPI ca pipeline khi khng c stall l = 1

    - Thi gian thc thi = CPI * s lnh * thi gian 1 chu k

    - Speed up = thi gian mutli cycle /thi gian pipeline = (3.9 * s lnh * 200)/( 1 *s lnh *

    200)

    2a)

    (1) addi $1, $zero, 100 (2) addi $2, $zero, 100 (3) add $3, $1, $2 (4) lw $4, L_4 (5) lw $5, L_5 (6) and $6, $4, $5 (7) sw $6, L_KQ

    Lnh no m cc ton hng c t mu m th hin s ph thuc qua thanh ghi .

    2b)

    - 9 stall, 3 stall gia (2) v (3), 3 stall gia (5) v (6), 3 stall gia (6) v (7)

    - Lc chn stall vo m bo l nhng ch cn c gi tr thanh ghi (ID) phi sau chu k ghi

    kt qu ca thanh ghi

    2c)

    Mt trong nhng cch sp xp lm gim stall

    lw $4, L_4

    lw $5, L_5

    addi $1, $zero, 100

    addi $2, $zero, 100

    add $6, $4, $5

    add $3, $1, $2

    sw $6, L_KQ

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    14

    Cn li 3 stall

    2d)

    - 1 stall. Sinh vin v hnh hiu r hn

    Hnh nh so snh single cycle, multi cycle, v pipe line

    Single cycle

    Load Add Jump store

    5 5 5 5

    Branch

    5

    Multi cycle

    Load add Jump store branch

    IF ID EXE MEM WB IF ID EXE WB IF ID IF ID EXE MEM IF ID EXE

    Pipeline

    IF ID EXE MEM WB

    IF ID EXE MEM WB

    IF ID EXE MEM WB

    IF ID EXE MEM WB

    IF ID EXE MEM WB

    IV. Memory

    IV.1 Memory

    Gm cc bus c bn sau:

    - Address:n bit dng xc nh a ch trong ram, khng gian a ch 2n

    - Data: m bits dng xut/ nhp d liu, rng ca ram l mbits

    - OE: output enable, khi tn hiu ny tch cc tng ng vi vic c d liu t RAM

    - WE: write enable, khi tn hiu ny tch cc tng ng vi vic ghi d liu vo RAM

    Cc im khc nhau c bn gia SRAM v DRAM

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    15

    SRAM DRAM

    - Cu to 6 transistor - t, nhanh - Tnh, khng cn refesh gi tr

    - nh a ch theo hng - .

    - 1 transistor + 1 t - R, chm hn - ng, v t in r r in theo thi gian nn cn

    phi refesh gi tr theo chu k - nh a ch theo ma trn - .

    Hnh IV-1 SRAM

    Hnh IV-2 DRAM

    IV.2 Cache

    Do tc pht trin ca ALU qu nhanh so vi Memory nn to ra khong cch kh xa gia

    ALU v MEM, do cn c cache lm b m gia ALU v MEM. Cache thng c lm

    bng SRAM. Mc ch lm gim thi gian truy xut memory.

    Tc v thi gian truy xut ca memory c xp theo th t sau (ch mang tnh cht tham

    kho)

    - Registers (size < 1 KB), Access time < 0.5 ns

    - Level 1 Cache (size 8 64 KB), Access time: 1 ns

    - L2 Cache (512KB 8MB), Access time: 3 10 ns

    - Main Memory (4 16 GB), Access time: 50 100 ns

    - Disk Storage (> 200 GB), Access time: 5 10 ms

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    16

    Temporal Locality (thi gian) mt bin, thc th c truy xut th c th n s c truy

    xut ln na. Thng xut hin trong nhng vng lp, hay gi hm/th tc nhiu ln. i vi

    truy xut theo thi gian th xu hng thng gi block trong cache nhm truy xut ln sau

    Spatial Locality (khng gian) lnh/data trong vng nh khi c truy xut c th cc

    lnh/data gn n s c truy xut. Thng xut hin trong khai bo mng, thc thi tun t

    i vi truy xut theo khng gian th xu hng thng chun b trc block k tip.

    IV.2.1 Block placement

    Phng php t block vo cache.

    IV.2.1.1 Direct mapped

    Mi block c xc nh mt v tr t duy nht. Cho n l s block trong cache. Block th m

    trong b nh (RAM) s c t vo v tr m%n trong cache

    IV.2.1.2 Full associative

    Mi block c c vo v tr no cn trng trong cache.

    IV.2.1.3 Set associative

    Mi block c xc nh mt set duy nht. Trong set c k s la chn, t block vo 1

    trong k ch trng . Cho n l s set trong cache, Block th m trong b nh (RAM) s c t

    vo v tr m%n trong cache.

    V d minh ha:

    Trong K- way Set associative th k block s gp thnh 1 set, v d trn k = 2.

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    17

    IV.2.2 Block identification

    xc nh a ch ngi ta chia a ch ra lm 3 phn (Tag, Index, block offset)

    31 0

    Tag Index Block offset

    IV.2.2.1 Block offset

    Xc nh thnh phn no trong block c truy xut. xc nh bock offset c bao nhiu

    bit, ta i xc nh trong block c bao nhiu phn t.

    Xc nh s phn t bng cch ly (size of block)/ (size of n v truy xut)

    IV.2.2.2 Index

    Dng xc nh s set trong b nh m

    - Direct mapped: 1 block l 1 set.

    - K-way (k = 2m) Set Associative: k block gom thnh 1 set.

    - Full Associative: ton b block thnh 1 set. Index lc ny l 0 bit.

    Xc nh s block bng cch ly (size of cache)/ (size of block)

    IV.2.2.3 Tag

    xc nh block no ang nm trong cache.

    Tag bit = 32 index bits block offset bits (trong kin trc 32 bits)

    IV.2.3 Block replacement

    Khi mt block vo m khng cn ch trng t vo th cn phi thay block c bng block

    mi.

    - Trong trng hp direct mapped, v mi block ch c 1 ch t nn ta khng nhc n

    y

    - FIFO: ci no c t vo trc th s c ly ra trc.

    - Ramdom

    - LRU: ci no t dng nht th c thay th trc.

    IV.2.4 Write strategy

    Chin lc ghi ngc li cache, memory

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    18

    IV.2.4.1 Write Back

    Ch updata cache, khi c yu cu hay cn thay th th mi update gi tr sau cng xung

    memory. Cn bit valid xc nh block c valid hay khng v bit modified xc nh

    block c update cha.

    Kh hin thc, t tn lu lng bng thng ca h thng.

    IV.2.4.2 Write Through

    Updata c cache v memory, cn bit valid xc nh block c valid hay khng.

    n gin, d hin thc, tn lu lng bng thng ca h thng v phi update nhiu.

    IV.2.5 Miss/hit

    - Miss: cn truy xut m tm khng thy trong cache. Do phi a block cha ci ta

    mun truy xut vo cache, sau truy xut n.

    - Hit: cn truy xut v tm thy ci mun truy xut trong cache.

    - Miss Penaty: s chu k x l cache miss.

    - Hit rate = hit / (hit + miss)

    - Miss rate = miss / (hit + miss) = 1 hit rate

    - I-Cache Miss Rate = Miss rate trong lc truy xut I-MEM

    - D-Cache Miss Rate = Miss rate trong lc truy xut D-MEM

    V d:

    Chng trnh c 1000 lnh trong c 25% l load/store. Bit lc c I-MEM b miss 150,

    D-MEM b miss 50. Tm I-Cache Miss Rate, D-Cache Miss Rate.

    - I-Cache Miss Rate = s ln miss / s ln truy xut I-MEM = 150/1000 = 15%

    - D-Cache Miss Rate = s ln miss / s ln truy xut D-MEM = 50/(1000*25%) =50/250 =

    20%

    Khi cache miss th s gy ra stall. xc nh bao nhiu stall, ta i tm cc thng s sau:

    - Memory stall cycles = Combined Misses * Miss Penalty

    - Miss Penalty: s chu k gii quyt vic miss.

    - Combined Misses = I-Cache Misses + D-Cache Misses

    - I-Cache Misses = I-Count I-Cache Miss Rate

    - D-Cache Misses = LS-Count D-Cache Miss Rate

    - LS-Count (Load & Store) = I-Count LS Frequency

    - Memory Stall Cycles Per Instruction = Combined Misses Per Instruction Miss Penalty

    - Combined Misses Per Instruction = I-Cache Miss Rate + LS Frequency D-Cache Miss

    Rate

    Memory Stall Cycles Per Instruction = I-Cache Miss Rate Miss Penalty + LS Frequency

    D-Cache Miss Rate Miss Penalty

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    19

    V d: Instruction count (I-Count) = 106 lnh, 30% lnh loads/stores, D-cache miss rate l 5%

    v I-cache miss rate l 1%, cho Miss penalty l 100 chu k, tnh combined misses per instruction

    and memory stall cycles

    1% + 30% * 5% = 0.025 combined misses mi lnh tng ng 25 misses trn 1000 lnh.

    Memory stall cycles = 0.025 * 100 (miss penalty) = 2.5 stall cycles per instruction

    Total memory stall cycles = 106 * 2.5 = 2,500,000

    CPIMemoryStalls = CPIPerfectCache + Mem Stalls per Instruction

    V d: cho CPI = 1.5 khi khng c stall, Cache miss rate l 2% i vi instruction v 5% i

    vi data. Lnh loads v stores chim 20%. Cho trc miss penalty l 100 chu k i vi I-cache

    v D-cache. Tnh CPI ca h thng?

    - Mem stalls cho mi lnh = 0.02*100 + 20%*0.05*100 =3

    - CPI Memory Stalls = 1.5 + 3 = 4.5 cycles

    Average Memory Access Time (AMAT) thi gian truy xut b nh trung bnh

    - AMAT = Hit time + Miss rate * Miss penalty

    Do gim thi gian truy xut th

    - Ta gim Hit time: bng cch dng b nh cache nh, n gin.

    - Gim Miss Rate: bng cch dng b nh cache ln, block size ln v k-way set

    associativity vi k ln

    - Gim Miss Penalty bng cch dng cache nhiu mc.

    V d: tm AMAT khi bit Cache access time (Hit time) of 1 cycle = 2 ns, Miss penalty = 20

    clock cycles, miss rate of 0.05 per access

    - AMAT = 1 + 0.05 20 = 2 cycles = 4 ns

    IV.3 Bi tp:

    1) Cho b nh cache c dung lng 256KB, block size l 4word, mi ln truy xut 1 byte.

    Xc nh s bit ca cc trng tag, index, block offset trong cc trng hp.

    - Direct mapped

    - Full associcative

    - 2 way set associcative

    2) Cho b nh chnh c dung lng 1G, b nh cache c dung lng 1MB, block size l

    256B, mi ln truy xut 1 word. Xc nh s bit ca cc trng tag, index, block offset

    trong cc trng hp.

    - Direct mapped

    - Full associcative

    - 4 way set associcative

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    20

    3) Trong cache c 8 block, mi block l 4words. Xc nh s ln miss/ hit khi h thng truy

    xut vo cc a ch theo th t sau:

    0x0001002A

    0x00010020

    0x0002006A

    0x00020066

    0x00020022

    0x0001002B

    Trong cc trng hp

    - Direct mapped

    - Full associcative

    - 2 way set associcative

    IV.4 p n/Gi

    1)

    - S phn t trong 1 block = (size of block)/(size of phn t truy xut) = 4 word /1 byte =

    4*4 bytes/ 1 byte = 16

    - S khi block trong cache = size of cache / size of block = 256KB / 4 word = 28*210/4*4 =

    214 blocks

    - Direct mapped: block offset 4 bits, index = 14 bits, tag = 32 4 -14 = 14 bits

    - Full associcative: block offset 4 bits, index = 0 bits, tag = 32 4 = 28 bits.

    - 2 Ways set associative: 2 block to thnh 1 set m c 214 block nn c 213 sets, block

    offset 4 bits, index = 13 bits, tag = 32 4 -13 = 15 bits

    2)

    - S phn t trong 1 block = (size of block)/ (size of phn t truy xut) = 256B / 4 bytes =

    28 bytes/ 22 byte = 26

    - S khi block trong cache = size of cache / size of block = 1MB / 256B = 210*210/28 = 212

    blocks.

    - Khng gian a ch l 1G, do ta dng thanh ghi 30 bit l c th truy xut c 1G ram.

    - Direct mapped: block offset 6 bits word, index = 12 bits, tag = 30 6 12 2 bit word =

    10 bits.

    - Full associcative: block offset 6 bits word, index = 0 bits, tag = 30 6 2 = 22 bits.

    - 4 ways set associative: 4 block to thnh 1 set m c 212 block nn c 210 sets, Block

    offset 6 bits word, index = 10 bits, tag = 30 6 -10 -2 = 12 bits.

    3)

    - Da vo , ta tnh c c 4 bits block offset, 3 bit index. Do ta phn tch a ch nh

    bn di:

    Direct map

  • Thc hnh kin trc my tnh n tp ktmt CE 2015

    21

    Address Tag Index Block Miss

    Gii thch offset /hit

    0x0001002A 0000 0000 0000 0001 0000 0000 0 010 1010 M First access

    0x00010020 0000 0000 0000 0001 0000 0000 0 010 0000 H

    0x0002006A 0000 0000 0000 0010 0000 0000 0 110 1010 M First access

    0x00020066 0000 0000 0000 0010 0000 0000 0 110 0110 H

    0x00020022 0000 0000 0000 0010 0000 0000 0 010 0010 M Khc tag

    0x0001002B 0000 0000 0000 0001 0000 0000 0 010 1011 M Khc tag

    Full associative

    Address Tag Block Miss

    Gii thch offset /hit

    0x0001002A 0000 0000 0000 0001 0000 0000 0010 1010 M First access

    0x00010020 0000 0000 0000 0001 0000 0000 0010 0000 H

    0x0002006A 0000 0000 0000 0010 0000 0000 0110 1010 M First access

    0x00020066 0000 0000 0000 0010 0000 0000 0110 0110 H

    0x00020022 0000 0000 0000 0010 0000 0000 0010 0010 M First access

    0x0001002B 0000 0000 0000 0001 0000 0000 0010 1011 H

    2 ways set associative, cn 2 bit index

    Address Tag Index Block Miss

    Gii thch offset /hit

    0x0001002A 0000 0000 0000 0001 0000 0000 00 10 1010 M First access

    0x00010020 0000 0000 0000 0001 0000 0000 00 10 0000 H

    0x0002006A 0000 0000 0000 0010 0000 0000 01 10 1010 M First access

    0x00020066 0000 0000 0000 0010 0000 0000 01 10 0110 H

    0x00020022 0000 0000 0000 0010 0000 0000 00 10 0010 M Khc tag

    0x0001002B 0000 0000 0000 0001 0000 0000 00 10 1011 M Khc tag

    I. Kiu lnhII. Single clock processorII.1 Single clock processorII.2 Kin trc single cycleII.3 Bi tpII.4 p n/Gi

    III. Pipeline processorIII.1 Pipeline processorIII.1.1 HazardIII.1.1.1 Structural hazardIII.1.1.2 Data hazardsIII.1.1.2.1 S ph thuc d liu:Read After Write RAW HazardIII.1.1.2.2 Phng php gii quyt data hazardIII.1.1.2.3 Control hazard

    III.2 Bi tp:III.3 p n/gi

    IV. MemoryIV.1 MemoryIV.2 CacheIV.2.1 Block placementIV.2.1.1 Direct mappedIV.2.1.2 Full associativeIV.2.1.3 Set associative

    IV.2.2 Block identificationIV.2.2.1 Block offsetIV.2.2.2 IndexIV.2.2.3 Tag

    IV.2.3 Block replacementIV.2.4 Write strategyIV.2.4.1 Write BackIV.2.4.2 Write Through

    IV.2.5 Miss/hit

    IV.3 Bi tp:IV.4 p n/Gi