Top Banner
Accelerating Dependent Cache Misses with an Enhanced Memory Controller Milad Hashemi, Khubaib, Eiman Ebrahimi, Onur Mutlu, Yale N. Patt Tuesday June 21: Session 7A, 3:30pm
13

Lightning Session - Carnegie Mellon Universityomutlu/pub/enhanced-memory... · 2016. 6. 24. · Title: Lightning Session Created Date: 6/14/2016 6:44:13 AM

Feb 04, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Accelerating Dependent Cache Misses with an Enhanced Memory Controller

    Milad Hashemi, Khubaib, Eiman Ebrahimi, Onur Mutlu, Yale N. PattTuesday June 21: Session 7A, 3:30pm

  • Memory Access Latency• Thelatencyofaccessingmainmemoryismadeupoftwoparts:

    DRAMMultiprocessor

  • Memory Access Latency• Thelatencyofaccessingmainmemoryismadeupoftwoparts:• DRAMaccesslatency

    DRAMMultiprocessor

  • Memory Access Latency• Thelatencyofaccessingmainmemoryismadeupoftwoparts:• DRAMaccesslatency• On-chiplatency

    DRAMMultiprocessor

  • On-Chip Delay

    0%10%20%30%40%50%60%70%80%90%100%

    4xcalculix

    4xpovray

    4xnamd

    4xgamess

    4xperlb

    ench

    4xtonto

    4xgrom

    ac4xgobm

    k4xdealII

    4xsje

    ng4xgcc

    4xhm

    mer

    4xh264ref

    4xbzip2

    4xastar

    4xXalancbm

    k4xzeusmp

    4xcactus

    4xwrf

    4xGe

    msFDT

    D4xleslie

    4xom

    netpp

    4xmilc

    4xsoplex

    4xsphinx

    4xbw

    aves

    4xlibquantum

    4xlbm

    4xmcf

    TotalM

    issCycles

    On-ChipDelay

    DRAM-Access

  • LD[R3]->R5

    Dependent Cache MissesCacheMiss

  • ADDR4,R5->R9

    LD[R3]->R5

    Dependent Cache MissesCacheMiss

  • ADDR9,R1->R6

    ADDR4,R5->R9

    LD[R3]->R5

    Dependent Cache MissesCacheMiss

  • CacheMissLD[R6]->R8

    ADDR9,R1->R6

    ADDR4,R5->R9

    LD[R3]->R5

    Dependent Cache MissesCacheMiss

  • PhysicalRegister

    File Live In Vector

    Uop Buffer

    Reservation Station

    ALU 0

    ALU 1EMC Data

    Cache

    Load StoreQueue

    Result Data

    Tag Broadcast

    Decoded micro-opsfrom core

    Live-outregistersto core

    Live-inregistersfrom core

    Dirty cache

    lines to core

    Compute Capable Memory Controller

  • Effective Memory Access Latency Reduction

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 Mean

    EffectiveMem

    oryAccessLatency

    CoreAccess

  • Effective Memory Access Latency Reduction

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 Mean

    EffectiveMem

    oryAccessLatency

    EMCAccess

    CoreAccess

  • Accelerating Dependent Cache Misses with an Enhanced Memory Controller

    Milad Hashemi, Khubaib, Eiman Ebrahimi, Onur Mutlu, Yale N. PattTuesday June 21: Session 7A, 3:30pm