Enabling Big Memory with Emerging Technologies Manjunath Shevgoor Enabling Big Memory with Emerging Technologies 1
Jan 21, 2016
Enabling Big Memory with Emerging Technologies 1
Enabling Big Memory with Emerging Technologies
Manjunath Shevgoor
Enabling Big Memory with Emerging Technologies 2
Big Memory
DRAM needs are increasing rapidly
Increased Data Gathering
Data Analytics
In Memory Databases
Enabling Big Memory with Emerging Technologies 3
Need more capacity
Source: Kevin Lim et al., Disaggregated Memory for Expansion and Sharing in Blade Servers, ISCA’09
Core count doubling ~ every 2 yearsDIMM capacity doubling ~ every 3 yearsMemory capacity per core expected to drop every year [Source:
Memory Scaling: Systems Architecture Perspective, O Mutlu]
Enabling Big Memory with Emerging Technologies 4
Possible Solutions
3D Stacking Increased Current Draw[MICRO’13]
Many Rank High Refresh Power[Under Submission]
[ICCD’15, HPCA’12, NVMW’11,15]
Non- Volatile MemorySneak Currents Memristor
Enabling Big Memory with Emerging Technologies 5
Thesis Statement
Memory capacity requirements are increasing at a very fast rate.
Management of high currents is crucial for effective deployment of new technologies.
This thesis hypothesizes that architecture/OS policies for data placement can help manage some
of the problems posed by high currents.
Enabling Big Memory with Emerging Technologies 6
Talk Outline
• Current Constraints in 3D DRAM
• Addressing Refresh Overheads in DRAM
• Improving Memristor Memory by Re-using Sneak Currents
• Conclusion and Future Work
Enabling Big Memory with Emerging Technologies 7
IR-Drop in 3D DRAM[MICRO’13]
Enabling Big Memory with Emerging Technologies 8
What is power delivery network?
Source: Sani R. Nassif, Power Grid Analysis Benchmarks
V VSS Grid of wires which connects power and
circuits
Voltage drops across every PDN
Voltage lost on the PDN is the IR Drop
Explore architectural policies to manage IR Drop
Enabling Big Memory with Emerging Technologies
• 3D stacking increases current density – Increased ‘I’
• TSVs add resistance to the PDN – Increased ‘R’
• Navigate 8 TSV layers to reach the top die
• Insufficient voltage leads to incorrect operation
9
High IR DropLow IR Drop
IR Drop in 3D DRAM
Enabling Big Memory with Emerging Technologies 10
Banks that are farther away from the TSVs suffer higher IR Drop
V on M1 on Layer 9
X Coordinate
V
Y Co
ordi
nate
Floor Plan and Quality of Power Delivery
Enabling Big Memory with Emerging Technologies 11
Layer 2 Layer 3 Layer 4 Layer 5
Layer 6 Layer 7 Layer 8 Layer 9
IR Drop Varies along a Die and across the stack
Enabling Big Memory with Emerging Technologies 12
Logic Layer
A Bot B Bot C Bot D Bot
A Top B Top C Top D Top Top 4 Dies
Bot 4 Dies Logic Layer
Create constraints for Iso-IR Drop regionsPlace critical pages in IR Drop resistant regionsIR Drop oblivious page placement leads to 47% performance degradation
Enabling Big Memory with Emerging Technologies 13
Region Based Constraints
Top Region 1-2 Reads allowed/regionBottom Region 4 Reads allowed/region
At least 1 Top-Read 8 Reads allowed/stackNo Top-Reads 16 Reads allowed/stack
Spatio-Temporal Constraints
Enabling Big Memory with Emerging Technologies 14
Dynamic Page Placement
Pages with highest total queuing delay are moved to
bottom regions
Using page access count to promote pages can starve
threads
Scheduler ensures fairness
Page migration is limited by Migration Penalty (10k/15M
cycles)
Enabling Big Memory with Emerging Technologies 15
ResultsWithin 20% of ideal
Enabling Big Memory with Emerging Technologies 16
Overview
3D Stacking Increased Current Draw[MICRO’13]
Many Rank Refresh Overhead[Under Submission]
[ICCD’15, HPCA’12, NVMW’11,15]
Non- Volatile MemorySneak Currents Memristor
Enabling Big Memory with Emerging Technologies 17
Re-Thinking Data Placement in Highly Ranked DRAM Systems
Enabling Big Memory with Emerging Technologies 18
Refresh Power in DRAM
Command Current (mA)
Act 67 Read 125Write 125
Refresh 245
Refresh consumes 96% more power
than read
Source: Micron 8GB DDR3L data sheet
There can be up to 4 ranks in DIMM
Enabling Big Memory with Emerging Technologies 19
8-coreCMP
MC MCChannel 1 Channel 2
Rank 1 Rank 2 Rank 3 Rank 4
Stagger refresh to reduce peak power
Enabling Big Memory with Emerging Technologies 20
Increase in Refresh Time
Chip Capacity (GB)
tRFC(ns)
tRFC_2X(ns)
tRFC_4X(ns)
8 350 240 16016 480 350 24032 640 480 350
Refresh Interval 7.8 µs 3.9 µs 1.95 µs
Fine grained refresh
Enabling Big Memory with Emerging Technologies 21
Effect of Staggered Refresh
0ns Simul Ref Stagger Ref Simul Ref ExtT
Stagger Ref ExtT
1.000 1.076 1.140 1.1611.369
Completion Time
Enabling Big Memory with Emerging Technologies 22
8-coreCMP
MC MCChannel 1 Channel 2
Rank 1 Rank 2 Rank 3 Rank 4
T1R2
T2R3
T1R1
T2R2
T2R1
T3R1
T1R1
T2R3
T1R3
T1R3
T3R3
T3R3
Stalled
Stalled
Each Staggered Refresh stalls many cores
Enabling Big Memory with Emerging Technologies 23
No Refresh Refresh0.00.20.40.60.81.01.21.41.6
1.000
1.259
0.991
1.247
0.808
1.346
No Interleave XOR Bank Interleave Chan Interleave
Nor
mal
ized
Com
p. T
ime
Limit the spread- Address Mapping
Enabling Big Memory with Emerging Technologies 24
8-coreCMP
MC MCChannel 1 Channel 2
Rank 1 Rank 2 Rank 3 Rank 4
T1R2
T2R3
T1R1
T2R2
T2R1
T3R1
T1R1
T2R3
T1R3
T1R3
T3R3
T3R3
Stalled
Stalled
T1R1
T2R2
T1R1
T2R2
T2R2
T3R3
T1R1
T2R2
T1R1
T1R1
T3R3
T3R3
Ideally
Enabling Big Memory with Emerging Technologies 25
Rank Assigned Page Mapping
8-core CMP
MC MCChannel 1 Channel 2
Rank 1
Rank 2Rank 3
Rank 4
Thread 1Thread 2
Thread 3Thread 4
Thread 5Thread 6
Thread 7Thread 8
(a) Strict mapping of threads to ranks.
Enabling Big Memory with Emerging Technologies 26
18.6% better than Staggered Refresh
CG EPM
G UA
canneal
GemsFDTD
astar
bzip2
lbm
libquan
tum mcf
milc
omnetpp
soplex
xalan
cbmk
classi
fication
cloud9
AM0.0
0.4
0.8
1.2
1.6
2.0
0ns Simultaneous Refresh Staggered Refresh Staggered RA
Nor
m. C
omp.
Tim
e
Enabling Big Memory with Emerging Technologies 27
Limit the spread- Page Mapping
Channel 1 Channel 2
Rank 1
Rank 2Rank 3
Rank 4
Thread 1Thread 2
Thread 3Thread 4
Thread 5Thread 6
Thread 7Thread 8
8-coreCMP
MC MC
Enabling Big Memory with Emerging Technologies 28
Relaxing Rank Assignment
0% 10% 15% 20% 33% 50%
18.6%16.5%
14.2% 13.5%12.1%
9.4%
% Pages not mapped to Preferred Rank
% E
xec.
Tim
e re
ducti
on
Enabling Big Memory with Emerging Technologies 29
Data Mapping
Address Mapping
Page Mapping18.6% better than Staggered Refresh
Enabling Big Memory with Emerging Technologies 30
Overview
3D Stacking Increased Current Draw[MICRO’13]
Many Rank Refresh Overhead[Under Submission]
[ICCD’15, HPCA’12, NVMW’11,15]
Non- Volatile MemorySneak Currents Memristor
Enabling Big Memory with Emerging Technologies 31
Designing a Fast and Reliable Memory with Memristor Technology
[ICCD’15, NVMW’15]
Enabling Big Memory with Emerging Technologies 32
Background
Store data in the form of resistance
Metal oxide sandwiched between two electrodes
Inherently non conducting Creation of conductive
Filaments of oxygen vacancies reduces resistance Source: Cong Xu et al., Modeling and Design Analysis of
3D Vertical Resistive Memory - A Low Cost Cross-Point Architecture, ASPDAC 2014
Enabling Big Memory with Emerging Technologies 33
Voltage Dependent Resistance
Resistance decreases with increasing voltage
The resistance of a ReRAM cell is not constant but varies with the applied voltage
Combination of a selector in series with memristor device
Enabling Big Memory with Emerging Technologies 34
BitLine
Word
Line
DRAM Cell
BitLine
Word
Line
PCM Cell
Word
Line
BitLine
Memristor Cell
Cell Size of 4F2
Cross Point Structure
Enabling Big Memory with Emerging Technologies 35
Because of non-linearity, it is possible to select a cell without an access
transistor.
Arrays can be layered vertically without resorting to 3D stacking.
Mem-
ristor
Selecto
r
Driver Transistors
Selected Cell
Memristor Cell
Reading and Writing
Enabling Big Memory with Emerging Technologies 36
Driver Transistors
Half Selected Cells
Selected Cell
Sneak Current
0VV/2 V/2 V/2
V/2
V/2V/2
V
Enabling Big Memory with Emerging Technologies 37
Effects of Ileak
Effects of Ileak
Enabling Big Memory with Emerging Technologies 38
Decreases Voltage at selected cell Increases Write Latency Can cause Write Failure
Distorts bit line current Increases read complexity Decreases read margin
Limits Array Size
Enabling Big Memory with Emerging Technologies 39
Reading from the crossbar array Step 1: Read background current (Ileak)
Vread/2
0
Ileak
Ileak
Ileak
Ileak
Vread/2
Vread/2
Vread/2
Vread/2 Vread/2 Vread/2
Enabling Big Memory with Emerging Technologies 40
Reading from the crossbar array Step 2: Read total Vread current (Iread)
Vread
0
Iread
Ileak
Ileak
Ileak
Vread/2
Vread/2
Vread/2
Vread/2 Vread/2 Vread/2
Enabling Big Memory with Emerging Technologies 41
State of selected cell determines
Iread ~ Ileak
tBG_READ tREAD
Read Latency
Enabling Big Memory with Emerging Technologies 42
Proposal 1: Re-use value in sample and hold circuit
Vread
Vread/2
Vread/2
Vread/2
Vread/2 Vread/2Vread/2
Vr
Pacc
Pprech
S1Sensing Circuit
S2
Sample and HoldSneak Current
Enabling Big Memory with Emerging Technologies 43
Reusing Sneak Current Read
Snea
k Cu
rren
t uA
Columns
Row
s
Enabling Big Memory with Emerging Technologies 44
Re-Use Sneak Current Reading for the same Column
tBG_READ tREAD
Read Latency1
tREAD
Read Latency2
Enabling Big Memory with Emerging Technologies 45
Impact of Cell Location
Enabling Big Memory with Emerging Technologies 46
Bit Line Mux
Word Line
Drivers
Increased error rates
Enabling Big Memory with Emerging Technologies 47
64 Byte Cache line
Array 1
Array 2
Array 3
Array 512
Bit 1 Bit 2 Bit 3 Bit 512
Default mapping leads to some lines with high error rate
Enabling Big Memory with Emerging Technologies 48
Proposal 2: Stagger the array mapping
Cacheline 1 Cacheline 2 Cacheline 3 Cacheline 4
0
0
0
0
1
1
1
1
2
2
2
2
3
3
3
3Nth bit in cacheline
Array 0
Array 1
Array 2
Array 3
Default Mapping
1
0
3
2
1
0
3
2
1
0
3
2
1
0
3
2
ProposedMapping
30X reduction in probability of a single bit error
Improving Memristor Memory with Sneak Current Sharing 49
Performance Vs Baseline
Gemsas
tar
bwaves
bzip2
gobmk
hmmerlbm lib
qmcf
milc
soplex
xalan
cze
usGM
0%20%40%60%80%
100%120%140% DRAM 32ReUse
Incr
ease
in P
erfo
rman
ce
Improving Memristor Memory with Sneak Current Sharing 50
Exploring Address Mapping
Gemsas
tar
bwaves
bzip2
gobmk
hmmerlbm lib
qmcf
milc
soplex
xalan
cze
usGM
0.7
0.8
0.9
1.0
1.1
1.232Reuse 4interleave XOR 32interleave 4Reuse
Nor
mal
ized
IPC
Enabling Big Memory with Emerging Technologies 51
Summary of Dissertation
3D Stacking Increased Current Draw[MICRO’13]
Many Rank Refresh Overhead[Under Submission]
[ICCD’15, HPCA’12, NVMW’11,15]
Non- Volatile MemoryMemory Latencies Memristor
Spatio-Temporal Constraints
Re-Thinking Data Placement
Re-use Sneak Currents
Enabling Big Memory with Emerging Technologies 52
Conclusions
3D Stacking Many Rank Memristor
Re-Use Sneak Currents
Rank Assignment
IR Drop Constraints
Enabling Big Memory with Emerging Technologies 53
Future Work
• Mitigating the Rising Cost of Process Variation in 3D DRAM
• PDN Aware Refresh Cycle Time for 3D DRAM
• Addressing Long Write Latencies in Memristor based Memory
Enabling Big Memory with Emerging Technologies 54
Other Projects and Publications
• Efficiently Prefetching Complex Address Patterns• MICRO’15
• USIMM: The Utah Simulated Memory Module• Used for the Memory Scheduling Championship
• Efficient Scrub Mechanisms for Error-Prone Emerging Memories• HPCA’12
• Accelerating Critical Word Access using Heterogeneous Memory• MICRO’12
• Avoiding Information Leakage in the Memory Controller• MICRO’15
Enabling Big Memory with Emerging Technologies 55
Acknowledgements
• Rajeev• Ashwini, Parents• Al, Erik, Naveen, Ken• Chris Wilkerson, Zeshan Chishti • Utah Arch team-mates• Karen, Ann
Enabling Big Memory with Emerging Technologies 56
Thank You
Enabling Big Memory with Emerging Technologies 57
Thesis Overview
3D Stacking Increased Current Density[MICRO’13]
Many Rank High Refresh Current[Under Submission]
[ICCD’15, NVMW’11,15]
Non- Volatile MemorySneak Currents Memristor
Analyze Impact of Currents
+ Performance
Loss
DataPlacement
Enabling Big Memory with Emerging Technologies 58
Comparisons to Prior Work
0ns RA ExtT640ns RP_Opt RP_Real Elastic Ref
1.001.10
1.361.18
1.471.30
Nor
mal
ized
Exe
c. T
ime
Enabling Big Memory with Emerging Technologies 59
RW RW RW
RW RW RW
RW RW RW
RW RW RW
0
V/2
V/2 V/2
V
Bit Lines
Word LinesVW1 VW2 VWN
VWN1
VWNM
Bit Line Mux
Bit line and word line resistances eat into the cell Voltage
Enabling Big Memory with Emerging Technologies 60
Percentage of refreshes stalling a thread
CG EP MG UA
cann
eal
GemsF
DTDas
tar
bzip
2lb
m
libqu
antu
mm
cfm
ilc
omne
tpp
sopl
ex
xala
ncbm
k
clas
sific
atio
n
clou
d9 AM0
20
40
60
80
100
50
% R
efr
esh
es
Improving Memristor Memory with Sneak Current Sharing 61
Memory Latency
Gemsas
tar
bwaves
bzip2
gobmk
hmmerlbm lib
qmcf
milc
soplex
xalan
cze
usGM
0
100
200
300
400
500 NoReUse 32Reuse DRAM
Mem
ory
Late
cny
(Cyc
les)
Improving Memristor Memory with Sneak Current Sharing 62
Memristor Read Power
Gemsasta
r
bwaves
bzip2
gobmk
hmmerlbm lib
qmcf
milcso
plexxa
lancze
usGM
0.5
0.6
0.7
0.8
0.9
1.032ReUse 4ReUse 4Interleave 32Interleave NoReUse
Nor
mal
ized
Rea
d Po
wer
Enabling Big Memory with Emerging Technologies 63
Core 1
zLa
st L
evel
$$
$ Miss
$ MissCore 8
Delta History Tables
Prediction
See a Delta? Predict a Delta!
Prediction
Feedback
Feedback
Delta Prediction
Tables
Delta History Tables
Delta Prediction
Tables
Enabling Big Memory with Emerging Technologies 64
Sneak path currents can distort Iread
Vread
0
Iread
Ileak
Ileak
Ileak
Vread/2
Vread/2
Vread/2
Vread/2 Vread/2 Vread/2
Sneak Currents
Enabling Big Memory with Emerging Technologies 66
Compress to reduce write latency
64 Byte Cache line
Array 1
Array 2
Array 3
Array 512
Bit 1 Bit 2 Bit 3 Bit 512
1
0
3
2
1
0
3
2
1
0
3
2
1
0
3
2
Proposed MappingWith 50%
Compression
Enabling Big Memory with Emerging Technologies 67
CG EP MG UA
cannea
l
GemsFD
TDast
arbzip
2lbm
libquan
tum mcfmilc
omnetpp
soplex
xalan
cbmk
classi
fication
cloud9
AM0.0
0.5
1.0
1.5
2.0
2.5
0ns Simultaneous RefreshStaggered Refresh Staggered RA
Nor
mal
ized
Com
pleti
on T
ime
Enabling Big Memory with Emerging Technologies 68
Summary
With great density come a few challenges Sneak Currents limit array size, complicate reads, and delay
writes Affect reliability
Background current can be reused Reliability can be improved at the cost of write latency Compression can reduce write latency 8.3% performance improvement 30X reduction in multi bit error probability
Improving Memristor Memory with Sneak Current Sharing 69
Column Hit Rate
Gemsasta
r
bwaves
bzip2
gobmk
hmmerlbm lib
qmcf
milc
soplex
xalanc
zeus
GM0
50100150200250300350400450500
0102030405060708090100NoReUse 32Reuse DRAM CHR
Mem
ory
Late
ncy
(Cyc
les)
Colu
mn
Hit
Rate
(%)