© 2007 IBM Corporation MICRO-2009 Start-Gap: Low-Overhead Near-Perfect Wear Leveling for Main Memories Moinuddin Qureshi John Karidis, Michele Franceschini.
Post on 17-Jan-2016
212 Views
Preview:
Transcript
© 2007 IBM Corporation
MICRO-2009
Start-Gap: Low-Overhead Near-Perfect Wear Leveling for Main Memories
Moinuddin QureshiJohn Karidis, Michele Franceschini
Viji Srinivasan, Luis Lastras, Bulent Abali
IBM T. J. Watson Research Center, Yorktown Heights, NY
2 © 2007 IBM Corporation
Introduction: Lifetime Limited Memories
Emerging Memory Technologies (PCM) candidate for main memory. Reasons: Scalability, Leakage Power Savings, Density, etc.
Challenge : Each cell can endure 10-100 Million writes Limited lifetime
With uniform write traffic, system lifetime ranges from 4-20 years
workloads16 yrs
4 yrs
3 © 2007 IBM Corporation
Problem: Non-Uniformity in Writes
Heavy non-uniformity in writes: <10% lines incur 90%+ of write traffic
Database workload (writes occur on eviction from a 256MB DRAM cache)
Average
4 © 2007 IBM Corporation
Expected Lifetime with Non-Uniform Writes
20x lower
Even with 64K spare lines, baseline gets 5% lifetime of ideal
Num. writes before system failureNum. writes before failure with uniform writes
Norm. Endurance = x 100%
05
101520253035404550556065707580859095
100
oltp db1 db2 fft stride stress Gmean
No
rma
lize
d E
nd
ura
nc
e (
%)
Baseline w/o spares
Baseline (64K spare lines)
5 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
6 © 2007 IBM Corporation
Existing Proposals: Table-Based Wear Leveling
Wear Leveling: Make writes uniform by remapping frequently written lines.Studied extensively for Flash Memories. Almost all proposals Table based.
Line Addr. Lifetime Count Period Count
A 99K (Low) 1K (Low)
B 100K (Med) 3K (High)
C 101K (High) 2K (Med)
Line Remap Addr
A C
B A
C B
Indirection Table
Physical Address PCM Address
7 © 2007 IBM Corporation
Disadvantages of Table Based Methods
Area overhead can be reduced with more lines per region: Reduced effectiveness (e.g. Line0 always written) Support for swapping large memory regions (complex)
Overheads: 1. Area of several (tens of) megabytes 2. Indirection latency (table in EDRAM/DRAM)
Our Goal: A wear leveling algorithm that avoids the storage, latency, and complexity of table based methods and still achieves lifetime close to ideal.
8 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
9 © 2007 IBM Corporation
Start-Gap Wear Leveling
Two registers (Start & Gap) + 1 line (GapLine) to support movement.Move GapLine every 100 writes to memory.
STARTABC
0 1 2 3
4
PCMAddr = (Start+Addr); (PCMAddr >= Gap) PCMAddr++)
D
GAP
Storage overhead: less than 8 bytes (GapLine taken from spares) Latency: Two additions (no table lookup)
Write overhead: One extra write every 100 writes 1%
10 © 2007 IBM Corporation
0
10
20
30
4050
60
70
80
90
100
oltp db1 db2 fft stride stress Gmean
Nor
mal
ized
End
uran
ce (
%)
BaselineStart GapPerfect
Results for Start-Gap
On average, Start-Gap gets 53% normalized endurance
10X better than baseline, but still 2x lower than Ideal. Why?
11 © 2007 IBM Corporation
Spatial Correlation in Heavily Written Regions
Start-Gap moves a line to its neighbor If heavily written regions are spatially close, Start-Gap may move hot lines to other hot lines
Peaks
Writes Localized
If address space is randomized, hot regions will be spread uniformly
FFT
db1
12 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
13 © 2007 IBM Corporation
Randomized Start Gap
Line Addr StaticRandomizer
Start-GapMapping
Physical Address Randomized Address PCM Address
One-to-one mapping Invertible function. Configured at design/boot.
Hot lines
PCM
Minor change can support Pagemode memory. Randomizer is OS unaware.
14 © 2007 IBM Corporation
Efficient Address Space Randomization Two proposals (very little hardware)
c0 c1 c2 c3=
b00 b01 b02 b03
b10 b11 b12 b13
b20 b21 b22 b23
b30 b31 b32 b33
RIB Matrix
xa0
a1 a2 a3
Random Invertible Binary (RIB) Matrix
85 byte storage (1 cycle latency)
Feistel Network (crypto)
5 byte storage (3 cycle latency)
15 © 2007 IBM Corporation
0
10
20
30
40
50
60
70
80
90
100
oltp db1 db2 fft stride stress Gmean
No
rma
lize
d E
nd
ura
nc
e (
%)
Baseline
StartGapStartGap+RIB
StartGap+Feistel Nw
Results for Randomized Start-Gap
Randomized Start-Gap achieves 97% of ideal lifetimewhile incurring a total storage overhead of 13 bytes.
16 © 2007 IBM Corporation
Analytical Model for Randomized Start Gap
Lifetime from analytical model matches very well (97% vs. 96.8%)
We developed a simple analytical model that uses variance in writetraffic across lines to compute norm. endurance (details in paper)
17 © 2007 IBM Corporation
Comparison with Table Based Methods
Randomized Start-Gap achieves lifetime similar to hardware-intensiveversion of table based & avoids several tens of cycle of latency overhead
05
101520253035404550556065707580859095
100
1 2 3 4
No
rmali
zed
En
du
ran
ce (
%)
Baseline TBWL-640MB TBWL-1.25MB RandSGap(1 line per region) (region=128KB) 13 bytes
Nor
mal
ized
End
uran
ce (
%)
18 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
19 © 2007 IBM Corporation
Security Challenge in Lifetime Limited Memories
What if an adversary knows about write endurance limit?Repeat Address Attack (RAA): repeat writes to same line.
RAA can cause line failure in less than 1 minute
Time to 1 line failure(seconds)
= Endurance * (CyclesPerWrite/CyclesPerSec)
= 225 x 212 4GHz
= 32 seconds
Both baseline and randomized Start-Gap suffers from this attack.Even table based wear leveling (practical version) suffers.
20 © 2007 IBM Corporation
Security Aware Wear LevelingSolution: Divide memory into regions. One Start-Gap per region.
Region size is made such that each line in region guaranteed to move once every “endurance” number of writes to region
NumLinesInRegion < Endurance WritesPerGapMovement
We use 256K lines per region (256 regions). Area Overhead < 1.5KB
RAA now takes about 3-4 months to cause failure. With delayed writes (in paper), time to failure ranges in year(s)
21 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
22 © 2007 IBM Corporation
Summary
Limited endurance poses lifetime and security challenge
Table based wear leveling: area and latency overhead
Start-Gap: Cost-effective wear leveling with two registers
Randomized Start-Gap: 97% of ideal endurance with 13 bytes
We took a first step towards making PCM systems secureagainst malicious attacks (RAA). Motivation for more research
23 © 2007 IBM Corporation
Advertisement
HPCA 2010 TutorialPhase Change Memory: A Systems Perspective
OrganizersDr. Moinuddin K Qureshi (IBM Research)
Prof. Sudhanva Gurumurthi (University Of Virginia)Dr. Bipin Rajendran (IBM Research)
Date: Jan 9, 2010 (Half Day)
http://www.cs.virginia.edu/~gurumurthi/PCM_tutorial/
24 © 2007 IBM Corporation
Backup Slides
25 © 2007 IBM Corporation
Supporting DRAM PageMode with Start-Gap
Randomization must be done at a DRAM-Page granularity instead of line
26 © 2007 IBM Corporation
Lifetime Under RAA attack
RAA will now take about 3-4 months to cause failure. With delayed writes (in paper), time required would range in year(s).
1 week
4 months
1 minute10
100
1000
10000
100000
1000000
10000000
100000000
1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152
Number of Lines in Region
Tim
e to
Fai
lure
(in
sec
on
ds)
27 © 2007 IBM Corporation
Outline
Problem Background on Wear Leveling Start Gap Wear Leveling Randomized Start-Gap Security Considerations Summary
28 © 2007 IBM Corporation
Spare Lines
top related