Read Disturb Errors - Carnegie Mellon Universityomutlu/pub/flash-read-disturb-errors_dsn… · Read Disturb Effect on V th Distribution Normalized Threshold Voltage × 10-3 6 5 4

Post on 08-Aug-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Read Disturb Errors in MLC NAND Flash Memory:

Characterization, Mitigation, and Recovery

Yu Cai, Yixin Luo, Saugata Ghose, Erich F. Haratsch*, Ken Mai, Onur Mutlu

Carnegie Mellon University, *Seagate Technology

Executive Summary•Read disturb errors limit flash memory lifetime today–Apply a high pass-through voltage (Vpass) to multiple pages on a read

•We characterize read disturb on real NAND flash chips–Slightly lowering Vpass greatly reduces read disturb errors

–Some flash cells are more prone to read disturb

• Technique 1: Mitigate read disturb errors online–Vpass Tuning dynamically finds and applies a lowered Vpass

–Flash memory lifetime improves by 21%

• Technique 2: Recover after failure to prevent data loss–Read Disturb Oriented Error Recovery (RDR) selectively

corrects cells more susceptible to read disturb errors

–Reduces raw bit error rate (RBER) by up to 36%

2

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

3

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

4

NAND Flash Memory Background

Flash Memory

Page 1

Page 0

Page 2

Page 255

……

Page 257

Page 256

Page 258

Page 511

……

……

Page M+1

Page M

Page M+2

Page M+255

……

Flash Controller

5

Block 0 Block 1 Block N

ReadPassPass

Pass

Sense Amplifiers

Flash Cell Array

Block X

Page Y

Sense Amplifiers

6

Row

Co

lum

n

Flash Cell

Floating Gate

Gate

Drain

Source

Floating Gate Transistor(Flash Cell)

Vth = 2.5 V

7

Flash Read

Vread = 2.5 V Vth = 3V

Vth = 2 V

1 0

Vread = 2.5 V

8

Gate

Flash Pass-Through

Vpass = 5 V Vth = 2 V

1

Vpass = 5 V

9

Gate

1

Vth = 3V

Read from Flash Cell Array

3.0V 3.8V 3.9V 4.8V

3.5V 2.9V 2.4V 2.1V

2.2V 4.3V 4.6V 1.8V

3.5V 2.3V 1.9V 4.3V

Vread = 2.5 V

Vpass = 5.0 V

Vpass = 5.0 V

Vpass = 5.0 V

1 100Correct values for page 2: 10

Page 1

Page 2

Page 3

Page 4

Pass (5V)

Read (2.5V)

Pass (5V)

Pass (5V)

Read Disturb Problem: “Weak Programming” Effect

3.0V 3.8V 3.9V 4.8V

3.5V 2.9V 2.4V 2.1V

2.2V 4.3V 4.6V 1.8V

3.5V 2.3V 1.9V 4.3V

Repeatedly read page 3 (or any page other than page 2) 11

Read (2.5V)

Pass (5V)

Pass (5V)

Pass (5V)

Page 1

Page 2

Page 3

Page 4

Vread = 2.5 V

Vpass = 5.0 V

Vpass = 5.0 V

Vpass = 5.0 V

0 100

Read Disturb Problem: “Weak Programming” Effect

High pass-through voltage induces “weak-programming” effect

3.0V 3.8V 3.9V 4.8V

3.5V 2.9V 2.1V

2.2V 4.3V 4.6V 1.8V

3.5V 2.3V 1.9V 4.3V

Incorrect values from page 2:

12

2.4V2.6V

Page 1

Page 2

Page 3

Page 4

Goal: Mitigate and Recover Read Disturb Errors

Read disturb errors: Reading from one page can alter the values stored in other unread pages

13

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

14

Methodology

• FPGA-based flash memory testing platform [Cai+, FCCM ‘11]

• Real 20- to 24-nm MLC NAND flash chips

• 0 to 1M read disturbs

• 0 to 15K Program/Erase Cycles (PEC)

15

Read Disturb Effect on Vth Distribution

Normalized Threshold Voltage

× 10-3

6

5

4

3

2

1

00 50 100 150 200 250 300 350 400 450 500

PD

F

0 (No Read Disturbs)

0.25M Read Disturbs

0.5M Read Disturbs

1M Read Disturbs

ER state

P1 state

P2 state

P3 state

Vth gradually increases with read disturb

counts

16

Other Experimental Observations

•Lower threshold voltage states are affected more by read disturb

•Wear-out increases read disturb effect

17

Reducing The Pass-Through Voltage

18

1 1.7 6.8 22100

470

1300

0

200

400

600

800

1000

1200

1400

0% 1% 2% 3% 4% 5% 6%

No

rmal

ize

d T

ole

rab

le

Re

ad D

istu

rb C

ou

nt

Percentage of Vpass Reduction

Key Observation 1: Slightly lowering Vpass

greatly reduces read disturb errors

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

19

Read Disturb Mitigation: Vpass Tuning

•Key Idea: Dynamically find and apply a lowered Vpass

•Trade-off for lowering Vpass

+Allows more read disturbs

– Induces more read errors

20

Read Errors Induced by Vpass Reduction

21

3.0V 3.8V 3.9V 4.8V

3.5V 2.9V 2.4V 2.1V

2.2V 4.3V 4.6V 1.8V

3.5V 2.3V 1.9V 4.3V

Vread = 2.5 V

Vpass = 4.9 V

Vpass = 4.9 V

Vpass = 4.9 V

1 100

Reducing Vpass to 4.9V

Page 1

Page 2

Page 3

Page 4

Read Errors Induced by Vpass Reduction

22

3.0V 3.8V 3.9V 4.8V

3.5V 2.9V 2.4V 2.1V

2.2V 4.3V 4.6V 1.8V

3.5V 2.3V 1.9V 4.3V

Vread = 2.5 V

Vpass = 4.7 V

Vpass = 4.7 V

Vpass = 4.7 V

1 000

Reducing Vpass to 4.7V

Incorrect values from page 2:

Page 1

Page 2

Page 3

Page 4

Utilizing the Unused ECC Capability

23

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21N-day Retention

1.0

0.8

0.6

0.4

0.2

0

RB

ER

× 10-3 ECC Correction Capability

Unused ECC capability

1. Huge unused ECC correction capability can be used to tolerate read errors

2. Unused ECC capability decreases over time

Dynamically adjust Vpass so that read errors fully utilize the unused ECC capability

Vpass Reduction Trade-Off Summary

•Conservatively set Vpass to a high voltage

–Accumulates more read disturb errors at the end of each refresh interval

+No read errors

•Dynamically adjust Vpass to unused ECC capability

+ Minimize read disturb errors

oControl read errors to be tolerable by ECC

oIf read errors exceed ECC capability, read again with a higher Vpass to correct read errors

24

Vpass Tuning Steps

•Perform once for each block every day:

1. Estimate unused ECC capability

2. Aggressively reduce Vpass until read errors exceeds ECC capability

3. Gradually increase Vpass until read error just becomes less than ECC capability

25

Evaluation of Vpass Tuning

•19 real workload I/O traces

•Assume 7-day refresh period

•Similar methodology as before to determine acceptable Vpass reduction

•Overhead for a 512 GB flash drive:

–128 KB storage overhead for per-block Vpass setting and worst-case page

–24.34 sec/day average Vpass Tuning overhead

26

Vpass Tuning Lifetime Improvements

27

02000400060008000

1000012000

ho

mes

web

-vm

mai

lm

ds

rsrc

hp

rnw

eb stg ts

pro

jsr

cw

dev usr

po

stm

ark

hm

cello

99

web

Sear

chfi

nan

cial

prx

y

P/E

Cyc

le L

ifet

ime Baseline Vpass TuningVpass Tuning

Average lifetime improvement: 21.0%

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

28

Read Disturb Resistance

29

R

P

Disturb-Resistant

Disturb-Prone

Normalized Vth

PDFN read

disturbs

N read disturbs

R

P

Observation 2: Some Flash Cells AreMore Prone to Read Disturb

30

P1ER

Normalized Vth

PDF

P

P

P

P

R

P

RP

R

P

RP

Disturb-prone cells have higher threshold voltages

Disturb-resistant cells have lower threshold voltages

After 250K read disturb:

Disturb-proneER state

Disturb-resistantP1 state

Read Disturb Oriented Error Recovery (RDR)

•Triggered by an uncorrectable flash error

–Back up all valid data in the faulty block

–Disturb the faulty page 100K times (more)

–Compare Vth’s before and after read disturb

–Select cells susceptible to flash errors (Vref−σ<Vth<Vref−σ)

–Predict among these susceptible cells

• Cells with more Vth shifts are disturb-prone Higher Vth state

• Cells with less Vth shifts are disturb-resistant Lower Vth state

31

RDR Evaluation

32

× 10-3

12

10

8

6

4

2

0

RB

ER

Read Disturb Count0 0.2M 0.4M 0.6M 0.8M 1M

No Recovery RDR

Reduce total error counts up to 36% @ 1M read disturbsECC can be used to correct the remaining errors

Outline

•Background (Problem and Goal)

•Key Experimental Observations

•Mitigation: Vpass Tuning

•Recovery: Read Disturb Oriented Error Recovery

•Conclusion

33

Executive Summary•Read disturb errors limit flash memory lifetime today–Apply a high pass-through voltage (Vpass) to multiple pages on a read

•We characterize read disturb on real NAND flash chips–Slightly lowering Vpass greatly reduces read disturb errors

–Some flash cells are more prone to read disturb

• Technique 1: Mitigate read disturb errors online–Vpass Tuning dynamically finds and applies a lowered Vpass

–Flash memory lifetime improves by 21%

• Technique 2: Recover after failure to prevent data loss–Read Disturb Oriented Error Recovery (RDR) selectively

corrects cells more susceptible to read disturb errors

–Reduces raw bit error rate (RBER) by up to 36%

34

Read Disturb Errors in MLC NAND Flash Memory:

Characterization, Mitigation, and Recovery

Yu Cai, Yixin Luo, Saugata Ghose, Erich F. Haratsch*, Ken Mai, Onur Mutlu

Carnegie Mellon University, *Seagate Technology

Read Disturb Induced RBER Increases Faster with Higher PEC

× 10-3

4.03.53.02.52.01.51.00.5

0

Raw

Bit

Err

or

Rat

e (

RB

ER)

0 20K 40K 60K 80K 100KRead Disturb Count

PEC Slope15K 1.90×10-8

10K 9.10×10-9

8K 7.50×10-9

5K 3.74×10-9

4K 2.37×10-9

3K 1.63×10-9

2K 1.00×10-9

Fast

erSl

ow

er

36

Threshold Voltage Increases with Read Disturb Count

183

184

185

186

187

188

189

190

0 0.25 0.5 0.75 1

No

rm. V

th M

ean

Read Disturb Count (Millions)

15

17

19

21

23

25

27

0 0.25 0.5 0.75 1No

rm. V

th S

tan

dar

d D

evia

tio

n

Read Disturb Count (Millions)

Showing results for P1 state @ 8K PEC, other states have similar trends

37

Lower Voltage States AreMore Prone to Read Disturb

38

170

175

180

185

190

195

200

0 0.25 0.5 0.75 1

No

rm. V

th M

ean

Read Disturb Count (Millions)

25

30

35

40

45

50

55

0 0.25 0.5 0.75 1

No

rm. V

th M

ean

Read Disturb Count (Millions)

ER State P1 State

Reducing Vpass Increases Tolerable Read Disturb Count

× 10-3

RB

ER

1.6

1.4

1.2

1.0

0.8

0.6

104 105 108 109

Read Disturb Count106 107

94% Vpass

95% Vpass

96% Vpass

97% Vpass

98% Vpass

99% Vpass

100% Vpass

0.4

94%95%96%97%98%99%100%

Pct. Vpass Value 100% 99% 98% 97% 96% 95% 94%

Rd. Disturb. Cnt. 1x 1.7x 6.8x 22x 100x 470x 1300x

39

Pass-Through Voltage Reduction Induced Read Error

40

× 10-3

Ad

dl.

RB

ER D

ue

to

Re

du

ced

Vp

ass

Relaxed Vpass

0.75

0.5

0.25

480 485 490 495 500 505 510

1.0

0-day

1-day

2-day

6-day

9-day

17-day

21-day

0

Read Errors Induced by Vpass Reduction

•Will generate a read error only if:

–Max(Vth) > Vpass

–Correct read value is 1

•These errors do not affect lifetime

–can usually be tolerated by the unused ECC capability

•These errors are temporary

–can be corrected (if necessary) by reading with the default Vpass

41

Illustration of Vpass Tuning Results

42

Some Flash Cells AreMore Prone to Read Disturb

43

Predict to be ER state- Area III is correct- Area IV is 50/50

Predict to be P1 state- Area I is correct- Area II is 50/50

Showing ∆Vth with 8K PEC from 250K to 350K read disturbs

top related