Indirection Systems for Shingled-Recording Disk …storageconference.us/2010/Presentations/Research/15.Cassuto.pdfShingled Recording Track 1 Track 2 Track 3 Track 4 Track 5. ... Track
Post on 23-Mar-2018
226 Views
Preview:
Transcript
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved.
Indirection Systems for Shingled-Recording
Disk Drives
Yuval Cassuto
with co-authors:
M. Sanvido, C. Guyot, D. Hall and Z. Bandic
MSST 2010, May 7
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 2
Shingled Recording
Track 1
Track 2
Track 3
Track 4
Track 5
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 3
Shingled Recording – No Random Write
Track 1
Track 2
Track 3
Track 4
Track 5
Write Erase
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 4
Shingled Recording Magnetics
head motion
cornerhead
progressive
scans
downtrack
cross-track
123
4
head field contours
track layout for shingled-recording
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 5
Shingled Recording Tradeoff
• More: Capacity
• Less: Functionality → Performance
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 6
The Optimization Envelope
Observed
Performance
Excess Capacity
Non-shingled
Heads /
media
shingled drive
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 7
From Magnetics to Systems
write
Ib
invalid
PBA 1
PBA M
Track Interference Block Interference
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 8
Drive Based Shingled Recording Indirection
Host
LBA R/W
(unrestricted)
Indirection
PBA R/W
(w/ shingled-
recording
constraints)
HDD
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 9
Drive vs. Host Indirection
• Transparent to Host
• Complete knowledge of physical layout
• “Shingle aware” access and allocation
• System specific performance optimization
Why on the drive?
Why on the host?
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 11
Core Concepts
• Append/Read-Modify-Write with Shingled Regions
Region 1
Region 2
Region 3
Region K
independent
: guard
pure sequential
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 12
General Method for LBA↔PBA Mapping
Temporary
Storage
Permanent
Storage
fixed
variable
(recent writes)
LBA
PBA
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 14
Evaluation Platform
R/W Traces
Synthetic R/W
loads
(random/seq.)Simulated
Performance
IOPS, MB/s
Internal
Statistics
LBA PBA
Algorithms
Data
Structures
Indirection Module
Drive
Model
Write
Overload
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 15
Set-Associative Disk Cache Architecture
• Disk is partitioned to LBA shingled
regions and cache shingled regions
• Each LBA region (shingle) is
associated with one cache region
(shinglet)
• Multiple LBA regions are associated
with each cache region (set-
associative cache)
Shinglet 1
Shinglet 0
Shingle 7
Shingle 6
Shingle 5
Shingle 4
Shingle 3
Shingle 2
Shingle 1
Shingle 0
Associa
ted c
ache
Associa
ted c
ache
HDD PBA SpacePBA 0:
PBA M:
Perm
an
en
tC
ach
e
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 16
Set-Associative Disk Cache: Write Path
Cache Region
LBA
regionsAssociated to
cache
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 17
Write to Disk Cache
Cache Region
LBA
regions FULL!
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 18
Garbage Collection
• Read cache
• For each LBA
region present in
cache: RMW* full
region
• Full cache available
again
Cache Region
LBA
regions Present in cache
* RMW = Read-Modify-Write
(Read+Write) x #Present
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 19
Results from PC Traces
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1 2 3 4 5 6 7 8 9 10
cache size [%]
Wri
te O
ve
rlo
ad
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Slo
wd
ow
n
Write Overload
Slowdown
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 20
4K Random IOPS
0
100
200
300
400
500
600
700
800
900
1000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Time [sec]
IOP
S
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 21
Effect of Region Size
0
20
0 1000 2000 3000 4000
Time [sec]
IOP
S
0
20
0 1000 2000 3000 4000
Time [sec]
IOP
S
Size 50000 Size 10000
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 22
Set-Associative Disk Cache: Score Sheet
• Significant improvement over plain Read-Modify-Write
• Simple to implement
• Large dips in performance due to long garbage collections
• Region updated in place → consistency issues
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 23
The S-Blocks Architecture
• Temporary (red) and
permanent (blue)
storage managed as
ring buffers
• S-Block: intermediate
unit between sector and
region
S-Blocks
E-cache
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 24
Ring Buffer
V
X
Used, valid
UnusedUsed, invalid
V
X
V
V
V
V
V
V VV
V
V
V
V
X
X X
X
X
HeadTail
Head-Tail guard band
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 25
Initial Conditions
PI
S-Blocks
E-cache
1
2
3
LI
Over-provisioning
Unit = sector
Unit = S-block
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 26
Write Operation
• If full S-Block: write
directly in S-Block
buffer
• If smaller than full
S-Block: write in E-
cache
PI
S-Blocks
E-cache
LI
HeadTail
Head
Tail
S-Block
Buffer
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 27
E-cache Full
• If no room for
incoming write:
reclaim invalid
exceptions in E-cache
(defrag)PI
S-Blocks
E-cache
LI
Head
HeadTail
S-Block
Buffer
xx
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 28
Defrag(E-cache)
• If not enough invalid
exceptions:
– Choose S-Block
– Destage(S-Block)PI
S-Blocks
E-cache
LI
HeadTail
HeadTail
S-Block
Buffer
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 29
Destage(S-Block)
19
17
1
18
16
15
14
12 98
7
5
3
2
13
11 10
6
4
Head Tail
20
S-Block
Buffer
2217
4
12
58
86
4294
9573577
629
Tail
Head
S-Block 1 → sectors 1-10
E-cache
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 30
Destage(S-Block)
19
17
1
18
16
15
14
12 98
7
5
3
2
13
11 10
6
4
Head
Tail20
2217
12
58
86
4294
953577
29
Tail
Head
S-block writes in E-cachce are not
needed after destage
E-cache
S-Block
Buffer
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 31
4K Random IOPS
0
50
100
150
200
250
300
350
400
450
0 200 400 600 800 1000 1200
Time [sec]
IOP
S
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 32
Choosing S-Block to Destage
Choose S-Block with highest
exception count
Best amortization of S-Block
rewrite
Good for biased workloads (e.g.
hotspots)
Optimal Destage Optimal Defrag
Choose S-Block closest to the
tail
Least amount of copying in S-
Block defrag
Good for random workloads
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 33
Comparison for Random Workload
• S-Block destage average # invalidated exceptions 55.26
• S-Block defrag average copy count 1744.64
• average IOPS = 73.44
• S-Block destage average # invalidated exceptions 12.52
• S-Block defrag average copy count 0.00
• average IOPS = 121.54
Optimal Destage
Optimal Defrag
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 34
Effect of S-Block Choice on Performance
0
200
400
600
800
1000
1200
0.00
0
0.00
2
0.00
4
0.00
6
0.00
8
0.01
0
0.03
0
0.05
0
0.07
0
0.09
0
0.20
0
0.40
0
0.60
0
0.80
0
1.00
0
3.00
0
5.00
0
7.00
0
9.00
0
Time [sec]
# C
om
ma
nd
s
milli
hundredths
tenths
seconds
0
200
400
600
800
1000
1200
0.00
0
0.00
2
0.00
4
0.00
6
0.00
8
0.01
0
0.03
0
0.05
0
0.07
0
0.09
0
0.20
0
0.40
0
0.60
0
0.80
0
1.00
0
3.00
0
5.00
0
7.00
0
9.00
0
Time [sec]
# C
om
ma
nd
s
milli
hundredths
tenths
secondsOptimal Destage
Optimal Defrag
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 35
Lazy Garbage Collection
0
50
100
150
200
250
300
350
400
450
0 200 400 600 800 1000 1200
Time [sec]
IOP
S
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 36
Constant Garbage Collection
0
50
100
150
200
250
300
350
400
0 100 200 300 400 500 600
Time [sec]
IOP
S
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 37
S-Blocks: Score Sheet
• Good sustained random write performance
• Append update → no consistency issues
• Flexible to handle different workload types
• Good sequential performance with direct S-block writes
• Non-trivial implementation
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 38
Summary
• Shingled magnetic recording opens a rich area of systems
research
• Good understanding of main issues and tradeoffs
• Proposed architectures likely basis for real implementations
• Significant research needed in performance optimization
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 41
Add to Ring Buffer
V
X
Used, valid
UnusedUsed, invalid
V
X
V
V
V
V
V
V VV
V
V
V
V
X
X X
X
X
Head Tail
V
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 42
Invalidate in Ring Buffer
V
X
Used, valid
UnusedUsed, invalid
V
X
V
V
V
V
V
V VV
V
V
X
V
X
X X
X
X
Head Tail
V
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 43
Defrag Ring Buffer
#
X
Used, valid
UnusedUsed, invalid
19
X
1
18
16
15
14
12 98
7
5
X
2
X
X X
X
X
Head Tail
20
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 44
Defrag Ring Buffer
#
X
Used, valid
UnusedUsed, invalid
19
X
1
18
16
15
14
12 98
7
5
X
2
X
X X
X
X
Head
Tail20
Copyright © 2010, Hitachi Global Storage Technologies, All rights reserved. 45
Defrag Ring Buffer
#
X
Used, valid
UnusedUsed, invalid
19
X
1
18
16
15
14
12 98
7
5
X
2
X
X X
X
X
Head
Tail
20
top related