Top Banner
Bridging the Information Gap in Storage Protocol Stacks Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison
32

Bridging the Information Gap in Storage Protocol Stacks

Feb 19, 2016

Download

Documents

Frieda Frieda

Bridging the Information Gap in Storage Protocol Stacks. Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison. State of Affairs. File System Storage System. Namespace, Files, Metadata, Layout, Free Space Block Based, Read/Write - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bridging the Information Gap in Storage Protocol Stacks

Bridging the Information Gapin Storage Protocol Stacks

Timothy E. Denehy,Andrea C. Arpaci-Dusseau,

and Remzi H. Arpaci-Dusseau

University of Wisconsin, Madison

Page 2: Bridging the Information Gap in Storage Protocol Stacks

2 of 32

State of Affairs

Namespace, Files, Metadata, Layout,

Free Space

Block Based, Read/Write

Parallelism, Redundancy

File System

Storage System

Interface

Page 3: Bridging the Information Gap in Storage Protocol Stacks

3 of 32

• Information gap may cause problems– Poor performance

• Partial stripe write operations– Duplicated functionality

• Logging in file system and storage system– Reduced functionality

• Storage system lacks knowledge of files• Time to re-examine the division of labor

Problem

Page 4: Bridging the Information Gap in Storage Protocol Stacks

4 of 32

• Enhance the storage interface– Expose performance and failure information

• Use information to provide new functionality– On-line expansion– Dynamic parallelism– Flexible redundancy

Our Approach

Informed LFS

Exposed RAID

Page 5: Bridging the Information Gap in Storage Protocol Stacks

5 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 6: Bridging the Information Gap in Storage Protocol Stacks

6 of 32

• Backwards compatibility– Block-based interface– Linear, concatenated address space

• Expose information to the file system above– Regions– Performance– Failure

• Allow file system to utilize semantic knowledge

ERAID Goals

Page 7: Bridging the Information Gap in Storage Protocol Stacks

7 of 32

• Region– Contiguous portion of the address space

• Regions can be added to expand the address space• Region composition

– RAID: One region for all disks– Exposed: Separate regions for each disk– Hybrid

ERAID Regions

ERAID

Page 8: Bridging the Information Gap in Storage Protocol Stacks

8 of 32

• Exposed on a per-region basis• Queue length and throughput• Reveals

– Static disk heterogeneity– Dynamic performance and load fluctuations

ERAID Performance Information

ERAID

Page 9: Bridging the Information Gap in Storage Protocol Stacks

9 of 32

• Exposed on a per-region basis• Number of tolerable failures• Reveals

– Static differences in failure characteristics– Dynamic failures to file system above

ERAID Failure Information

RAID1

XERAID

Page 10: Bridging the Information Gap in Storage Protocol Stacks

10 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 11: Bridging the Information Gap in Storage Protocol Stacks

11 of 32

• Log-structured file system– Transforms all writes into large sequential writes– All data and metadata is written to a log– Log is a collection of segments– Segment table describes each segment– Cleaner process produces empty segments

• Why use LFS for an informed file system?– Write-anywhere design provides flexibility– Ideas applicable to other file systems

I·LFS Overview

Page 12: Bridging the Information Gap in Storage Protocol Stacks

12 of 32

• Goals– Improve performance, functionality, and manageability– Minimize system complexity

• Exploits ERAID information to provide– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

I·LFS Overview

Page 13: Bridging the Information Gap in Storage Protocol Stacks

13 of 32

• NetBSD 1.5• 1 GHz Intel Pentium III Xeon• 128 MB RAM• Four fast disks

– Seagate Cheetah 36XL, 21.6 MB/s• Four slow disks

– Seagate Barracuda 4XL, 7.5 MB/s

I·LFS Experimental Platform

Page 14: Bridging the Information Gap in Storage Protocol Stacks

14 of 32

I·LFS Baseline Performance

• Four slow disks: 30 MB/s• Four fast disks: 80 MB/s

Page 15: Bridging the Information Gap in Storage Protocol Stacks

15 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 16: Bridging the Information Gap in Storage Protocol Stacks

16 of 32

• Goal: Expand storage incrementally– Capacity– Performance

• Ideal: Instant disk addition– Minimize downtime– Simplify administration

• I·LFS supports on-line addition of new disks

I·LFS On-line Expansion

Page 17: Bridging the Information Gap in Storage Protocol Stacks

17 of 32

• ERAID: Expandable address space• Expansion is equivalent to adding empty segments• Start with an oversized segment table• Activate new portion of segment table

I·LFS On-line Expansion Details

Page 18: Bridging the Information Gap in Storage Protocol Stacks

18 of 32

I·LFS On-line Expansion Experiment

• I·LFS immediately takes advantage of each extra disk

Page 19: Bridging the Information Gap in Storage Protocol Stacks

19 of 32

• Goal: Perform well on heterogeneous storage– Static performance differences– Dynamic performance fluctuations

• Ideal: Maximize throughput of the storage system• I·LFS writes data proportionate to performance

I·LFS Dynamic Parallelism

Page 20: Bridging the Information Gap in Storage Protocol Stacks

20 of 32

• ERAID: Dynamic performance information• Most file system routines are not changed

– Aware of only the ERAID linear address space– Reduces file system complexity

• Segment selection routine– Aware of ERAID regions and performance– Chooses next segment based on current performance

I·LFS Dynamic Parallelism Details

Page 21: Bridging the Information Gap in Storage Protocol Stacks

21 of 32

I·LFS Static Parallelism Experiment

• Simple striping limited by the rate of the slowest disk• I·LFS provides the full throughput of the system

Page 22: Bridging the Information Gap in Storage Protocol Stacks

22 of 32

I·LFS Dynamic Parallelism Experiment

• I·LFS adjusts to the performance fluctuation

Page 23: Bridging the Information Gap in Storage Protocol Stacks

23 of 32

• Goal: Offer new redundancy options to users• Ideal: Range of mechanisms and granularities• I·LFS provides mirrored per-file redundancy

I·LFS Flexible Redundancy

Page 24: Bridging the Information Gap in Storage Protocol Stacks

24 of 32

• ERAID: Region failure characteristics• Use separate files for redundancy

– Even inode N for original files– Odd inode N+1 for redundant files– Original and redundant data in different sets of regions

• Flexible data placement within the regions• Use recursive vnode operations for redundant files

– Leverage existing routines to reduce complexity

I·LFS Flexible Redundancy Details

Page 25: Bridging the Information Gap in Storage Protocol Stacks

25 of 32

I·LFS Flexible Redundancy Experiment

• I·LFS provides a throughput and reliability tradeoff

Page 26: Bridging the Information Gap in Storage Protocol Stacks

26 of 32

• Goal: Avoid replication performance penalty• Ideal: Replicate data immediately before failure• I·LFS offers redundancy with delayed replication• Avoids replication penalty for short-lived files

I·LFS Lazy Redundancy

Page 27: Bridging the Information Gap in Storage Protocol Stacks

27 of 32

• ERAID: Region failure characteristics• Segments needing replication are flagged• Cleaner acts as replicator

– Locates flagged segments– Checks data liveness and lifetime– Generates redundant copies of files

I·LFS Lazy Redundancy

Page 28: Bridging the Information Gap in Storage Protocol Stacks

28 of 32

I·LFS Lazy Redundancy Experiment

• I·LFS avoids performance penalty for short-lived files

Page 29: Bridging the Information Gap in Storage Protocol Stacks

29 of 32

Outline

• ERAID Overview• I·LFS Overview• Functionality and Evaluation

– On-line expansion– Dynamic parallelism– Flexible redundancy– Lazy redundancy

• Conclusion

Page 30: Bridging the Information Gap in Storage Protocol Stacks

30 of 32

Comparison with Traditional Systems

• On-line expansion– Yes

• Dynamic parallelism (heterogeneous storage)– Yes, but with duplicated functionality

• Flexible redundancy– No, the storage system is not aware of file composition

• Lazy redundancy– No, the storage system is not aware of file deletions

Page 31: Bridging the Information Gap in Storage Protocol Stacks

31 of 32

Conclusion

• Introduced ERAID and I·LFS• Extra information enables new functionality

– Difficult or impossible in traditional systems• Minimal complexity

– 19% increase in code size• Time to re-examine the division of labor

Page 32: Bridging the Information Gap in Storage Protocol Stacks

32 of 32

Questions?

http://www.cs.wisc.edu/wind/