Top Banner
Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with input from the NTFS and SQL Server teams
28

Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Jan 03, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Fragmentation in Large Object Repositories

Russell Sears

Catharine van Ingen

CIDR 2007

This work was performed at Microsoft Research San Francisco with input from the NTFS and SQL Server teams

Page 2: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Background

• Web services store large objects for users– eg: Wikipedia, Flickr,

YouTube, GFS, Hotmail

• Replicate BLOBs or files – No update-in-place

• Benchmark before deployment – Then, encounter storage

performance problems

• We set out to make some sense of this

Object StoresObject StoresObject StoresObject StoresObject StoresObject StoresObject StoresDB

(metadata)

ApplicationServers

Replication /Data scrubbing

Clients

Page 3: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Problems with partial updates

• Multiple changes per application request– Atomicity (distributed transactions)

• Most updates change object size– Must fragment, or relocate data

• Reading / writing the entire object addresses these issues

Page 4: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Experimental Setup• Single storage node• Compared filesystem, database

– NTFS on Windows Server 2003 R2– SQL Server 2005 beta

• Repeatedly update (free, reallocate) objects– Randomly chose sizes, objects to update– Unrealistic, easy to understand

• Measured throughput, fragmentation

Page 5: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Reasoning about time

• Existing metrics– Wall clock time: Requires trace to be

meaningful, cannot compare different workloads

– Updates per volume: Coupled to volume size

Storage Age:

Average number of updates per object

Page 6: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Read performance

• Clean system – SQL good small object

performance (inexpensive opens)

– NTFS significantly faster with objects >>1MB

• SQL degraded quickly• NTFS small object

performance was low, but constant

0

2

4

6

8

10

12

NTFS SQL NTFS SQL

Rea

d T

hro

ug

hp

ut

(MB

/s)

0 2 4Updates per object

256 KB Objects 1 MB Objects

Page 7: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

10MB object fragmentation

0

5

10

15

20

25

30

35

40

0 1 2 3 4 5 6 7 8 9 10

Storage Age

Fra

gm

ents

/ob

ject

SQL Server

NTFS• NTFS approaching

asymptote

• SQL Server degrades linearly – No BLOB

defragmenter

Page 8: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Rules of Thumb

• Classic pitfalls– Low free space (< 10%)– Repeated allocation and deallocation (High

storage age)

• One new problem– Small volumes (< 100-1000x object size)

• Implicit tuning knobs– Size of write requests

Page 9: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Append is expensive!• Neither system can take advantage of final

object size during allocation

• Both API’s provide “append”– Leave gaps for future appends

– Place objects without knowing length

• Observe same behavior with single and random object sizes

Page 10: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Conclusions

• Get/put storage is important in practice

• Storage age– Metric for comparing implementations and

workloads– Fragmentation behaviors vary significantly

• Append leads to poor layout

Page 11: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

----BACKUP SLIDES----

Page 12: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Theory vs. Practice

• Theory focuses on contiguous layout of objects of known size

• Objects that are allocated in groups are freed in groups– Good allocation algorithms exploit this– Generally ignored for average case results– Leads to pathological behavior in some

cases

Page 13: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

• Small objects / Large volumes– Percent free space

• Large objects / Small volumes– Number of free objects

Small volumes

Page 14: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Efficient Get/Put

• No update-in-place– Partial updates complicate apps

– Objects change size

• Pipeline requests– Small write buffers, I/O Parallelism

Applicationserver

1234

Page 15: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.
Page 16: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Lessons learned

• Target systems avoid update-in-placeNo use for database data models

• Quantified fragmentation behavior– Across implementations, workloads

• Common API’s complicate allocation– Filesystem / BLOB API is too expressive

Page 17: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Applicationserver

1234

Page 18: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Example systems

• SharePoint– Everything in the database, one copy per version

• Wikipedia– One blob per document version; images are files

• Flickr / YouTube• GFS

– Scalable append; chunk data into 64MB files

• Hotmail– Each mailbox is stored as a single opaque BLOB

Page 19: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

The folklore is accurate, so why do application designers…

…benchmark, then deploy the “wrong” technology?

…switch to the “right one” a year later?

…then switch back?!?

Performance problems crop up over time

Page 20: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Conclusions

• Existing systems vary widely– Measuring clean systems is inadequate, but

standard practice

• Support for append is expensive

• Unpredictable storage is difficult to reliably scale and manage– See paper for more information about

predicting and managing fragmentation in existing systems

Page 21: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Comparing data layout strategies

• Study the impact of– Volume size– Object size– Workload– Update strategies– Maintenance tasks– System implementation

• Need a metric that is independent of these factors

Page 22: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Related work

• Theoretical results– Worst case performance is unacceptable– Average case good for certain workloads– Structure in deallocation requests leads to

poor real-world performance

• Buddy system– Place structural limitations on file layout– Bounds fragmentation, fails on large files

Page 23: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Introduction

• Content-rich web services require large, predictable and reliable storage

• Characterizing fragmentation behavior

• Opportunities for improvement

Page 24: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Data intensive web applications

• Simple data model (BLOBs)– Hotmail: user mailbox– Flickr: photograph(s)

• Replication– Instead of backup– Load balancing– Scalability

Object StoresObject StoresObject StoresObject StoresObject StoresObject StoresObject StoresDB

(metadata)

ApplicationServers

Replication /Data scrubbing

Clients

Page 25: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Databases vs. Filesystems

• Manageability should be primary concern– No need for advanced storage features– Disk bound

• Folklore– File opens are slow– Database interfaces stream data poorly

Page 26: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Clean system performance

0

2

4

6

8

10

12

256K 512K 1M

Object Size

Rea

d t

hro

ug

hp

ut

(MB

/sec

)

SQL Server

NTFS • Single node– Used network

API’s

• Random workload– Get/put one object

at a time

• Large objects lead to sequential I/O

Page 27: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Revisiting Fragmentation

• Data intensive web services– Long term predictability– Simple data model: get/put opaque objects

• Performance of existing systems

• Opportunities for improvement

Page 28: Fragmentation in Large Object Repositories Russell Sears Catharine van Ingen CIDR 2007 This work was performed at Microsoft Research San Francisco with.

Introduction

• Large object updates and web services– Replication for scalability, reliability– Get / put vs. partial updates

• Storage age– Characterizing fragmentation behavior– Comparing multiple approaches

• State-of-the-art approach:– Lay out data without knowing final object size– Change the interface?