Aravindan Raghuveer David.H.C.Du DISC - DTC · 2012. 8. 16. · Aravindan Raghuveer David.H.C.Du DISC Object-based Storage for Exhaustive Search. University of Minnesota Digital Technology

University of Minnesota Digital Technology Center Intelligent Storage Consortium

Aravindan RaghuveerDavid.H.C.Du

DISC

Object-based Storage for Exhaustive Search


Introduction

Exhaustive Search– Examine all objects in a storage system.– An expensive Operation

Why Exhaustive Search ?– Fuzzy Queries:

Semantic gap in image, video hard to annotate Content-based (Query-by-Example) Demonstrated in the Diamond project at Intel/CMU

– Index Creation: Not effective: Curse of dimensionality Too expensive Not always possible: Fuzzy queries

A “necessary evil” feature on all filesystems.


The factors . . .

Four recent developments spur us to rethink theway exhaustive search is implemented today:

Data Characteristics Disk Technology Trends Filesystem and Database Design Concurrent Applications


Factor-1: Data Characteristics

Petabyte scale data sets are becoming common indata mining and HPC apps

Video surveillance:– terabytes of data (depending on number of feeds) that needs

to be searched by content. For instance, experiments @ Mayo clinic generate:

– 3 million images per week– 1TB per 9 days– Presentation in ISW last year.

Impact on exhaustive search– Reduce search space of exhaustive search as much as

possible (for instance, based on metadata)– Exhaustive Search better be darn efficient !


Factor-2: Disk Technology Trends

Bits per unit area increasing rapidly I/O Bandwidth lagging behind Effect on exhaustive search:

– 1 day to sequentially read 10TB*– 5 months with 8KB chunk random access !!

Impact on exhaustive search:– Exhaustive Search algorithm should be conscious of avoiding

random disk seeks– Try to get as sequential performance as we possibly can.

* Dr. Jim Gray’s keynote from FAST’05:


Factor-3: Filesystem and Database Design

Exhaustive search (eg. grep) runs on top of a filesystem.

Filesystem or a database is not even aware that theapplication is exhaustively searching

Exhaustive Search is not the primary design criteriafor today's’ filesystems.– Filesystem level exhaustive search: Recursive exploration of

directories. With aged, fragmented filesystems:

– At the disk: an Exhaustive search will look more like randomaccess than sequential.

Databases : not as efficient as filesystems in handlingblobs in the presence of fragmentation*.

Impact on exhaustive search: F/S and D/B are not theright place to embed exhaustive search.

* R. Sears, C.Van. Ingen, “Fragmentation in Large Object Repositories”, CIDR 2007


Factor 4: Concurrent Applications

Exhaustive Search : Long running, I/O intensive task. Other filesystem applications running concurrently. Concurrent execution of both:

– Performance Isolation: Impact on response time of other applications should be minimal. Impact on efficiency of exhaustive search should be as low as

possible.

Not possible with block storage:– Cannot distinguish one block request from another.– No priorities and QoS levels assigned to requests

Impact on exhaustive search:– The storage device should be able to provide differentiated

services.


Summary of what we’ve seen:

Data Characteristics:– Exhaustive Search better be darn efficient !

Disk Technology Trends:– Exhaustive Search algorithm should be conscious

of avoiding random disk seeks Filesystem and Database Design:

– F/S and D/B are not the right place to embedexhaustive search

Concurrent Applications:– The storage device should be able to provide

differentiated services.


What this work is about ?

A fresh look at Exhaustive Search Ensure that the storage system isnever the bottleneck in performance. Conscious of random disk seeks. Close-to-sequential performancealways Concurrent execution with otherfilesystem apps.

– Without compromising extensively onresponse time and efficiency


An Overview of proposed approach

Layout aware:– Search order not based on logical filesystem view but

physical on-disk organization.– As close to sequential performance as possible.

Suspend-and-resume– On a real-time request to disk:

Suspend exhaustive search. Service real-time request. Resume exhaustive search.

– Modify search order based on current disk headposition.


Questions to be answered … Architecture:

– Where to embed functionality: filesystem or smart object baseddisk ?

Layout-Aware Search:– Planning the search ?– Metadata handling and placement?

Where are object extents located List of objects already scanned

Suspend-Resume:– Maintaining search progress metadata to avoid re-scanning [suspend]– Computing new search plan [resume]


Proposed Solution

Architecture: An intelligent storage node (ISN)capable of exhaustive search.– Exposes an object interface– A case for application-awareness at the storage level to

improve performance Why OSD ?

– File-system or databases does not have idea of storageinternals and parameters.

– Filesystem can be built on multiple layers of virtualization disconnected from actual reality (disk boundaries)

– Filesystem level search performance degrades withfragmentation.

– Block-storage does not differentiate real-time and exhaustivesearch.


The Intelligent Storage Node Storage node with:

– T10 compatible object interface– Capable of executing a limited set of exhaustive search queries.– Called an Intelligent Storage Node (ISN)

Extension to the OSD interface:– Command OSD_QUERY to trigger an exhaustive search.– resultSetCollectionID OSD_QUERY(queryType,

exampleObjectID)

Search Order : Object-Fragment level search as opposed to objectlevel.

Suspend and Resume :– Static : Search Order not modified on resume– Dynamic : Search Order adjusted on resume


ISN for Exhaustive Search

A case for application-aware intelligent storage Application Characteristic:

– Order in which object fragments are scanned is notimportant

Storage-Device characteristic:– Sequential performance is 10X better than random access

performance

Application-Aware Storage Optimization:– Determining search order of fragments to obtain close-to-

sequential performance– Suspend-and-resume support for real time requests.


Architecture of ISN:

Prototype under development based on DISC OSD referenceimplementation:– Object filesystem (ext3)– Fragment Indexer– Search Planner

Real-time request support implementation in progress.

Initial results look very promising..

OSD Command Interpreter

Object Filesystem

Fragment Indexer

Search Planner

Block Device


Experimental Setup

Aging tool

F/S SearchPlanner

Layout -Aware Search Planner

Search executor

Ext3 Filesystem

Aging tool syntheticallyfragments a filesystem through

file append, delete, createoperations.

Filesystem search plan Layout-Aware search plan


Results

Storage Age = 5 Filesystem usage 10G (Partition Size = 63G) Time taken for exhaustive search

– Filesystem : 41 mins– Layout-aware search : 7 mins


Acknowledgments

DISC Team– Faculty and Cory Devor– Students

Member Companies


Thank You!

Questions ??


More spindles ??

More spindles comes at a cost:– Hardware cost : not too bad.– Maintenance : backup, scrubbing : expensive.– Concurrent failure issues.

Our technique can make “more-spindles” evenbetter.

More spindles not a solution for all scenarios– Home user– Video and image search imaginable.


Lazy defragmentation of filesystem??

Alleviates the issue but an expensiveoperation in itself.

Still may not work as good as a storage levelsearch with multiple levels of virtualization.


Investigations toDo:

Layout-Awareness:– 2 modes of layout-aware search.– Pre-planned and adhoc.

Pre-planned used when the disk stores a small number ofobjects.

Adhoc mode used when the disk is almost full. Pre-planned and adhoc can be used at finer granularities

(example: different modes on different areas of the disk)– Suspend-Resume:

Suspend: Search Metadata is distributed over the disk, close to thedata.

Resume: Based on the remaining number of objects we either shiftto the pre-planned or adhoc mode.

Aravindan Raghuveer David.H.C.Du DISC - DTC · 2012. 8. 16. · Aravindan Raghuveer David.H.C.Du DISC Object-based Storage for Exhaustive Search. University of Minnesota Digital Technology

Documents