DTC Intelligent Storage Consortium Research Activities ... · 5/10/2005 DISC - Digital Technology Center Intelligent Storage Consortium 1 DTC Intelligent Storage Consortium Research
Post on 31-Jul-2018
216 Views
Preview:
Transcript
5/10/2005 DISC - Digital Technology Center Intelligent Storage Consortium 1
DTC Intelligent Storage Consortium Research Activities
Supportedby
StorageTek, Veritas, Engenio, Sun Micro, ETRI/Korea
DOE, ONR
David H.C. DuDepartment of Computer Science and Engineering
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
2
Disk = Node (Intelligent Storage)has magnetic storage (1TB?)has processor & DRAMhas SAN/IP attachment has an execution environment Semantic-awareApplication-awareObject storage device
OS KernelSAN driver Disk driver
File System RPC, ...Services DBMS
Applications
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
3
Example : Adding storageExpand a file system to allocate/extend a database table
DBA: Noticetablespace full
Contact Sys Admin
Check for spaceon filesystem;
see more needed
Contact Storage Admin
Find LUN(s) withright characteristics
Make sure LUN(s) arebound to right port
Ensure HBA, portare in same zone
Mask LUN to allow
HBA access
Contact Sys Admin
Log into host; run commands to see
new volumes
Add primary volumeto volume group,
expand group
Mount primaryvolume to
filesystem; extend
Log into DB;create datafile
Add datafile to tablespace and
expand
DBA-or-App
Installer
Storage Admin
System Admin Contact DBA
OrInstaller
Repeat previous 4 steps for volumeat remote DR site
Setup mirroring withremote volume
using volume mgr
Installer:Determine space
needed
Contact Storage Admin
Log into SystemInstall application-or--or-
-or-
Source: IBM
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
4
Our Approach
Explore the intelligence to be included in storage devicesDevelop and extend the OSD (Object Storage Device) StandardsInvestigate essential technologies and design new storage architectures Investigate applications and environment that can benefit directly
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
5
Background: Intelligent storage
Blocks
Files
Object
Information
KnowledgeCaptured in the attributes of an object
Exploited to store data more efficiently
[ INTELLIGENCE ]
Extended attributes augmented view high level semantics associated.
Traditional storage device view raw bits, no associated semantics.
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
6
Focused ResearchEssential Technical Components– Extended Attributes– Security Applications– Micro-Array Data (Functional Genomics)– Supercomputing (Tape Backup and Archive)
Exploiting the Attributes for Intelligence– QoS Support– Search and Indexing– Data Provenance
Tape Backup and Archive– Parallel Archive– Parallel Data Placement and File System for Tape Library
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
7
Extended Attributes (EAs) in OSDMotivations– Attributes are important extension of objects to traditional
files– EAs can be used to support semantic-aware and application-
aware storage – Only preliminary solutions in several existing file systems:
Ext2, Ext3, XFS, JFS, ReiserFS, etc.ObjectivesHow to use EAs and how to efficiently store and access EAs?– Access control of EAs– Fast retrieval of any EA by name– Fast bulk copy of EAs– Efficient space utilization for variable-sized EAs– Help on search and indexing
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
8
Access Control of EAsPermission bits and file ACL are NOT enough to protect EAs
– Example: an “audit trace”EA should not be accessible to file owner
Categorization of EAs– Search and indexing
Group– Storage Management
Group– Monitoring Group– Enhanced Application-
aware GroupDefault access control rules are defined for members of each group when createdNew access control entry can be inserted into ACL of each EA
<page 1>
xattr 1xattr 2xattr 3xattr 4xattr 5
extended attribute headers
reserved ACEs for default access rules
xattr 6xattr 7 <page 0>
external pages for long ACLs
ACL of xattr 4
ACL of xattr 7
i-node
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
9
Storing EA on Storage Media
Inline (within i-node) space for small EAExternal pages for more EAsDifferentiate per-instance EAs(IEAs) and per-object EAs
– Example: When copying, “audit trace” EA should not be copied to the new object
Separated EA headers and EA values
– Headers are organized into index for fast lookup
– Headers have inline space for storage small values
– EA value pages support efficient variable-sized EA values
Mode
Owner info
Size
Timestamps
EA Header Block
Direct Blocks
Indirect Blocks
Double Indirect Blocks
Triple Indirect Blocks
Data
Data
Data
Data
Data
Data
EA headers……ACL Block 1
ACLs
EA Value Block
EA valuesACL Block 2
ACLs
reserved space for inline EA
IEA Value Block
IEA Value Block
IEA headers
IEA values
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
10
EAs in External Pages
EA headers are built into B-Tree w/ EA name as searching keyEA headers have reserved space for small valuesExternal EA values are reference by (logical EA page number, slot number)
– Reference of external EA value don’t need to change when copying to new object
EA header EA header EA header EA header……
Value space
external EA value page
Free space
Slot Directory
PAGE i
(page i, slot 1)
# of slots
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
11
Focused ResearchEssential Technical Components– Extended Attributes– Security Applications– Micro-Array Data (Functional Genomics)– Supercomputing (Tape Backup and Archieve)
Exploiting the Attributes for Intelligence– QoS– Search and Indexing– Data Provenance
Tape Backup and Archive– Parallel Archive– Parallel Data Placement and File System for Tape Library
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
12
Selected Medical Application
Data Explosion : Can intelligent storage help ??
Access to Diverse Heterogeneous Distributed Data
Expression Arrays (various tissues)
Personal genomics
X-rays, MRI, mamograms, etc
Clinical Record
Analysis lab notes
Hospital events ....admission, surgery, recovery, discharge
1. Patient Information Challenges
Volume and complexity of data
Integrating massive volumes of disparate data
Need for sophisticated analytics
Growing collaboration across ecosystem
Slide from Dr. Khaja Zafarullah’s presentation
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
13
IBM Healthcare and Life Sciences Clinical Genomics Solution Conceptual Architecture
Medical Research
Clinical Care
HIS, RIS, CIS, Pathology, Rx, Patient Charts
Expression, SNPs, Clinical Studies & Trials, Proteomic
Medical Information GatewayDeidentification of Patient Data & Anonymous
Global Patient Identifier Assigned
Adherence to StandardsHL7, BSML/HapMap, CDISC/ODM, MAGE-ML, CDA, etc.
Medical Information BrokerMedical
Information Repository
DB2 Information Integrator
WebSphere
Source scientific data & unstructured text files
e.g. MS Access. MS Excel, EST/ GenBank, XML,
Medline, dbSNP
Data Mining/Statistical Analysis/Visualization
Source: IBM
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
14
Focus of Work.
Explore the scope and advantages of intelligent storage in a practical setting.Collaborator: Mayo clinic
High level goalData explosion: Volume and complexity of data is increasing by the day – bioinformatics and health care sectorStore and organize large volumes of disparate data in an efficient manner.How can intelligent storage make things better ?
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
15
Data pieces generated at various phases of the experiment
Quantization
geneID
sampleID
Gene expression level
Consolidation
3. Scanned image file(s)4. Gene expression matrix
1. MIAME data
2. Chip data
Experiment Setup
5. Annotation
Analysis
External knowledge bases
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
16
Pieces of data generated andtheir characteristics:
Data Piece characteristicsExperiment description, gene exp. matrix : MAGE-ML
structured
Chip data : vendor provided structured
Annotation and findings (XML)
semi-structured, frequently accessed
Image files Unstructured, less frequently accessed.
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
17
Research Road Map
MicroArray ExperimentMicroArray Data– MIAME– MAGE-ML
Field DataStarting the MicroArray Intelligent storage mapping
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
18
Prototype Implementation: A rough plan
Hardware: – storage brick– dual processor
RAID controllerEmbedding intelligence
Operating System:– Linux clone– ext3 filesystem modify to infuse intelligence
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
19
Focused ResearchEssential Technical Components– Extended Attributes– Security Applications– Micro-Array Data (Functional Genomics)– Supercomputing
Exploiting the Attributes for Intelligence– QoS Support– Search and Indexing– Data Provenance
Tape Backup and Archive– Parallel Archive– Parallel Data Placement and File System for Tape Library
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
20
Integrated QoS Provisioning in A Remote Accessible Environment
Problem– How to provide QoS support to clients that access remote
storage?Solution– A full integration of network QoS, SAN QoS and storage
QoSApproach– Take both network (TCP/IP and SAN) and storage
conditions into consideration– Combine the feedback mechanism with storage and
network scheduler
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
21
QoS Support for Remote Data Accesses
Computer
Server
Laptop
Server
Multilayer Switch
Data Center
FC-AL storage
SAN SwitchRouter
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
22
Approaches and ResultsWorking on a OSD standard based QoS Specification (iSCSI QoS Reference Implementation) Propose a QoS framework for OSD-based data access– Support of multiple QoS classes– Extension of OSD specification to incorporate QoS
specification Intelligent storage scheduling and resource allocation to support QoSInvestigate SAN QoSIntegrate network QoS, SAN QoS with Storage QoS
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
23
What is data provenance?
Provenance is a relationship between data objects to explain how a particular object has been derived.A workflow of data processes usually explains this relationshipUsing provenance, a user can trace the “workflow” that led to the aggregation of processes producing a particular object.
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
24
How to solve data provenance in bioinformatics?
Workflow of Functional GenomicsData Dependent Relationships Between Data ObjectsAnalysis Tools: take several input data with a set of parameter values to produce a version of output data objectResults and generated knowledge are presented as annotations and feedback to the system
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
25
A Novel Update Propagation Module for the Data Provenance Problem
Objective: Preventing the following scenarios:Propagation of erroneous outcomesUnnecessary rerunning of time consuming and heavily computations
Approach: Integrating three major decision factors as a sequential hypothesis testing problem to form a unique decision module
– Sensitivity analysis (Variance-based Approach) – Independent inputs– Correlated inputs
– Uncertainty analysis (Root-Sum-of the Squares Method)– Complexity & Dependency
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
26
Search and Indexing: Many Types of Data
Homeland Security– Fingerprints – Facial photographs – Satellite imagery– Intelligence reports – Criminal records
Special Cases– Multiple types of data in
one itemPatient or criminal record has both text and images
– Some data has additional semantic information and relationships
Medical– X-rays– Micro arrays – Lab notes– Publications – Patient records
Business/Personal– Office files (.doc, .ppt,
.pdf)– Multimedia (images,
video, and audio)– Emails
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
27
Data FormatsStructured– A strict format (schema) is known in advance – Databases use structured data– Good query performance on indexed fields
Semi-structured– No fixed or predefined schema– Data and schema information are mixed together
as “self-describing data” (e.g., XML)Unstructured– Various text, HTML, images, video, audio, and
other files arranged in no particular way – Slow query performance due to exhaustive
searches
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
28
ChallengesThe system needs to scale to massive levelsThere are some limitations to existing database technology– If data structure changes often, databases need to
deal with schema evolution and migration of all old data
– Dealing with multi-format data sometimes resorts to pointers to a separate file system
Location of unstructured data often requires an expensive exhaustive search– Even with pre-indexed data, an unexpected query
can trigger an exhaustive search
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
29
Automatic Indexing by Query History
High Speed Network
Clients1. Find attr3 = X2. Find attr3 = Y
Metadata Server4. attr3 index here
Intelligent Storage Devices1. Full scan2. Full scan and build index on attr33. Index scan
ObjectsExtended Attributes:attr1, attr2, attr3
3. Find attr3 = Z
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
30
Example Semantics: What is MeSH?
over 22,000 descriptors arranged MeSHstands for Medical Subject HeadingsA controlled thesaurus of medical terms according to National Laboratory of MedicineContains hierarchically into 15 main categorieshttp://www.nlm.nih.gov/mesh/2005/MeSHtree.html
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
31
MeSH-based Grouped Allocation
High Speed Network
ClientsGenerate dataAssign MeSH terms
Metadata ServerMeSH-aware
Intelligent Storage DevicesAllocate somewhat intelligently…by grouping related objects
ObjectsRelated objects inherit attributes
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
32
Proposed Approaches
Take advantage of the hierarchical structure and semantic of MeSH (Medical Subject Headings) terms to assist search for similar or required data Design an adaptive data allocation on intelligent storage devices for fast retrieval based on past query results
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
33
Focused Research
Essential Technical Components– Extended Attributes– Security Applications– Micro-Array Data (Functional Genomics)– Supercomputing
Exploiting the Attributes for Intelligence– QoS– Search and Indexing– Data Provenance
Tape Backup and Archive– Parallel Archive– Parallel Data Placement and File System for Tape Library
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
34
…
… …
Tape Controller Tape Controller
I/O Controller
Data Transfer
Creating OSD-Enabled Tape Library
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
35
Research Goals on Parallel Tape Libraries
• For high performance cluster environment, how to place objects across multiple tape drives to increase aggregate object retrieval rate from tape archives
• For thousands of distributed clients with various connection rates and no knowledge about tapes, how to improve backup/archive performance and easy to use interface
• Determine the polices required for this HSM software to allow efficient storage of objects on tapes
5/10/2005 DISC - Digital Technology Center Intelligent Storage Consortium 36
High Performance Tape File System for Data Backup/Archive
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
37
Motivation
Users’ laptops and desktops usually have invaluable data for business, but users are not experts for data backupTo avoid natural disasters and terror attacks, critical data have to be archived to tapes and sent to off-siteReducing the backup window to tape is critical for the safety of the valuable dataSingle data repository is easy to be managed and avoids human errors.
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
38
System Desired Features
Simplicity, convenience, low cost and high performance for data backup/archiveTape should be treated as a normal storage device and provides various data I/O interfaces to users– cp, scp, sftp, http put, osd and etc.
Direct backup/archive to the final destination – tapes –without involving disk cachingProvides “infinite” storage and concurrent writing to users
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
39
The Big Picture
VFS
ext3
nfs
fuse
op
glibcglibc
libfuse
/mnt/tape
internet
client 1
client 2
client n
user space
kernel space
network
scp
sftp
http
Note: Data streams are interleaved if necessary
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
40
In general, data interleaving is good for write and bad for readRestore from interleaved data will be talked in next slide
µ a11 a12 b11 c11 a13 a13 b12 c12 a21 a22 b21 c21
µ/2 µ/4µ/4
c1a1 b1 c2a2 b2
Control point
µ/3
sendingwaiting
Block Level Data Interleaving
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
41
Interleaved Data Restore
Effective restore bandwidth for interleaved data objects depends on how objects are interleavedRequests for archived data usually come in batches, which provides space for scheduling optimizationScheduling Issues for restore performance– Data backup scheduling issue: optimize data interleaving for
future restore based on object access probabilities and relationship
– Data restore scheduling issue: for a given bunch of data requests, find a near-optimal solution to minimize object restore window
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
42
Prototyping
Client B
SCSI
Client A
PC
Put a PC in front of a physical tape library to provide necessary processing power and memory buffer– Write data to disk and tape simultaneously
Interleave data objects for tape writing
– Read data object from disk or tape
Client C Virtual Tape Library
network
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
43
Data Transfer Model for d2d2t
µ µ µ µ
Disk Array Cache with central control
Tape Drives
Data path
Control path
Tape cartridge
Tape Library Robot
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
44
Object Allocation Scheme for d2d2tby considering the object relationship
3rd batch 4th batch1st batch: n(d-1) 2nd batch
tape
n tape libraries
near-line tapes
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
45
Parallel Archiving in OSD Environment
Supported by DOE contractOSDs have internal hierarchies of storage managed by independent commercial HSMFiles can consists of multiple objects stored in parallel OSDs
– Synchronize the storage-level of related objects, e.g., migrations of object 3 and object 4 should be coordinated
Objective– Leverage object based
parallel file system technology and commercial (non-parallel) HSM products
– Parallel archive integrated tightly into a globally shared scalable parallel file system
Metadata hashed across multiple
machinesFile /foo =
OBSD1 object 4 OBSD3 object 3
owner = me date = today
etc.
Metadata Cluster
Object Storage Devices
Network
OBSD1
object 4
OBSD2 OBSD3
object 3
Clients
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
46
Classical Hierarchical Storage Management Approach
KernelUser
Application calls for IO, DMAPI layer
intercepts IO call, if data has been
migrated to tape, Migration Daemon is notified to recall data
If file system is too full, Migration Daemon is notified to migrate
data out
DMAPI
VFS
File System X
Application
File System Client machine
Migration Daemon
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
47
Coordinating Parallel HSMs
Metadata hashed across multiple
machines File /foo = OBSD1 object 4 OBSD3 object 3
owner = me date = today
Archive attributes etc.
ClientsMetadata Cluster
Object Storage Devices 1 2 3
4 3
Network
Migration Agents w/
partial DMAPI FunctionsMultiple instances
of commercial HSM solutions used in parallel means parallel migration and
recall
Migration coordinator: query Metadata cluster, instruct
HSM’s on migration, update Metadata
cluster about new location
Migration coordinator cluster
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
48
OSD Internal Architecture
Migration Agent
Disk File SystemDisk Driver Tape Library Driver
Tape File System/DB
HSM
Networking driver
NIC
OSD Interface
Potential POSIX standard supported by all HSM vendors
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
49
Other Projects
Active Data ObjectONR Storage and Networking Planning Tool
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
50
Active Objects
Problem: How can the OSD environment be extended to be even more flexible?Approach: Allow objects to include executable methods in addition to the data, attributes and metadata. These methods can be invoked when a pre-set condition is met.Uniqueness: Data objects are truly autonomic. Intelligent storage devices have to designed to provide such a capability.
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
51
Our Implementation
Networking
MDSServer
LockClient
MDS Backend
Rec
over
y
LockServer
Load Balancing
OBDClient
MDSClient
LockClient
Lustre clientfilesystem
Run-time
OS bypassfile I/O
Meta-dataWB cache
Method API
Application (distributed grep)
Networking
Rec
over
y
Networking
Object-Based Disk(OBD) server
Lockserver
Object-Based Disk (OBD)
Methods Run-time
Rec
over
y
Networking
Object-Based Disk(OBD) server
Lockserver
Object-Based Disk (OBD)
Methods Run-time
Rec
over
y
Object Storage Target Object Storage Target Meta-Data Server
File Client
System Configuration
- 1 file client: Blade server node (linux)- 2 or more OST: Blade node (linux)- 1 MDS: Blade node (linux)
- Integrates method API into Lustre file system- Develops new distributed grep program
5/10/2005 DISC - Digital Technology Center Intelligent Storage ConsortiumUniversity of Minnesota
52
Conclusions
Intelligent Storage has a long way to goMany interesting and promising issuesIndustrial collaboration is extremely importantDefinitely need to demonstrate the advantages of OSD with real applications
Thanks! and Questions?
top related