Top Banner
dCache -  delegated storage solutions Tigran Mkrtchyan for dCache Team ISGC 2016, Taiwan
21

dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Feb 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

dCache ­  delegated storage solutionsTigran Mkrtchyan for dCache Team

ISGC 2016, Taiwan

Page 2: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 2

dCache on one slide

Pools(Data Server)

Pools(Data Server)

Door

Message passing layer

JVM JVM JVM

Door(s)(clients entry point) Pool Manager

(requests scheduler)Name Space(MetaData Server)

Pools(Data Server)

DBMSdcap

ftphttpnfs

Page 3: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 3

Usage around the World● ~ 80 installations

● > 50% of WLCG storage

● biggest 22 PB

● Typical ~100x nodes

● Typical ~ 10^7 files

Page 4: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 4

dCache as Storage System● Provides a single­rooted namespace.● Metadata (namespace) and data locations are independent.● Aggregates multipe storage nodes into a single storage system.● Manages data movement, replication, integrity.● Provides data migration between multiple tiers of storage (DISK, 

SSD, TAPE).● Uniquely handles different Authentication mechanisms, like 

x509, Kerberos, login+password, auth tokens.● Provides access to the data via variety of access protocols 

(WebDAV,  NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

Page 5: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 5

dCache as Storage System● Provides a single­rooted namespace.● Metadata (namespace) and data locations are independent.● Aggregates multipe storage nodes into a single storage system.● Manages data movement, replication, integrity.● Provides data migration between multiple tiers of storage (DISK, 

SSD, TAPE).● Uniquely handles different Authentication mechanisms, like 

x509, Kerberos, login+password, auth tokens.● Provides access to the data via variety of access protocols 

(WebDAV,  NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

Page 6: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 6

dCache's data management● Automatic migration

● Tape/disk/disk● HotSpot detection● Permanent migration jobs● Checksumming on transfer

● Manual migration● Data replication

● multiple copies● same host/rack/site policy

Page 7: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 7

Software­defined storage (or did you listen Patrick carefully?)

● Abstraction of logical storage services and capabilities from the underlying physical storage systems

● Automation with policy­driven storage provisioning with service­level agreements replacing technology details.

● Commodity hardware with storage logic abstracted into a software layer.

Page 8: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Storage in dCache (what we have)

Block device

Pool service

● dCache provides high level service● Data replication and management core dCache service● Each pool attached to own disks

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

Replication/Migration

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 9: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Storage in dCache (outsourcing, phase 1)

Block device

Pool service

● dCache provides high level service● Data replication and management core dCache service● Each pool has it own 'partition' on shared storage● Each 'partition' attached to it's own block device

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

Replication/Migration

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 10: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 1 (changing IO layer)

● Single data server owns the data● Single data server manages data

● flush to tape● restore from tape● removal● garbage collection

Page 11: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Replication/Migration

Storage in dCache (outsourcing, phase 2)

Block device

Pool service

● dCache provides high level service● All pool see all 'partition' on shared storage● Any pool can deliver data from any partition● Object store takes care about replication

Block device

Pool service

Block device

Pool service

Block device

Pool service

Block device

Pool service

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Page 12: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 2 (Changing core philosophy)

● All data managed by 'quorum'● group decision who interact with tape● group decision who/when file is removed● File location is always 'known'

Page 13: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Replication/MigrationReplication/Migration

Storage in dCache (outsourcing, phase 3)

Block device

● dCache provides high level service● dCache can move data between regular and OS pools

Block device Block device Block deviceBlock device

dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

Replication/Migration

Pool service Pool service Pool service Pool servicePool service

Page 14: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

Phase 3 (mixed environment)

● Mixed setup● Islands of storage servers● Replication and data movement between 

islands 

Page 15: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 15

Why CEPH

● No specific hardware support● Runs on commodity hardware● Scalable to exabytes of data ● Deployed at sites as storage system for 

OpenStack● Provides Object, Block and File interfaces

Page 16: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 16

And not only CEPH

● Other object store can be adopted● DDN WOS

● Swift/S3/CDMI● Cluster file systems (as a side effect)

● Luster● GPFS● GlusterFS

Page 17: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 17

CEPH (extremely simplified)● OSD ~ a physical disk● CRUSH - determines how to store

and retrieve data by computing data storage locations.

● RADOS - distributes objects across the storage cluster and replicates objects

● librados - provides low-level access to the RADOS service.

OSD OSD OSD

CRUSH

RADOS

LIBRADOS

APP RDBCEPH

FS

Page 18: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 18

Current work

● Functional prototype only● Focus on stability first● RBD based

● striping● alterable content

● Object interface will be evaluated as well

Page 19: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 19

Roadmap● Phase 1

● running prototype is available today● some sites volunteer to help with testing

● cleaning up to make generally available● Phase 2/3

● depends on user demand● operational overhead, if any●  support overhead, if any

Page 20: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 20

Summary

● dCache is demanded storage system.● New technology provides required building 

blocks.● Combination on both makes us to 

concentrate on missing parts.● Working prototype available for testing. 

Page 21: dCache delegated storage solutions · dCache as Storage System Provides a singlerooted namespace. Metadata (namespace) and data locations are independent. Aggregates multipe storage

  Delegated Storage | Tigran Mkrtchyan  |  3/15/16  |  Page 21

Links

● https://www.dcache.org/● https://en.wikipedia.org/wiki/Software­def

ined_storage● http://ceph.com/