Top Banner
14

What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Dec 21, 2015

Download

Documents

Irene Stevens
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
Page 2: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

What is it?

• Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992• Allows storage management of 100s of billions of files spanning 100s

of petabytes for the HPC community.• Licensed and supported by IBM

Page 3: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Why?

• Reduced cost• Scalability• Power usage• Reliability• Speed• Long term storage

Page 4: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

How?

• Distributed cluster architecture• Metadata engine IBM DB2• Multiple storage classes• Striped disks and tapes

Page 5: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Who Uses it?

• NCSA BlueWaters• Argonne National Lab• Indiana State University

Page 6: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Disk and Tape

• Hierarchical storage management (HSM)• Frequently used data cache on disk• Archival data on tape

• Automatic migration (Mirror offsite)• Scalable, any instance of HPSS can access many tapes at the same time to provide parallel transfer

rates.

• Pros:• Lower cost• No power usage• Reliable

• Cons:• High latency

• Pros:• Low Latency

• Cons:• Power usage• Reliability• Higher Cost

Page 7: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Standard POSIX interface

• Users can access files using several methods:• FTP – standard FTP from mover• PFTP – Parallel transfer of data from multiple movers• Client API

• HSI – transfer files put/get files from HPSS• HTAR – archive multiple files together and transfer to HPSS

• VFS Client• XFS

Page 8: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Components

• Core Server• Translation Human Readable Name -> HPSS Object Identifiers• Translates virtual volumes into physical volumes

• Allows parallel I/O to the resources• Schedules mounting/dismounting of media

• Migration/Purge Server• Manages migration purge policies

• Disk Migration Purge• Once files have been moved down the hierarchy they are purged from disk

Page 9: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Components

• Tape File Migration• Make additional copies to multi-site setup

• Tape Volume Migration• Move data between tapes to optimally fill up tapes

• Gatekeeper (GK)• Account validation service• Site authorization etc…

• Location Server (LS)• Allows client to determine which location they should contact• Improves speed in multi-site setups

• Physical Volume Library (PVL)• Manages all HPSS physical volumes• Mounting and dismounting ( => PVR)• Atomic mounts for sets of cartridges for parallel access to data

Page 10: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Components

• Physical Volume Repository (PVR)• Interface to request cartridge mounts and dismounts• One to one with tape libraries

• Movers Servers• Handles actual data transfers• Communicates with Core Server to figure out source and destination• Retries moves on failures

Page 11: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Components

Page 12: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Scalability

• Horizontally scales:• Add more movers• Add more tape drives

Page 13: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

BlueWaters

• Software “RAIT” is being developed jointly by IBM and NCSA• Add 8+2 reliability to HPSS striping• 40 GbE network• 100,000 tape cartridges• 38.5 TB per hour

Page 14: What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

Indiana University• Multi-site setup• Centralized archival storage for all campus clusters