Top Banner
CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio Gorini
14

CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

May 30, 2018

Download

Documents

trantu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

CSCS STORAGE INFRASTRUCTURE

CSCS HPS

Storage System Engineer

Stefano Claudio Gorini

Page 2: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

CSCS GPFS FS

2

/users & /apps /project /store

Small size Very Large Size Extreme Size

Quota by user Quota by group Quota by consortiunm

As a user exits

@ CSCS + 6 months

Duration of project

+ 6 months

As contractually agreed

Normal bandwidth High bandwidth High bandwidth

(if file on disk)

Backed up Backed up HSM

100 GB per user Capacity requested

and justified in a

project proposal

Capacity by Contract;

either matching founds

or fully paid by customer

Page 3: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

PROJECT FS – HW

3

~1.4 PB ~1 PB

480 SATA 2TB DISKS 480 SATA 2TB DISKS 420 SATA 2TB DISKS 420 SATA 2TB

DISKS

DATA DISKS ~2.4 PB

~ 1 TB on SSD

Card

METADATA DISKS

TSM

Storage

Agent

BERNINA15

BERNINA03

BERNINA04

BERNINA16 BERNINA14

BERNINA13 BERNINA01 BERNINA05

BERNINA02

BERNINA22

GLOBAL 118-119 GLOBAL 123-124 GLOBAL 112-113 GLOBAL 116-117

BERNINA23

BERNINA25

BERNINA05

Page 4: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

HOME & APPS FS – HW

4

GLOBAL 114-115

BERNINA11

BERNINA10

BERNINA24

DATA DISKs

60 of 120 SATA 2TB DISKS

METADATA DISKs

4 of 64 FC 500GB DISKS

TSM

Storage

Agent

Page 5: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

STORE FS – HW

5

DATA DISKs

300 SATA 3TB DISKS DATA DISKs

300 SATA 3TB DISKS

DATA DISKs

300 SATA 3TB DISKS

~ 500 GB on SSD

Card

METADATA DISKS

TSM

Storage

Agent

ADULA05

ADULA06

MEDEL01

MEDEL02

MEDEL03

MEDEL04

MEDEL05

MEDEL06

MEDEL07

MEDEL08

MEDEL09

MEDEL10

MEDEL11

MEDEL12

MEDEL13

MEDEL14

RAMSAN1

~2.1 PB DATA DISKS

Page 6: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

GPFS - CNFS

6

~# mmremotefs show all

Local Name Remote Name Cluster name Mount Point Mount Options Automount Drive Priority

global global global.cscs.ch /global rw yes - 0

apps apps globalhome.cscs.ch /apps rw yes - 0

users users globalhome.cscs.ch /users rw yes - 0

store archive store.cscs.ch /store rw yes - 0

/global *.cscs.ch(rw,async,no_root_squash)

/users *.cscs.ch(rw,async,no_root_squash)

/apps *.cscs.ch(rw,async,no_root_squash)

/store *.cscs.ch(rw,async,no_root_squash)

Alias used by CNFS:

nfs01.cscs.ch

nfs02.cscs.ch

nfs03.cscs.ch

nfs04.cscs.ch

BERNINA20

BERNINA07

BERNINA21

BERNINA08

Page 7: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

QFS/SAM-FS to GPFS TSM Migration

7

GPFS

+

TSM/HSM

QFS/SAM-FS

Page 8: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

QFS/SAM-FS to GPFS TSM Migration

8

1. Snapshot and migration of the metadata

2. Production stays on the old system

3. Bulk migration of data from the snapshot:

• Read tar le from SAM-FS/QFS

• Transfer data over network using a parallel copy tool

• Data integrity verication after the network transfer done by checksums, which

had been taken from the old system before the start of the migration.

• Untar to the new GPFS location 4. Transition to production after final synchronization of data

5. After that clean GPFS/TSM on production without access to the old tapes

SAM-FS

GPFS

HMK tool

Page 9: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

QFS/SAM-FS to GPFS TSM Migration

9

FROM 02/09/2011 TO 10/12/2011 TO MIGRATE :

• ~26M files

• ~650 TB

• Average speed ~7 TB/day

PERFORMANCES WERE DRIVEN BY DEVICE SPEED:

• Tape drive speed ( T10000 max. 100 MB/s )

• Network speed ( 3 Gb/s due to a PCIX card)

• GPFS performances (the one used to migrate data was 4 GB/s)

Page 10: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

DATA TOPOLOGY

10

Y<1MB 1MB<Y<100MB

100MB<Y<1GB 1GB<Y<10GB

Y> 10GB

0 20,000 40,000 60,000 80,000 100,000 120,000 140,000

X < 1 Month

1 Months < X < 3 Months

3 Months < X < 1 Year

X >1 Year

GB

File Size Range summerized in GB (Y) as a function of Last Access time (X)

Page 11: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

TSM/HSM

11

/project

HSM & Backup Clients

HSM & Backup Clients

TSM

DB

HSM & Backup Clients

/store

TSM

DB

3 TSM Servers + 1 Spare

6 TSM Storage Agents:

24 LTO Tape Drives

5,719 LTO5 Slots & 5,719

Cartridges

- 8.58 PB uncompressed

Backup / Restore Capacity:

20 x 100 MB/s = 2000 MB/s = 7.2 TB/h

+ 4 drives for Data Management:

Reclaim, Copy, Move, DB Backup

GP

FS

S

tora

ge

Ag

en

ts

TS

M S

erv

ers

Active Libr. Manager

Spare

/users & /apps

TSM 6.3

Page 12: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

Open Issue on GPFS/TSM

MMBACKUP - GPFS utility that drives Backup using the filesystem

policy does not yet completely join the TSM warning/error catalog:

12

“ Cannot reconcile shadow database.

Unable to compensate for all TSM errors in new shadow database.

Preserving previous shadow database.

Run next mmbackup with -q to synchronize shadow database. exit

12”

~10 hours to rebuild shadow database

Page 13: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

Future Plans

Add a NEW TIERED GPFS “STAGE”:

– SSD & FC DISK

– Data moved across disk group by gpfs policy

Deploy a complete TSM Replica (TSM 6.3 feature)

13

Page 14: CSCS STORAGE INFRASTRUCTURE - HPC Advisory …hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/... · CSCS STORAGE INFRASTRUCTURE CSCS HPS Storage System Engineer Stefano Claudio

Thanks!

14