Top Banner
Tape write efficiency improvements in CASTOR Department CERN IT CERN IT Department CH-1211 Genève 23 Switzerland http://cern.ch/it-dss DS S Data Storage Services Primary author: MURRAY Steven (CERN) Co-authors: BAHYL Vlado (CERN), CANCIO German (CERN), CANO Eric (CERN), KOTLYAR Victor (Institute for High Energy Physics (RU)), LO PRESTI Giuseppe (CERN), LO RE Giuseppe (CERN) and PONCE Sebastien (CERN) The CERN Advanced STORage manager (CASTOR) is used to archive to tape the physics data of past and present physics experiments. The current size of the tape archive is approximately 61PB. For reasons of physical storage space, all of the tape resident data in CASTOR are repacked onto higher density tapes approximately every two years. The performance of writing files smaller than 2GB to tape is critical in order to repack all of the tape resident data within a period of no more than 1 year. Implementing the delayed flushing of the tape-drive data-buffer • Implemented using immediate tape-marks from version 2 of the SCSI standard • CERN worked with the developer of the SCSI tape-driver for Linux to implement support for immediate tape-marks • Support for immediate tape-marks is now available as an official SLC5 Kernel-patch and is in the vanilla Linux-Kernel as of version 2.6.37 Methodology used when modifying legacy code • The legacy tape reader/writer is critical for the safe storage and retrieval of data • Modifications to the legacy tape reader/writer were kept to a bare minimum • It is difficult to test new code within the legacy tape Results 0 100 200 300 400 500 600 0 20 40 60 80 100 120 140 Drive write performance, MB/s 1 flush per N GB 1 flush per file 3 flushes per file file size, MB speed, MB/s Average file-size ≈ 290 MB Overall increase ≈ ×10 Efficiency increase ≈ ×3 for average file size Legacy tape transfer- manager Drive schedul er Legacy tape reader/writ er 1. Mount tape 2. File info 3. Write header file 4. Flush buffer 5. Write user file 6. Flush buffer 7. Write trailer file 8. Flush buffer 9. Wrote file Original architecture before improved write efficiency – 3 flushes per file 3 flushes ≈ 5 seconds ≈ 1.2 GB that could have been written Architecture after first deployment – 1 flush per file Legacy tape transfer- manager Drive schedul er 1. Mount tape Legacy tape reader/writ er 2. File info 3. Write header file 4. Write user file 5. Write trailer file 6. Flush buffer 7. Wrote file 1 flush ≈ 1.667 seconds ≈ 400 MB that could have been written Minor modification to the legacy tape reader/writer Architecture of second deployment – 1 flush per N GB Legacy tape transfer- manager Drive schedul er 1. Mount tape Tape gateway 2. File info for N files For N files or data loop 3. Write header file 4. Write user file 5. Write trailer file End loop 6. Flush buffer 7. Wrote N files Protoco l bridge Legacy tape reader/writ er Legacy tape transfer- manager replaced by the tape gateway Unlike the legacy tape transfer- manager, multiple tape-gateways can be ran in parallel for redundancy After N files, 1.667 seconds spent flushing is negligible Protocol bridge allows old and new installations to co-exist More efficient bulk- protocol
1

Tape write efficiency improvements in CASTOR

Feb 16, 2016

Download

Documents

morna

DSS. Tape write efficiency improvements in CASTOR. Data Storage Services. Primary author: MURRAY Steven (CERN) Co-authors: BAHYL Vlado (CERN), CANCIO German (CERN), CANO Eric (CERN), - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tape write efficiency improvements in CASTOR

Tape write efficiency improvements in CASTOR DepartmentCERNIT

CERN IT DepartmentCH-1211 Genève 23Switzerlandhttp://cern.ch/it-dss

DSSData Storage

ServicesPrimary author: MURRAY Steven (CERN) Co-authors: BAHYL Vlado (CERN), CANCIO German (CERN), CANO Eric (CERN),KOTLYAR Victor (Institute for High Energy Physics (RU)), LO PRESTI Giuseppe (CERN), LO RE Giuseppe (CERN) and PONCE Sebastien (CERN)

The CERN Advanced STORage manager (CASTOR) is used to archive to tape the physics data of past and present physics experiments. The current size of the tape archive is approximately 61PB. For reasons of physical storage space, all of the tape resident data in CASTOR are repacked onto higher density tapes approximately every two years. The performance of writing files smaller than 2GB to tape is critical in order to repack all of the tape resident data within a period of no more than 1 year.

Implementing the delayed flushing of the tape-drive data-buffer• Implemented using immediate tape-marks from version 2 of the SCSI standard

• CERN worked with the developer of the SCSI tape-driver for Linux to implement

support for immediate tape-marks

• Support for immediate tape-marks is now available as an official SLC5

Kernel-patch and is in the vanilla Linux-Kernel as of version 2.6.37

Methodology used when modifying legacy code• The legacy tape reader/writer is critical for the safe storage and retrieval of data

• Modifications to the legacy tape reader/writer were kept to a bare minimum

• It is difficult to test new code within the legacy tape reader/writer, therefore

unit-tests were used to test the new code separately

• Used the unit-testing framework for C++ named CPPUnit

• Developed 189 unit-tests so far

• Tests range from simple object instantiation through to testing TCP/IP

application-protocols

Results

0 100 200 300 400 500 6000

20

40

60

80

100

120

140

Drive write performance, MB/s

1 flush per N GB

1 flush per file

3 flushes per file

file size, MB

spee

d, M

B/s

Average file-size ≈ 290 MB

Overallincrease

≈ ×10

Efficiency increase≈ ×3 for average file size

Legacy tape transfer-manager

Drive scheduler

Legacy tape reader/writer

1. Mount tape

2. File info

3. Write header file4. Flush buffer5. Write user file6. Flush buffer7. Write trailer file8. Flush buffer

9. Wrote file

Original architecture before improved write efficiency – 3 flushes per file

3 flushes≈ 5 seconds

≈ 1.2 GB that couldhave been written

Architecture after first deployment – 1 flush per file

Legacy tape transfer-manager

Drive scheduler

1. Mount tape

Legacy tape reader/writer

2. File info

3. Write header file4. Write user file5. Write trailer file6. Flush buffer

7. Wrote file

1 flush≈ 1.667 seconds

≈ 400 MB that couldhave been written

Minor modification to the legacy tape reader/writer

Architecture of second deployment – 1 flush per N GB

Legacy tape transfer-manager

Drive scheduler

1. Mount tape

Tapegateway

2. File info for N files

For N files or data loop 3. Write header file 4. Write user file 5. Write trailer fileEnd loop6. Flush buffer

7. Wrote N filesProtocol bridge

Legacy tape reader/writer

Legacy tape transfer-manager replaced by the tape gateway

Unlike the legacy tape transfer-

manager, multiple tape-gateways can be ran in

parallel for redundancy

After N files, 1.667 seconds spent

flushing is negligible

Protocol bridge allows old and

new installations to co-exist

More efficient bulk-protocol