Top Banner
Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1
13

Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Dec 14, 2015

Download

Documents

Gaige Hotchkiss
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

1

Ceph – status update and xrootd

testingAlastair Dewhurst, Tom Byrne

Page 2: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Introduction

• On 15th October gave overview talk on plans for Ceph at RAL Tier 1.

• Will aim to provide updates on progress made focusing on the xrootd deployment and testing.

• Current Ceph cluster with 7 nodes using 2013 generation hardware.

2

Page 3: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

S3 gateway

• At last meeting we had S3 gateway on virtual machine:

• Hope to have firewall holes + x.509 authentication working by next week.

• S3 gateway ‘does it’s own thing’ with files which means it is difficult to use with other plugins.

• Will investigate writing own WebDAV gateway.

3

Page 4: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

CERN plugins• CERN have four plugins based on XRootD for

CEPH:

• radosfs (impl. file & directories in rados)

• xrootd-rados-oss (interfacing radosfs as OSS plug-in)

• xrootd-diamond-ofs (adding checksumming & TPC)

• xrootd-auth-change-id (adding NFS server style authentication to xrootd)

• Our work has been on the xrootd-diamond-ofs

• Setup instructions can be found: https://github.com/cern-eos/eos-diamond/wiki

4

Page 5: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Xrootd deployment• Used RPMs provided on wiki to setup XrootD

gateway

• Had to setup a Cache tier because it currently doesn’t work directly with erasure coded pools

• This is because the file is opened and then appended to, CERN are working on patching it to work with EC.

• There are two pools:

• Data and Meta-Data

5

Page 6: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Cache Tier• Cache Tier is using mostly default settings

• 3 replicas of the data

• Will create a ‘cold’ erasure coded copy instantly

• LRU algorithm to clean up data.

• We would prefer not to use a Cache Tier and have direct access to Erasure coded pool

• It would be possible to have a ~10% Cache Tier in front of the storage.

• We believe Erasure coded pool should work well as we are not appending to files.

6

Page 7: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Diamond data

• Plugin splits file into chunks which are stored with a GUID in Ceph:

• Makes it hard to manage files and write other plugins.

7

[root@gdss540 ~]# rados -p diamond-data ls | grep 774b1a83-14d0-4fb9-a6c0-10e36c32febf | sort774b1a83-14d0-4fb9-a6c0-10e36c32febf774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000001774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000002774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000003774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000004774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000005774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000006774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000007774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000008774b1a83-14d0-4fb9-a6c0-10e36c32febf//00000009774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000a774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000b774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000c774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000d774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000e774b1a83-14d0-4fb9-a6c0-10e36c32febf//0000000f

Page 8: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Diamond meta-data 8

https://indico.cern.ch/event/305441/session/5/contribution/37/material/slides/0.pdf

Page 9: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Testing• Have tried commands from:

• UI (using xrootd v3.3.6)

• Node (using xrootd v4.0.4)

• Can copy files in and out:

9

[root@gdss540 ~]# xrdcp ./ivukotic\:group.test.hc.NTUP_SMWZ.root root://gdss541//root/ivukotic:group.test.hc.NTUP_SMWZ.root.1[760.2MB/760.2MB][100%][==================================================][95.03MB/s]

[root@gdss540 ~]# xrdcp root://gdss541//root/ivukotic:group.test.hc.NTUP_SMWZ.root /ivukotic\:group.test.hc.NTUP_SMWZ.root [760.2MB/760.2MB][100%][==================================================][58.48MB/s]

Page 10: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

“Filesystem”

• Can create directories with UNIX style permissions.

• Setup is “Fragile” – frequently need to restart xrootd.

• Dies when doing “ls –l”

10

xrdfs gdss541 mkdir "/atlas/?owner=10763&group=1307"

[root@gdss540 ~]# xrdfs gdss541 ls /atlas//atlas/ivukotic:group.test.hc.NTUP_SMWZ.root/atlas/test

Page 11: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Direct Read• Code from Wahid:

• git clone https://[email protected]/reps/FAX

• Wanted to try 4 tests:

• Read 10% of the file and use 30MB cache

• Read 100% of the file and use 30MB cache

• Read 10% of the file and use 100MB cache – CRASHED!

• Read 100% of the file and use 100MB cache – CRASHED!

11

30MB Cache 1st 2nd 3rd Average

100%CPU Time /s 31.13 31.13 30.5 30.92

Disk IO MB/s 112.654 112.951 113.094

112.8997

10%CPU Time /s 15.9 16.35 16.04

16.09667

Disk IO MB/s 110.737 112.13 112.056 111.641

Page 12: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Future plans• 3 threads of development:

• Get simplified xrootd to work.

• Look into GridFTP gateway – Spoken to Brian Bockelman who has made equivalent for HDFS.

• Look into Webdav gateway – Instructions to get started on Ceph wiki and will speak to DPM developers.

• Need to start looking at xattr

• We have procured mac mini for future Calamari builds.

12

Page 13: Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.

Tom Byrne, 12th November 2014

Summary

• We got S3 gateway to work, but it wasn’t quite what we wanted.

• Testing Diamond plugin with help from CERN. Do not need all the features.

• Question: Why do all the plugins create their own data formats?

• If we go with an object store we will have to write our own plugins but this does not appear to be an impossible task.

13