Scalla/XRootd Scalla/XRootd Scalla/XRootd Scalla/XRootd Advancements Advancements xrootd /cmsd (f.k.a. olbd) Fabrizio Furano CERN – IT/PSS CERN IT/PSS Andrew Hanushevsky Stanford Linear Accelerator Center http://xrootd slac stanford edu http://xrootd.slac.stanford.edu
20
Embed
Scalla/XRootd Advdv ce e sancements...Composite Cluster Name Space opendir() refers to the directory structure maintained by xrootd:2094 Client xroot.redirect mkdir myhost:2094 Redirector
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Scalla/XRootdScalla/XRootdScalla/XRootd Scalla/XRootd AdvancementsAdvancementsdv ce e sdv ce e s
xrootd /cmsd (f.k.a. olbd)
Fabrizio FuranoCERN – IT/PSSCERN IT/PSS
Andrew HanushevskyStanford Linear Accelerator Center
ScallaScalla is a data access systemSome users/applications want file system semantics
More transparent but many times less scalable
For years users have asked ….Can ScallaScalla create a file system experience?Can ScallaScalla create a file system experience?
The answer is ….It can to a degree that may be good enough
We relied on FUSEFUSE to show how28-November-07 6: http://xrootd.slac.stanford.edu
We relied on FUSEFUSE to show how
What is FUSEFUSE
FFilesystem in UUsersspaceeU d t i l t fil t iUsed to implement a file system in a user space program
Linux 2.4 and 2.6 onlyRefer to http://fuse sourceforge net/Refer to http://fuse.sourceforge.net/
Can use FUSE FUSE to provide xrootd accessLooks like a mounted file systemLooks like a mounted file system
SLAC and FZK have xrootd-based versions of thisWei Yang at SLAC g
Tested and practically fully functionalAndreas Petzold at FZK
I l h t t t f ll f ti l t
28-November-07 7: http://xrootd.slac.stanford.edu
In alpha test, not fully functional yet
XrootdFS (Linux/FUSE/Xrootd)
ClientClient Kernel
User Space
Appl
POSIX File SystemInterface FUSE
FUSE/X t I t fHostHost opendir
createmkdir
xrootd POSIX Client
Appl FUSE/Xroot Interface
mkdirmvrm
rmdir
Redirectorxrootd:1094
Name Spacexrootd:2094RedirectorRedirector
HostHostHostHost
Should run cnsd on serverst t FUSE t
28-November-07 8: http://xrootd.slac.stanford.edu
to capture non-FUSE events
XrootdFS PerformanceSun V20zRHEL4
2x 2.2Ghz AMD Opteron
VA Linux 1220RHEL3
2x 866Mhz Pentium 34GB RAM
1Gbit/sec Ethernet1GB RAM
100Mbit/sec Ethernet
Unix dd, globus-url-copy & uberftp
Client
, g py p5-7MB/sec with 128KB I/O block size
Unix cp 0.9MB/sec with 4KB I/O block size
Conclusion: Better for some things than othersConclusion: Better for some things than others..
28-November-07 9: http://xrootd.slac.stanford.edu
f gf g
Why XrootdFS?
Makes some things much simplerM t SRM i l t ti t tlMost SRM implementations run transparentlyAvoid pre-load library worries
But impacts other thingsPerformance is limited
Kernel-FUSE FUSE interactions are not cheapRapid file creation (e.g., tar) is limited
FUSEFUSE t b d i i t ti l i t ll d t b dFUSE FUSE must be administratively installed to be usedDifficult if involves many machines (e.g., batch workers)Easier if it involves an SE node (i e SRM gateway)
Much lower latencyNew very extensible protocolNew very extensible protocolBetter fault detection and recoveryAdded functionalityAdded functionality
Global clustersAuthenticationServer selection can include space utilization metricUniform handling of opaque informationC l b l f lCross protocol messages to better scale xproof clusters
Better implementation for reduced maintenance cost
Test 2: Draw a histogram from that tree data(6k interactions)
Measured time ~15-20min Using xrootd with WAN optimizations disabled
28-November-07 17: http://xrootd.slac.stanford.edu**Federico Carminati, Federico Carminati, The The ALICE ALICE Computing Status and ReadinessComputing Status and Readiness, LHCC, November 2007, LHCC, November 2007
**Smart WAN Access**
Exploit xrootd WAN OptimizationsTCP multi-streaming: for up to 15x improvement data WAN throughputTh ROOT TT C h id th hi t ”f t ” d tThe ROOT TTreeCache provides the hints on ”future” data accessesTXNetFile/XrdClient ”slides through” keeping the network pipeline full
Data transfer goes in parallel with computationData transfer goes in parallel with computationThroughput improvement comparable to “batch” file-copy tools
70-80%, we are doing a live analysis, not a file copy!Test 1 actual time: 60-70 secondsTest 1 actual time: 60 70 seconds
Compared to 30 seconds using a Gb LANVery favorable for sparsely used files
Test 2 actual time: 7-8 secondsTest 2 actual time: 7 8 seconds Comparable to LAN performance
100x improvement over dumb WAN access (i.e., 15-20 minutes)
28-November-07 18: http://xrootd.slac.stanford.edu**Federico Carminati, Federico Carminati, The The ALICE ALICE Computing Status and ReadinessComputing Status and Readiness, LHCC, November 2007, LHCC, November 2007
Conclusion
Scalla is a robust frameworkElaborative
Composite Name SpaceXrootdFS
ExtensibleCluster globalization
Many opportunities to enhance data analysisMany opportunities to enhance data analysisSpeed and efficiency