Top Banner
Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are doing Open Science Grid Talk outline: [email protected]
28

Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

Dec 29, 2015

Download

Documents

Noreen Park
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

Any Data, Anytime, AnywhereMatevž Tadel (UCSD), for the AAA team

1. AAA on the map of things 2. Project status3. How others can profit from what we are doing

Open Science Grid

Talk outline:

[email protected]

Page 2: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 2

OSGOpen Science Grid

Building Blocks for Next Gen Networks, 3/12/13

CMSCompact Muon Selenoid

LHCLarge Hadron Collider

ATLASthe competing experiment

US CMScomputing & software AAA

FAX

NSF$$$

3 year, 3 FTE projectstarted in Sept. 2011

UCSD, UNL &UW Madison+- 10 people

Page 3: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 3

OSGOpen Science Grid

Building Blocks for Next Gen Networks, 3/12/13

CMSCompact Muon Selenoid

LHCLarge Hadron Collider

ATLASthe competing experiment

US CMScomputing & software AAA

FAX

NSF$$$

3 year, 3 FTE projectstarted in Sept. 2011

UCSD, UNL &UW Madison+- 10 people

Cover all aspects of remote data access in an organization with large-scale, distributed

storage and computing infrastructure.

Page 4: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 4Building Blocks for Next Gen Networks, 3/12/13

?

Probing deeper then ever intothe smallness of space …

… to explain and understand thelargest and how it came to be.

Page 5: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 5Building Blocks for Next Gen Networks, 3/12/13

LHC: The most powerful accelerator ever built:– Physics reach: create particles with masses up to 1 TeV– Circumference 16.6 mi, 60 – 160m underground– 4 experiments + accelerator ~12k people …

Page 6: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 6Building Blocks for Next Gen Networks, 3/12/13

… some of whom are French

Page 7: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 7Building Blocks for Next Gen Networks, 3/12/13

12,500 tons, 21m long, 16m diameter80 Million electronic channels

LHC collision frequency: 40MHz~ 10 PBytes/sec of information• 1/1000 zero-suppression• 1/100,000 online event filtering~ 100-1000 MBytes/sec raw data storedup to 10 PBytes of raw data per year

2000 scientists spread over 40 countries

CMS

Page 8: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 8Building Blocks for Next Gen Networks, 3/12/13

The LHC Computing ChallengeHow do we organize the processing of 10’s to 1000’s of Petabytes of data by a globally distributed community

of scientists, and do so with manageable “change costs” for the next 20 years ?

Solution to the ChallengeChose technical solutions that allow

computing resources as distributed as human resources.Support distributed ownership and control,

within a global single sign-on security context.Design for heterogeneity and adaptability.

Page 9: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 9Building Blocks for Next Gen Networks, 3/12/13

T2T2

T2

T2

T2

T2

T2

T1

CMS collaboration:3000+ collaborators2000 Scientists1200 Ph.D.s in physics

~ 180 institutions~ 40 countries

USCMS computing:Tier-1 center @ FNAL7 Tier-2 centers2 of them in California:• CalTech• UCSD

“Owned resources” in US: 20k cores, 40PBytes of disk

Page 10: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 10

The Role of Networks

• Bring Big Data for analysis to Petabyte Science Caches at Universities– and do it fast to maximize human productivity

AAA adds a new role:• Access any data, any time, from anywhere– interactive debugging is crucial at every step– squeeze the last drop out of available CPU and disk– usage of opportunistic resources

Building Blocks for Next Gen Networks, 3/12/13

Page 11: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 11

AAA Modus OperandiAAA is a meta-project: Take proven technologies, integrate them, and contribute back.– XRootD: serve and access data– Condor: flexible job scheduling– GlideinWMS: dynamic job placement– CernVM FS: software distribution

CMS is our main target, esp. the US sites. But:– Most products can be used by any VO with little extra work– Operational experience is crucial

→ we’re doing most of the mistakes so others don't have to.

– CERN and European sites are already picking it up …

Building Blocks for Next Gen Networks, 3/12/13

Page 12: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 12

PROJECT STATUS

1. What we have done and how it looks2. What’s cooking3. Plans

Building Blocks for Next Gen Networks, 3/12/13

Page 13: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 13

-I. I/O Optimizations• HEP is firmly C++-based• ROOT framework used for object serialization– ROOT is also the “analysis studio” for physicists:

looping over data, histogramming, plotting, 3D-graphics, fitting & multivariate analysis

– Format on disk is rather complicated to:• maximize compression• allow low-overhead access to partial data[it’s column-wise ntuples with chunked compression buffers]

Significant effort between CMS, ROOT & AAA was needed to make I/O work well for remote access:– coalesce successive reads into “vector requests”– develop client-side data-cache with optional read-ahead

Building Blocks for Next Gen Networks, 3/12/13

Page 14: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 14

I. USCMS XRootd data federation• Creating the data-federation was only a matter of

configuring XRootd, then there were the details …• XRootd is an amazing product (SLAC):

– absolute winner in performance, scalability & resilience;– flexible – plugins for all major components;

authentication & authorization, file-system interface– supports multi-tiered storage;– supports multi-site data-federations & federation peering;– has excellent monitoring.

• A lot of AAA effort was used to make sure this is all true and works well for our standard T2 FS (Hadoop).– UCSD & UNL even joined XRootd collaboration– Europeans are starting to pick up the ball …

Building Blocks for Next Gen Networks, 3/12/13

Page 15: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 15

How this works

Building Blocks for Next Gen Networks, 3/12/13

Using coherent directory structures on all sites really helps.Ask ATLAS what happens otherwise …

Three level server hierarchy:1. Redirector / meta-manager

This is a single point of entry2. Site redirector / manager3. Actual servers, ~10 per site

• Managers cache results from below.• Clients report errors to managers, if

they happen.

Notes:• our files are 100 MB to 8 GB• processing speed 100 kB/s to 4 MB/s

Page 16: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 16

How this works on intergalactic scale

Building Blocks for Next Gen Networks, 3/12/13

This is being put in production now …

Page 17: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 17

II. Established usages

1. Interactive access– used for detailed inspection, event-displays and

debugging of analysis code

2. Fallback to XRootd when local access fails– CMS had about 3% of job failure– we only do it at file-open time

3. Job overflow to nearby sites– job queues on sites can get very very long– send jobs to other sites and let them read via XRootd

Building Blocks for Next Gen Networks, 3/12/13

Page 18: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 18

Monitoring

• We monitor everything – server is alive & files accessible (authentication!)– details of individual servers / managers• transfers, connections, redirections• internal server state (buffers, threads, system usage)

– details about individual user sessions• almost live tracking of current users / file transfer• analyze details of file-access patterns

– detect abuse, ignorance and stupidity– optimize I/O stack, data-placement and replication

Building Blocks for Next Gen Networks, 3/12/13

Monitorin

g write-up

Page 19: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 19

Monitoring

Building Blocks for Next Gen Networks, 3/12/13

Page 20: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 20

Monitoring

Building Blocks for Next Gen Networks, 3/12/13

Page 21: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 21

Monitoring

Building Blocks for Next Gen Networks, 3/12/13

Page 22: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 22

III. Work in progress

1. XRootd caching-proxy – full and partial file support– storage healing– reduced & on-demand replication of data– Tier-3 sites at universities, campus resources

2. Opportunistic usage & running on clouds– GlideinWMS pilot jobs– CernVM FS + Parrot– We did some trials on Amazon (owned T2s are still

cheaper)

3. Source selection algorithms & Multi-source readingBuilding Blocks for Next Gen Networks, 3/12/13

Page 23: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 23

IV. Plans

• From proposal: no new development in 3rd year• Still many things going on for 2013• Wrap-up, deploy and put into production• Final round of stress & scaling tests • Hand operations over to CMS

Building Blocks for Next Gen Networks, 3/12/13

Page 24: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 24

HOW OTHERS CAN PROFIT FROM WHAT WE ARE DOING

Building Blocks for Next Gen Networks, 3/12/13

Page 25: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 25

Banking on AAA work I.

• Join OSG, use OSG software!• Use XRootd to serve / access your data– Geography of your federation– Namespace construction– Authentication

• Know your data-access requirements and patterns

Building Blocks for Next Gen Networks, 3/12/13

Page 26: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 26

Banking on AAA work II.

• Know how your software access individual files– Full-file caching / pre-staging: when you read whole

files or can not predict what parts get read;– Partial file-caching:• works for storage healing / as a remote extra block “RAID”• or your files have a more compact structure than ours

– Direct remote access: when you know the data will not be read again any time soon• Optimize data access for partial file read• must know how to anticipate read ahead

Building Blocks for Next Gen Networks, 3/12/13

Page 27: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 27

Banking on AAA work III.

• Monitoring! It actually does help to:– Understand how you actually access your files– Optimize IO patterns– Detect file access errors & user abuse• Most often users are just being creative, not malicious

– Accounting– …– Prevent making fool of yourself more often than

it’s absolutely necessary

Building Blocks for Next Gen Networks, 3/12/13

Page 28: Any Data, Anytime, Anywhere Matevž Tadel (UCSD), for the AAA team 1.AAA on the map of things 2.Project status 3.How others can profit from what we are.

M. Tadel: Any Data, Anytime, Anywhere 28

Conclusion

• AAA is serving a large collaboration at the forefront of big-data distributed processing– Other fields and sciences are catching up

• Solving real problems that arise in such contexts– For us and, we hope, for others

• We are committed to broad collaboration, esp. through OSG

Building Blocks for Next Gen Networks, 3/12/13