L.M.Barone – INFN Rome 16-20 October 2000 ACAT2000 - FNAL Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università.

16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome

Management of Large Scale Data Productions for the CMS

Experiment

Presented by

L.M.Barone

Università di Roma & INFN


The Framework

• The CMS experiment is producing a large amount of MC data for the development of High Level Trigger algorithms (HLT) for fast data reduction at LHC

• Current production is half traditional (Pythia + CMSIM/Geant3) half OO (ORCA using Objectivity/DB)


The Problem

• Data size~ 106 - 107 events, 1 MB/ev~ 104 files (typically 500 evts/file)

• Resource dispersionmany production sitesCERN,FNAL,Caltech, INFN etc.

Dealing with actual MC productions and notwith 2005 data taking


The Problem (cont’d)

• Data Relocationdata produced in site A are storedcentrally (CERN); site B may need a fraction of them;combinatorics increasing

• Objectivity/DB does not make life easier(but the problem would exist anyway)


ORCA Production 2000Signal

Zebra fileswith HITS

ORCADigitization

(merge signal and MB)

ObjectivityDatabase

HEPEVTntuples

CMSIM

HLT AlgorithmsNew

ReconstructedObjects

MC

P

rod

.O

RC

A P

rod

.

HLT G

rp

Data

bases

ORCAooHit

FormatterObjectivityDatabase

MB

ObjectivityDatabase

Catalog import

Catalog import

ObjectivityDatabaseObjectivityDatabaseytivitcejbOytivitcejbOesabataDesabataD

Mirro

red

Db

’s(U

S, R

ussia

, Ita

ly..)


The Old Days

• Question: how was it done before ?

A mix of ad hoc scripts/programs with a lot of manual intervention... but the problem was smaller and less dispersed


Requirements for a Solution

• Solution must be as automatic as possible decrease manpower

• Tools should be independent from data type and from site

• Network traffic should be optimized (or minimized ?)

• Users need complete information on data location


Present Status

• Job creation is managed by a variety of scripts in different sites

• Job submission again goes through diverse methods, from UNIX commands to LSF or Condor

• File transfer has been managed up to now by Perl scripts

not generic, not site independent


Present Status (cont’d)

• The autumn 2000 production round is a trial towards standardization same layout (OS, installation) same scripts (T.Wildish) for non Objy data transfer first use of GRID tools (see talk by

A.Samar) validation procedure for production

sites


Collateral Activities

• Linux + CMS software automatic installation kit (INFN)

• Globus installation kit (INFN)

• Production monitoring tools with Web interface


What is missing ?

• Scripts and tools are still too specific and not robust enough need practice on this scale

• Information service needs a clear definition in our context and then an effective implementation (see later)

• File replication management is just appearing and needs careful evaluation


Ideas for Replica Management

• A case study with Objectivity/DB(thanks to C.Grandi Bologna,INFN) – can be extended to any kind of file


Cloning federations

• Cloned federations have a local catalog (boot file) – It is possible to manage each of them in an

independent way. Some databases may be attached (or exist) only in one site

– “Manual work” is needed to keep the schemas synchronized (this is not the key point today...)


CERN CERN FDFD

DB1DB1 DB2DB2 DB3DB3 DBnDBn

CERN BootCERN Boot

RC1RC1FDFD

RC1 BootRC1 Boot

RC2RC2FDFD

RC2 BootRC2 Boot

Cloning federations

Clone FDClone FD

DB_aDB_a

DB_bDB_b


Productions

• Using a DB-id pre-allocation system it is possible to produce databases at RCs which can then be exported to other sites– A notification system is needed to inform

other sites when a database is completed– This is today accomplished by GDMP using

a publish-subscribe mechanism


Productions

• When a site receives notification, it can:– ooattachdb to the remote site DB– copy the DB and ooattachdb it locally– ignore it


Productions

CERN CERN FDFD


CERN BootCERN Boot

RC1RC1FDFD

RC1 BootRC1 Boot

RC2RC2FDFD

RC2 BootRC2 Boot

DBn+1DBn+1

DBn+mDBn+m

DBn+m+1DBn+m+1

DBn+m+kDBn+m+k


Analysis

• In each site a complete catalog with the location of all the datasets is needed. Some DBs are local and some are remote

• In case more copies of a DB are available it would be nice to have in the local catalog the closest one (NWS)


Information service

• Create an Information Service with information about all the replicas of the databases (GIS ?)

• In each RC there is a reference catalog which is updated taking into account the available replicas

• It is even possible to have a catalog created on-the-fly only for the datasets needed by a job


Analysis

CERN CERN FDFD


CERN BootCERN Boot

RC1RC1FDFD

RC1 BootRC1 Boot

RC2RC2FDFD

RC2 BootRC2 Boot

DBn+1DBn+1

DBn+mDBn+m

DBn+m+1DBn+m+1

DBn+m+kDBn+m+kDBn+m+kDBn+m+k

DBn+m+1DBn+m+1DBn+mDBn+m

DBn+1DBn+1


Logical vs Physical Datasets

• Each dataset is composed by one or more databases– datasets are managed by application-sw

• Each DB is uniquely identified by a DBid– DBid assignment is a logical-db creation

• The physical-db is the file– zero, one or more instancies

• The IS manages the link between a dataset, its logical-dbs and its physical-dbs


Logical vs Physical Datasets

Dataset: H 2

Dataset: H 2e

Hmm.1.hits.DB

Hmm.2.hits.DB

Hmm.3.hits.DB

Hee.1.hits.DB

id=12345

id=12346

id=12347

id=5678

Hee.2.hits.DB id=5679

Hee.3.hits.DB id=5680

pccms1.bo.infn.it::/data1/Hmm1.hits.DB

shift23.cern.ch::/db45/Hmm1.hits.DB

pccms1.bo.infn.it::/data1/Hmm2.hits.DB



pccms5.roma1.infn.it::/data/Hee1.hits.DB

shift49.cern.ch::/db123/Hee1.hits.DB





pccms3.pd.infn.it::/data3/Hmm2.hits.DB


Database creation

• In each production site we have:– a production federation including incomplete

databases– a reference federation with only complete

databases (both local and remote ones)

• When a DB is completed it is attached to the site reference federation

• The IS monitors the reference federations of all the sites and updates the database list


RC1RC1ProdProd

DB4 DB4

RC1RC1RefRef

pc.rc1.net

CERN CERN FDFD

DB1DB1 DB2DB2 DB3DB3

DB4DB4

0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005

Database creation

DB5 DB5 DB5DB5

DB5DB5

0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005 DB5.DB pc.rc1.net::/pc/data

0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005 DB5.db pc.rc1.net::/ps.data shift.cern.ch::/shift/data

shift.cern.ch


Replica Management

• In case of multiple copies of the same DB each site may choose which copy to use:– it should be possible to update the reference

federation at given times– it should be possible to create on-the-fly a

mini-catalog only with information about the datasets requested by a job

• this kind of operation is managed by application-sw (e.g. ORCA)


0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data

0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data0002 DB2.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data 0003 DB3.DB shift.cern.ch::/shift/data

Replica Management

CERN CERN FDFD

DB1DB1DB2DB2

DB3DB3

BOBORefRef

shift.cern.ch pc1.bo.infn.it

PDPDRefRef

pc1.pd.infn.it

DB1DB1

DB2DB2


Summary of the Case Study

• Basic functionalities of a Replica Manager for production are already implemented in GDMP

• The use of an Information Server would allow easy synchronization of federations and optimized data access during analysis

• The same functionalities offered by the Objectivity/DB catalog may be implemented for other kind of files


Conclusions (?)

Globus and the various GRID projects try

to address the issue of Large Scale

distributed data access

Their effectiveness is still to be proven

The problem again is not the software,

it is the organization

L.M.Barone – INFN Rome 16-20 October 2000 ACAT2000 - FNAL Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università.

Documents

barone infn rome

roma infn slide

acat2000 fnal requirements

data location slide

mc data

objectivitydb slide

independent slide

barone universit