16-20 October 200 0 ACAT2000 - FNAL L.M.Barone – INFN Rome Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università di Roma & INFN
Dec 21, 2015
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Management of Large Scale Data Productions for the CMS
Experiment
Presented by
L.M.Barone
Università di Roma & INFN
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
The Framework
• The CMS experiment is producing a large amount of MC data for the development of High Level Trigger algorithms (HLT) for fast data reduction at LHC
• Current production is half traditional (Pythia + CMSIM/Geant3) half OO (ORCA using Objectivity/DB)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
The Problem
• Data size~ 106 - 107 events, 1 MB/ev~ 104 files (typically 500 evts/file)
• Resource dispersionmany production sitesCERN,FNAL,Caltech, INFN etc.
Dealing with actual MC productions and notwith 2005 data taking
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
The Problem (cont’d)
• Data Relocationdata produced in site A are storedcentrally (CERN); site B may need a fraction of them;combinatorics increasing
• Objectivity/DB does not make life easier(but the problem would exist anyway)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
ORCA Production 2000Signal
Zebra fileswith HITS
ORCADigitization
(merge signal and MB)
ObjectivityDatabase
HEPEVTntuples
CMSIM
HLT AlgorithmsNew
ReconstructedObjects
MC
P
rod
.O
RC
A P
rod
.
HLT G
rp
Data
bases
ORCAooHit
FormatterObjectivityDatabase
MB
ObjectivityDatabase
Catalog import
Catalog import
ObjectivityDatabaseObjectivityDatabaseytivitcejbOytivitcejbOesabataDesabataD
Mirro
red
Db
’s(U
S, R
ussia
, Ita
ly..)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
The Old Days
• Question: how was it done before ?
A mix of ad hoc scripts/programs with a lot of manual intervention... but the problem was smaller and less dispersed
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Requirements for a Solution
• Solution must be as automatic as possible decrease manpower
• Tools should be independent from data type and from site
• Network traffic should be optimized (or minimized ?)
• Users need complete information on data location
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Present Status
• Job creation is managed by a variety of scripts in different sites
• Job submission again goes through diverse methods, from UNIX commands to LSF or Condor
• File transfer has been managed up to now by Perl scripts
not generic, not site independent
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Present Status (cont’d)
• The autumn 2000 production round is a trial towards standardization same layout (OS, installation) same scripts (T.Wildish) for non Objy data transfer first use of GRID tools (see talk by
A.Samar) validation procedure for production
sites
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Collateral Activities
• Linux + CMS software automatic installation kit (INFN)
• Globus installation kit (INFN)
• Production monitoring tools with Web interface
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
What is missing ?
• Scripts and tools are still too specific and not robust enough need practice on this scale
• Information service needs a clear definition in our context and then an effective implementation (see later)
• File replication management is just appearing and needs careful evaluation
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Ideas for Replica Management
• A case study with Objectivity/DB(thanks to C.Grandi Bologna,INFN) – can be extended to any kind of file
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Cloning federations
• Cloned federations have a local catalog (boot file) – It is possible to manage each of them in an
independent way. Some databases may be attached (or exist) only in one site
– “Manual work” is needed to keep the schemas synchronized (this is not the key point today...)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
CERN CERN FDFD
DB1DB1 DB2DB2 DB3DB3 DBnDBn
CERN BootCERN Boot
RC1RC1FDFD
RC1 BootRC1 Boot
RC2RC2FDFD
RC2 BootRC2 Boot
Cloning federations
Clone FDClone FD
DB_aDB_a
DB_bDB_b
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Productions
• Using a DB-id pre-allocation system it is possible to produce databases at RCs which can then be exported to other sites– A notification system is needed to inform
other sites when a database is completed– This is today accomplished by GDMP using
a publish-subscribe mechanism
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Productions
• When a site receives notification, it can:– ooattachdb to the remote site DB– copy the DB and ooattachdb it locally– ignore it
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Productions
CERN CERN FDFD
DB1DB1 DB2DB2 DB3DB3 DBnDBn
CERN BootCERN Boot
RC1RC1FDFD
RC1 BootRC1 Boot
RC2RC2FDFD
RC2 BootRC2 Boot
DBn+1DBn+1
DBn+mDBn+m
DBn+m+1DBn+m+1
DBn+m+kDBn+m+k
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Analysis
• In each site a complete catalog with the location of all the datasets is needed. Some DBs are local and some are remote
• In case more copies of a DB are available it would be nice to have in the local catalog the closest one (NWS)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Information service
• Create an Information Service with information about all the replicas of the databases (GIS ?)
• In each RC there is a reference catalog which is updated taking into account the available replicas
• It is even possible to have a catalog created on-the-fly only for the datasets needed by a job
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Analysis
CERN CERN FDFD
DB1DB1 DB2DB2 DB3DB3 DBnDBn
CERN BootCERN Boot
RC1RC1FDFD
RC1 BootRC1 Boot
RC2RC2FDFD
RC2 BootRC2 Boot
DBn+1DBn+1
DBn+mDBn+m
DBn+m+1DBn+m+1
DBn+m+kDBn+m+kDBn+m+kDBn+m+k
DBn+m+1DBn+m+1DBn+mDBn+m
DBn+1DBn+1
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Logical vs Physical Datasets
• Each dataset is composed by one or more databases– datasets are managed by application-sw
• Each DB is uniquely identified by a DBid– DBid assignment is a logical-db creation
• The physical-db is the file– zero, one or more instancies
• The IS manages the link between a dataset, its logical-dbs and its physical-dbs
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Logical vs Physical Datasets
Dataset: H 2
Dataset: H 2e
Hmm.1.hits.DB
Hmm.2.hits.DB
Hmm.3.hits.DB
Hee.1.hits.DB
id=12345
id=12346
id=12347
id=5678
Hee.2.hits.DB id=5679
Hee.3.hits.DB id=5680
pccms1.bo.infn.it::/data1/Hmm1.hits.DB
shift23.cern.ch::/db45/Hmm1.hits.DB
pccms1.bo.infn.it::/data1/Hmm2.hits.DB
shift23.cern.ch::/db45/Hmm2.hits.DB
shift23.cern.ch::/db45/Hmm3.hits.DB
pccms5.roma1.infn.it::/data/Hee1.hits.DB
shift49.cern.ch::/db123/Hee1.hits.DB
pccms5.roma1.infn.it::/data/Hee2.hits.DB
shift49.cern.ch::/db123/Hee2.hits.DB
shift49.cern.ch::/db123/Hee3.hits.DB
pccms5.roma1.infn.it::/data/Hee3.hits.DB
pccms3.pd.infn.it::/data3/Hmm2.hits.DB
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Database creation
• In each production site we have:– a production federation including incomplete
databases– a reference federation with only complete
databases (both local and remote ones)
• When a DB is completed it is attached to the site reference federation
• The IS monitors the reference federations of all the sites and updates the database list
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
RC1RC1ProdProd
DB4 DB4
RC1RC1RefRef
pc.rc1.net
CERN CERN FDFD
DB1DB1 DB2DB2 DB3DB3
DB4DB4
0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005
Database creation
DB5 DB5 DB5DB5
DB5DB5
0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005 DB5.DB pc.rc1.net::/pc/data
0001 DB1.DB shift.cern.ch::/shift/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data0005 DB5.db pc.rc1.net::/ps.data shift.cern.ch::/shift/data
shift.cern.ch
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Replica Management
• In case of multiple copies of the same DB each site may choose which copy to use:– it should be possible to update the reference
federation at given times– it should be possible to create on-the-fly a
mini-catalog only with information about the datasets requested by a job
• this kind of operation is managed by application-sw (e.g. ORCA)
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data
0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data0002 DB2.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data 0003 DB3.DB shift.cern.ch::/shift/data
Replica Management
CERN CERN FDFD
DB1DB1DB2DB2
DB3DB3
BOBORefRef
shift.cern.ch pc1.bo.infn.it
PDPDRefRef
pc1.pd.infn.it
DB1DB1
DB2DB2
16-20 October 2000ACAT2000 - FNALL.M.Barone – INFN Rome
Summary of the Case Study
• Basic functionalities of a Replica Manager for production are already implemented in GDMP
• The use of an Information Server would allow easy synchronization of federations and optimized data access during analysis
• The same functionalities offered by the Objectivity/DB catalog may be implemented for other kind of files