Top Banner
Bookkeeping Tutorial
23

Bookkeeping Tutorial

Feb 25, 2016

Download

Documents

wilbur

Bookkeeping Tutorial. Bookkeeping content. Contains records of all “jobs” and all “files” that are produced by production jobs Job: In fact technically a “step” in a workflow E . g . “Gauss step”, “Brunel step” … For real RAW data: the “job” is in fact a DAQ run - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bookkeeping Tutorial

Bookkeeping Tutorial

Page 2: Bookkeeping Tutorial

Bookkeeping Tutorial 2

Bookkeeping content

m Contains records of all “jobs” and all “files” that are produced by production jobs

m Job:o In fact technically a “step” in a workflow

P E.g. “Gauss step”, “Brunel step”…o For real RAW data: the “job” is in fact a DAQ runo Has input files (except runs and Gauss)o Has output files

P Note that files may not be kept (i.e. have a replica)P All files are registered in order to keep the full history

o Has metadataP Location, production number, application, CPUTime, etc…

m Files:o Always output of a “job”o Files are defined by an LFN (Logical File Name)o Contain metadata

P Number of events, size, event type, etc…

Page 3: Bookkeeping Tutorial

Bookkeeping Tutorial 3

Bookkeeping purpose

m Provenance databaseo Contains the full history of productions

P Traceability of datasetsm User dataset search

o Select a list of files from selection criteriaP Only files with a replica!P Generate Gaudi configuration file

o Give also access to the job/file treeP E.g. investigate history of a file

m Production datasets searcho Select the dataset to be processed by production jobs

P Ensures consistency of input files for a productiono Uses directly the BK API to get the list of files

Page 4: Bookkeeping Tutorial

Bookkeeping Tutorial 4

Bookkeeping partitioning

m Configuration Name / versiono Real data

P <DAQ partition> / <activity>o Simulated data

P “MC” / <activity>d <activity> : “2008” / “DC06” / …

m Conditionso Parameters of initial data

P All subsequent processed data inherit the “conditions”o Real data

P DAQ conditionsd Beam conditions, energy, magnetic field, detector conditions…

o Simulated dataP Simulation conditions

d Beam energy, magnetic field, luminosity, generator settings…

Page 5: Bookkeeping Tutorial

Bookkeeping Tutorial 5

Processing pass

m Associated to a level of processingo Within a given partition (config name / version + conditions)o Corresponds to the whole processing workflow

P Single workflow for a given processing passP Compatible versions of applications

o Specifies the processing pass of input data when applicableP Sequence of processing

o Re-processing creates branches

Gauss

SIM

Boole

DIGI

Brunel

DST

DaVinci

ETC

Brunel

DST

SimReco

Stripping

Page 6: Bookkeeping Tutorial

Bookkeeping Tutorial 6

Other query parameters

m Event typeo File propertyo Real data

P 90000000 : real data full streamP 90000001 : real data express streamP Types to be defined for stripping streams

o Simulated dataP LHCb convention for decay tree

m File typeo Data content / format

P Format not yet used

Page 7: Bookkeeping Tutorial

Bookkeeping Tutorial 7

Running the bookkeeping GUI

m Needs a valid Grid certificatem Needs an X server

m lhcb-bkko SetupProject Dirac

P Sets up the environmento If needed: lhcb-proxy-init

P Creates a proxyo dirac-bookkeeping-gui

m Individual commands can be issued from the prompt!

Page 8: Bookkeeping Tutorial

Bookkeeping Tutorial 8

The query tree

Page 9: Bookkeeping Tutorial

Bookkeeping Tutorial 9

More info

m Right click ono Conditionso Processing pass

Page 10: Bookkeeping Tutorial

Bookkeeping Tutorial 10

Event type and file type

Page 11: Bookkeeping Tutorial

Bookkeeping Tutorial 11

Dataset selection

Logical File name

Page 12: Bookkeeping Tutorial

Bookkeeping Tutorial 12

Saving configuration (a.k.a. options) file

m Python configuration (default)o Still possible to create .opts (discouraged!)o .txt file for just a list of LFNs

m All files or selected files (if any)

Page 13: Bookkeeping Tutorial

Bookkeeping Tutorial 13

Dealing with PFNs or XML catalogs

m Using ganga + DIRACo Bookkeeping integrated in ganga:

P dataset = browseBK()o LFN handling is then automatic…

m If you really need XML catalog or PFNs, use genXMLCatalogo Ensures files are available on the specified siteo Gets the PFN from the Storage Element

P Not constructed “by hand”

Page 14: Bookkeeping Tutorial

Bookkeeping Tutorial 14

Dealing with XML catalog and PFNs

Page 15: Bookkeeping Tutorial

DIRAC Monitoringweb portal

15

Page 16: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 16

General information

m Entry point to the DIRAC web portalo http://dirac.cern.ch

m Web implementation of (almost) a full desktop applicationo Monitoring of productions / jobso Accounting (jobs, data management)o Allows to take actions on jobs

m Authentication / authorisation is mandatoryo Anonymous access gives minimal accesso Get a certificate and load it in our in your browser

https://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/Certificateo DIRAC authorisation through “DIRAC groups”

P Default: lhcb_userP Other groups: lhcb_prod, dirac_admin…P Future: specific groups per physics groups, PPG (for production

authorisation)…P Capabilities depends on the group

Page 17: Bookkeeping Tutorial

DIRAC Monitoring tutorial 17

The DIRAC portal home page

IdentityDIRAC group

DIRAC instance

Menus

Page 18: Bookkeeping Tutorial

DIRAC Monitoring tutorial 18

Job Monitoring

Selection

Monitoring info Actions

Page 19: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 19

Job Monitoring (cont’d)

m Selectiono For group lhcb_user, only see your own jobso Can select with

P StatusP SiteP DateP …

m Columnso Can tailor the columns to be displayedo Clicking toggles the sorting in the column

m Rowso Jobs displayed in pages (default 25 rows, don’t exceed 100)o Can scroll pages

Page 20: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 20

Logging info

Page 21: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 21

Output peeking

Page 22: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 22

Attributes

Page 23: Bookkeeping Tutorial

DIRAC Monitoring Tutorial 23

Parameters