Oct 1, 1999 G. Wormser LAL Orsay, 3 r d LHC Computing Workshop 1 Data Analysis Tools G. Wormser, LAL Orsay The topics End-user data (statistical) Analysis Tools Event Displays (Data Quality Control) The inputs Feedback from LHC/HEP experiments The various analysis packages HEPVis99 Personal experience from BABAR The key issues Conclusions
71
Embed
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop1 Data Analysis Tools G. Wormser, LAL Orsay zThe topics yEnd-user data (statistical) Analysis.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
1
Data Analysis Tools G. Wormser, LAL Orsay
The topics End-user data (statistical) Analysis Tools Event Displays (Data Quality Control)
The inputs Feedback from LHC/HEP experiments The various analysis packages HEPVis99 Personal experience from BABAR
The key issues Conclusions
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
2
Historical perspective : PAW
Very large ‘ productivity boost ’ in the physicists community with the introduction of a universal analysis tool program PAW very easy to use , available everywhere Ntuples, MINUIT, presentation package fortran interpreter macros/script (KUIP, .kumac)
No integration within experiments frameworkNo overhead!But not possible to benefit from
infrastructure (no access to code, constants, data not in ntuples,event display)
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
3
The new environment
OO Data structures (ROOT,Objectivity,etc)Analysis codes and tools in OO language
We want ‘ PAW_OO ’!Very large datasets
want Better integration within the frameworkVery powerful CPUs
Better interactivity
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
4
User Basic Requirements
Histo and ‘ tuples ’Knowledge of the experiment data
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
5
Example: Detailed requirements from ATLAS AnT design should be modular and reusable, and allow modules addition and deletion without major
changes to the program.
AnT should save and restart analysis procedures in the same state as at the exit time.
AnT should provide a standard mechanism to store information and operations executed in each analysis procedure (i.e. information about a dataset, selection cuts, calibration data used - if attributes were re-calculated in an analysis job) to allow their recalculations with identical results.
AnT should provide a standard mechanism to store information on any errors encountered in any data manipulation (i.e. fitting, mathematical manipulations, display). The information should be stored in an object generated by the data operations.
AnT should provide a standard mechanism to append information on the data related to an analysis (for example - criteria used to select data and conditions used to collect data) to the analysis results.
AnT should provide a standard mechanism to store and view results of the preliminary, the intermediate, and the final stage of analysis.
AnT should allow viewing of results in the interactive form and a possibility to save them, if needed, in a standard format for possible inclusion in informal and formal publications.
AnT should display one or more events simultaneously.
AnT should make it possible to plot, graph and represent graphically in other ways results from simple and multiple data sets.
AnT should be easy enough to learn its basic functionality’s in a short time (~ few hours).
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
6
Technical Requirements
Lifetime of the experiments>Lifetime of the packages
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
8
What is ROOT
Ambitious replacement for PAW by its main author, R. Brun and his group , written in C++
Covers all aspects of data analysis: Data storage (ROOT I/O) Statistical analysis C++ interpreter CINT Event Display
Initially built as all-in-one-package, evolution towards more modularity
‘ Open source ’ approach Large and growing users base
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
9
ROOT users base
ALICELHCb test beam (Outer tracking)CDF,D0BABAR (see later)JLCSTAR and many other nuclear
physics projects
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
10
Root class structure
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
11
Some ROOT examples from various expts
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
12
An Online ROOT application from ALICE
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
13
Fermilab Review committee Evaluation of ROOT (‘ 98)
1) ROOT is a complete, full-featured package that meets the functional requirements
2) There are some trivial unacceptable features (use of CMZ, lack of build scripts) which should not be a stumbling block, but will require a formal collaboration with the ROOT team
3) There is a large, world-wide user base, but so far limited use for serious HEP analysis
4) ROOT can cope with the CDF and D0 data models
5) ROOT has an effective internal data format well matched to HEP needs
6) The present version of CINT is a potential serious drawback (buggy, undocumented, limited C++ features, hard to support, poorly engineered). This will require a decision to enhance/upgrade/replace, which would require significant work.
7) the user interface is not very friendly
8) The interconnectedness of the various modules is substantial. External modules must conform to (ROOT specific non-standard) ROOT protocols to be functional.
9) The package is not highly engineered (ie, it has grown organically rather than been designed). The current implementation reflects this evolution, for example, it has not kept up with the C++ language standard (has its own container classes, etc.) Even beyond CINT, the product has many bugs.
10) It will require some relatively straightforward customization to support casual users
11) There is an active and responsive support team with good archives and an active mailing list
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
14
Fermilab review Committee recommendations
RECOMMENDATIONS FOR RUN II:
We recommend that ROOT be adopted as the standard physics analysis package for Run II, contingent on a collaborative agreement with the ROOT team. It should be recognized that this recommendation depends critically on timing and on sharing development with outside collaborators, and the steering committee should assess the validity of these assumptions in evaluating the recommendation. In particular, if the requirement for an immediate choice is being driven by on-line needs (which may not require the full functionality of an off-line analysis package immediately), it needs to be determined if the components of NIRVANA that already exist are adequate for the immediate needs.
LONG-TERM RECOMMENDATIONS:
It is highly likely that by the end of RUN II (or by the time of the LHC) that commercial components will be heavily used for analysis tasks. Commercial offerings should continue to be investigated and made available (perhaps on limited platforms). The Computing Division should also initiate formal collaboration with the LHC++ project so as to have some influence on the choices made and direction taken. These two initiatives, while lower priority than the immediate ROOT support and development needs, should position us to take full advantage of expected evolution of these products.
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
15
What is JAS
Analysis framework based on JAVADevelopped at SLAC by T. JohnsonSee the presentation by M. Ronan
after this talkAims at similar complete functionality
as ROOT Smaller user community (NLC, BABAR
online)
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
16
Java Libraries and API’s
Standard Libraries and API’s 2D + 3D graphics + GUI (Swing) + Imaging + Printing Database connectivity (JDBC) + ODMG Collections, IO (Serialization), Data Compression Networking, Sockets, SSL, Corba, RMI Java Beans (components), Help Multimedia, Sound, Speech Security, Code Signing, Cryptography Math, Arbitrary Precision Math Shared Data (Collaborative Applications)
Huge “Community-Ware” software archive IBM alone has hundreds of Java resources on its
Alphaworks site
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
17
Remote Data Analysis
GUIDataAnalysis Engine
UsersJava Code
ExperimentInterface
JavaCompiler +Debugger
ExperimentExtensions(Event Display)
TCP/IP Network
Padded Cell
C++ Code
Data•Zebra•Jazelle•Paw•Root•Objectivity
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
18
Plot Display Package
1-d/2-d Histogram/ScatterPlot Display multiple axes, direct user interaction,
overlays, fitting
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
19
JAS Availability
1.0 (Beta) currently available Windows (NT, 95, 98) + Unix (Solaris+Linux) Installed on Solaris at SLAC (/usr/local/bin/jas) Limitations
Detailed documentation still under development May still be some changes to user API
BABAR: No official tool, ie PAW (JAS online, + ROOT)
CDF/D0: ROOT for RunII
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
41
Some words about AliROOT
The ROOT framework will provide to ALICE: Data Storage On-line monitoring Statistical analysis Event Display
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
42
Alice Framework
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
43
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
44
CMS/ATLAS/LHCb approach
Define their data model and framework independently (eg GAUDI/LHCb, CARF/CMS)
Objectivity for persistencyClose collaboration with LHC++ effortEvaluate as many products as
resonable using test beam stands(Produce documents!)
Invest on Event Displays (ATLAS, CMS)
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
45
LHCb strategy
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
46
LHCb strategy (2)
Common problems HEP-Analysis Foundation Libraries (ex NAG, CLHEP) Toolkits(ex HTL)
LHCb specific Analysis Tools, some will make use of HEP-wide toolkits mathematical Libraries Histogramming Fitting and Minimization Visualization Data Access
Components exist in different stages but what about their interfaces? LHC++ is planning to create interfaces on existing packages
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
47
Atlas Web Page
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
48
CMS Software Task Breakdown
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
49
Tracker TestBeam Online Monitoring
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
50
The trends at HEPVis99
Collaborative environment Try to define common interfaces The Open source approach How to get out of ‘ One man-one tool ’?
Distributed environment IDL/CORBA/JAVA
No ROOT participation
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
51
The near future
LHC++ basic histos in a few weeksHEPVis collaborationHow to ease convergence?
This organisation has been founded by an international group of computing scientists, engineers, physicists, in 1999 to help the development of software tools for academic scientific research in an international and collaborative way.
A first target of this group is to extract a web based working organisation model aiming, in a first step, at the production of interactive data analysis tools for high energy and nuclear physic experiments.
We hope that this model will be sufficently general and efficient to apply to other domains
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
52
The event displays
Goals Code debugging Event debugging Quality control
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
53
Event Displays: Approaches and Contributions
ALICE : ROOTATLAS: WIRED (see J. Hrivnac talk)CMS: Qt, Iguana, HEPVis
BABAR: WIRED with CORBA
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
54
Event Display requirements
A/Code debugging Needed very early in the development
Integration with simulated objects Compatibility with GEANT4!
B/Online displaybatch mode
Access to RAW objectsSpeed
C/offline analysis Integration with reco framework (interactivity)
FlexibilityPublic relations
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
55
ALICE Geant3 geometry display with ROOT
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
56
CMS Interactive Graphical User Analysis (IGUANA)
Interactive Detector and Event Visualisation (CMSCAN) Physics Analysis Tools (Graphical) User Interfaces Tasks include
Assessment of HEP-wide and commercial tools Development of missing and CMS-specific components (e.g. Detector and Event Visualisation systems) Design and implementation of (Graphical) User Interfaces for CMS software systems (ORCA, OSCAR, test beam, PRS,...) Working closely with and contributing to HEP-wide projects (e.g. LHC++, HEPVis, GEANT4, etc.) Deployment, distribution, and support in the CMS environment
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
57
A General Idea of a User Application
DisplayDisplayDisplayDisplay
2D/3D2D/3D2D/3D2D/3D
Off ScreenOff ScreenOff ScreenOff Screen
PostScriptPostScriptPostScriptPostScript
MetafileMetafileMetafileMetafile
User ApplicationUser ApplicationUser ApplicationUser Application
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
58
CMS Detector and Event Visualization in IGUANA
Generic software developed (collaborate with HEPVis: CDF, D0, L3,...)Interactive Graphical User Interface and graphics managerDeployed with ORCA (detector elements and reconstructed objects)Extend to test-beams and OSCAR (GEANT4 for CMS) by end of 1999
Pixels
MSGC’s
BeamPipe
Silicon
CMS Tracker H
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
59
Event Displays : The new trends
Hepvis library OpenInventor, SoFreeWIREDCORBA
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
60
WIRED Client-Server/File Architecture
Geometryand
Events
WIREDApplication
Geometryand
Events
WIREDServer
WIREDGateway
WWWServer
WIREDCode
WIREDCode
WWWBrowser
WIREDApplet
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
61
GUI (inside Netscape Browser)
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
62
WIRED connected to services via bus
to access Data
to access other Services
to enable Collaboration
External Bus
Event Viewer
Event Viewer
Event Viewer
Event Viewer
External Bus
Event DataServer
Geometry DataServer
External Bus
Event Viewer State Manager
ReconstructionServer
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
63
The BABAR experience
Main characteristics Data and constants stored in Objectivity Very large statistics for a start-up (do not plan
against you!)(Best achieved 45 pb-1/day, 1.05 10**33)
1 fb-1 in 4 months (as many B-B pairs as LEP in 6 years)
Mostly uncalibrated detector at run start
Too early to draw definitive conclusions from observed performances
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
64
TheBABAR Tools
Prompt reconstruction immediately following data taking
REC,AOD,TAG data stored in database(AOD=microDST)
~AOD also available as PAW-Ntuples as a temporary initial measure
Event display incorporated in the framework
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
65
Possible to do some zero-order physics at AOD level(Also meaning: not possible to do First order physics at this level!)
Calibrations need REC Detector performance/
Detector Understanding needs AOD+partial REC
Event display essential
Some Confirmations
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
66
BABAR initial constraints
No export just ~working now, 1 M AOD evts at
Lyon and RALCPU limitations and slow turnaround
Restricted access to dataRolling calibration scheme not yet
implemented
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
67
Slow access to data=>Insufficient calibration up to now and not yet optimal detector performances
Providing very easy standalone access to REC data would speed up the process
15-30% of total stats available to the average user: not too good, not too bad!
Review Committee in August 1999
The initial problems
MC width: 5-7 MeV/c
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
68
Babar software committee recommendations
Provide users with another fast access to data : ROOT/IO files based batch access to ROOT/IO files in a first step BABAR code interactive in ROOT (/…) at the end of the
yearSend data to regional centers to reduce the burden
at SLACPut in place a ‘Risk management plan’ to assess
Objectivity progress towards design performances Resources management at SLACDuplicate the Opr farm to allow development in
parallel with production
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
69
Conclusions
A lot of technology exists Statistical tools
Paw very succesful. Need collaboration towards ‘ PAW_OO’. Still some more work on requirements
ROOT/LHC++/OpenScientist/JAS present front runners
JAVA interface with C++ (CORBA) Event displays much more connected to the
experiments. Trends is to distributed computing JAS/WIRED and/or more integration (ROOT)
Do not forget human factors!
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
70
The key issues
No main underlying technical issues Integration
Data model/Statistical tool Statistical tool/Event Display
Interoperability 1 experiment and several outside packages ‘ Build your own ’ package
Collaborative effort Time scale
Oct 1, 1999 G. Wormser LAL Orsay, 3 rd LHC Computing Workshop
71
CMS Software Milestones
The OO “Proof of Concept” phase has been completed
CMS MILESTONES
CORE SOFTWARE
End of Fortran developmentGEANT4 simulation of CMS 1 2 3 4
Reconstruction/analysis framework 1 2 3 4
Detector reconstruction 1 2 3 4
Physics object reconstruction 1 2 3 4
User analysis environment 1 2 3 4
DATABASE
Use of ODBMS for test-beamEvent storage/retrieval from ODBMS 1 2 3 4
Data organisation/access strategyFilling ODBMS at 100 MB/sSimulation of data access patternsIntegration of ODBMS and MSSChoice of vendor for ODBMSInstallation of ODBMS and MSS
1 Proof of concept 3 Fully functional
2 Functional prototype 4 Production system
Dec-00Dec-01
Dec-02
Dec-03Dec-98
Jun-99Dec-00
Dec-97
Jun-98 Dec-99 Dec-01
Dec-98 Jun-00 Dec-02 Dec-04
Mar-99 Jun-00 Dec-02 Dec-04
Dec-98 Dec-99 Jun-02 Jun-04
Jun-98 Dec-99 Dec-01 Dec-03
2005
Jun-98
Jun-98 Dec-99 Jun-01 Dec-03
2001 2002 2003 20041998 1999 2000
Sept 1999
The “Functional Prototype” phase is well underway CMS must provide functional software by end 1999 / beginning 2000