1 ALICE Grid Status ALICE Grid Status David Evans David Evans The University of The University of Birmingham Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005
Mar 28, 2015
1
ALICE Grid StatusALICE Grid Status
David EvansDavid Evans
The University of BirminghamThe University of Birmingham
GridPP 14th Collaboration MeetingBirmingham 6-7 Sept 2005
2
The ALICE ExperimentThe ALICE Experiment
ALICE is one of the four main LHC ALICE is one of the four main LHC experiments at CERN.experiments at CERN.
Only one dedicated to heavy-ion physics.Only one dedicated to heavy-ion physics.– Study of QCD under extreme conditionsStudy of QCD under extreme conditions
~ 1000 collaborators~ 1000 collaborators ~ 100 institutions~ 100 institutions Birmingham is only Birmingham is only
UK institute involvedUK institute involved
3
UK ALICEUK ALICE
Birmingham only UK institute in ALICEBirmingham only UK institute in ALICE Small group but plays a vital roleSmall group but plays a vital role
– Responsible for the design, construction, building and Responsible for the design, construction, building and maintenance of Central Trigger Processor and Local maintenance of Central Trigger Processor and Local Trigger Units.Trigger Units.
– Getting involved in Getting involved in physics exploitation.physics exploitation.– No surplus UK manpower No surplus UK manpower available for Grid work (ALICE available for Grid work (ALICE Gridd experts at CERN).Gridd experts at CERN).
» i.e. no UK ALICE Gridd expertsi.e. no UK ALICE Gridd experts..
4
ALICE RequirementsALICE Requirements
Data taking (each year)Data taking (each year)– 1 month of Pb-Pb data ~ 1 PByte1 month of Pb-Pb data ~ 1 PByte– Also p-p for rest of the year ~ 1 PByteAlso p-p for rest of the year ~ 1 PByte
Large scale simulation effortLarge scale simulation effort – 1 Pb-Pb event: ~ 24 hrs (1 GHz)1 Pb-Pb event: ~ 24 hrs (1 GHz)
Data ReconstructionData Reconstruction Data analysisData analysis Smaller Collaboration than Smaller Collaboration than
ATLAS or CMS but similar ATLAS or CMS but similar computing requirements.computing requirements.
5
Computing RequirementsComputing Requirements
Data processing (pp)Data processing (pp)– Calibration & alignment – (quasi) onlineCalibration & alignment – (quasi) online– First reconstruction pass during data takingFirst reconstruction pass during data taking
» Establish overall properties quicklyEstablish overall properties quickly
– Followed by tuning passFollowed by tuning pass– Followed by second reconstruction passFollowed by second reconstruction pass
Data processing (Pb-Pb)Data processing (Pb-Pb)– Calibration & alignment during data takingCalibration & alignment during data taking– First reconstruction pass ~ 4 monthsFirst reconstruction pass ~ 4 months– Second reconstruction pass ~ 6 monthsSecond reconstruction pass ~ 6 months
6
Computing RequirementsComputing Requirements
Monte Carlo SimulationsMonte Carlo Simulations– pp data: generate similar amount of MC data ~10pp data: generate similar amount of MC data ~1099
eventsevents
– Pb-Pb data: generate ~ 10Pb-Pb data: generate ~ 1077 events events
(factor 10 less than real data)(factor 10 less than real data)
7
Profile of CPU Profile of CPU requirementsrequirements
Total
CERN T0
CERN T1
Ext Tier 1
Ext Tier 2
35 MSK2K
Jan 07 Sept 08 Nov 09
8
Tier HierarchyTier Hierarchy
MONARC ModelMONARC Model
‘‘Cloud Model’ (Tier free) used Cloud Model’ (Tier free) used in ALICE data challenges (native AliEn sites – for LCG site in ALICE data challenges (native AliEn sites – for LCG site
we comply with Tier model)we comply with Tier model)
Tier 0RAW data master copyData reconstruction (1st pass)Prompt analysis
Tier 1Copy of RAWreconstructionScheduled analysis
Tier 2MC productionPartial copy of ESDData analysis
9
ALICE Gridd - AliEnALICE Gridd - AliEn
AliEn (ALICE Environment) – Grid framework AliEn (ALICE Environment) – Grid framework developed by ALICE – developed by ALICE – used in production for > 4 used in production for > 4 yearsyears..
Based on WEB services and standard protocols.Based on WEB services and standard protocols. Built around open source codeBuilt around open source code
– Less than 5% is native AliEn code (mainly PERL).Less than 5% is native AliEn code (mainly PERL).
To date, To date, > 500,000> 500,000 ALICE jobs have been run ALICE jobs have been run under AliEn control worldwide.under AliEn control worldwide.
10
First implementation of Alice World Computing ModelFirst implementation of Alice World Computing Model
AliEn@GRIDAliEn@GRID
11
Old AliEn FrameworkOld AliEn Framework
A liE n S e rvic e s
R D B M S
D BD rive r
U se rInte r fac e
B aseC lie nt
C lus te rM o nito r
B aseC lie nt
P roc e s sM onitor
B ase
C lie nt
C o m putingEle m e nt
B aseC lie nt
C E
Alic eAtlas
Sto rageEle m e nt
B aseC lie nt
SE
FTD
B ase
C lie nt
IS
B ase
C lie nt
D BP ro xy
B aseC lie nt
Lo gger
B ase
C lie nt Authen
B aseC lie nt CP U
S erver
B ase
C lie nt
Se rve rSe rve r
Se rve r
U se rApplic atio n
(C /C + + )
LD AP
W e bP o rtal
B ase
C lie ntP o r tal
R B
B ase
C lie nt
Se rve rSe rve r
100% perl5
SOAP
LocalSiteelements
Centralservices
User
12
AliEn ‘Pull’ ProtocolAliEn ‘Pull’ Protocol
One of the major differences between ALiEn and LCG One of the major differences between ALiEn and LCG grids is that AliEn uses the ‘grids is that AliEn uses the ‘pullpull’ rather than ‘’ rather than ‘pushpush’ protcol.’ protcol.
EDG/Globus model:EDG/Globus model:
ALiEn model:ALiEn model:
user server
ResourceBroker
user server
ResourceBroker
job
list
13
Resource BrokerResource Broker
T ier0
T AS K Q UEUE
CP US erver
ACCT
REM O T ES IT E
Rem oteQ ueue
Clus terM onitor
J ob
1P ro ce ssM o n ito r
J ob
1P ro ce ssM o n ito r
J ob
2P ro ce ssM o n ito r
J ob
nP ro ce ssM o n ito r
ACCT
REM O T E S IT Eor
AN O T HERG RID
Rem oteQ ueue
Clus terM onitor
AliEnS erver
EDG /G lo b us
“Pull” instead of traditional “Push”
architecture
Broker
Authen
Logger
TransferBroker
IS
TransferOptimiser
EDG/Globus
14
EGEE / gLiteEGEE / gLite
ALICE is committed to using as much common grid ALICE is committed to using as much common grid applications as possible.applications as possible.
In the framework of the EGEE project (EU funded grid In the framework of the EGEE project (EU funded grid
project) middleware (gLite) is being developed.project) middleware (gLite) is being developed.– ALICE was playing a full role in this project – not so much nowALICE was playing a full role in this project – not so much now ! !
ChangesChanges have been made to make AliEn work with LCGhave been made to make AliEn work with LCG– E.g. changes to File Catalogue (FC) E.g. changes to File Catalogue (FC) LFC (Local File Catalogue LFC (Local File Catalogue
or LCG File Catalogue) or LCG File Catalogue)
– V0 Box at Tier 1 V0 Box at Tier 1
– Globus/GSI compatible authenticationGlobus/GSI compatible authentication
15
AnalysisAnalysis
Core of ALICE computing model is AliRootCore of ALICE computing model is AliRoot– Uses ROOT frameworkUses ROOT framework
Couple AliEn with ROOT for Grid-based analysis.Couple AliEn with ROOT for Grid-based analysis.– Use PROOF – Parallel ROOT Facility Use PROOF – Parallel ROOT Facility
– To the user it’s like using ROOTTo the user it’s like using ROOT
4-tier architecture: 4-tier architecture: – ROOT client session, API server (AliEn + PROOF), ROOT client session, API server (AliEn + PROOF),
Site PROOF master servers, PROOF slave servers. Site PROOF master servers, PROOF slave servers.
16
PROOFPROOF
Each node has PROOF slave
Each site has PROOF master server
Uses ‘pull’ protocol i.e. the slaves ask the master for work packets.Slower slaves get smaller work packets etc.
ClientAPI
APIServer
AliEnFC….
List of sites with
data
17
Authentication - SASLAuthentication - SASL
SASL is the Simple Authentication and Security Layer, a SASL is the Simple Authentication and Security Layer, a method for adding authentication support to connection-based method for adding authentication support to connection-based protocols.protocols.
AliEn now has perl module with implementation GSSAPIAliEn now has perl module with implementation GSSAPI
This allows us toThis allows us to – Use all SASL authentication schemes Use all SASL authentication schemes – Use old AliEn authentication (token, AFS password, SSH) Use old AliEn authentication (token, AFS password, SSH) – Use X509 certificates and Globus/GSI (AliEn distribution now Use X509 certificates and Globus/GSI (AliEn distribution now
includes necessary Globus/GSI software) includes necessary Globus/GSI software) – Develop secure Peer-To-Peer File Transfers based on Develop secure Peer-To-Peer File Transfers based on
machine/protocol/user certificates and LDAP based configuration machine/protocol/user certificates and LDAP based configuration managementmanagement
18
AuthenticationAuthentication
ClientProxy Server
DatabaseLDAP
Request methods
List of methods
SASL AuthenticationChecking if user
exists
Data Data
X509(AliEn/Globus)PKI/RSA (ssh)Token (AliEn)AFS password
Server
19
SummarySummary
AliEn is a Grid framework developed by ALICE AliEn is a Grid framework developed by ALICE using 95% open source code (e.g SOAP) and 5 % using 95% open source code (e.g SOAP) and 5 % AliEn specific (perl) code.AliEn specific (perl) code.– Successfully used over past 3 yearsSuccessfully used over past 3 years
ALICE wishes to use as many common grid ALICE wishes to use as many common grid solutions as possiblesolutions as possible
AliEn evolving to take into account EGEE/gLite AliEn evolving to take into account EGEE/gLite framework and to work with LCG.framework and to work with LCG.– New user interfaces being developed New user interfaces being developed – PROOF for analysis being developedPROOF for analysis being developed– Better authentication/authorisation being developedBetter authentication/authorisation being developed– Etc.Etc.