PIC port d’informació científica The Worldwide LHC Computing Grid: Riding the computing technology wave to enable the Higgs boson discovery. 68th ICREA Colloquium, November 17 th , 2015 Prof. Manuel Delfino UAB Physics Dept. and IFAE, Director of PIC
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
PICport d’informaciócientífica
The Worldwide LHC Computing Grid:Riding the computing technology wave to
enable the Higgs boson discovery.68th ICREA Colloquium, November 17th, 2015
Prof. Manuel DelfinoUAB Physics Dept. and IFAE, Director of PIC
PICport d’informaciócientífica Outline
● Data Processing in Experimental Particle Physics
● The 1980s: From mainframes to clusters
● The 1990s: From clusters to farms
● The Large Hadron Collider – timeline
● The 2000s: The Worldwide LHC Computing Grid
● Outlook and Conclusions
PICport d’informaciócientífica
Data Processing inExperimental Particle Physics
PICport d’informaciócientífica
Data Processing inExperimental Particle Physics
● Post World War II “High Energy Nuclear Physics”:Linking mainframe computers, devices and scientists– Bubble chamber scanning device automation
– Storage of electronic readout data → DAQ and Online computers
– Raw data converted to analysis data on central computer atlaboratories → Offline computers at labs
– Scientists transport tapes and punched cards in their suitcases →Data distribution
– Scientists attempt to analyse data on their University's centralcomputer and face many barriers and frustration
● Curiously, many scientific fields today still use thismethodology. Last step less troublesome thanks to PCs.
PICport d’informaciócientífica
Data Processing inExperimental Particle Physics
● 1960s-1970s: Microelectronics + detector innovation → more data generated, transported, computed on– Standard “nuclear” electronics: NIM, CAMAC, VME
– Microprocessors, particularly Motorola 68K, in DAQ
– Minicomputers, particularly PDP-11, as online computers
– Available mainframe capacity becomes totally insufficient● Pressure on experiments to tighten trigger requirements,
recording less data at the risk of “missing” new physics● Multiple mainframes at labs● Very few universities make available “research mainframes”
– 1972: IBM releases VM/370: Virtual machines
– 1977: Digital VAX-11/780 minicomputer: Virtual memory
PICport d’informaciócientífica
The 1980s:From mainframes to clusters
PICport d’informaciócientífica 1980s: From mainframes to clusters
– Advances in SCSI hard disks allow the delivery of hugeamount of disk space to physicists
– CERN CASTOR Hierarchical Storage Manager makestape look like an agile extension of disk
● Similar tendency at Fermilab Tevatron Collider– Development of the SAM Distributed Data Framework
– Data distribution over network in “push” and “pull” modes
PICport d’informaciócientífica 1990s: from clusters to farms
● 1993: Microsoft releases Windows NT● 1994: NASA uses standard Unix plus MPI/PVM to build first
Beowulf cluster → “cluster supercomputer”● 1994: Linux Torvalds → Linux kernel v1.0● 1995: Bill Gates “Internet Tidal wave” → TCP/IP everywhere● mid-1990s: 32 bit mass market microprocessors
– 1995: Intel Pentium Pro
– 1996: AMD K5
● 1994: CERN buys its last mainframe– IBM SP-2: Mainframe built from RISC workstations
– CERN decides to manage it in an integrated manner within SHIFT
PICport d’informaciócientífica 1990s: from clusters to farms
● 1996: CERN RD-47 project:High energy physics processing using commodity components Barcelona-CERN-JINR Dubna-Florida State-Fac. des Sciences Luminy- Santa Cruz-Washington
– Implement the whole particle physics data processing environmenton commodity hardware and software
– Concepts and prototypes of extra tools to automatically manage alarge number of nodes → processor farm
● Two approaches:– Use Windows NT PCs to replicate VAXcluster environment using
purely commercial components
– Use Linux PCs to implement shift using as much “open” software aspossible
● Some controversy as RISC had moved to 64-bits
PICport d’informaciócientífica 1990s: from clusters to farms
● Windows NT approach worked, but >10 years ahead of itstime → Windows Azzure cloud service
● Linux approach became dominant– Rapid scale-up of power and number of nodes (100s)
– University groups start deploying local clusters
– Development of automation tools: Quattor, Lemon→ precursors of Puppet, Chef
● 1999: CERN DG triggers the first study of solutions for thedata processing needs of the LHC.First look generates considerable shock– > 100k computers needed
● 1984: European Committee on Future Accelerators workshop: LargeHadron Collider in the LEP tunnel
● 1987: U.S. President Ronald Reagan announces support for theSuperconducting Supercollider
● 1988: Digging of LEP tunnel completed at CERN● 1992: Letter of Intent for ATLAS and CMS detectors● 1993: U.S. Congress kills the SSC after 2 G$ spent● 1994: CERN Council approves construction of LHC● 2000: CERN stops LEP, dismantles to house LHC● 2008: First LHC beams, magnet interconnect accident● 2009: Beams back in LHC, run 1 starts● 2012: ATLAS and CMS discover the Higgs boson● 2015: LHC Run2 starts
● 1992-1994: SSC, CERN: resources needed will be “much larger than those ofcurrent facilities”
● 1996-1998: R&D for Online: 1 PB/s → 1-10 PB/year– LHCb prototype of Myrinet based farm (later used to build first MareNostrum at BSC)
– CMS bets on farm based on giant Ethernet switchesNeeded capacity is equivalent to ¼ of US phone traffic
● 1999: First estimate: 0.1 M cores + 100 PB = 200 M€
● 1999: Ian Foster and Carl Kesselman publish“The Grid: Blueprint for a new computing infrastructure”
● 2000-2001: NSF, DOE, EU fund Grid development
● 2001: Worldwide LHC Computing Grid project approved by CERN Council,becomes part of the CERN Research Program
● 2002: LCG1 service becomes operational
● 2012: WLCG acknowledged as key enabler for Higgs boson discovery
● 2015: Estimates for High Luminosity LHC: Resources needed are “much higher”
PICport d’informaciócientífica
The 2000s:The Worldwide LHC Computing Grid
PICport d’informaciócientífica
2000s: Worldwide LHC Computing Grid
● Fastest evolution component in computing:Wide Area fiber optics communication
● Five very differentiated needs:– Archiving
– Reconstruction
– Filtering
– Analysis
– Simulation
● MONARC study– wide area network to integrate worldwide resources
– centers with different capabilities and reliabilities
PICport d’informaciócientífica
2000s: Worldwide LHC Computing Grid
● The WLCG Tiers as defined in MONARC– Tier-0 at CERN
● Receive raw data from DAQ, archive to tape and cache on disk● Run prompt reconstruction and quality checks● Distribute raw data to a limited number (13) of Tier-1 centers
● Receive raw data form CERN, archive to tape and reconstruct● Receive simulations, archive to tape and reconstruct● Run filters → physics objects with pre-determined patterns● Distribute filtered data to many (200) Tier-2 centers
– Tier-2● Make filtered physics objects available for analysis● Run simulations
PICport d’informaciócientífica
PICport d’informaciócientífica
Economies of scale were an important partof the “Grid vision”
Time
Qua
lity,
eco
nom
ies
of s
cale
Decouple production &consumption, enabling
● On-demand access● Economies of scale● Consumer flexibility● New devices
Adapted by permission from Ian Foster, University of Chicago and US Argonne National Lab
An example from thedevelopment of electricalpower from a “cottage
industry” to a dependableinfrastructure
PICport d’informaciócientífica
2000s: Worldwide LHC Computing Grid
● Grid “middleware” (example based on European gLite)– Authentication: International Grid Trust Federation of entities issuing X.509
certificates to users (ES: RedIRIS)
– Authorization: Virtual Organization Management Service (VOMS)→Internetservers tell your rights within project
– Computing Element: Abstraction of a batch queue
– Storage Element: Abstraction of a disk
– Resource Broker (or Workload Management System):Broker amongst Computing Elements
– Replica Manager:Handle multiple copies of same data in different Storage Elements
– Information Service: Where is what ?
– Logging and Book-keeping: What is happening ?
– User Interface: Machine which bridges normal world to the Grid
PICport d’informaciócientífica
Foster & Kesselman vision: Grid or Cloud?
Servers:Execution
ApplicationServices:Distribution
Application Virtualization
• Automatically connect applications to services• Dynamic & intelligent provisioning
● Luminosity: x6 Run2, x14 Run3, x120 Run4● Complexity: More collisions per beam crossing● Current model will work for Runs 2 and 3, but worries about large
number of sites and associated people● Need to re-think everything for Run 4 → Start now
PICport d’informaciócientífica At PIC: Benefit other projects
● Past and current
– MAGIC Cherenkov Telescope main data center
– PAU Cosmological Survey main data center
– EUCLID Science Data Center supporting Simulation OU
– Analysis support for LHC-ATLAS, neutrinos-T2K, cosmology-DES, etc.
– Storage and analysis support for simulations● Turbulent flow (Hoyas and Jiménez-Sendín, UPM, published)
● Universe expansion (Fosalba, et.al., IEEC-ICE, published and ongoing)
● Star evolution (Padoan, et.al., ICREA, published and ongoing)
– PIC Neuroimage processing platform
● Future
– Cherenkov Telescope Array (CTA) landing data center
– Next generation neutrino experiments
– Simulation storage and analysis
– Other fields? → Contact me if you want to explore a collaboration