ICREA - Manuel Delfino

PICport d’informaciócientífica

The Worldwide LHC Computing Grid:Riding the computing technology wave to

enable the Higgs boson discovery.68th ICREA Colloquium, November 17th, 2015

Prof. Manuel DelfinoUAB Physics Dept. and IFAE, Director of PIC

PICport d’informaciócientífica Outline

● Data Processing in Experimental Particle Physics

● The 1980s: From mainframes to clusters

● The 1990s: From clusters to farms

● The Large Hadron Collider – timeline

● The 2000s: The Worldwide LHC Computing Grid

● Outlook and Conclusions


Data Processing inExperimental Particle Physics



● Post World War II “High Energy Nuclear Physics”:Linking mainframe computers, devices and scientists– Bubble chamber scanning device automation

– Storage of electronic readout data → DAQ and Online computers

– Raw data converted to analysis data on central computer atlaboratories → Offline computers at labs

– Scientists transport tapes and punched cards in their suitcases →Data distribution

– Scientists attempt to analyse data on their University's centralcomputer and face many barriers and frustration

● Curiously, many scientific fields today still use thismethodology. Last step less troublesome thanks to PCs.



● 1960s-1970s: Microelectronics + detector innovation → more data generated, transported, computed on– Standard “nuclear” electronics: NIM, CAMAC, VME

– Microprocessors, particularly Motorola 68K, in DAQ

– Minicomputers, particularly PDP-11, as online computers

– Available mainframe capacity becomes totally insufficient● Pressure on experiments to tighten trigger requirements,

recording less data at the risk of “missing” new physics● Multiple mainframes at labs● Very few universities make available “research mainframes”

– 1972: IBM releases VM/370: Virtual machines

– 1977: Digital VAX-11/780 minicomputer: Virtual memory


The 1980s:From mainframes to clusters

PICport d’informaciócientífica 1980s: From mainframes to clusters

● Microprocessor evolution– Lower price minicomputers

● Physics departments at major universities have their own VAX● Minicomputer shrinked into workstation: VAXstation

– Very powerful 32-bit RISC workstations for CAD/CAM and electronicsdesign: Apollo Computer, Sun Microsystems, IBM, DEC

– 1981: IBM PC using 16-bit Intel 8088 + 16 kB of RAM

– 1984: Apple Macintosh using 16/32-bit Motorola 68K + 128 kB RAM

● Microelectronics evolution: Bit-sliced microprocessor– SLAC and CERN co-develop IBM emulators

– Souped-up online computers → Software triggers

● UA1 discovers W and Z bosons using a DAQ system built onMac+VME and an offline mainframe boosted by emulators


● Local area networks

– Ethernet: Xerox PARC (1974) → 3com (1979)

– Token Ring: Cambridge → IBM Zurich lab → Apollo

● Academic IBM maxis and VAX minis linked across the world:BITNET/EARN

● Clusters: Many computers on a network with

– Common authentication/authorization/access control

– Shared access to resources (disk, network, tape)

– Inter-process communication → “Cluster supercomputer”

● Operating system software gains importance

● 1983: IBM and DEC fund Project Athena at MIT (70 M$)


● 1985-1988: Dig 27 km tunnel for LEP/LHC at CERN– CERN deploys world's largest token ring for LEP control

● Interactive computing becomes larger than batch– IBM VM/CMS and Digital VAX/VMS are dominant

– Scientists start exchanging data via BITNET and DECnet

– 1986: CERN develops PAW, the “Excel” of part. Physics

– 1989: Tim Berners-Lee invents the World Wide Web at CERN

– 1993: NCSA Mosaic™ browser with graphic interface

● 1985: Needs of CERN's LEP experiments estimated at10 times larger than budgeted growth in maxis + minis

● 1989: CERN buys a Cray X-MP, decides to run it with Unix


The 1990s:From clusters to farms

PICport d’informaciócientífica 1990s: From clusters to farms

● Processing data from particle collisions is “embarrasingly parallel”● 1987: LFAE + Florida State Univ. launch FALCON:

– FALCON I: Quasi-online farm: Raw → Physics objectsDiskless VAXstations + disks dual-ported to DAQ

– FALCON III: Analysis farm: Physics objects→PAW ntuplesVAXstations hosting disks + exploit data locality

– Both based on Local Area VAXcluster over Ethernet

● 1990: CERN launches SHIFT project, based on (then) verycontroversial technological choices/goals:– Unix and TCP/IP

– Heterogeneous RISC workstation hardware:Apollo, Sun Microsystems, Silicon Graphics

– High-speed network from small California company

– Support multiple experiments. Develop accounting system !!

PICport d’informaciócientífica FALCON I

PICport d’informaciócientífica SHIFT

PICport d’informaciócientífica 1990s: From clusters to farms

● 1992: Digital releases Alpha RISC microprocessor– Capable of running VMS

– FALCON I and III upgraded to Alphas

– FSU builds FALCON IV: Alpha VAXcluster with FDDI net.Tapes shipped across the Atlantic to use FALCON IV.

● 1995: Digital launches Altavista search engine● 1994-1998: Digital disintegrates● Particle physics abandons VMS in favor of Unix

PICport d’informaciócientífica 1990s: from clusters to farms

● SHIFT is very sucessful:– Niche network replaced by TCP/IP over standard Ethernet

– Hardware agnostic→Heavy competition amongst vendors

– Advances in SCSI hard disks allow the delivery of hugeamount of disk space to physicists

– CERN CASTOR Hierarchical Storage Manager makestape look like an agile extension of disk

● Similar tendency at Fermilab Tevatron Collider– Development of the SAM Distributed Data Framework

– Data distribution over network in “push” and “pull” modes


● 1993: Microsoft releases Windows NT● 1994: NASA uses standard Unix plus MPI/PVM to build first

Beowulf cluster → “cluster supercomputer”● 1994: Linux Torvalds → Linux kernel v1.0● 1995: Bill Gates “Internet Tidal wave” → TCP/IP everywhere● mid-1990s: 32 bit mass market microprocessors

– 1995: Intel Pentium Pro

– 1996: AMD K5

● 1994: CERN buys its last mainframe– IBM SP-2: Mainframe built from RISC workstations

– CERN decides to manage it in an integrated manner within SHIFT


● 1996: CERN RD-47 project:High energy physics processing using commodity components Barcelona-CERN-JINR Dubna-Florida State-Fac. des Sciences Luminy- Santa Cruz-Washington

– Implement the whole particle physics data processing environmenton commodity hardware and software

– Concepts and prototypes of extra tools to automatically manage alarge number of nodes → processor farm

● Two approaches:– Use Windows NT PCs to replicate VAXcluster environment using

purely commercial components

– Use Linux PCs to implement shift using as much “open” software aspossible

● Some controversy as RISC had moved to 64-bits


● Windows NT approach worked, but >10 years ahead of itstime → Windows Azzure cloud service

● Linux approach became dominant– Rapid scale-up of power and number of nodes (100s)

– University groups start deploying local clusters

– Development of automation tools: Quattor, Lemon→ precursors of Puppet, Chef

● 1999: CERN DG triggers the first study of solutions for thedata processing needs of the LHC.First look generates considerable shock– > 100k computers needed

– Cost similar to one of the LHC detectors


The Large Hadron Collider –timeline

PICport d’informaciócientífica (Brief) LHC timeline

● 1984: European Committee on Future Accelerators workshop: LargeHadron Collider in the LEP tunnel

● 1987: U.S. President Ronald Reagan announces support for theSuperconducting Supercollider

● 1988: Digging of LEP tunnel completed at CERN● 1992: Letter of Intent for ATLAS and CMS detectors● 1993: U.S. Congress kills the SSC after 2 G$ spent● 1994: CERN Council approves construction of LHC● 2000: CERN stops LEP, dismantles to house LHC● 2008: First LHC beams, magnet interconnect accident● 2009: Beams back in LHC, run 1 starts● 2012: ATLAS and CMS discover the Higgs boson● 2015: LHC Run2 starts

PICport d’informaciócientífica (Brief) LHC Computing timeline

● 1992-1994: SSC, CERN: resources needed will be “much larger than those ofcurrent facilities”

● 1996-1998: R&D for Online: 1 PB/s → 1-10 PB/year– LHCb prototype of Myrinet based farm (later used to build first MareNostrum at BSC)

– CMS bets on farm based on giant Ethernet switchesNeeded capacity is equivalent to ¼ of US phone traffic

● 1999: First estimate: 0.1 M cores + 100 PB = 200 M€

● 1999: Ian Foster and Carl Kesselman publish“The Grid: Blueprint for a new computing infrastructure”

● 2000-2001: NSF, DOE, EU fund Grid development

● 2001: Worldwide LHC Computing Grid project approved by CERN Council,becomes part of the CERN Research Program

● 2002: LCG1 service becomes operational

● 2012: WLCG acknowledged as key enabler for Higgs boson discovery

● 2015: Estimates for High Luminosity LHC: Resources needed are “much higher”


The 2000s:The Worldwide LHC Computing Grid


2000s: Worldwide LHC Computing Grid

● Fastest evolution component in computing:Wide Area fiber optics communication

● Five very differentiated needs:– Archiving

– Reconstruction

– Filtering

– Analysis

– Simulation

● MONARC study– wide area network to integrate worldwide resources

– centers with different capabilities and reliabilities



● The WLCG Tiers as defined in MONARC– Tier-0 at CERN

● Receive raw data from DAQ, archive to tape and cache on disk● Run prompt reconstruction and quality checks● Distribute raw data to a limited number (13) of Tier-1 centers

– Tier-1 (CA, DE, ES, FR, IT, KR, Nordic, NL, RUx2, TW, UK, USx2)

● Receive raw data form CERN, archive to tape and reconstruct● Receive simulations, archive to tape and reconstruct● Run filters → physics objects with pre-determined patterns● Distribute filtered data to many (200) Tier-2 centers

– Tier-2● Make filtered physics objects available for analysis● Run simulations



Economies of scale were an important partof the “Grid vision”

Time

Qua

lity,

eco

nom

ies

of s

cale

Decouple production &consumption, enabling

● On-demand access● Economies of scale● Consumer flexibility● New devices

Adapted by permission from Ian Foster, University of Chicago and US Argonne National Lab

An example from thedevelopment of electricalpower from a “cottage

industry” to a dependableinfrastructure



● Grid “middleware” (example based on European gLite)– Authentication: International Grid Trust Federation of entities issuing X.509

certificates to users (ES: RedIRIS)

– Authorization: Virtual Organization Management Service (VOMS)→Internetservers tell your rights within project

– Computing Element: Abstraction of a batch queue

– Storage Element: Abstraction of a disk

– Resource Broker (or Workload Management System):Broker amongst Computing Elements

– Replica Manager:Handle multiple copies of same data in different Storage Elements

– Information Service: Where is what ?

– Logging and Book-keeping: What is happening ?

– User Interface: Machine which bridges normal world to the Grid


Foster & Kesselman vision: Grid or Cloud?

Servers:Execution

ApplicationServices:Distribution

Application Virtualization

• Automatically connect applications to services• Dynamic & intelligent provisioning

Infrastructure Virtualization

• Dynamic & intelligent provisioning• Automatic failover

Source: The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), 2004

Applications:Delivery




● Large sites had to learn to deal with x100 resources● Petabyte level storage

– Disk units failing weekly

– Multiple boxes presenting a single namespace:Distributed file systems: Lustre, GPFS, dCache

– Tape in Tier-1s

● Batch systems with 1000-10000 nodes● Efficient use of many-core processors● Traffic shaping in huge LANs. WANs near saturation

● Electrical power consumption and cooling efficiency● European Tier1s are multi-experiment: good and bad

PICport d’informaciócientífica WLCG sites in 2015



PIC's 10 Gbps link to LHCOPN running flat-out

PICport d’informaciócientífica ATLAS collisions processed per day


ATLAS: One job completes every 5 seconds

PICport d’informaciócientífica ATLAS: 1.4 PB/month processed


Outlook


LHC capacity growth at PIC (WLCG=x50)

PICport d’informaciócientífica LHC Run4 → Exascale

● Luminosity: x6 Run2, x14 Run3, x120 Run4● Complexity: More collisions per beam crossing● Current model will work for Runs 2 and 3, but worries about large

number of sites and associated people● Need to re-think everything for Run 4 → Start now

PICport d’informaciócientífica At PIC: Benefit other projects

● Past and current

– MAGIC Cherenkov Telescope main data center

– PAU Cosmological Survey main data center

– EUCLID Science Data Center supporting Simulation OU

– Analysis support for LHC-ATLAS, neutrinos-T2K, cosmology-DES, etc.

– Storage and analysis support for simulations● Turbulent flow (Hoyas and Jiménez-Sendín, UPM, published)

● Universe expansion (Fosalba, et.al., IEEC-ICE, published and ongoing)

● Star evolution (Padoan, et.al., ICREA, published and ongoing)

– PIC Neuroimage processing platform

● Future

– Cherenkov Telescope Array (CTA) landing data center

– Next generation neutrino experiments

– Simulation storage and analysis

– Other fields? → Contact me if you want to explore a collaboration