Top Banner
An Overview over Online Systems at the LHC Invited Talk at NSS-MIC 2012 Anaheim CA, 31 October 2012 Beat Jost , Cern
18

An Overview over Online Systems at the LHC

Dec 30, 2015

Download

Documents

jana-wood

An Overview over Online Systems at the LHC. Invited Talk at NSS-MIC 2012 Anaheim CA, 31 October 2012 Beat Jost , Cern. Acknowledgments and Disclaimer. I would like to thank David Francis, Frans Meijers and Pierre vande Vyvre for lots of material on their experiments - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Overview over Online Systems at the LHC

An Overview over Online Systems at the LHC

Invited Talk at NSS-MIC 2012Anaheim CA, 31 October 2012

Beat Jost , Cern

Page 2: An Overview over Online Systems at the LHC

Beat Jost, Cern

Acknowledgments and Disclaimer

I would like to thank David Francis, Frans Meijers and Pierre vande Vyvre for lots of material on their experiments

I would also like to thanks Clara Gaspar and Niko Neufeld for many discussions

There are surely errors and misunderstandings in this presentation which are entirely due to my shortcomings

2NSS-MIC Anaheim 31 October 2012

Page 3: An Overview over Online Systems at the LHC

Beat Jost, Cern

Outline

❏ Data Acquisition Systems➢ Front-end Readout➢ Event Building

❏ Run Control➢ Tools and Architecture

❏ Something New – Deferred Triggering❏ Upgrade Plans

3NSS-MIC Anaheim 31 October 2012

Page 4: An Overview over Online Systems at the LHC

Beat Jost, Cern

Role of the Online System

❏ In today’s HEP experiments millions of sensors are distributed over hundreds of m2 and actuated dozens million times per second

❏ The data of all these sensors have to be collected and assembled in one point (computer, disk, tape), after rate reduction through event selection➢ This is the Data Acquisition (DAQ) system

❏ This process has to be controlled and monitored (by the operator)➢ This is the Run Control System

❏ Together they form the Online systemAnd, by the way, it’s a pre-requisite for any physics analysis 4NSS-MIC Anaheim 31 October 2012

Page 5: An Overview over Online Systems at the LHC

Beat Jost, Cern

Setting the Scene – DAQ Parameters

5NSS-MIC Anaheim 31 October 2012

Page 6: An Overview over Online Systems at the LHC

Beat Jost, Cern

A generic LHC DAQ system

6NSS-MIC Anaheim 31 October 2012

Sensors

Front-End Electronics Aggregation

Aggregation/(Zero Suppression)

Zero Suppression/Data Formatting/Data Buffering

Event BuildingNetwork

HLT Farm

Perm. Storage

On/

near

Det

ecto

rO

ff D

etec

tor

Front-End Electronics

Front-End Electronics

Front-End Electronics

Front-End Electronics

Today’s data rates are too big to let all the data flow through a single component

Page 7: An Overview over Online Systems at the LHC

Beat Jost, Cern

❏ The DAQ System can be viewed like a gigantic funnel collecting the data from the sensors to a single point (CPU, Storage) after selecting interesting events.

❏ In general the response of the sensors on the detector are transferred (digitized or analogue) on point-point links to some form of 1st level of concentrators➢ Often there is already a concentrator on the detector electronics,

e.g. readout chips for silicon detectors.➢ The more upstream in the system, the more the technologies at

this level differ, also within the experiments➥ In LHCb the data of the Vertex detector are transmitted in analogue

form to the aggregation layer and digitized there

❏ The subsequent level of aggregation is usually also used to buffer the data and format them for the event-builder and High-level trigger

❏ Somewhere along the way, Zero suppression is performed

Implementations – Front-End Readout

7NSS-MIC Anaheim 31 October 2012

Page 8: An Overview over Online Systems at the LHC

Beat Jost, Cern 8NSS-MIC Anaheim 31 October 2012

DDL

Optical 200 MB/s ~500 links Full duplex: Controls FE (commands, Pedestals, Calibration data)Receiver card interfaces to PC

Yes

SLINKOptical: 160 MB/s ~1600 LinksReceiver card interfaces to PC.

Yes

SLINK 64

LVDS: 400 MB/s (max. 15m) ~500 linksPeak throughput 400 MB/s to absorb fluctuations, typical usage 2kB@100 kHz = 200 MB/s.Receiver card interfaces to commercial NIC (Myrinet)

Yes

Glink (GOL)

Optical 200 MB/s ~4800 links+ ~5300 analog links Before Zero SuppressionReceiver card interfaces to custom-built Ethernet NIC (4 x 1 Gb/s over copper)

(no)Trigger Throttle

Readout Links of LHC ExperimentsFlow Control

Page 9: An Overview over Online Systems at the LHC

Beat Jost, Cern

Implementations – Event Building

❏ Event building is the process of collecting all the data fragments belonging to one trigger in one point, usually the memory of a processor of a farm.

❏ Implementation typically using a switched network➢ ATLAS, ALICE and LHCb Ethernet➢ CMS 2 steps, first with Myrinet, second Ethernet

❏ Of course the implementations in the different experiments differ in details from the ‘generic’ one, sometimes quite drastically.➢ ATLAS implements an additional level of trigger, thus reducing the

overall requirements on the network capacity➢ CMS does event building in two steps; with Myrinet (fibre) and 1

GbE (copper) links➢ ALICE implements the HLT in parallel to the event builder thus

allowing bypassing it completely➢ LHCb and ALICE use only one level of aggregation downstream of

the Front-End electronics.

9NSS-MIC Anaheim 31 October 2012

Page 10: An Overview over Online Systems at the LHC

Beat Jost, Cern 10NSS-MIC Anaheim 31 October 2012

EthernetSingle Stage event–building- TCP/IP based Push protocol- Orchestrated by an Event Destination Manager

Ethernet

Staged Event-Building via a two-level trigger system- partial readout driven by ROI (Level-2 trigger)- full readout at reduced rate of accepted eventsTCP/IP based pull protocol

Myrinet/Ethernet

Two-stage Full Readout of all triggered events- first stage Myrinet (flow control in hardware)- second stage with Ethernet TCP/IP and driven by Event Manager

Ethernet

Single stage event-building directly from Front-end Readout Units to HLT farm nodes- driven by Timing&Fast Control system- pure push protocol (raw IP) with credit-based congestion control. Relies on deep buffers in the switches

SWITCH

HLT farm

Detector

TFC System

SWITCHSWITCH SWITCH SWITCH SWITCH SWITCH

READOUT NETWORK

L0 triggerLHC clock

MEP Request

Event building

Front-End

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Readout Board

Expe

rimen

t Con

trol

Sys

tem

(EC

S)

VELO ST OT RICH ECal HCal MuonL0

Trigger

Event dataTiming and Fast Control SignalsControl and Monitoring data

SWITCH

MON farm

CPU

CPU

CPU

CPU

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

Event Building in the LHC Experiments

GDC TDSM

CTP

LTU

TTC

FERO FERO

LTU

TTC

FERO FERO

LDCLDC

BUSY BUSY

Rare/All

Event Fragment

Sub-event

Event

File

Storage Network (8GB/s)PDS

L0, L1a, L2

L0, L1a, L2

360 DDLs

D-RORCD-RORC

EDM

LDC

D-RORC D-RORC

Load Bal. LDC LDC

D-RORC D-RORC

HLT Farm

FEPFEP

DDL

H-RORC

10 DDLs

10 D-RORC

10 HLT LDC

120 DDLs

DADQM

DSS

Event Building Network (20 GB/s)

430 D-RORC

175 Detector LDC

75 GDC30 TDSM

18 DSS60 DA/DQM

75 TDS

Archiving on Tapein the ComputingCentre (Meyrin)

SWITCH

HLT farm

Detector

TFC System

SWITCHSWITCH SWITCH SWITCH SWITCH SWITCH

READOUT NETWORK

L0 triggerLHC clock

MEP Request

Event building

Front-End

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Readout Board

Expe

rimen

t Con

trol

Sys

tem

(EC

S)

VELO ST OT RICH ECal HCal MuonL0

Trigger

Event dataTiming and Fast Control SignalsControl and Monitoring data

SWITCH

MON farm

CPU

CPU

CPU

CPU

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

Readout Board

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

FEElectronics

Page 11: An Overview over Online Systems at the LHC

Beat Jost, Cern

Controls Software – Run Control

❏ The main task of the run control is to guarantee that all components of the readout system are configured in a coherent manner according to the desired DAQ activity.➢ 10000s of electronics components and software processes➢ 100000s of readout sensors

❏ Topologically implemented in a deep hierarchical tree-like architecture with the operator at the top

❏ In general the configuration process has to be sequenced so that the different components can collaborate properly Finite State Machines (FSM)

❏ Inter-Process(or) communication (IPC) is an important ingredient to trigger transitions in the FSMs

11NSS-MIC Anaheim 31 October 2012

Page 12: An Overview over Online Systems at the LHC

Beat Jost, Cern

Control Tools and Architecture

ALICE ATLAS CMS LHCb

IPC Tool DIM CORBA XDAQ (HTTP/SOAP) DIM/PVSS

FSM Tool SMI++ CLIPS RMCS/XDAQ SMI++

Job/Process Control DATE CORBA XDAQ PVSS/FMC

GUI Tools Tcl/Tk Java Java Script/Swing/Web Browser PVSS

12NSS-MIC Anaheim 31 October 2012

LVDev1

LVDev2

LVDevN

DCS

SubDetNDCS

SubDet2DCS

SubDet1DCS

SubDet1LV

SubDet1TEMP

SubDet1GAS

Co

mm

and

s

DAQ

SubDetNDAQ

SubDet2DAQ

SubDet1DAQ

SubDet1FEE

SubDet1RO

FEEDev1

FEEDev2

FEEDevN

ControlUnit

DeviceUnit

Legend:

INFR. TFC LHC

ECS

HLT

Sta

tus

& A

larm

s

Run Control

DetectorControl

ex. LHCb Controls Architecture

Page 13: An Overview over Online Systems at the LHC

Beat Jost, Cern

GUI Example – LHCb Run Control

13NSS-MIC Anaheim 31 October 2012

Main operation panel for the shift crew

Each sub-system can (in principle) also be driven independently

Page 14: An Overview over Online Systems at the LHC

Beat Jost, Cern

Error Recovery and Automation

❏ No system is perfect. There are always things that go wrong➢ E.g. de-synchronisation of some components

❏ Two approaches to recovery ➢ Forward chaining

➥ We’re in the mess. How do we get out of it?– ALICE and LHCb: SMI++ automatically acts to recover– ATLAS: DAQ Assistant (CLIPS) operator assistance– CMS: DAQ Doctor (Perl) gives operator assistance

➢ Backward chaining➥ We’re in the mess. How did we get there?

– ATLAS: Diagnostic and Verification System (DVS)

❏ Whatever one does: One needs lots of diagnostics to know what’s going on.

14NSS-MIC Anaheim 31 October 2012

Snippet of forward chaining (Big Brother in LHCb):

object: BigBrother state: READY when ( LHCb_LHC_Mode in_state PHYSICS ) do PREPARE_PHYSICS when ( LHCb_LHC_Mode in_state BEAMLOST ) do PREPARE_BEAMLOST ... action: PREPARE_PHYSICS do Goto_PHYSICS LHCb_HV wait ( LHCb_HV ) move_to READY action: PREPARE_BEAMLOST do STOP_TRIGGER LHCb_Autopilot wait ( LHCb_Autopilot ) if ( VELOMotion in_state {CLOSED,CLOSING} ) then do Open VELOMotion endif do Goto_DUMP LHCb_HV wait ( LHCb_HV, VELOMotion ) move_to READY ...

Page 15: An Overview over Online Systems at the LHC

Beat Jost, Cern

Summary

❏ All LHC Experiments are taking data with great success➢ All implementations work nicely➢ The systems are coping with the extreme running conditions,

sometimes way beyond the original requirements➥ ATLAS and CMS have to cope with upto 40 interactions/bunch

crossing (requirement was ~20-25) LHCb ~1.8 interactions instead of 0.4 as foreseen.

➥ Significantly bigger event sizes➥ Significantly longer HLT processing

❏ Availability of the DAQ systems are above 99%➢ Usually it’s not the DAQ hardware that doesn’t work

❏ The automatic recovery procedures implemented keep the overall efficiency typically above 95%, mainly by faster reaction and avoidance of operator mistakes.

15NSS-MIC Anaheim 31 October 2012

Page 16: An Overview over Online Systems at the LHC

Beat Jost, Cern

Something New – deferred Trigger

❏ The inter-fill gaps (dump to stable-beams) of the LHC can be significant (many hours, sometimes days)

❏ During this time the HLT farm is basically idle❏ The idea is to use this idle CPU time for executing the HLT

algorithms on data that was written to a local disk during the operation of the LHC.

16NSS-MIC Anaheim 31 October 2012

Moore Moore Moore

MEPrx

DiskWr

MEP

ResultOverflow

OvrWr Reader

MEP buffer full?

No

YesFarm Node

Page 17: An Overview over Online Systems at the LHC

Beat Jost, Cern

❏ Currently deferring ~25% of the L0 Trigger Rate➢ ~250 kHz triggers

❏ Data stored on 1024 nodes equipped with 1TB local disks

❏ Great care has to be taken ➢ to keep an overview of which nodes hold files of which

runs.➢ Events are not duplicated

➥ During deferred HLT processing files are deleted from disk as soon as they are opened by the reader

❏ Starting and stopping is automated according to the state of the LHC➢ No stress for the shift crew

Deferred Trigger – Experience

17NSS-MIC Anaheim 31 October 2012

Start of Data

taking

Beam Dump

Start of deferred

HLT

End deferred HLT

Start of Data

taking

Beam Dump

Online troubles

New fillStart of

Data taking

Nu

mb

er o

f fi

les

Beam Dump

Start of deferred

HLT

Page 18: An Overview over Online Systems at the LHC

Beat Jost, Cern

Upgrade Plans

❏ All four LHC experiments have upgrade plans for the nearer or farther future➢ Timescale 2015

➥ CMS– integration of new point-to-point link (~10 Gbps) to new back-end electronics (in µTCA)

of new trigger/detector systems– replacement of Myrinet with 10 GbE (TCP/IP) for data aggregation in to PCs and

Infiniband (56 Gbps) or 40 GbE for event building

➥ ATLAS: merging of L2 and HLT networks and CPUs➥ Each CPU in Farm will run both triggers

➢ Timescale 2019➥ ALICE: increase acceptable trigger rate from 1 to 50kHz for Heavy Ion operation

– New front-end readout link– TPC continuous readout

➥ LHCb: Elimination of hardware trigger (readout rate 40 MHz)– Readout front-end electronics for every bunch crossing

• New front-end electronics• Zero suppression on/near detector

– Network/Farm capacity increase by factor 40 (3.2 TB/s, ~4000 CPUs)– Network technology: Infiniband or 10/40/100 Gb Ethernet– No architectural changes

➢ Timescale 2022 and beyond➥ CMS&ATLAS: implementation of a HW track trigger running at 40 MHz and surely

many other changes…18NSS-MIC Anaheim 31 October 2012