1 CHEP 2000 Padova Stefano Veneziano The Read-Out Crate in the ATLAS DAQ/EF prototype -1 • The Read-Out Crate model • The Read-Out Buffer • The ROBin • ROB performance • ROC performance • I/O module configurations • Conclusions
Mar 26, 2015
1
CHEP 2000Padova
Stefano Veneziano
The Read-Out Crate in the ATLAS DAQ/EF prototype -1
• The Read-Out Crate model
• The Read-Out Buffer
• The ROBin
• ROB performance
• ROC performance
• I/O module configurations
• Conclusions
Y.Hasegawa, M.Joos, G.Lehmann, J.Lopez, A.Mailov, L.Mapelli,
G.Mornacchi, Y.Nagasaka, M.Niculescu, K.Nurdan, J.Petersen,
D.Prigent, J.Rochez, L.Tremblet, G.Unel, S.Veneziano, Y.Yasu
2
CHEP 2000Padova
Stefano Veneziano
ATLAS DAQ/EF
LDA
EF SubFarm
SFOQ
SFI
LDA
ReadoutBuffers
TRGQ
EBIF
Switching Network
Detector Electronics
Farm DAQ
SwitchSupervision
Mass Storage
DFM+ LDAQ
Front End DAQ
From triggerTo level 2
Event Builder
LDA
ReadoutBuffers
TRGQ
EBIF
systems (L2A, L2R, ROI)
consisting of read-out crates
consisting of sub-farm crates
Input Rate
75 - 100 kHz
Bandwidth
~100 MB/s
1- 2 kHz 4-5 GB/s
LDA
EF SubFarm
SFOQ
SFI
ROC
3
CHEP 2000Padova
Stefano Veneziano
Read-Out Crate logical model
The ROC performs:• data collection from many readout links• event buffering during LVL2 latency• event fragment distribution to Event Builder and LVL2
External I/O channels
External I/O channels
Internal I/O channels
4
CHEP 2000Padova
Stefano Veneziano
The I/O Module– Each I/O channel, external or
internal, has an associated Task
– A Task is activated on the occurence of stimuli (messages or event data)
– An I/O Module (IOM) is a collection of Tasks associated to an external I/O channel.
– All Tasks belonging to an IOM are scheduled within a single process and activated by polling conditions (scheduler)
– Event manager API
– All components within a ROC communicate via a well defined message passing protocol
• the baseline implementation of the ROC crate, VMEbus based one Single Board Computer (SBC) per IOM, has evolved to a collapsed solution, where a SBC can handle many external I/O channels.
5
CHEP 2000Padova
Stefano Veneziano
Message passing• Based on circular buffers
• Supported buses: VMEbus, PVIC, PCI, CPU bus, EBIO (TCP/IP)
• Duplication of R/W pointers ==> no polling on the bus
• DMA and broadcast used where possible
PCI DMA
0
20
40
60
80
100
0 500 1000 1500 2000 2500
message size (bytes)
tim
e/m
ess
ag
e (
sec)
Two receivers
One receiver
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200
message size (bytes)ti
me
/ m
es
sa
ge
(s
ec
)
Five receivers
One receiverPVIC
6
CHEP 2000Padova
Stefano Veneziano
The MFCC based ROBin
The CES MFCC 8441 is a commercially available intelligent PMC– I/O: via user programmable 10k50ev front-end FPGA
– Same S/W environment as on SBC (LynxOS 3.0.1)
Today’s ROB implementation minimizes movement of event fragments over the system bus (need to receive and buffer events of 1 kB at 100 kHz), by using an add-on PMC card (one per Read-Out Link).
7
CHEP 2000Padova
Stefano Veneziano
The ROBin software
• No device drivers, minimal operating system calls.
• Single process, scheduler, three tasks to manage one ROL, one internal I/O to ROB-host + firmware.
8
CHEP 2000Padova
Stefano Veneziano
ROBin firmware
• Firmware on FPGA from VHDL synthesis (40/66 MHz clk):•PPC master and slave•interface to ROL protocol (S-link)•Buffer manager and DMA manager state machines•2 kB data fragment buffering•communication to ROBin task via two FIFOs (EM pages stats)
PPC
ROL (S-link)603
sdram
9
CHEP 2000Padova
Stefano Veneziano
ROBin interactionROBin application interacts with ROB-host and event data source (FE-FPGA)
The following ROB performance results rely on test programs running on the ROB-host on:• Event Location• Event Deletion• Event Retrieval
Scheduler loop
ROBinROB-host PCI
10
CHEP 2000Padova
Stefano Veneziano
Event location
•Messages over the PPC-PCI-PPC buses
•No S-Link I/O
•No broadcast mechanism
•Best arrangement of events into memory (one event per class)
EM_GetById - sequential IDs
0.0
10.0
20.0
30.0
40.0
50.0
60.0
0 1 2 3 4
number of MFCCs
rate
(kH
z)
PPC MEM
PCI bus
RIO2 8062
PPC MEM
MFCC 8441 PMC
11
CHEP 2000Padova
Stefano Veneziano
Event retrieval
• Messages: single cycles
• Event data transfer: DMA
• Transfer bandwidth: ~50 MB/s
EM_GetByIdCopy
0
10
20
30
40
50
0 256 512 768 1024
event size (bytes)
rate
(kH
z).
1 MFCC2 MFCCs3 MFCCs4 MFCCs
12
CHEP 2000Padova
Stefano Veneziano
Event deletion
• No input of new events, only messages over the system bus.
• Messages sent in DMA mode
• Several delete requests packed into one message
• Delete requests get acknowledged
EM_DeleteById
0
100
200
300
400
500
600
0 10 20 30 40 50 60
L2R group size
rate
(k
Hz)
1 MFCC2 MFCCs3 MFCCs4 MFCCs
No event regeneration# pages/event = 1# events/class = 1
13
CHEP 2000Padova
Stefano Veneziano
S-Link measurements
• Rate dominated by event fragment input traffic
• Max. input rate in best conditions = 145 MB/s (with no messages from ROB-host)
(SLIDAS max bandwidth is 160 MB/s)
EM_DeleteById (with SLIDAS, single cycle message passing, preliminary)
0
25
50
75
100
125
150
0 10 20 30 40 50
L2R group size
rate
(kH
z)
1016 bytes/event504 bytes/event248 bytes/eventpages/event=1
One ROBwith one ROBin
14
CHEP 2000Padova
Stefano Veneziano
ROC measurements
• No S-Link, input emulated
• All ROB<->ROBin messages sent in single cycle mode
• rate dominated by PCI traffic and ultimately by MFCC CPU to ~120 kHz0
20
40
60
80
100
120
140
0 1 2 3 4
number of MFCCs
eve
nt
rate
(kH
z)
1% L2A
5% L2A
PPC MEMDMA
PPC MEMDMA
PCI bus PCI bus
VMEbus
PVICPVIC PMC PVIC PMC
RIO2 8062 RIO2 8062DMA DMA
Event data
(Messages)
Messages+
Event data
Input fragments
+Messages
+Event data
PPC
Not used
15
CHEP 2000Padova
Stefano Veneziano
I/O modules configurations
ROC Event Rates
0
20
40
60
80
100
120
140
160
180
200
even
t ra
te (
kHz)
TRG + EBIF + ROBs
TRG/ EBIF + ROBs
TRG + EBIF/ ROBs
TRG/ EBIF/ROBs
One ROBNo ROBins
T E R R R
TE E ER R R
TE
R R RT
E E ER R RT T T
To further minimize data movement and message passing, more than one external input can be handled by an IOM
TRG ROLs
16
CHEP 2000Padova
Stefano Veneziano
ROC Event Rates with ROBins
ROC Event Rate vs. # ROBins
80
90
100
110
120
0 1 2 3 4 5
Number of ROBins per ROB
Eve
nt
Ra
te (
kHz)
TRG + EBIF + ROBsTRG/EBIF + ROBsTRG + EBIF/ROBsTRG/EBIF/ROBs
Max event rate increases when data collection (ROBins to EBIF) is done on the same SBC.Expected ATLAS max ROL bandwitdh 1KBX75(100) kHz
1KB event fragments
17
CHEP 2000Padova
Stefano Veneziano
Conclusions
• A Read-Out Crate based on the DAQ/-1 design and deployed on COTS components has the functionality required by the ATLAS Trigger/DAQ community
• Software layering and minimal dependence of the software packages on the operating system adds flexibility without a degradation of performance. It also facilitates porting (move to Motorola/Linux or Intel/Linux)
• The requirements of many Read-Out Links per Read-Out Buffer have lead to the design of the ROBIN, capable of receiving Event data fragments at the expected Level-1 rates.