# Applications of programmable logic in modern particle physics experiments

Geoff Hall
Imperial College London



# Particle physics experiments

Imperial College London

- What is particle physics?
  - Far too big a challenge to explain in full, but...
- What is LHC?
- What are we doing there?
- How do we do it?
  - (Why?)
- What are the technical challenges of such experiments?
- How have we (we hope) overcome them?
  - What is the role of (programmable) logic?
- How will we do so in future?
  - Our subject has relied on electronics for decades, especially digital logic, and has closely followed technology evolution
  - There has been a rapid expansion of the use of FPGAs in just a few years

FTP 2006 2 Geoff Hall



**ALICE** 



# Large Hadron Collider

Latest CERN accelerator due to start late 2007

very high intensity

■ 10<sup>15</sup> collisions per year

very high rate

beams cross @ 40MHz

few "interesting" events

~100 Higgs decays per year

Beams

7 TeV protons

=> 14 TeV energy

also ions

• (eg Pb)



**LEP tunnel** 







- The forces between matter (particles) are transmitted by fields, also represented by particles
  - The "Standard Model" has unified some of the (4) forces of nature
    - Electromagnetism and the Weak Nuclear Force
    - There are only a few families of elementary particles
    - Several free parameters are not yet explained theoretically
      - information needed from experiment
  - But the SM has been astonishingly successful
    - The most significant missing item is mass
    - It may be explained by a new field (and particle) Higgs (boson)
    - Experiment and theory suggest this should be found at LHC



## Forces hold matter together





#### Standard Model





#### Make up hadronic matter:

- Proton (uud)
- Neutron (udd)
- Exotic baryons (qqq)
- Mesons (qq̄)

electric

#### Leptons

| Lontone                                                                              |        | charge |                      | charge |
|--------------------------------------------------------------------------------------|--------|--------|----------------------|--------|
| Leptons ■ Families - like quarks - but why? Ele                                      | ectron | -1     | Electron<br>neutrino | 0      |
| - p and t are like nearly electrons                                                  | luon   | -1     | Muon<br>neutrino     | 0      |
| <ul> <li>each has a neutrino partner</li> <li>All quarks and leptons have</li> </ul> | Tau 🔵  | -1     | Tau<br>neutrino      | 0      |
| mass                                                                                 |        |        |                      |        |

electric

 Colliding beams maximises the energy available to create new particles (compared to shooting at a target)



Hadron collisions are actually between their constituent parts...

• 
$$\lambda \sim 1/p \approx 1/E$$

- gluons
- quarks: both valence and sea (≈ real and virtual)
- and the particles they exchange (Z, W,...)







#### Discoveries...

- Look at interactions for
  - unexpected behaviour
    - like large energy at large angle to beam
       (how Rutherford discovered the atomic nucleus)
  - evidence of short-lived particles
    - visible evidence
    - Indirect, by peaks in mass spectra





Old picture of a charmed particle production and decay in a bubble chamber



## What we hope to find



- Higgs discovery (simplified!)
  - Will be produced with many other particles
  - ~20 events per beam crossing
  - hundreds of secondary particles/25ns

- Much new physics
  - New forces
  - New particles
  - New symmetries





# CMS Compact Muon Solenoid

Imperial College London





# **Detector Luminosity Effects**

■  $H\rightarrow ZZ \rightarrow \mu\mu ee$ ,  $M_H=300$  GeV for different luminosities in CMS





## Trigger and DAQ systems

- Trigger/DAQ is the system that selects particle interactions that are potentially
  of interest for physics analysis (Trigger), and which takes care of collecting
  the corresponding data from the detector system, putting them into a suitable
  format and recording them to permanent storage (DAQ)
- First-level trigger First stage of a multi-level selection process that takes place before data are recorded to permanent storage
  - In <u>custom digital electronics</u>
- Followed by
  - Higher trigger levels implemented in commercial computer farm
- Trigger requirements
  - High efficiency for selecting processes of interest for physics analysis
    - Precisely known and unbiased
  - Large reduction of rate from unwanted high-rate processes
  - Decision must be fast
  - Operation should be deadtime free
  - Implementation should allow flexibility to adapt to experimental conditions
  - System must be affordable



## **Triggering**

- Many categories of interesting events have characteristic signatures in the detector. Combinations of:
  - Detection of energetic electron(s) (ECAL)
  - Detection of μ(s) (muon system)
  - Observation of energetic jets (ECAL/HCAL)
    - (groups of hadronic particles)
- Must not reject them without further analysis
- Need fast decision (few µs) in hardware







## Processing LHC Data







## **Latency Summary**





# CMS Trigger & DAQ

Two level architecture:





## LHC Trigger Levels



#### Collision rate 10<sup>9</sup> Hz

Channel data sampling at 40 MHz

#### Level-1 selected events 10<sup>5</sup> Hz

**Particle identification** (High  $p_{\tau}$  e,  $\mu$ , jets, missing  $E_{\tau}$ )

- Local pattern recognition
- Energy evaluation on prompt macro-granular information

#### Level-2 selected events 10<sup>3</sup> Hz

Clean particle signature (Z, W, ..)

- Finer granularity precise measurement
- Kinematics. effective mass cuts and event topology
- Track reconstruction and detector matching

#### Level-3 events to tape 10..100 Hz

**Physics process identification** 

Event reconstruction and analysis



# Challenges of LHC

- LHC machine itself is immensely challenging
- R&D for LHC experiments commenced ~1990
  - No turnkey commercial systems or components
  - Expected radiation levels were unprecedented
    - unclear how to build detectors for such a hostile environment
  - All systems, especially electronics, are highly customised
    - They must operate for 10 years, with almost no intervention on the detector
- Issues include...
  - Number of channels and data volumes
  - Unprecedented readout speed and trigger rate
  - Radiation tolerance of sensors and readout
  - Data transfer
  - Power dissipation
  - Available computing power
  - Long term reliability
  - Component obsolescence
  - Important constraints on detector design mass, hermeticity,...
  - Cost



~68 million pixels

#### Silicon Tracker

#### Imperial College London



FTP 2006 21 Geoff Hall





# Tracker during assembly



Outer barrel 3.1M channels

October 2006

Inner barrel system
-to be inserted into outer barrel
1.8M channels



#### Electromagnetic Calorimeter







### Muon System



#### Position measurement:

Drift Tubes (DT) in barrel Cathode Strip Chambers (CSC) in endcaps

#### Trigger:

Resistive Plate Chambers (RPCs) in barrel and endcaps



**HCAL** 

Magnet coil, in cryostat

195k DT channels 210k CSC channels 162k RPC channels

24



# LHC electronic systems

- functions required by all systems
  - amplification and filtering
  - analogue to digital conversion
  - association to beam crossing
  - storage prior to trigger
  - deadtime free readout @ ~100kHz
  - storage pre-DAQ
  - calibration
  - control
  - monitoring

- Special functions for Calorimeters and Muon systems
  - first level trigger primitive generation
  - (so far not feasible for tracking)





# Tracker Electronic System



- NB Control, clock, trigger not shown
- On-detector
  - 73k APV chips/ 9.3Mchannels
  - 43k optical links
- Off-detector electronics
  - 440 FEDs, plus spares
  - ~25 FECs

Main features

- Analogue readout
- No on-detector zero suppression
- Optical analogue data transfer
- Control signals sent optically
- Local electrical transfer



x96



#### APV25

- Main features
  - Radiation hard by design/technology
  - 128 readout channels
  - 50 ns CR-RC amplifier
  - 192 cell pipeline memory
  - alternate operating modes
    - peak, deconvolution, multi-mode
    - on-chip analogue signal processing
  - on-chip ancillary functions
    - eg calibration, I<sup>2</sup>C, programmable latency...





7.1mm

27







#### FPGAs in CMS

- Rapid development in last few years- eg -
  - FPGAs originally too expensive & insufficient resources for FED project (1990s) and we expected ASICs would be required
  - 2002 committed to Xilinx Virtex-II 2Mgate (3M exceeded budget target)
    - But first production parts only available in May 2002
- Board design
  - Close collaboration between professional engineers and physicists
  - Works best with rigorous plan requirements/specifications/project management/...
  - NB long development, prototyping, testing time for demanding applications
    - Long term reliability in operation mandatory
- Firmware not the only challenge
  - Shared between engineers, and (technically minded) physicists
    - NB PhD students and RAs often become the firmware experts
- Many (& growing number of) projects using FPGAs in CMS for flexible digital logic, especially <u>trigger</u> system
  - Three representative applications, to illustrate some of the challenges
  - Tracker FED
  - APVe; Front-end chip pipeline logic emulation
  - Global Calorimeter Trigger



#### **Tracker Front End Driver**

- opto-electric conversion
- digitisation
- data reordering
- baseline subtraction
- hit finding
- zero suppression
- data assembly
- data transfer via high speed link
- Raw input data rate= 3.4GB/s.
- Average output rate
  - = 200 MB/s

15 metal layers
~6000 components
Double sided assembly
~25,000 traces
Impedance matched





#### **FED Overview**



Modularity matches Opto Links

8 x Front-End "modules"

OptoRx/Digitisation/Cluster Finding

160MHz transmission to..

Back-End module / Event Builder

VME module / Configuration

Power module

Other Interfaces:

TTC: Clk / L1 / BX

DAQ: 400 [200] MB/s peak

[sustained] SLink64

TCS: Busy & Throttle

**VME**: Control & Monitoring

JTAG: Test & Configuration



# Front End (FE) Unit on FED

Imperial College London

Digitise & sync data Find hit clusters Opto-to-electrical conversion Opto-RX, Optical ribbon 3 x Delay FPGA Virtex II. 2M gate FPGA cable input 12 way (ADC clk timing) performs signal processing U12 2 Analogue circuitry duplicated 6 x Dual 40MHz. 12 x Buffers on secondary side 10bit ADCs

#### De-multiplexed fibre channel = APV Data Frame



To extract hit need to perform:

- Common mode subtraction
- Pedestal subtraction
- Cluster finding
- Sync checking



# FED Challenges

- Firmware is one of many issues in such a project
  - 3 major FPGAs
    - FE (physics), BE (data handling), VME interface
    - Customised firmware, to be thoroughly debugged
    - Software, ditto -
    - Many interfaces
  - Algorithms are not complex, but many operational system issues
  - Manufacturing is a major concern, given numbers and budget
- Some of the other issues encountered and overcome
  - Component availability
  - Component obsolescence
  - Dense layout of analogue sections
  - Precise clock distribution to 96 channels
  - Design tool evolution
  - Manufacture quality
  - ADC non-conformity
  - (noise and power)
  - Cooling

"Final" yield > 98% with good operational experience on ~50 boards over 1-2 years

Final system now being installed



# FED Acceptance Testing Flow Imperial College London



1. Custom CMS Tests at **Assembly Plant Boundary Scan** Analogue



**Assembly Company** 



**FED** 



3. Tests at CERN followed by integrated system tests

4. Installation at CMS underground area





VME Transition card

"Pipeline" with each stage taking ~ 1 month and containing ~ 50 FEDs



#### The FED Tester

Provides 24 optical channels & Trigger Control System (TCS)



FED evaluation required almost as complex a module

Data for 3 channels loaded into FPGA.

Converted to analogue form by 3 DACs.

Cross-point switch controls distribution of the 3 unique channels to the 24 channels.

8 three channel Analogue OptoHybrids convert electrical to optical signals.

Temp of AOHs controlled to +/-1°C



 Emulates pipeline logic in APV25 FE readout chip to prevent buffer overflows

< 75ns for decision, including transmission</p>

APV25 silicon strip readout chip

Interface to TTCci for pipeline address transmission

 FE chips on the detector cannot report to system if their buffers will overflow

 Signal transmission (100m) needs too long to prevent further triggers arriving in the meantime

Buffer monitoring could use a real APV25

- But emulation provides much more information on status
- However, when idea was first developed FPGAs did not seem sufficiently powerful



Interface to Central & Local TCS to receive control signals and send status.

Additional input for FED status



## "Deadtime free" operation

- 73k identical chips, operating synchronously
  - System partitioned for testing
  - SEU & other effects can de-synch.
- Data stored in pipeline memory with "ring" topology
  - Pointers record current (write) location and location of data being read
  - Addresses of used locations stored in FIFO to be skipped during writing
  - pipeline length is dynamic
- Pipeline length, buffer depth, storage time chosen to ensure that rate of data lost is sufficiently small
  - queuing problem



APV needs  $7\mu s$  for frame readout Max FIFO depth = 10 Prob( $N_{trig} > 10$ ) <<1

But Poisson fluctuations mean overflow is inevitable



### How does APVE work?

- Two methods:
  - "Belt and braces"
- Real APV25 chip
  - .. is fed the same L1A and Reset signals as those in the Tracker
  - counts the number of filled APV25 buffers
    - L1A => INCREMENTS
    - Data frame => DECREMENTS
    - Reset => CLEARS
  - Cannot update status until a frame is fully read out
- APV emulation
  - FPGA replicates the internal APV logic.
  - This provides the best Tracker efficiency, but the logic is difficult to synthesise and must match the real APV25 logic precisely.





## I-ImaS Project

- APVe was realised to be a very flexible module
  - Upgraded it for final system and other applications
  - Used in EU Medical Imaging project (I-ImaS)



- New board has:
  - USB 2.0
  - 10/100 Ethernet
  - XC2VP20-6FF1152C
  - 8 Rocket IO (SATA configured)
  - CF bootloader
  - Legacy VME interface
  - Temperature monitoring & thermal shutdov
  - EEPROM memory
  - 15A regulators
  - 128-512MB DDR SDRAM
  - ~270 spare I/O





### Comments on APVe

- First stand-alone project for university group
  - Much learning involved
- Verilog model of pipeline logic was not available so had to reverse-engineer the code
  - First in C, then VHDL
  - Essential that code was identical to implementation
  - And synthesisable
- Resources
  - Virtex II 1000 (xc2v1000-fg256)

```
No Slice Flip-flops 2993/10240 = 30%
```

Virtex II Pro Vp20 (xc2vp20-ff1152)

```
• No Slice Flip-flops 2982/18560 = 16%
```

No 4-input LUTs
7456 /18560 = 40%



# Triggering and FPGAs

- The LHC experiments are large and complex. However, the logic used for triggering is relatively simple, at least at L1
  - It relies on identifying "simple" objects, quickly e, μ, jets, high p<sub>T</sub>,...
- What makes it difficult is the large volume of information to be handled and the need for high levels of interconnectivity between different physical elements or regions
  - Eg, so far, no L1 tracking trigger
  - Parts of CMS were built using then state-of-the-art ASICs to achieve density and speed, and achieve practical connectivity
- FPGAs have been getting more interesting
  - Some early projects did not overcome the interconnection problems
    - Latest FPGAs have much more IO, pin-count, speed,... and have been adopted to overcome these challenges



# Calorimeter Algorithms

Imperial College London



### Electron/photon

- Large deposition of energy in small region, well separated from neighbour
- $\blacksquare$  E<sub>HCAL</sub>/E<sub>ECAL</sub><0.05
- 2D array to map 3D geometry

### Hadronic Jet

- Isolated narrow energy deposition
- Simulations identify likely patterns to accept or veto





## Role of GCT in the CMS Trigger

Imperial College London

- Receives data from RCT (18 crates) and ranks results
- Combines data from full detector to identify jets



**Jet Finder** is most challenging GCT task

Requires fast data exchange between different hardware units

•essential for CMS

Almost all the discovery channels require that this is operational made design complex, and demanded high clock speed and bandwidth latency critical

# Global Calorimeter Trigger

#### Imperial College London









### **New GCT features**

- Main features of new design
  - Large, powerful FPGAs on small number of cards concentrate algorithm processing in few locations

 Optical links increase input density, so allowing large number of input signals to be brought to processing boards

■ Low (40MHz) clock (+DDR) used throughout

FPGAs exchange data in close proximity

Few boards, with many common features

FPGAs: Virtex 4 and Virtex II Pro chosen for:

Built-in parallel-serial data handling

Large size and processing capability

Availability

Leaf processing

Two firmware variants needed U2 & U2: Virtex II Pro VP70 (xc2vp70-ff1517-7)

U2:

Slice flip-flops 7% 4-input LUTs 28%

U1:

Slice flip-flops 11% 4-input LUTs 53%

- Very late start but rapid progress with implementation in < 1 year</li>
  - Source cards (6), Leafs (2), tested and integrated, final production launched
  - Wheel & Concentrator in manufacture



### Comments on GCT

- Old ahead of its time for such a problem. Data sharing between modules led to significant difficult challenges
  - 155 FPGA configurations (even if many common elements)
  - ~65,000 lines VHDL code in ~200 active firmware modules
  - Operational (noise) challenges on boards communicating using 1.44GB/s and 3.2 GB/s electrical links
  - Hard to test system adequately in modular way need whole to validate
- New currently uses most advanced FPGAs in CMS
  - Exploits size, I/O capacity, embedded SERDES features to drastically reduce size of system
    - Input signals translated from diff ECL to 1.6Gb/s optical provides higher density and reduced noise risk
  - Significant improvements in testability, whose importance in such projects cannot be overstated
  - NB rapid progress at late stage
    - testament to capable people, and drawing from previous projects
    - but also to ease of use of FPGA technology

## GCT boards to date...



### Source card

- USB 2.0 interface for readout
- 2x VHDI SCSI RCT inputs (d-ECL)
- 4x SFP optical outputs
- Spartan-3 1M FPGA

### Leaf card





## Summary & conclusions

- FPGAs have a bright future in our field
  - Increasing use inevitable, especially in small, flexible data handling projects
  - However, they cannot (yet?) satisfy all our requirements
    - Location in radiation regions, or where there are material budget constraints
  - Analysing Cost vs Benefit is not easy
    - FPGAs look impressive although big, powerful, expensive, state of the art parts are needed in some projects
    - We have yet to see the long term operation
    - Maintaining a community of firmware experts is a challenge
  - Major particle physics projects span several lifetimes
    - Development ambitions high, may hit cost/performance issues
    - Production stay as close to state of art as possible
    - Operation aging to geriatric phase 10-15 years for LHC
      - Not yet clear when upgrades will be needed/possible during this phase
- We continue to live in interesting times...



## Acknowledgements

- I have relied on many other people's work to compile this talk. In particular, I would like to thank
  - Sergio Cittolin
  - John Coughlan
  - Costas Foudas
  - Greg Iles
  - John Jones
  - Wesley Smith
    - AND
  - Many other colleagues at Imperial College, Rutherford Appleton Laboratory and CERN



# Spare slides

### Control structure



- APVE receives LHC clock, control and timing signals (→→) from Central & Local TCS.
  - Local TCS allows the Tracker to operate without main Trigger system - for test/calibration
- APVE sends Tracker status ( → ) to the active TCS.
  - Busy, Warn, Out-Of-Sync, Ready or Error
  - Status is a combination of APV and FED status
- The "golden" pipeline address is transmitted to the FED via TTC channel ( )
  - Allows APV25 synchronisation check
- APVs in the Tracker receive L1As and control information via FEC and CCU ring control system.
  - They send data frames ( → ), including the pipeline address of each event, to the FEDs