The Track - Finding Processor for the Level

advertisement
The Track-Finding Processor for the
Level-1 Trigger of the CMS Endcap
Muon System
D. Acosta, A. Madorsky, B. Scurlock, S.M. Wang
University of Florida
A. Atamanchuk, V. Golovtsov, B. Razmyslovich, L.
Uvarov
St. Petersburg Nuclear Physics Institute
Geometric Coverage
ME4 descoped
LHC Electronics Workshop 2001
2
Alex Madorsky
Trigger Regions in ϕ
ME1/3
Track-Finding
performed in
independent
60° sectors
MB2/2
MB2/1
Illustration of
overlap region
LHC Electronics Workshop 2001
3
Alex Madorsky
Muon Track-Finding
Perform 3D track-finding from trigger primitives
Measure PT, ϕ, and η
Transmit highest PT candidates to Global L1
θ
ϕ
LHC Electronics Workshop 2001
4
Alex Madorsky
Sector Partitioning
20 °
20°
20 °
Sector Sector Sector
ME1 Left
30° or 20°
subsectors in
ME1
ME1/3
ME1 Center
10 ° 10 °
10°
60° Sector
10°
10°
ME2/210 °
10°
10°
10 ° 10 ° 10 °
ME3/2
10 °
10 ° 10°
10 ° 10 ° 10 °
10° 10 ° 10° 10 ° 10° 10°
ME1/2
ME2/1
10° 10° 10° 10° 10° 10°
ME1/1
ME1/A
20°
20 °
20°
20 °
ME3/1
20°
20°
10 ° 10° 10° 10 ° 10°10 °
16µ
2 or 3 MPC
10°
60 ° Sector
ME1 Right
Muon
Port
Card
16µ
Muon
Port
Card
2µ
6 µ / SR
2µ
Sector
Receiver
Barrel
LHC Electronics Workshop 2001
16µ
18µ
Muon
Port
Card
18µ
Muon
Port
Card
Muon
Port
Card
2µ
3µ
3µ
6µ
Sector
Processor
Barrel
6µ
Sector
Receiver
Barrel
5
Alex Madorsky
60° sectors in
ME2 —ME4
Sector Processor Functionality
èIdentify
and measure muons from ~ 600 bits every 25ns (3
GB/s)
èPerform all possible station-to-station extrapolations in
parallel
p Simultaneously search roads in ϕ and η
èAssemble 3- and 4-station tracks from 2-station
extrapolations
èCancel redundant short tracks if track is 3 or 4 stations in
length
èSelect the three best candidates
èCalculate PT, ϕ, η and send to CSC muon sorter
LHC Electronics Workshop 2001
6
Alex Madorsky
Sector Processor Block Diagram
Extrapolation Units
2–4
(18 bits)
3-4
3-2
TAU 2
3-1
- Control Line
- Downloading/
Readout Line
Track 4
2-4
2-3
2-1
3 (3x26)
1(6x26)
TAU 1
EU2
1-3
LHC Electronics Workshop 2001
Track 3
Track 2
Track 1
1–3
(36 bits)
2 (3x26)
1(6x26)
EU1
1-2
1–2
(36 bits)
Assignment Unit
Data
Extraction LID
MUX
CLOCKED
FIFO
U – Extrapolation Unit
AU – Track Assembling Unit
SU – Final Selection Unit
TA –Selected Track Address
ID – Look-Up Input Data
2(3x26)
2–3
(18 bits)
FSU
Track 5
3 (3x26)
EU3
2-3
Track 6
VME BUS
EU4
2-4
Pt
LUT
Output
Data
3x22=66
7
Alex Madorsky
OUTPUT
CONNECTOR
- Data Line
2 (3x26)
Track Assemblers
Final Selection Unit
STA
Control
4 (3x26)
Input Data
15x34=510
Data
EU5
3 -4
DOWNLOADING/
READOUT
INTERFACE
3 (3x26)
INPUT DATA &
CONTROL
INTERFACE
9U CUSTOM
BACKPLANE
Input
8x3
4 (3x26)
3–4
(9 Extrapolations or
18 bits)
Sector Processor Logic
èPerform
Chamber
ME4
41
42
43
Chamber
ME3
31
32
all combinations of
extrapolations in parallel:
1i ↔ 2k , 1i ↔ 3k, 2i ↔3k 2i ↔
4k , 3i ↔ 4k
But not 1i ↔ 4k
33
èTrack
Chamber
ME2
21
22
23
Assembler takes best
2 or 3 extrapolations per
reference segment
Chamber
ME1
*
1
*
1
*
1
1
*
1
2
3
1
**
1
1
**
2
1
**
3
**
1
LHC Electronics Workshop 2001
8
Alex Madorsky
Extrapolation Unit
η Road Finder:
•Check if track segment is in allowed trigger
region in η.
•Check if ∆η and η bend angle are consistent
with a track originating at the collision vertex.
Quality Assignment Unit:
•Assigns final quality of extrapolation by
looking at output from η and φ road finders
and the track segment quality.
φ Road Finder:
•Check if ∆φ is consistent with φ bend angle φ b
measured at each station.
•Check if ∆φ in allowed range for each η
window.
LHC Electronics Workshop 2001
9
Alex Madorsky
Track Assembler Unit
Functions:
èDetermines
if any track segment pairs belong to the same
muon
èCombines those segments and assigns a code to denote
which muon stations are involved
èOutputs a quality word for the best track for each hit in the
key stations (2 and 3)
LHC Electronics Workshop 2001
10
Alex Madorsky
Final Selection Unit
Functions:
èSelect
three best out of nine tracks presented by Track
Assemblers
èCancel redundant tracks (parts of the longer track)
LHC Electronics Workshop 2001
11
Alex Madorsky
Pt Assignment Unit
PT, ϕ, η of the three best muons selected by Final
Selection Unit.
èCalculates
LHC Electronics Workshop 2001
12
Alex Madorsky
First Prototype
LHC Electronics Workshop 2001
13
Alex Madorsky
Custom LVDS Backplane
LHC Electronics Workshop 2001
14
Alex Madorsky
Test crate
LHC Electronics Workshop 2001
15
Alex Madorsky
First Prototype Results
èTrack-Finder
algorithm implemented in 15 Xilinx Virtex
FPGAs, ranging from XCV50 to XCV400
èOne XCV50 for VME interface
èOne XCV50 for output FIFO
èLatency 15 clocks
èOutput compared with C++ model, using simulated input data
as well as random numbers, transmitted via custom
backplane
è100% matching demonstrated
LHC Electronics Workshop 2001
16
Alex Madorsky
Test software
Written:
èStandalone
version of the C++ model for Windows
èModule for the comparison of the C++ model with the board’s
output
èJTAG configuration routine, controlling the fast VME-to-JTAG
module of the board
èLookup configuration routine, used to write and check the
on-board lookup memory
èBoard Configuration Database with a Graphical User Interface
(GUI), that keeps track of many configuration variants and
provides a one-click selection of any one of them. Each
variant contains the complete information for FPGA and
lookup memory configuration. Can be used for multiple
boards.
To simplify possible porting:
èAll
software is written in portable C or C++
èGUI is written in JAVA
LHC Electronics Workshop 2001
17
Alex Madorsky
Test Software Screenshot
LHC Electronics Workshop 2001
18
Alex Madorsky
Single Crate Solution
Muon Sorter
SR SR SR SR SR SR
/ / / / / /
SP SP SP SP SP SP
CCB
BIT3 Controller
Clock and Control
Board
MS
Track-Finder crate (1.6 Gbits/s optical links)
SR SR SR SR SR SR
/ / / / / /
SP SP SP SP SP SP
SR/SP Card
(3 Sector Receivers +
Sector Processor)
(60° sector)
From MPC
(chamber 4)
From MPC
(chamber 3)
From MPC
(chamber 2)
From
Trigger Timing
Control
From MPC
(chamber 1B)
From MPC
(chamber 1A)
To
Global Trigger
To DAQ
• Power consumption: ~ 500W per crate
• 15 optical connections per SR/SP card
• Custom backplane for SR/ SPs < > CCB/MS connection
LHC Electronics Workshop 2001
19
Alex Madorsky
Merged SR/SP
15 trigger links, 1 DAQ
Small Form
Factor
Deserializer Front
Transceivers
Chips
FPGAs
From
trigger
primitives
45 SRAM
Memory
Look-up
Tables
From MPC
(chamber 4)
~45 SRAM chips
80 MHz speed
VME
Interface
Sector
Processor
FPGA chip
From MPC
(chamber 3)
From MPC
(chamber 2)
To Muon
Sorter
From MPC
(chamber 1C)
Ptassignment
LUTs
From MPC
(chamber 1B)
To/From
Barrel
From MPC
(chamber 1A)
To CSC DAQ
1 April 2001
Input: 528 bits/sector/b.x.
LHC Electronics Workshop 2001
From Clock
and Control
Board
Output: 60 bits/sector/b.x.
20
Alex Madorsky
Salient Features
Bi-Directional Optical Links
èSince
we have both transmitters and receivers in the optical
connection, this allows the option to send data as well as
receive
è Makes testing much easier
p One board sends data to other, or even itself through links
SP chip on a mezzanine board
è
è
Decouples board development from FPGA technology
Makes upgrades easier
DT fan-out on transition board
Deliver all possibly needed signals to backplane connector
è Settle transmission technology, connector type, connector
count on transition board at a later date
è
LHC Electronics Workshop 2001
21
Alex Madorsky
SR/SP Inputs
è
Muon Port Cards deliver 15 track stubs each BX via optical
Signal
Sent on
first
frame
èDT
Bits / stub
½ Strip *
CLCT
pattern *
L/R bend *
Quality *
Wire group
Accelerator µ
CSC i.d.
8
4
Bits / 3 stubs
(1 MPC)
24
12
Bits / 15 stubs
(ME1–ME4)
120
75
1
3
7
1
4
3
9
21
3
12
15
45
105
15
60
BXN
Valid pattern
2
1
6
3
30
15
Spare
Total:
1
32
3
96
15
480
Description
½ strip label
Pattern number without
4/6, 5/6, 6/6
Sign bit for pattern
Computed by TMB
Wire group label
Straight wire pattern
Chamber label in
subsector
2 LSB of BXN
Must be set for above
to apply
(240 bits at 80 MHz)
Reduced
from 5
Needed
for frame
info
Track-Finder delivers 2 track stubs each BX via LVDS
Signal
Bits / stub
φ
φb
Quality
BXN
Synch/Calib
Muon Flag
Total:
12
5
3
2
1
1
24
LHC Electronics Workshop 2001
Bits / 2 stubs
(MB1: 60° )
24
10
6
4
2
2
48
22
Description
Azimuth coordinate
φ bend angle
Computed by TMB
2 LSB of BXN
DT Special Mode
2nd muon of previous BX
Alex Madorsky
SR/SP Outputs
6 track stubs are delivered to DT Track-Finder each BX
(delivered at 40 MHz to transition board)
è
è
Signal
Bits / stub
12
Bits / 2 stubs
(ME1: 20° )
24
Bits / 6 stubs
(ME1: 60° )
72
Bits / 6 stubs
@ 80 MHz
36
φ
η
1
2
6
3
Quality
3
6
18
9
BXN
–
2
6
3
16
34
102
51
Description
Azimuth
coordinate
DT/CSC
region flag
Computed
by TMB
2 LSB of
BXN
Total
3 muons per SP are delivered to Muon Sorter via GTLP backplane
Sent on
first
frame
Signal
Bits / µ
φ
η
Rank *
Sign *
BXN *
Error *
Spare *
Total:
LHC Electronics Workshop 2001
5
5
7
Bits / 3 µ
(1 SP)
15
15
21
Bits / 36 µ
(12 SP)
180
180
252
1
–
–
1
19
3
2
1
3
60
36
24
12
12
720
23
Description
Azimuth coordinate
Pseudorapidity
5 bits pT + 2 bits
quality
2 LSB of BXN
(360 bits at 80 MHz)
Alex Madorsky
SR/SP Internal Dataflow
è
ME1
only
Data delivered from SR to SP
Signal
Bits / stub
12
5
6
1
1
Bits / 6 stubs
(ME1)
72
30
36
6
6
Bits / 15 stubs
(ME1–ME4)
180
75
90
15
6
φ
φb
η
Accelerator µ
Front/Rear *
CSC ghost *
–
4
4
Quality
Total:
3
28
18
172
45
415
LHC Electronics Workshop 2001
24
Description
Azimuth coordinate
φ bend angle
Pseudorapidity
η bend angle
ME1 chamber
stagger
2 stubs in same
ME1 CSC
Computed by TMB
(sent at 80 MHz)
Alex Madorsky
Old SR Memory Scheme
Now in TMB
LHC Electronics Workshop 2001
25
Alex Madorsky
New SR Memory Scheme
0.5 BX
½ Strip
8
Pattern
4
Quality
3
L/R Bend 1
Muon #
Local
φ LUT
128K× 16
0.5 BX
φlocal 10
10 φlocal
ALCTWG 5
6 φb, local
CSCID
Muon #
1
4
1
φlocal 10
Frame 1
Frame 2
ALCTWG 5
CSCID
Muon #
4
1
ALCTWG 7
CSCID 4
φb, local 6
LHC Electronics Workshop 2001
26
φlocal
2
Muon #
1
Global
φ LUT
(CSC)
1M×12
12 φCSC
Global
φ LUT
(DT)
1M×12
12 φDT
Global
η LUT
(CSC)
1M×11
6 ηCSC
5 φb,CSC
Alex Madorsky
Discussion of SR Memory (1)
Data Frames:
Data arrives off links @ 80 MHz
è The proposed framing scheme implies no latency loss
p First LUT operates on first frame only
(but must reduce CLCT pattern label by 1 bit)
è This frame definition must be applied in the TMB where the
data is generated and first serialized (i.e. before MPC) to avoid
latency penalty
è
Memories:
è
è
è
LUT for DT only applies to ME1
Chip count is 3 per stub, down from 6 originally
Increased memory size to 0.5M (okay for synch. SRAM)
LHC Electronics Workshop 2001
27
Alex Madorsky
Discussion of SR Memory (2)
One memory set per stub:
è
è
è
15×3 = 45 chips
1 BX latency
415 signals to SP
First SR had 36 chips and 1 BX latency
Memory choices:
è
è
synch SRAM clocked at 160 MHz (tested to this frequency)
Flow Through SRAM clocked at 80 MHz
LHC Electronics Workshop 2001
28
Alex Madorsky
Pt Assignment Memory Scheme
SP
FPGA
8
4
Sign 1
η
4
F/R 1
Mode 4
Muon 1
Rank
LUT
4M×8
Rank1
7
OE
∆φ12
∆φ23
8
4
Sign
η
F/R
Mode
1
4
1
4
Muon 2
Rank
LUT
4M×8
Rank2
7
OE
∆φ12
∆φ23
Must serialize to fit data on
backplane
è No latency penalty for
serialization if Rank sent
first to Muon Sorter
è SP FPGA contains tri-state
buffers and generates
enable for memories
è
Frame 1
8
4
Sign 1
η
4
F/R 1
Mode 4
80 MHz
Muon 3
Rank
LUT
4M×8
Rank3 7
OE
80 MHz
Sign1 , Sign2, Sign 3, BXN, Error
φ1 , φ2 , φ3
η1 , η2 , η3
LHC Electronics Workshop 2001
9
30
15
Register/GTLP
converter
∆φ12
∆φ23
To GTLP
Backplane
GTLP
backplane
Frame 2
15
29
Alex Madorsky
CSC TF DAQ Plans
Send input of SR and output of SP to DAQ stream for
diagnostics
Track-Finder DAQ acts as additional DDU for EMU
system
è
Plan to use existing DDU design by OSU
p OSU has decided to use T.I. Chip for serialization
p 12 SP fiber connections fits well into 15 planned for DDU
LHC Electronics Workshop 2001
30
Alex Madorsky
Mirror CSC DAQ Path
OSU now
plans 20°
slices to
equalize
bandwidth
CSC DDU
designed
by Ohio
State Univ.
15 optical fibers
× 36
Sector
Processors
send L1
data
LHC Electronics Workshop 2001
12 optical fibers
DDU
31
SLINK
Alex Madorsky
+ 1 (2?)
First prototype data flow
Pre-production prototype data flow
From Muon Port Cards
From Muon Port Cards
1
Lookup tables
Sector Receiver st.4
Front FPGAs
Sector Receiver st.2,3
1
Optical receivers
Sector Receiver st.1
Optical receivers
1
Front FPGAs
1
Lookup tables
SR/SP board
Channel link
transmitters
0
Bunch crossing analyzer (not implemented)
4
1
2
Bunch crossing analyzer (not implemented)
3
Extrapolation units
Latency
Latency
Channel link receivers
1
3
Final selection unit 3 best out of 9
3
2
Total: 21
Pt precalculation for best 3 muons
Pt assignment (memory)
Sector Processor
9 Track Assembler units (memory)
1
1
Total: 7
To Muon Sorter
LHC Electronics Workshop 2001
32
9 Track Assembler units
Sector Processor
FPGA
1
2
Extrapolation units
Pt precalculation for
9 muons
Final selection
unit 3 best out
of 9
Output multiplexor
Pt assignment (memory)
To Muon Sorter
Alex Madorsky
Pre-Production Prototype
Done:
èSP algorithm is rewritten in Verilog HDL
èNew C++ model line-by-line corresponds to Verilog HDL code
èIf Verilog HDL code is written properly, C++ conversion is very
simple and straightforward
èNew C++ model matches the functionality of the original C++ model
èSP-Verilog has been extensively simulated and is working in exact
correspondence to functionality of the first prototype and both C++
models
èSimulated latency for Sector Processor Algorithm: 5 clocks (15 in
the first prototype)
èBoard development underway
Plans:
èReplacing the original C++ model with the new one in ORCA
èAdding new features into the SP-Verilog code and C++ model
simultaneously, keeping them in strict line-by-line correspondence
LHC Electronics Workshop 2001
33
Alex Madorsky
SP Algorithm Modifications
èExtrapolation
and Final Selection units are reworked, and
now each of them is completed in only one clock.
èThe
Track Assembler Units in the first prototype were
implemented as external lookup tables (static memory). For
the second prototype, they are implemented as FPGA logic.
This saved I/O pins on the FPGA and one clock of the latency.
èThe
preliminary calculations for the PT assignment are done
in parallel with final selection for all 9 muons, so when three
best out of nine muons are selected, the pre-calculated
values are immediately sent to the external PT assignment
lookup tables.
LHC Electronics Workshop 2001
34
Alex Madorsky
Acknowledgements
This work was supported by grants from the US Department
of Energy.
We also would like to acknowledge the efforts of R. Cousins,
J. Hauser, J. Mumford, V. Sedov, B. Tannenbaum, who
developed the first Sector Receiver prototype, and the
efforts of M. Matveev and P. Padley, who developed the
Muon Port Card and Clock and Control Board, which were
used in the tests.
LHC Electronics Workshop 2001
35
Alex Madorsky
Download