A Prototype Track-Finding Processor for the Level-1 Trigger of the CMS Cathode Strip Muon System S.M.Wang, D.Acosta, A.Madorsky, B.Scurlock University of Florida A.Atamanchouk, V.Golovstov, B.Razmyslovich PNPI DPF 2000 August 9-12, 2000 Track-Finding Processor Prototype • We have designed and built a prototype Track-Finding processor for the Level-1 Trigger of the CMS Endcap Muon System • It is fast and efficient in linking tracks segments into complete 3-D tracks • Measures the Pt, φ, η of these tracks • In this talk : – Design of the prototype – Test results of the prototype S.M. Wang, University of Florida DPF2000, August 9-12, 2000 2 S.M. Wang, University of Florida CMS Drift Tube Chambers (DT) Resistive Plate Chambers (RPC) Cathode Strip Chambers (CSC) DPF2000, August 9-12, 2000 HCAL ECAL Tracker Forward Calorimeter Solenoid 3 Purpose for a Fast and Efficient Muon Trigger At LHC : • Expect muons to provide clear signature for a wide range of interesting physics processes – Higgs production, B Physics, New Physics ... (H → ZZ ∗ → 4µ) • These interesting physics events have low production cross sections • Need high luminosity (⇒ high bunch crossing rate (40 MHz)) to observe these rare processes • Require a fast and efficient muon trigger to: – efficiently tag muons from rare physical processes, and reject most muons from large background processes – perform trigger operation in a short period of time (3.2 µs for Level-1) S.M. Wang, University of Florida DPF2000, August 9-12, 2000 4 Requirements for CSC Track-Finder • High efficiency with low Pt threshold (∼ 20 GeV/c) • Single muon trigger rate < few kHz at L = 1034cm−2s−1 • Pt resolution ≈ 20% – Require 3-D tracking and information from 3 CSC stations • Multi-muon capability • Pipelined at 40 MHz bunch crossing frequency – deadtime-less • Small latency, = 16 bunch crossing (400 ns) • Programmable ⇒ FPGA and RAM implementation – Allows experiment to adapt to different background conditions and collision rates S.M. Wang, University of Florida DPF2000, August 9-12, 2000 5 CSC Level-1 Trigger Trigger Regions : • Divided into 60◦ sectors in azimuth for both endcaps • η region covered by each sector includes Overlap region < | η | < 1.2), and Endcap region (1.2 < | η | < 2.4) (1.0∼ ∼ ∼ ∼ Overlap η= 1.2 η= 0.9 RPC Drift Tube CSC Non-uniform Magnetic Field S.M. Wang, University of Florida DPF2000, August 9-12, 2000 6 CSC Level-1 Trigger Scheme Strip LCT + Motherboard card Strip FE cards CSC Track-Finder Port Card LCT Sector Receiver Sector Processor OPTICAL FE SR PC SP LCT 3µ / port card TMB FE 2µ / chamber Wire LCT card Wire FE cards 3µ / sector In counting house RIM CSC Muon Sorter RPC Interface Module On chamber In peripheral crate S.M. Wang, University of Florida RPC 4µ DT 4µ 4µ Global µ Trigger DPF2000, August 9-12, 2000 Global L1 4µ 7 CSC Level-1 Trigger Scheme Strip LCT + Motherboard card Strip FE cards CSC Track-Finder Port Card LCT Sector Receiver Sector Processor OPTICAL FE SR PC SP LCT 3µ / port card TMB FE 2µ / chamber Wire LCT card Wire FE cards 3µ / sector In counting house RIM CSC Muon Sorter RPC Interface Module On chamber In peripheral crate RPC 4µ DT 4µ 4µ Global µ Trigger Global L1 4µ • Track segments from chambers in 60◦ sector are sent to Sector Receiver for pre-processing before sending to Sector Processor • Track-finding is performed in the Sector Processor. • Sector Processor also receives track segments from Drift Tube chambers for track-finding in the overlap region S.M. Wang, University of Florida DPF2000, August 9-12, 2000 7 CSC Track-Finder Architecture • Track-Finder implemented as 12 Sector Processors (6 for each endcap) • Each Sector Processor : – Handles track segments in 60◦ azimuthal sector – Perform 3-D track-finding from track segments (match segments in φ and η) – Measure Pt, φ, and η. – Send up to 3 best track candidates to the Muon Sorter S.M. Wang, University of Florida DPF2000, August 9-12, 2000 8 CSC Track-Finder Architecture • Each Sector Processor implemented on a 9U VME card • CSC track segments are sent from 3 Sector Receivers (SR) via custom point-to-point backplane. DT track segments arrive via a transition board at the back of crate • Custom point-to-point backplane – Delivers ∼ 600 bits every 25 ns (3 GBytes/s) – Operates at 280 MHz to reduce connections ∗ National Channel Link 28:4 serialization Transition Board Backplane S R S R S P S R C C B Backplane S.M. Wang, University of Florida DPF2000, August 9-12, 2000 9 Sector Processor Block Diagram E1 E2 E3 E4 E1 – E2 Extrapolation Units bus Bunch Crossing Analyzer BXA EU1-2 Track Assembler Units bus TAU1 Final Selection Unit EU1-3 EU2-3 TAU2 FSU EU2-4 Assignment Unit EU3-4 TAU3 EU MB1-2 E1 E2 E3 E4 AU E2 – E4 FIFO MUX • Main components : – Bunch Crossing Analyser (BXA) : accumulate track segments for a couple of B.X., to accommodate error in B.X. assigned to track segments – Extrapolation Units (EU) – Track Assembler Unit (TAU) – Final Selection Unit (FSU) – Assignment Unit (AU) • Expected overall latency is 16 B.X. S.M. Wang, University of Florida DPF2000, August 9-12, 2000 10 Extrapolation Unit η road finder Q η(A1B1) Q η(A2B1) Q η(A3B1) Amb(A1) Amb(B1) LUT η1 η (A1) η (B1) “AND” SUB ηA-ηB “OR” Z η road finder LUT ∆η LUT η2 CMP ∆ φhigh LUT ∆ φhigh LUT ∆ φmed Course Pt assign CMP ∆ φmed Qextrap(A1B1) LUT Extrap Qual LUT ∆ φlow quality assignment unit CMP ∆ φlow ABS ∆φ Qual(A1) Qual(B1) LUT φb+ LUT φb- φb (A1) φ (A1) φ (B1) φb (B1) CMP ∆φ-φb+ CMP ∆φ-φb- SUB φA-φB LUT ∆φ LUT φb+ LUT φb- “AND” CMP ∆φ-φb+ ∆Φ Φroad finder CMP ∆φ-φb- ϕ road finder • Link track segments from two muon stations in 3-D : – Test if segments are matched in η – Check if ∆Φ is consistent with the bend angles in each station – Assign coarse Pt – Output Quality: no match, low, medium, or high Pt • 87 individual extrapolations are performed in parallel, a total of ∼ 1011 operations per second S.M. Wang, University of Florida DPF2000, August 9-12, 2000 11 3 3 12 ME33 ME4 ME33 ME2 ME33 ME1 LINK 33 9 6 3 3 12 ME32 ME4 ME32 ME2 ME32 ME1 LINK 32 9 6 3 3 12 ME31 ME4 ME31 ME2 ME31 ME1 LINK 31 9 6 3 3 12 ME23 ME4 ME23 ME3 ME23 ME1 LINK 23 9 6 3 3 12 ME22 ME4 ME22 ME3 ME22 ME1 LINK 22 9 6 3 3 12 ME21 ME4 ME21 ME3 ME21 ME1 LINK 21 9 6 6 ME23 ME1 4 ME23 MB2 8 ME23 MB1 LINK 23 9 6 6 ME22 ME1 4 ME22 MB2 8 ME22 MB1 LINK 22 9 6 6 ME21 ME1 4 ME21 MB2 8 ME21 MB1 LINK 21 9 6 SRAM 256Kx16 IDT S.M. Wang, University of Florida To Final Selection Unit From Extrapolation Units Track Assembler Unit (TAU) 6 bit Ranking & 9 bit hit i.d. : Endcap: Overlap: 3 bits for ME1 2 bits for MB1 2 bits for ME2 2 bits for MB2 2 bits for ME3 3 bits for ME1 2 bits for ME4 2 bits for ME2 DPF2000, August 9-12, 2000 12 • TAU implemented as 9 static RAM memories for Endcap and Overlap • Each Link unit handles all extrapolations to a single track segment in station 2 or 3. Successful extrapolations are used to form the best track pattern. 1 1 1 3 1 4 2 3 ME21- ME4 LINK 21 1 1 2 3 2 1 4 4 3 ME21- ME1 ME21- ME3 1 4 2 1 1 3 3 5 6 ME1 ME2 ME3 ME4 • Id of the track segments and the quality of the assembled track are sent to the Final Selection Unit (FSU) S.M. Wang, University of Florida DPF2000, August 9-12, 2000 13 Final Selection Unit (FSU) 9 Tracks from Track Assembler Units MUX 3 Best Tracks I.D. Comparison Unit Cancellation Logic and Encoder Track Rank Sorter • Compare the qualities of the tracks and the ID of the track segments that form the tracks to – cancel redundant tracks – select 3 best distinct tracks ( best ⇒ highest Pt ) S.M. Wang, University of Florida DPF2000, August 9-12, 2000 14 Assignment Unit • Determines φ, η, Pt of the 3 best muon candidates selected by the Final Selection Unit • Pt determined from the sagitta measured between 2 or 3 muon stations in the endcap fringe field – σPt/Pt ∼ 30% with only 2 stations – σPt/Pt ∼ 20% with 3 stations ⇒ improve rate reduction at Level 1 • Implemented with FPGA preprocessing followed by large SRAM look-up table φ LUT φ η FIFO MUX φ1 φ2 LUT η ∆Φ φ ME3 ME1 SUB φ1-φ2 φ 4 φ φ ~2M x 8 SRAM LUT Sign Mode (From FSU) ME2 η Rank (PT & Quality) 12 MUX φ3 ME4 φ 3 2 1 SUB φ2-φ3 ∆Φ23 S.M. Wang, University of Florida DPF2000, August 9-12, 2000 15 CSC Sector Processor Prototype FSU EU XCV150BG352 XCV400BG560 AU TAU BXA XCV50BG256 + 2Mbit x 8 SRAM 256k x 16 SRAM XCV50BG256 XCV => Xilinx Virtex FPGA S.M. Wang, University of Florida DPF2000, August 9-12, 2000 16 Test of the CSC Sector Processor ProtoType • Currently SP prototype is undergoing series of tests at U. Florida Sector Processor Prototype Clock-Control Board Bit-3 Module Test Software Test Stand at U. Florida S.M. Wang, University of Florida DPF2000, August 9-12, 2000 17 Tests: • Down load programs and LUTs into FPGAs and SRAMs (PASS) • Logic tests : – Each subprocessor is tested individually – Known “patterns” are loaded into buffer, send through subprocessor, compare the output from subprocessor with the results from simulation Sector Processor Prototype Pt, Φ, η, ... Of assembled Tracks Hardware CMSIM Φ, η, .... CMS Detector Simulator Track Segments Compare Results Sector Processor Prototype Software Simulator Pt, Φ, η, ... Of assembled Tracks – Send track segments from ∼ 100k single muon and triple muon events through both hardware and software simulation. Get similar results in both. – Test each subprocessor at 40 MHz rate S.M. Wang, University of Florida DPF2000, August 9-12, 2000 18 Summary of Tests Subprocessor Logic Test Logic Test at 40 MHz EU PASS PASS TAU PASS PASS FSU AU Table 1: Simulation Results on the Prototype: Efficiency of finding single track 1 0.8 Single µ Rate Rate dN/dηdt (kHz) Eff. 102 24423 1.719 0.3891 ID Entries Mean RMS Single Muon Rate 10 4 3-Stn Pt 2-Stn Pt 10 3 10 2 10 0.6 Overlap 0.4 Target Rate 1 10 0.2 10 -1 -2 1.2 < |η| < 2.4 34 -2 -1 L = 10 cm s 0 1 1.2 1.4 1.6 1.8 2 2.2 2.4 η 10 -3 1 10 10 2 PtThreshold (GeV/c) • High single track finding efficiency • Can achieve the target single muon trigger rate with 3 station Pt measurement S.M. Wang, University of Florida DPF2000, August 9-12, 2000 19 Future Tests: • Test with other CSC electronic trigger prototypes – These tests will be conducted at Florida in August Clock/Control signals Track Segments M P C S R Track Seg S P C C B Optical (Track Seg) S.M. Wang, University of Florida DPF2000, August 9-12, 2000 20 Summary • A Track-Finder Processor prototype for CMS has been built and tested • Uses 3-D algorithm to find tracks • Processor is driven at 40 MHz, overall latency is 400 ns • Each Sector Processor handles trigger region 1.04 < η < 2.4, ∆φ = 60◦ • Measure momenta of the best track candidates from the sagitta measured between 2 or 3 muon chambers in the endcap fringe field • Track finding algorithm is fully re-programmable, since the logics is mainly implemented in FPGAs and SRAMs • Initial tests indicate that the prototype is performing according to design S.M. Wang, University of Florida DPF2000, August 9-12, 2000 21