Fast Data Processing in ALICE TRD FEE & GTU • Introduction • FEE partitioning • Design for Test • Optical Readout • Global Tracking Unit KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Venelin Angelov Kirchhoff Institute of Physics Chair of Computer Science Prof. Dr. Volker Lindenstruth University Heidelberg, Germany Phone: +49 6221 54 9812 Fax: +49 6221 54 9809 Email: angelov@kip.uni-heidelberg.de WWW: www.ti.uni-hd.de 1 TRD Structure stack PASA MCM 1.2 million channels 1.4 million ADCs peak data rate: 16 TB/s ~65000 MCMs processing time: 6 µs TRAP 18 channels TR-detector r z module 8 MCM 144 channels 1080 optical links @2.5Gbps 18 Supermodules in azimuth KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 G T U 2 Line Fit & Global Tracking module profile track virtual plane projected tracklets tracklets tracklet processor global tracking straight line fit via a linear regression method searching for dedicated patterns (for energy cut and electron - pion separation) raw data buffer projection of ‘tracklets’ to a virtual plane searching for tracklets belonging together perform (accurate) energy cut and generate trigger KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 3 FEE development How to design the FEE with so many channels? - fast and with low latency - low power - precise power control (1mW/channel → 1.2kW) - low cost - minimize using connectors - use some simple chip package (MCM + ball grid array) - standard components? which process? IP cores? - flexible, as much as possible - make everything configurable - use CPU core(s) for final processing - reliable, no possibility to repair anything later - redundancy, error and failure protection - self diagnostic features KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 4 FEE development(2) Chip Design flow 1. Detector simulations to understand the signals we get and what kind of processing we need 2. Select PASA shaping time, ADC sampling rate and resolution 3. Behavior model of the digital processing including the bit-precision in every arithmetic operation 4. Estimate the processing time and select the clock speed of the design (multiple of LHC and ADC sampling clock) 5. Code the digital design, simulate it, synthesize it, estimate the timing and area, optimize again… 6. Submit the chip, this is the point of no return! 7. Continue with the simulations, find some bugs and think about fixes 8. Prepare the test setup And so on, TRAP1, 2, TRAPADC, miniQPM, TRAP3, TRAP3a (final) KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 5 Partitioning, Data Flow & Reduction MCM - Multi Chip Module TRD PASA ADC Tracklet Tracklet Preprocessor Processor TPP TP L1 trigger to CTP Network Interface NI GTU to HLT & DAQ event buffer store raw data until L1A detector 6 layers 1.2 million analog channels time: data / event: peak rate: mean rate: reduction: charge sensitive preamplifier shaper 10 Bit ADC 10 MSPS 21 channels digital filter preprocess data event buffer during first 2 µs (drift time) 33 MB 16 TB/s 257 GB/s 1 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 fit tracklets for builds readout tree trigger functionality for process raw data trigger & raw data monitoring after 3.5 µs merge tracklets into tracks for trigger process raw data for HLT after 4.1 µs after 6 µs max. 80 KB some bytes 600 GB/s ~ 400 to trigger decision 6 Detector Readout Pretrigger 0 1,200,000 channels, 1,400,000 ADCs 10 bit 10 MSPS, 20 samples/event preprocessing Drift 2 MCM A + D 3+own:1 4,100 Readout Boards, 16+1(+1) MCMs 8 bit 120MHz DDR readout tree Transmission time 4.6 GTU finished 6 time [µs] KIP Uni-Heidelberg, V.Angelov 90 Track Matching Units (GTU) 1 TMU/module, FPGA based Central Trigger Processor ALICE Workshop Sibiu 2008 BM 4:1 HCM 12 optical links ... 1,080 Optical Readout Interface links 2 links/chamber at 2.5 Gb/s 4:1 ORI 5:1 TMU GTU Processing 4 Merger 65,000 MCMs, 18+3 channels/MCM, 4 CPUs at 120 MHz SMU DDL x18 ... TGU CTP 7 TRAP development – a long long way TRAP1 UMC 0.18μm PASA AMS 0.35μm FaRo1 AMS 0.35μm MCM for 8 channels with the first prototypes of the Digital chip and Preamplifier, commercial ADCs Beg 2001 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 First tested TRAP chip, in „spider“ mode Summer of 2002 8 Multi Chip Module 4 cm PASA Internal ADCs (Kaiserslautern) Digital Frontend and Tracklet Preprocessor Master State Machine CPU cores, memories & periphery Global I/O-Bus Serial Interface slave Network Interface KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 External Pretrigger Serial Interface Readout Network 9 PASA - Preamplifier and Shaper Input Pads x18 Charge Sensitive Preamplifier P/Z cancellation diff output to the ADC Shaper 1 Shaper 2 1.0 PASA - Preamplifier and Shaping Amplifier FWHM (shaping time): 120 ns ENC: 850 electrons at 25 pF Gain: 12.5 mV/fC Integral Nonlinearity: 0.3% Power: 12 mW / channel Process: 0.35µm AMS Area: 21.3 mm² V. Catanescu, H.K.Soltveit KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 i1 0.5 Differential output voltage, V Vup Vdn CLK DIN 0.0 DAC CLK DAC[7..0] DIN STR EN[17..0] EN[17..0] STR Vref dig_cntrl1 -0.5 18x Vref -1.0 -0.98 EN_i -1.00 -1.02 0.0 500.0n 1.0µ 1.5µ 2.0µ Time, s Programmable test generator 10 TRAP block diagram 10 bit 10 MHz, 12.5 mW, low latency 21 ADCs Nonlinearity Correction Filter Pedestal Correction Digital Filters Gain Correction 64 samples Event Buffer Tail Cancellation Crosstalk Suppression Memory: 4 x 4k for instr. 1k x 32 Quad ported for data Hamming prot. CFG Hit Detection CPU Hit Selection IMEM Fitting Fitting Fitting Fitting Unit Unit Unit Unit CPU DMEM GRF CPU IMEM CPU Flags IMEM Fit Register File IMEM NI 4 x 8 bit 120 MHz DDR inputs GSM Standby Armed Acquire Process Send 24 Mb/s serial network SCSN FIFO FIFO FIFO PC Decoder FRF PRF GRF CONST FIFO DMEM ALU IMEM TRAP Pipe 1 Pipe 2 8 bit 120 MHz DDR KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 4x RISC CPU @ 120 MHz Bus (NI) 11 ADC – principle of operation essential Operations: - comparision - +/- V ref - x2 - (Sample & Hold) 111 Designed in Uni-Kaiserslautern, R.Tielert, D.Muthers ADC- Stage: threshold Vin x2 x2 S&H +/Vref x2 Vref 000 COMP DAC Bi 1 0 1 binary decisions Example of quantization process KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Principle of the cyclic AD-Conversion 12 ADC parameters 110 kHz sinwave, 10 MSPS, RMS=0.39, ENOB=9.57 1000 ADC output 800 CPU running 600 400 200 RES 0 50 100 150 200 250 300 350 110 kHz sinwave, 10 MSPS, RMS=0.39, ENOB=9.57 400 50 400 1 0 -1 100 150 200 250 300 350 samples Resolution Sampling Rate Process Size (per ADC) Power Input (pro.) Supply DNL INL ENOB @1MHz Signal SNR @1MHz Signal SFDR @1MHz Signal THD @1MHz Signal KIP Uni-Heidelberg, V.Angelov 10bit 10.4MS/s 0.18um CMOS + MIMCAPS 0.11mm 2 12.5mW @ 10.4MS/s +/-1V - +/-1.4V 1.8V, 3.3V - 0.4 / +0.6 LSB - 0.8 / +0.7 LSB 9.5 bit 59.2 dB 73.0dB 69.5dB ALICE Workshop Sibiu 2008 13 Filter and Tracklet Preprocessor ADC ADC DFIL DFIL Event Buffer Condition Check Digital FILter 64 timebins deep Non- Offs Lin Q hit Gain Tail- Crosscanc talk DFIL Condition Check Q hit Event Buffer 18+1 channels ADC DFIL Condition Check Q hit COG Position Calc LUT Para meter Calc COG Position Calc LUT Para meter Calc COG Position Calc LUT Para meter Calc COG Position Calc LUT Para meter Calc Event Buffer ADC DFIL Condition Check Q hit Event Buffer ADC DFIL Event Buffer KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 AA C O G L R A C FIT Register File and tracklet selection ADC Hit Select Unit (max. 4 hits) Event Buffer CPU0 CPU1 CPU2 CPU3 FIT register file is for the CPUs a readonly register file 14 10 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 12 12 8 8 amp R y-1 y y+1 COG pos C to processor max. 4 cand. calculate sums for regression select candidates max. 4 position position correction position calculation channel selection 21x event buffer crosstalk cancell. tail cancellation gain adjustment pedestal corr. filter & event buffer time bins 10 nonlinearity corr. history fifo 21 digital channels from ADCs (10 MHz) Filter & Preprocessor preprocessor 232 232 deflection L origin 15 MIMD Processor Preprocessor, 4 sets of fit data MIMD processor 4 CPUs shared memory / register file global I/O bus arbiter separate instruction memory coupled data & control paths IMEM CPU decoder Harvard style architecture two stage pipeline 32 Bit data path register to register operations fast ALU 32x32 multiplication 64/32 radix-4 divider maskable interrupts synchronization mechanisms KIP Uni-Heidelberg, V.Angelov DMEM PC interrupt pipeline register CPU0 CON select operands GRF FIT PRF write back ALU clks power control ALICE Workshop Sibiu 2008 local I/O busses rst external interrupts I/O bus arbiter global I/O bus 16 SCSN - Slow Control Serial Network serial bridged ring 1 (DCS) Slave Slave KIP Uni-Heidelberg, V.Angelov Slave Slave SCSN Up to 126 slaves per ring CRC protected 24 MBit/s transfer rate 16 addr., 32 databits/frame ALICE Workshop Sibiu 2008 Slave Master (DCS) ring 1 Slave Master Slave Slave ring 0 ring 0 Slave Slave Slave Slave 17 NI Datapath 10 CPU 3 CPU 4 I/O 0 16 local I/O 1 I/O 1 16 local I/O 2 I/O 2 16 local I/O 3 I/O 3 16 global I/O I/O G port0 16 16 16 16 16 FiFo 64x16 Network Interface port1 16 16 clk config 16 local I/O 0 port2 16 10 16 16 FiFo 64x16 16 clk GRF CPU 2 port3 10 clk IMEM CPU 1 global bus arbiter DMEM Network Interface clk Processor 10 FiFo 64x16 FiFo 64x16 local & global I/O interfaces input port with data resync. and DDR decoding input FIFOs (zero latency) port mux to define readout order output port with DDR encoding and programmable delay unit 16 port4 Delay units KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 10 18 Readout Boards and Readout Tree Half-chamber 2.5 Gbps 850 nm MCM as Readout Network • • • KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 data source only data source and data merge data merge only 19 Radiation tests - TRAP 400 5 Total errors in CPUregisters chip: isofruit I=60pA 3 Total bit errors 300 reg CPU0 reg CPU1 reg CPU2 reg CPU3 Hamming protection switched off Total bit errors 4 Total errors in instruction memory (each 4k x 32) chip: isofruit I=60pA 2 200 im0 im1 im2 im3 100 1 0 0 0 200 400 600 800 1000 Time, sec 0 200 400 600 800 1000 Time, sec 11 registers x 32 bit / CPU Typical size of the real time CPU program <256 Words. Memory is fully protected from 1 bit error and will be refreshed periodically. The data in event buffers remain <100s. We have parity bit for error detection. KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 20 Test flow of the MCM testing • Apply voltages, control the currents Programmable • JTAG connectivity test supply voltages, • Basic test using SCSN (serial current control LVDS configuration bus) 8 bit • Test of all internal components MCM0 MCM1 120MHz using the CPUs Progr. DDR • Test of the fast readout Step • Test of the ADCs by applying 200 Gener. MCM kHz Sin-wave FPGA1 FPGA2 (DUT) • Test of the PASA by applying Progr. voltage steps through serial Sin-wave capacitors Gener. MCM3 MCM2 • Store all data for each MCM in separate directory, store in XML file the essential results SCSN • Export the result for MCM marking PCI(PC) and sorting +1.8Vd +3.3Vd +1.8Va +3.3Va KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 22 TRAP internal tests 4 x 4k x 24 IM0 1k x 32 ADC 256 x 32 DM DB N P G T C KIP Uni-Heidelberg, V.Angelov CPU0 Event buff Port 3 StateM (~25) NI (~12) IRQ(64) Counters(8) ~130 Const(20) ALICE Workshop Sibiu 2008 r0 21x Global bus In total 434 configuratio n registers r1 LUT-nonl(64) Gain corr(42) LUT-pos(128) Fil/Pre(~44) Arbiter/ SCSN slv ~280 r1 r0 23 TRAP wafer test and results Programmable power supply A Programmable sin-wave generator Parallel readout TRAP Serial configuration interface 576 TRAP chips/wafer Fully automatic partial test of the TRAP. 100% 06/06 07/03-2 07/03-1 75% 06/02 Prep 25 KIP Uni-Heidelberg, V.Angelov 07/01 25 new wafers Up to now produced and tested 201 wafers with ~115,000 TRAPs, of them ~86,000 usable 06/09 ALICE Workshop Sibiu 2008 25 49 49 3 25 25 24 MCM Tester and results • test of 3x3 or 4x4 MCMs • digital camera with pattern recognition software for precise positioning using an X-Y table • vertical lift for contacting • about 1 min/MCM for positioning and test NM 3% BM 6% BAD 4% CM 1% T.Blank, FZK IPE (Karlsruhe) • store the result into a DB • mark later the tested MCMs with serial Nr. and test result code KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 GOOD 86% 25 TRAP ADC – conversion gain variation 12 12 10 10 %, 100% = 81408 Channel-Channel variation on the same TRAP +/-2 % 8 6 4 8 4 2 2 -2 0 2 4 0 90 6 1.38 1.24 1.26 1.28 1.3 1.32 1.34 1.36 Essential for tracking! KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 0 5 10 15 20 1.2 1.22 Statistics based on ~27000 TRAPs TRAP ADC Vref, V -4 1.4 -6 25 0 Chip-Chip variation +/- 5 % 6 95 100 105 110 100 105 110 1.4 1.38 1.36 1.34 1.32 1.3 1.28 1.26 1.24 1.22 1.2 90 95 Amplitude, % 26 TRAP ADC – programmable conv. gain The conversion gain can be changed by the ADCDAC parameter (5 bit) for all ADCs in the chip. This can be used to adjust the conversion gain according to the gasgain variation on the chamber. The variations within one chip can be compensated by the digital gain correction in a range of +/- 12%. 10 8 9 7 8 6 %, 100% = 81408 7 5 4 3 6 5 4 3 2 2 1 1 0 115.5 0 116 116.5 117 117.5 118 Ampl(DAC=0)/Ampl(DAC=10000b), % KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 118.5 86 86.5 87 87.5 88 Ampl(DAC=11111b)/Ampl(DAC=10000b), % 27 TRAP ADC – summary • The ADC reference voltage varies about +/- 5%. • The ADC conversion gain varies about +/- 5% and corresponds to the variation of the Vref. • The ADC conversion gain variation from channel to channel in the same TRAP is about +/- 2%. •The relative conversion gain controlled by the ADCDAC parameter has small variation (about +/- 0.5%). • At ADCDAC=0 the measured amplitude is 1.170 times the amplitude at ADCDAC=10000b. • At ADCDAC=11111b the measured amplitude is 0.870 times the amplitude at ADCDAC=10000b. KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 28 Optical Readout Interface (ORI) How to push the data out of the FEE? - no handshaking possible - magnetic field - don’t disturb the sensitive PASA Data rate 240 MB/s - because of 8b/10b encoding this means at least 2.4 Gbps - full custom chip started, but not finished at the right time (OASE) - simple board based on commercially available chips - use local high stability quartz oscillator instead of LHC clock - use custom laser driver circuit to avoid using commercial SFP modules with unspecified stability in magnetic field KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 29 Optical Readout Interface (ORI) 120MHz 8 bit DDR +24 ns +24 ns +300 ns Conf. Mem. 125MHz I2C LVDS-TTL HCM (TRAP) latency SERDES 2.5GBits/s CPLD Laser Driver 16 DDR SDR Resynchronization, status, counters Magnetic field & radiation tolerant! KIP Uni-Heidelberg, V.Angelov VCSEL Laser Diode 850 nm All 1200 produced and tested, 1199 of them fully functional ALICE Workshop Sibiu 2008 30 ORI – production test (laser diode) 25 Ibias, mA at 200uW Several parameters controlled: % of ORIs, 100%=969 20 - Supply currents at various operation mode conditions - Voltages on the board in enabled/disabled state - Source, bias and modulation currents through the laser diode - Temperature of the laser driver chip - optical output power - photodiode current measured by the laser driver chip 25 Ireset, A 15 10 5 0 1 20 2 3 4 5 Ibias, mA at optical power 200uW 6 Imonitor, uA at 200uW 18 16 % of ORIs, 100%=969 20 15 10 14 12 10 8 6 5 4 2 0 0.35 0.36 KIP Uni-Heidelberg, V.Angelov 0.37 0.38 0.39 0.4 Ireset, A ALICE Workshop Sibiu 2008 0.41 0 0 50 100 150 200 250 Imonitor, uA at optical power 200uW 31 ORI radiation test in Oslo No permanently damaged components. Equivalent time in years in blue, * - the device fails. Recovery after power cycle or switching off for up to 12 hours (VR and Laser Driver). The configuration of the CPLD and EEPROM was not damaged. EEPROM 24LC01 30* LVDS Transceiver National Semiconductor DS90LV048A 30 KIP Uni-Heidelberg, V.Angelov CPLD Lattice LC4256V 15-50* ALICE Workshop Sibiu 2008 Laser Driver VCSEL Diode Linear Technology ULM Photonics LTC5100 100* ULM850-02-LC-TOSA 20 Serializer Texas Instruments TLK2501 250* Voltage Regulators National Semicondutor LP3962-3.3 and -2.5 60-110* F.Rettig 32 TRD Trigger Timing Charge Cluster to Tracklet Global Tracking • Local tracking units on detector perform linear fits and reject uninteresting data • Inside GTU (Global Tracking Unit) time bins deflection origin ♦ ♦ “Tracklet” (32-Bit word): • y position (origin) • slope (deflection) • z position (padrow number) • charge • Objective: find high momentum tracks • Search for tracklets belonging together • Combine tracklets from all six layers • Reconstruct pt, compare to threshold and generate trigger • Constraint: only approx. 1.5 µs processing time Trigger decision after 6 µs Charge drift and data pre-processing uses most of the time KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 33 Online Track Reconstruction Track Re-assembly Reconstruction of the Transverse Momentum Track • Search for tracklets belonging together Projected (3-dimensional matching Tracklets task)Projection of tracklets to virtual planeSliding window algorithmA track is found, if ≥ 4 trackletsVirtual Plane from different layers inside same multidimensional window Module Profile KIP Uni-Heidelberg, V.Angelov •Calculate linear fit of (unprojected) y positions of tracklets. Estimate transverse y momentum from line parameter a uses look-up tables, additions and multiplications a x b ALICE Workshop Sibiu 2008 34 GTU in the Framework Other ALICE Systems TRD Systems L1 contrib, Busy GTU CTP TTC (L0, L1, L2, …) TGU SMUs SMUs Trigger Online Tracking L0 contrib, Busy TMUs TMUs TMUs Modified Tracklet Data Outside Magnet Busy Inside Magnet Other Detectors T0 TOF V0 ACORDE Pre-Trigger Pre-Trigger, L0, L1 Szintillators Pre-Trigger based on TOF, L1 contribution based on GTU Cosmic Trigger KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 35 Global Tracking Unit in the TRD Readout Chain L1 contribution + busy 18 x GTU Racks 5x Tracklets only Half Chamber 12 fibres ORI ORI Module Trigger Design Tracklets & Raw Data Track Concentr. Trigger Handling & Control RX Event Buffering Design TGU DCS Board TTCrx TTC trigger signals DDL SIU DIU D -RORC TMU SMU Central Trigger Processor DATE Software Stack GTU Supermodule Segment Front-End Electronics within L3 magnet ♦ ♦ ♦ Track Matching Unit (TMU) Supermodule Unit (SMU) Trigger Generation Unit (TGU) KIP Uni-Heidelberg, V.Angelov Storage Racks below Muon System ALICE Workshop Sibiu 2008 DAQ System 36 Event Buffering Design SRAM Controller data Scheduling Event Info FIFO Buffer Memory Management Event Shaper status SMU Control (L0-/L1-Trigger) KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Data Formatting read address write Event (n+1, 11) ... 12x ... Event (n, 11) empty Event (n+1, 0) Data Block, Link 11 TMU/SMU Interface Supermodule Unit 16/128 Data Stream Merging 16/128 ... 12x ... Links from one Stack 4-Mbit SRAM Event (n, 0) Data Block, Link 0 Read Address Logic & Control Readout Unit SMU Control (L2-Message) 37 Trigger – Simulation Framework AliRoot-based framework of simulation classes: AliEn integration, Offline-RawReader, TRAP/GTU simulation Simulation Framework TRD Response Model 18 SM, 5 Stacks, 6 Layers, 2 Half-Chambers CASTOR / AliEn or Files Half-Chamber Models Offline RawStrea m Reader TMUs TMUs TMU Models ALICE Workshop Sibiu 2008 SMUs SMU Models TGU Model TRAP models Input of „empty“ and interesting events KIP Uni-Heidelberg, V.Angelov GTU Trigger Model Configuration parameters L1 contrib Trigger Efficiency/Purity Assessment 38 Trigger Schemes ♦Di-Lepton Trigger • • • • Tracking developed by Jan de Cuveland Transverse momentum cuts simple coincidence of “high-pt seen flags” from different supermodules Optimized for minimum latency → L1 contribution ♦Jet-Trigger • PhD thesis: Concept and Implementation of a Jet trigger in the GTU • fast selection of events (L1 contribution) based on • thresholds for numbers of high-pt tracks within certain space regions/angular ranges • charge sums (if available) within certain space regions/angular ranges • more complex calculations (invariant mass, …) and correlations being under consideration currently → idea of L2 contribution by GTU ♦PhD theses on GTU trigger at KIP: S. Kirsch (A-A), F. Rettig (pp) KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 39 Track Matching Unit (TMU) • Trigger design currently optimized for high pt cut only • Simple di-lepton trigger foreseen in trigger unit (based on `binary‘ local decisions) KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 40 850 nm SFP-Transceiver DDR2 SDRAM 64 MB From 1 Detector Stack KIP Uni-Heidelberg, V.Angelov 4 MB ALICE Workshop Sibiu 2008 To Right TMU From Left TMU To SMU Board DDR2 SRAM: High Bandwidth (28.8 GBit/s) Data Buffer FPGA Config PROM 3 Parallel Links (240 MHz DDR, 8 Bit LVDS) JTAG DDR2 SRAM Xilinx XC4VFX100 FF1152 CompactPCI Bus 12 Fiber Optical Serial Links (2.5 GBit/s) Custom LVDS I/O 72 Pairs Track Matching Unit (TMU) Board (6U Height, Single Width) Virtex-4 FX100 FPGA: 95k LCs, 768 I/Os, 20 Internal Multi-Gigabit Serializer/Deserializer Units, 2 PowerPC Cores 41 TRD Beam Test at CERN (2007-11) 1 TRD Supermodule 1 GTU Segment ♦ ♦ ♦ ♦ ♦ Accelerator: CERN Proton Synchrotron (PS) Particles: Electrons, Pions (Transverse Momenta: 0.5 – 6 GeV/c) Good statistics for detector calibration (More than 1 Mio. events per momentum value) 8 days of continuous operation First run with tracklets, consistent with raw data November 2007 Beam Test Setup at CERN Proton Synchrotron Mean: -0.0916 cm RMS: 0.119 cm Deflection Error / cm KIP Uni-Heidelberg, V.Angelov First Tracklets from TRD Count Single Tracklet Deflection Precision ALICE Workshop Sibiu 2008 (not to scale) 42 Commissioning at CERN / ALICE 2008 Cosmics Runs ♦ ♦ ♦ ♦ Read-out chain running continuously Writing cosmics data to DAQ via GTU 60,000 events with pseudo-random test pattern: No errors Measured TRD readout timing: Fully able to saturate DAQ link TRD Readout Timing •Measured readout time for maximum black event (30 time bins): •L0 → Event at GTU: 245 us (4 kHz)L0 → DDL Tx done: ~ 13 ms (~ 75 Hz)Divide times by a factor of 5–500 for zero-suppressed dataDDL performance: 120 Mbyte/s GTU Installation at CERN (End of 2007, Work in Progress) KIP Uni-Heidelberg, V.Angelov •DDL utilization: ALICE Workshop Sibiu 2008 99 % 43 Summary GTU • The TRD Global Tracking Unit (GTU) is the central element of the ALICE TRD trigger system and readout chain • It provides both complex online data analysis within tight low-latency requirements and high bandwidth raw data handling • The GTU system has been fully installed • Commissioning and extensive read-out and interoperability tests have been performed successfully KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 A GTU Segment in Operation •Next steps •Further Tests of the online tracking and trigger functionality • Preparations for first LHC operation in 2008 44 Questions? KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 45 SPARES KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 46 A Large Ion Collider Experiment ♦ p-p or Pb-Pb Collision ♦ Creation of Quark Gluon Plasma, a state of matter existing during the first few microseconds after the big bang ♦ TRD is used as a trigger detector due to its fast readout time (2 µs): • Transversal Momentum • Electron/Pion Separation KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Inner Tracking System (ITS) Time Projection Chamber (TPC) Transition Radiation Detector (TRD) 47 Transition Radiation Detector TRD - Transition Radiation Detector used as trigger and tracking detector > 24000 particles / interaction in acceptance of detector up to 8000 charged particles within the TRD trigger task is to find specific particle pairs within 6 s. ITS - Inner Tracking System event trigger vertex detection TPC - Time Projection Chamber high resolution tracking detector but too slow for 8000 collisions / second KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 48 Tail Cancellation Filter t t TRF ( t ) 1 Tail L L L S unfiltered input 12 12 sub 12 add 15 register 15 mult 15 add 15 12 filtered output 13 Long Exponetial Coefficients mult 13 13 12 13 register 13 mult 13 add 13 mult 12 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Short Exponetial Coefficients 49 The TRAP chip bonded on the MCM KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 50 13 ch.evt.buf 21 ADC Channels TRAP Layout 8 ch.evt.buf DataBuffer 21 independent Digital filter channels CPU 3 CPU 2 5x7 mm Quad Port Memory GRF CPU 0 CPU 1 IMEM 0 KIP Uni-Heidelberg, V.Angelov IMEM 2 IMEM 3 ALICE Workshop Sibiu 2008 IMEM 1 network IF FiFo 51 Delay unit NI_out(9:0) + strobe ~1720 ps ~ 3440 ps ~860 ps del[2:0] per bit 8 7 6 delay, ns 5 4 3 out(0) out(4) simulation 2 1 0 Experimental setup (50cm tp cable) KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 0 1 2 3 4 5 6 7 Programmeddelay (0..7) 52 Tail Cancellation Optimization t t TRF ( t ) 1 Tail L L L S Average Signal before Tail Cancellation Minimization of Error Measure Decay , , L L S M , , 2 L L S Mean , , L L S KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Average Signal after Tail Cancellation Mean Signal during Drift Time Signal Decay after Drift Time 53 Tracking Arithmetics Hit Selection Timer Pedestal Substraction Q t)Q t) i 1( i 1( Q t) i( Look-Up Table + + Q (t) + Y(t) + Y2 (t) + X(t)·Y(t) + X2(t) + X(t) +1 Fit Register File KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 54 DCS board, slow control in ALICE ARM+FPGA Embedded Linux Ethernet 10 TTC optical Receiver ADCs, Temp. TRAP config SCSN interf. KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 55 The MIMD Architecture Local Bus CPU 0 Local Bus IMEM Const. CPU 1 Interrupt IMEM FIT Bus Cnt/Timer GRF IMEM CPU 2 IMEM Local Bus KIP Uni-Heidelberg, V.Angelov CPU 3 Local Bus ALICE Workshop Sibiu 2008 Config. Network Interface Evt. Buffer D-MEM Four RISC CPU's Coupled by Registers (GRF) and Quad ported data Memory Register coupling to the Preprocessor Global bus for Periphery Local busses for Communication, Event Buffer 4x10 read and direct ADC read 10 I-MEM: 4 single ported SRAMs Serial Interface for Configuration IRQ Controller for each CPU Counter/Timer/PsRG for each CPU and one on the global bus 4 Low power design, CPU clocks gated individually 4 56 TRAP internal configuration 4 x 4k x 24 IM0 1k x 32 DM 256 x 32 DB r1 r0 CPU0 TRAP Global bus ~120 KIP Uni-Heidelberg, V.Angelov StateM (~25) NI (~12) IRQ(64) Counters(8) Const(20) ALICE Workshop Sibiu 2008 LUT-nonl(64) Gain corr(42) LUT-pos(128) Fil/Pre(~44) Arbiter/ SCSN slv ~280 r1 r0 57 The TRAP1 chip bonded on the MCM 1 2 6 3 On-chip FuseID 4 5 1. ADC 2. Filter/Preprocessor 3. DMEM 4. CPUs 5. Network Interface 6. IMEM KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 58 Wafer test – bad contacts 26 adc adc 5mm 5mm 7mm 40 40 29 7mm 42 39 Balance! 20 26 Number of needles KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 59 Integration equations, simple but… ♦ PASA + TRAP = MCM ♦ 17 or 18 MCMs connected properly = ROB (readout board) ♦ 6 or 8 ROBs + 1 ORI + 1 DCS connected properly = module (chamber) electronics ♦ 5 modules in one layer = one layer of the supermodule ♦ x 6 layers = supermodule! ♦ 18 supermodules = TRD! KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 60 Integration prototype, Hd 2004 MCM = PASA+TRAP DCS board Trigger & clock distribution, ARM CPU+FPGA, embedded Linux, Ethernet, serial link to the TRAPs … KIP Uni-Heidelberg, V.Angelov Detector prototype prepared for the beam test at CERN ALICE Workshop Sibiu 2008 61 Supermodule I in Heidelberg, 2006 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 62 Installation of the first Supermodule KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 63 TMU - Momentum Reconstruction • Estimate transverse momentum from line parameter a, pt const a • Fixed point arithmetics, look-up tables, additions and multiplications only • Fast cut condition for trigger: KIP Uni-Heidelberg, V.Angelov const pt min afull calculation of pt too slow ALICE Workshop Sibiu 2008 64 TMU - Track Re-assembly Projected Tracklets Track • Search for tracklets belonging together (3-dimensional matching task) • Projection of tracklets to virtual Plane • Sliding window Algorithm • A track is found, if ≥ 4 tracklets from different layers inside same multidimensional window • Algorithm highly optimized for hardware primitives available • only quantities really needed are computed! KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 65 GTU: Hardware Production / Installation ♦ ♦ ♦ ♦ KIP Uni-Heidelberg, V.Angelov • 5 TMU boards and 1 SMU board per supermodule All components produced and installed 4 Types of PCBs In total: 108 FPGA nodes • TMU/SMU Hardware • 130 PCBs (90 TMUs + 18 SMUs + 1 TGU + TTC Input Spares) 60 Fiber • Main GTU (LVDS) Optical Backplane FPGA/PROM Links Programming • 18 PCBs (+ Spares) and Control via • Trigger Concentrator Ethernet Backplane • 1 PCB (+ Spare) Uplink to • CTP Interface Module Data Acquisition • 1 PCB (+ Spare) 9 Wiener cPCI crates One of the 18 GTU Mechanical components, Segments custom front panels, ... ALICE Workshop Sibiu 2008 • 66 Cosmic Trigger – Current Selection Criteria Currently used: • TRAP cosmic mode: hit detection, fit selection and dedicated tracklet building C C C C C C • TMU: upper limit on charge/hits for selection of individual tracklets → sort out perturbances like hot pads, high amplitude border noise and 600 kHz oscillations • SMU: threshold on charge/hit sums for each chamber, currently nearly open (approx. one true tracklet) • SMU: threshold on number of “active” layers per stack, currently 4 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 67 850 nm SFP-Transceiver DDR2 SDRAM 64 MB Ethernet: System Configuration and Control KIP Uni-Heidelberg, V.Angelov 5 Parallel Links (120 MHz DDR, 8 Bit LVDS) FPGA 4 MB ALICE Detector Data Link (DDL) Config PROM ALICE Workshop Sibiu 2008 Trigger Out From TMU 0 From TMU 1 From TMU 2 From TMU 3 From TMU 4 JTAG DDR2 SRAM Xilinx XC4VFX100 FF1152 (6U Height, Double Width) CompactPCI Bus ALICE Timing, Trigger & Control Input (TTC) Custom LVDS I/O 72 Pairs SMU Concentrator Board TTC Detector Control System (DCS) Board RJ 45 DDL - Source Interface Unit (SIU) 68 Cosmic Trigger – Nice Examples for Tuning KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 69 Cosmic Trigger – First Results KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 70 Cosmic Trigger – First Results ev 18 ev 18 ev 20 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 71 Cosmic Trigger – Additional Trigger Criteria As soon as TOF delivers proper L0 trigger or pre-trigger input: • Stronger settings for all thresholds/limits, e.g. requirement of 5 active layers/stack • Coincidence between supermodules trg KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 72 Cosmic Trigger – TOF Pre-trigger Picture by MinJung, TRD Annual Meeting 2008 TOF coincidence for pre-trigger box input KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 73 Cosmic Trigger – MCM Level Scaled Relevant Relevant Modified tracklets for cosmics trigger: no tracklet parameters, instead…Tracklet Charge Sum MCM MCM over all time bins and channels 8 Bit Configuration Parameters 8 Bit Number of Hits 8 Bit Configuration Parameters Word 8 Bit N 255 Charge Sum Details: Presentation by Venelin Angelov, 30.06.2008 N 360.000 KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 Charge Sum 74 Cosmic Trigger – Chamber Level (TMU) Summation of tracklet numbers, charge and hit numbers over all tracklets from both half-chamber links of each chamber. Threshold conditions for trigger: HC HC trg HC HC trg HC HC trg HC HC trg HC HC trg HC HC trg • Charge summed up over all tracklets per chamber • Hits summed up over all tracklets per chamber • Number of Tracklets per chamber (normal tracklets) thresholds KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 75 Cosmic Trigger – Stack Level (SMU) Trigger condition: number of “active” chambers per stack Optionally: minimum/maximum limit for number of stacks hit C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C trg trg trg trg trg KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 thresholds 76 Cosmic Trigger – Supermodule Level (TGU) Trigger conditions: trg trg ♦single supermodule ♦one-to-many correlations between supermodules Optionally: trg trg ♦“same-side veto” in oneto-many correlation Events like shown above may be very rare → experience, rates? Such events rejected by TOF-/ACORDE-coincidence settings? KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 77 Cosmic Trigger – Next Steps ♦Precise measurements of all contributions to latency in pretrigger, tracklet calculation and GTU tracking/trigger needed ♦Increase the number of time bins with pre-trigger in operation ♦Shift CTP L1 contribution time window to 6.2us? ♦Need for additional cosmic trigger schemes? ♦Update Münster Setup • benchmarking of cosmic trigger performance with Münster setup, L1 trigger decision recorded in SMU header → offline comparison: easy to determine purity and efficiency • use L1 contribution as trigger too? → adaption of setup to L1 contribution and generate L1a/r KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 78 Cosmic Trigger – Bad Examples for Tuning KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 79 ORI - production test (data path) Independently tested the following interfaces and components: • I2C interface from the TRAP chips on the readout board and the configuration memory of the laser driver chip • J2C interface from the TRAP chips on the readout board and the configuration and status registers in the CPLD • JTAG to the CPLD • 120 MHz DDR parallel data between the half-chamber merger TRAP and the CPLD on ORI • Data word and parity error counters in the CPLD • Optical data transmission using a receiver module mounted on a PCI card The optimal configuration for two optical power settings 200 and 300µW stored together with all other test data to a database. KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 80