A Full Mesh ATCA-Based General Purpose Data Processing Board

advertisement
A Full Mesh ATCA-Based General Purpose Data Processing Board
- Designed for silicon based track trigger -
奥村恭幸
(OKUMURA, Yasuyuki)
University of Chicago / Fermilab
Ted Liu, Jamieson Olsen, Hang Yin
1/15
Tracking Trigger @ high luminosity
•  High luminosity LHC is requisite approach
•  Track trigger is key technique to maintain efficient
trigger with high luminosity (80 collisions every 25ns)
beam dispersion σz ~10cm
2/15
Data Formatting challenges
in silicon based track trigger
Technical challenges in data formatting prior to tracking
–  Tracking trigger must be implemented as parallel processors
(trigger η-φ tower) and send the detected hits to appropriate
processors (more than 100k hits / event)
•  Detector as well as readout is not designed for trigger η-φ towers
in ATLAS-Fast Tracker (FTK) case which uses existing silicon detectors
–  Huge number of hits have to be shared among the processors to
avoid inefficiency at boundary region
Tracking
Processor
Tracking
Processor
routing
copying
Parallel
processing units
Tracking
Processor
Tracking
Processor
3/15
ATLAS FTK data formatter
Hits sharing between ATLAS FTK η-φ towers
•  Required connectivity in ATLAS FTK data formatter cases
Full mesh network topology is the natural solution 4/15
Full mesh ATCA
•  Advanced Telecommunication Computing
Architecture (AdvancedTCA / ATCA)
backplane supports Full Mesh network
topology with high speed
Network topology of 14board system in full mesh
Zone2
Connector
for full mesh
5/15
Board design concept
•  Two FPGAs (running up to ~10 Gb/s)
•  They are interfaced to mezzanine cards (FMC), Rear
Transition Module (RTM), ATCA full mesh backplane
Local bus
ATCA backplane
RTM
ATCA
backplane
Opt fiber (inter-crate)
6/15
Prototype (Pulsar 2a) design
Mezzanine
Interface
Zone 3
RTM
Local
Bus
Zone 2
Full Mesh
GTX
LVDS
7/15
57
20
First prototype board
DRAFT
Test Mezzanine Card
Top FPGA
Test Mezzanine Card
ZONE 3
ZONE 2
ZONE 2
Bottom FPGA
ZONE 1
ZONE 1
RTM
Figure 16: Prototype DF with its RTM and two test mezzanine cards.
Mini backplane
8/15
Very first prototype works well
•  High speed GTX lines w/ Loopback modules
–  Local bus
(BER<1.4x10-15 @ 10 Gb/s)
-17 @ 6.25 Gb/s)
–  ATCA fabric channels
(BER<4.2x10
May 3, 2013 – 21 : 57
DRAFT
–  RTM channel
(BER<8.3x10-17 @ 6.25 Gb/s)
•  Mezzanine Interface
RTM
–  400 Mb/s/LVDS-pair
•  13.6Gb/s/Mezzanine
Mezzanine card
•  Crate level operation
ATCA fabric
–  Control with network
switch (hub board)
–  Full mesh backplane test
local bus
Figure 16: Prototype DF with its RTM and two test mezzanine
cards.
9/15
DF Board Zone2/3 (6.25Gb/s)
May 3, 2013 – 21 : 57
Voltage offset
RTM Zone3
20
DRAFT
May 3, 2013 – 21 : 57
DRAFT
ATCA Zone2
6.25 Gb/s
6.25 Gb/s
Figure 16: Prototype DF with its RTM and two test mezzanine cards.
512
allowing further optimization of the routing algorithm and improvement in the bandwidth use in the
system. The current data-driven estimate does not include the IBL, which will be added in the future
after final decisions are made for the IBL module-to-fiber mapping.
513
4.5 AUX
510
511
516
Pattern recognition and first stage track fitting are done in the Processor Units (PU) in the core crates.
Each PU consists of an AMB VME card with a large auxiliary board (AUX) behind it. The AMB is
described in section 4.6.
517
4.5.1 AUX functionality
514
515
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
The AUX receives hits from the DFs for the 8 silicon layers used in the first stage of track reconstruction.
It stores the hits in the Data Organizer (DO), a smart database built on the fly that allows rapid retrieval
of hits in a road, and sends the hits to the AMB with coarser resolution appropriate to pattern recognition
(superstrips or SS). When the AMB finds a road with hits on at least 7 of the 8 layers, the road number
is sent to the AUX which then retrieves all of the hits in the road. Rapid hit retrieval is possible because
hits are stored in the DO by SS address, and a road consists of a single SS in each layer. The hits, the
road number, and the sector number are transferred to the track fitter (TF).
The TF calculates the 2 for each combination of one hit per layer in the road (Nominal fit) using
linear constants stored by sector number. The same constants are used when there are hits in only 7 of the
8 layers (Majority fit) by calculating the hit location in the missing layer that would minimize the overall
2 . Any combination with 2 < 2 is sent out of the TF. If a combination with hits on all 8 layers has
cut
a 2 above 2cut but below 2high , that track candidate is refit 8 times with one hit dropped each time in
order to allow for detector inefficiency and picking up a random hit (Recovery fit). If a refit satisfies the
goodness of fit requirement, it is accepted.
The tracks forwarded from the TF go to the Hit Warrior (HW) function for duplicate track removal.
If two tracks in the same road share more than a programmable number of hits, only the higher quality
track is kept. The quality is defined based on the 2 and which silicon layers have a hit. Tracks exiting
Figure 16: Prototype DF with its RTM and two test mezzanine ca
512
allowing further optimization of the routing algorithm and improvement in the b
system. The current data-driven estimate does not include the IBL, which will be
after final decisions are made for the IBL module-to-fiber mapping.
513
4.5 AUX
510
511
516
Pattern recognition and first stage track fitting are done in the Processor Units (PU
Each PU consists of an AMB VME card with a large auxiliary board (AUX) beh
described in section 4.6.
517
4.5.1 AUX functionality
514
515
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
The AUX receives hits from the DFs for the 8 silicon layers used in the first stage of
It stores the hits in the Data Organizer (DO), a smart database built on the fly that
of hits in a road, and sends the hits to the AMB with coarser resolution appropriate t
(superstrips or SS). When the AMB finds a road with hits on at least 7 of the 8 lay
is sent to the AUX which then retrieves all of the hits in the road. Rapid hit retrieva
hits are stored in the DO by SS address, and a road consists of a single SS in each
road number, and the sector number are transferred to the track fitter (TF).
The TF calculates the 2 for each combination of one hit per layer in the road
linear constants stored by sector number. The same constants are used when there ar
8 layers (Majority fit) by calculating the hit location in the missing layer that would
2 . Any combination with 2 < 2 is sent out of the TF. If a combination with hi
cut
a 2 above 2cut but below 2high , that track candidate is refit 8 times with one hit d
order to allow for detector inefficiency and picking up a random hit (Recovery fit).
goodness of fit requirement, it is accepted.
The tracks forwarded from the TF go to the Hit Warrior (HW) function for dup
If two tracks in the same road share more than a programmable number of hits, on
track is kept. The quality is defined based on the 2 and which silicon layers have
sample point
10/15
DF local bus vs KC705 @ 10Gb/s
May 3, 2013 – 21 : 57
DRAFT
Reference
Xilinx Kintex7
Evaluation Kit
DF board local bus
Voltage offset
20
(w/ SFP+ 0dB Loopback adopter)
Figure 16: Prototype DF with its RTM and two test mezzanine cards.
512
allowing further optimization of the routing algorithm and improvement in the bandwidth use in the
system. The current data-driven estimate does not include the IBL, which will be added in the future
after final decisions are made for the IBL module-to-fiber mapping.
513
4.5 AUX
510
511
516
Pattern recognition and first stage track fitting are done in the Processor Units (PU) in the core crates.
Each PU consists of an AMB VME card with a large auxiliary board (AUX) behind it. The AMB is
described in section 4.6.
517
4.5.1 AUX functionality
514
515
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
The AUX receives hits from the DFs for the 8 silicon layers used in the first stage of track reconstruction.
It stores the hits in the Data Organizer (DO), a smart database built on the fly that allows rapid retrieval
of hits in a road, and sends the hits to the AMB with coarser resolution appropriate to pattern recognition
(superstrips or SS). When the AMB finds a road with hits on at least 7 of the 8 layers, the road number
is sent to the AUX which then retrieves all of the hits in the road. Rapid hit retrieval is possible because
hits are stored in the DO by SS address, and a road consists of a single SS in each layer. The hits, the
road number, and the sector number are transferred to the track fitter (TF).
The TF calculates the 2 for each combination of one hit per layer in the road (Nominal fit) using
linear constants stored by sector number. The same constants are used when there are hits in only 7 of the
8 layers (Majority fit) by calculating the hit location in the missing layer that would minimize the overall
2 . Any combination with 2 < 2 is sent out of the TF. If a combination with hits on all 8 layers has
cut
a 2 above 2cut but below 2high , that track candidate is refit 8 times with one hit dropped each time in
order to allow for detector inefficiency and picking up a random hit (Recovery fit). If a refit satisfies the
goodness of fit requirement, it is accepted.
The tracks forwarded from the TF go to the Hit Warrior (HW) function for duplicate track removal.
If two tracks in the same road share more than a programmable number of hits, only the higher quality
track is kept. The quality is defined based on the 2 and which silicon layers have a hit. Tracks exiting
sample point
10 Gb/s
10 Gb/s
11/15
Mezzanine interface & Test mezzanine card
Pattern Recognition Associative
Memory chip socket
•  LVDS (400Mb/s/pair) through FMC
connector to main board
–  Double Data Rate (DDR) running at
200 MHz shared between Mezzanine
& Main Board
–  In total 13.6 Gb/s with 34 pairs
•  Four SFP+ I/O GTX lines tested at
6.25Gb/s
4 high speed
lines (by GTX)
FPGA DDR3
•  Pattern finding associative memory
chip capability
–  FPGA+DDR3
FMC connector
12/15
Successful full crate testing
activate all the GTX at 6.25 Gb/s
on 7 boards in one crate
Hub board
}
7 Pulsar 2a
boards
Slot#4 to #10
Ethernet connection
13/15
Pulsar 2b
Pulsar 2b board
•  Vertex 7 FPGA
•  80 GTH lines are
implemented in total
(cf 32 GTX in pulsar 2a)
•  Improved design :
design is based on
highly valuable
prototype experiences
14/15
Summary
•  An ATCA general purpose board is designed
–  High speed FPGAs interfaced various I/O & internal
communication lines
•  Our first ATCA prototype works
well with speed up to 10 Gb/s
•  We tested details of high speed performance
•  Pulsar 2b design is work in progress
–  First application to be used is ATLAS Fast TracKer
(from 2015)
15/15
Download