Data Driven Front-End Architectures David Christian Fermilab

advertisement
Data Driven Front-End
Architectures
David Christian
Fermilab
January 12, 2007
LHCb Upgrade Workshop, Edinburgh
1
Evolution of data acquisition systems in
response to higher data rates
• DAQ Computer reads out contents of every
channel of every front-end module.
• CPU reads out modules which say that they
have data.
• Front-end modules digitize and zero-suppress
data – CPU reads out compressed, formatted
data.
• Front-end modules digitize, zero-suppress, etc.
AND send data to next processing level, all
without instruction from a central control CPU --(data driven).
2
Example – BTeV Pixels
• Pixel detector front-end electronics
performs three functions:
– Signal processing (amplify, discriminate,
digitize).
– Zero suppression (pixels make pattern
reconstruction easy because occupancy is
very low Æ data is very sparse).
– Readout.
• This presentation focuses on zero
suppression and readout.
3
BTeV Pixel chip (FPIX2)
• 0.25μ CMOS (rad hard).
• 128 rows x 22 columns – 50μ x 400μ pixels.
• Data driven, zero-suppressed readout – no
trigger; every hit is read out.
• Readout off chip is point-to-point, using a
configurable number of 140 Mbs serial links
(1,2,4, or 6).
• All I/O is LVDS (Low Voltage Differential
Signaling).
4
FPIX2 Block Diagram
Pixel Unit Cells (22 columns of 128 rows each)
Fabricated by TSMC
(through MOSIS).
FPIX2 2003
FPIX2.1 2005.
Core
End-of-Column logic (22 copies)
Core Logic
DAC’s
Clock
Control
Logic
Next
Word
Block
Data Output Interface
Programmable Registers
Word Serializer
Programming Interface
BCO Clock
Steering Logic
Serial Clock
Input/Output
High Speed Output
5
FPIX2 Layout
Debugging
Outputs
Pixel array
+ End-of-Column
Logic + ε = “Core”
Registers
and DAC’s
Command
Interface
Internal bond
pads for Chip ID
128x22
Pixel array
End-of-Column
Logic
Data Output
Interface
LVDS Drivers
and I/O pads
6
Pixel Cells (four 50 x 400 μm cells)
12 µm bump pads
Preamp 2nd stage
+disc
ADC
Kill/
inject
ADC
encoder
Digital interface
7
Readout details – column
•
•
•
•
I will illustrate FPIX2 operation by focusing first at the base of a
column of pixels.
Each column contains four “command sets” and can hold data
associated with four separate beam crossings.
Command set actions are determined by two state machines,
one which changes state on Beam Cross Over clock edges
(1/132 ns), and one which changes on Readout clock edges
(max frequency = 35 MHz = 1/29 ns).
EOC logic is constructed so that the BCO clock & Readout clock
need not be related to one another, either in frequency or in
phase.
8
No BCO# latched yet
L–0
E–0
E–0
E–0
NTS
(at bco clock edge)
Pixel(s) are hit
(at bco clock edge)
L–0
NTS
F – #1
NTS
E–0
E–0
E–0
NTS
NTS
NTS
NTS
L–0
E–0
E–0
NTS
NTS
NTS
NTS
(at readout clock edge)
(at readout clock edge, after
horiz. token has arrived)
(token)
O – #1
NTS
O – #1
STS
O – #1
Talking
L–0
E–0
E–0
NTS
NTS
NTS
NTS
L–0
E–0
E–0
NTS
STS
NTS
NTS
L–0
E–0
E–0
NTS
Talking
NTS
NTS
States: (Empty/Listen/Full/Output) & (Nothing To Say/Something to Say/Talking/Silent) 9
Priority logic insures only 1 command set at a time in Listen or Output.
Core Readout
•
•
•
•
Readout is controlled by two sets of tokens, horizontal &
vertical.
The vertical tokens are launched (independently) when a
command set enters the “Output” state. They move again as
data is driven off of a pixel.
When any EOC command set enters the “Something to Say”
state, the core logic launches the horizontal token on the next
readout clock edge (if it is not already active).
When the token arrives at a column with something to say, the
command set in STS goes to “Talking” at the next readout clock
edge, and data is driven onto the Core output bus on this
readout cycle.
10
Core Readout (continued)
•
•
When the vertical token arrives at the last hit pixel associated
with the command set in the “Output” state, the horizontal token
is released and stops at the next column with something to say.
This insures that there will not be an empty readout cycle
between the last hit in one column and the first hit in the next
column.
When the horizontal token passes the last column, the core
goes silent for one readout cycle. Another readout cycle is
required to relaunch the horizontal token, so there are always
two empty readout cycles every time the horizontal token
sweeps past all 22 columns. These are the only “wasted”
readout cycles.
11
Data Output
•
Data is driven off of a hit pixel onto the Core output bus, which is
23 bits wide. The data word consists of the information
generated in the pixel unit cell (7 bit row number, and 3 bit ADC
value), plus a 5 bit column number and an 8 bit BCO number,
which are added by the end of column logic.
•
The Data Output Interface latches data from the Core output bus
on the falling edge of the readout clock, serializes the data, and
drives it off chip.
12
Output Data Format
b23 b22 b21 b20 b19 b18 b17 b16 b15 b14 0
Status
0
0
0 0
0
0
Sync
0
0
0
0 0
0 1
Word Mark
b23 b22 b21 b20 b19 b18 b17 b16 b15 b14 b13 b12 b11 b10 b09 b08 b07 b06 b05 b04 b03 b02 b01 1
Row Number
Column
BCO
ADC
Word Mark
• Five bits are used to encode 22 columns. The column numbering
scheme has no column number ending in 00. This ensures that a data
word can never have 0’s in b01 – b13. This feature distinguishes a data
word from a sync/status word.
•Synchronization between the FPIX2 and the Pixel Data Combiner Board
is established and maintained using the “sync/status” word. Whenever no
data is available for output, the FPIX2 transmits the sync/status word. At
least two sync/status words are guaranteed to be output every time the
column number decreases. In addition, 23 bit hit data is transferred using
a 24 bit word. The PDCB uses the word mark bit as a sync check on every
word transfer.
13
Programming Interface
•
•
•
•
Each FPIX2 has a chip id, which is set by wire bonds on internal
bond pads.
I/O is bussed on three pairs of lines: shift control, shift in, shift out.
I/O is synchronous – clocked by the BCO clock.
Commands can be addressed to a single chip, or broadcast to all
chips on the bus.
Shift Control
BCO
Shift In
C5 C4
C3
C2
C1 R5
R4
R3 R2
R1
I3
I2
I1
1
Chip ID
Register #
Instruction
14
Programming Interface
Instructions
•
•
•
•
•
Write (followed by 2, 8, or 2816 bits of data)
Set (all bits in register = 1)
Read
Reset (all bits in register = 0)
Default (set register to default value)
15
Registers and DAC’s
•
•
•
•
23 of 32 possible registers are used.
14 are 8 bit registers that control Digital to Analog Converters –
used to set bias currents and voltages, and comparator
thresholds.
2 are serpentine registers (kill and inject) running up and down
the pixel columns, with 1 bit in each pixel.
6 control facets of chip operation (# of output pairs, BCO sync
check, SendData, RejectHits, Core Reset, Programming Reset).
16
Recap: FPIX2 Block Diagram
Pixel Unit Cells (22 columns of 128 rows each)
Core
End-of-Column logic (22 copies)
Core Logic
DAC’s
Clock
Control
Logic
Next
Word
Block
Data Output Interface
Programmable Registers
Word Serializer
Programming Interface
BCO Clock
Steering Logic
Serial Clock
Input/Output
High Speed Output
17
Outlook
•
FPIX2 serial output is slow compared to “modern” commercial
electronics – 1.5 Gbps is common for relatively short copper links
(x10 performance boost is possible).
•
FPIX2 zero suppression scheme could probably can be made 10
times faster, but only by adding complexity (grouping pixels within a
column & adding another level of token, for example) or reducing the
number of pixels per readout chip.
•
Signal processing must be optimized:
– Time between crossings?
– Maximum leakage current?
– ADC? # of bits?
•
Radiation environment must be understood:
– Total dose.
– Mitigation of single event upsets.
18
Backup slides
19
Data Driven Architectures
OP
OP
OP
Pipeline
OP
OP
Parallel
OP
20
Data Driven Architectures
OP
OP
Parallel
Pipeline
OP
OP
OP
OP
21
Zero Suppression (1/2)
• Pixel array is organized into semiautonomous columns.
• Data is held at the individual pixel until it is
read out.
• Each hit pixel “raises its hand” & each
column with a hit “raises its hand.”
22
Zero Suppression (2/2)
• Pixels not hit are skipped using “tokens.”
• A token is dropped down each column &
stops if there is a hit pixel.
• A token is launched across the base of the
columns & stops at the first column with a
hit.
• While a hit is read out, the token advances
to the next hit (skipping zeros).
23
Readout
• Each pixel chip sends its data out on a
dedicated point-to-point serial link.
• When readout is enabled, output
continues as long as the clock is present
and there is data to send.
• Each BTeV pixel readout chip can output
~ 30 MHz of pixel hits (1 hit per 30 ns).
24
Radiation Hard
• 0.25μ CMOS design started in 1998; verified
in a series of small test chips.
– Tested to 87 Mrad with no degradation in analog
performance and only minor changes required to
bias conditions.
– Digital cells insensitive to total dose.
– No latch-up, no gate rupture.
– Single event upset cross sections measured,
typically < 10-15 cm-2 per bit.
• Measured with 200 MeV protons; about the same for MIP’s.
• Expected upset rates: ~10 pixel kill bits/hr, 2 DAC register bits/hr, 1
data serializer bit/hr.
• Will reload kill pattern each fill, & monitor other registers & reset as
necessary (can be done without interrupting data taking for most
errors).
25
Aside: could the pixel be smaller?
• Amplifier takes < 25% of area.
• ~ 50% of area is associated w/3-bit FADC.
• Could be made smaller by:
– Using fewer ADC bits.
– Using newer process (digital sections would
get much smaller).
26
Pixel Unit Cell
Vdda
Thresholds (Vth1-Vth7)
Token Out
Iff
ADC
Thermometer
Vfb2
Flash
Latch
to Binary
Encoder
Vfb
Resets
Ibp1
Row
Address
Bus
Ibb
-
Ibp2
Command Interpreter
+
Sensor
Controller
00 - idle
01 - reset
Kill
10 - output
11 - listen
Inject
Test
Vref
Threshold (Vth0)
4 pairs of
Command Lines
RFastOR
Throttle
HFastOR
Read Clock
Read Reset
Token In
Token Reset
27
RCLK and SCLK
•
•
The core readout clock (RCLK) is derived from the serial clock
(SCLK). SCLK is constructed from external clocks and is
nominally 140 MHz.
The frequency of RCLK depends on the number of output pairs
being used. This relationship means that no buffer memory is
required in the Data Output Interface.
Configuration
SCLK Frequency
RCLK Frequency
6 output pairs
140 MHz
35 MHz
140(6/24)
4 output pairs
140 MHz
23.3 MHz
140(4/24)
2 output pairs
140 MHz
11.7 MHz
140(2/24)
1 output pair
140 MHz
5.8 MHz
140(1/24)
35 MHz = 4.6/132 ns
28
Download