TileCAL LVPS Discussion of SEU Problem

advertisement
ATLAS TileCAL
TileCAL LVPS
Discussion of SEU Problem
Gary Drake, Bob Stanek
Argonne National Laboratory, USA
May 3, 2011
LVPS Radiation Test Session Summary
 Session #1: Protons – Dec. 4, 2010
– Massachusetts General Hospital, Cancer Treatment Facility
– 188 MeV protons from DC cyclotron
– Tested: (3) V7.3.1 +15MB, (2) V7.3.1 +5DIG
 Session #2: Neutrons – Dec. 6, 2010
– Univ. of Mass., Lowell, MA, Training Reactor
– 1 MeV-equivalent neutrons from reactor core
– Tested: (3) V7.3.1 +15MB, (2) V7.3.1 +5DIG
 Session #3: Gammas – Dec. 16, 2010
Better
Cooling…
Greater
Sophistication…
– Brookhaven National Lab, Co-60 Source Facility
– 1.2 MeV photons from source
– Tested: (2) V7.3.1 +15MB, (2) V7.3.1 +5DIG, (1) V6.5.4 +5V
 Session #4: Protons – Feb. 27, 2011  Repeat to study components
– Massachusetts General Hospital, Cancer Treatment Facility
– Tested: (1) V7.3.1 +15MB, (1) V7.3.1 +5DIG , (1) V6.5.4 +5V, (2) Components Bds
 Session #5: Neutrons – Feb. 28, 2011  Repeat to study components
– Univ. of Mass., Lowell, MA, Training Reactor
– Tested: (3) V7.3.1 +15MB, (2) V7.3.1 +5DIG , (1) V6.5.4 +5V, (2) Components Bds
 Session #6: Protons – Apr. 17, 2011  Confirmed cause of SEU problem
– Massachusetts General Hospital, Cancer Treatment Facility
– Tested: (1) V7.3.2 +15MB, (1) V7.3.2 +5DIG, (1) V7.3.1 +5DIG, (2) Components Bds
 Session #7: Protons  June 15, 2011 ?
2
Summary of LVPS Radiation Test Results So Far
 Radiation Tests
– Gamma Tests
• No trips through dose; No deaths through dose range of interest
• Observe general degradation in noise, stability, & efficiency vs. dose
• Calibration constants mostly OK; Some changes in offset voltages
• Saw OVP & OCP failures after high dose
• Saw failures of opto-isolators after high dose
 Probably OK, but analysis of data continuing
– Neutron Tests
• Original test apparatus flawed (oven effect caused tripping from heat),
but neutron damage is independent of powering brick  Improved in Session 5
• Bricks die at ~85% of target  probably OK
• Observe general degradation in noise, stability, & efficiency vs. dose
• Calibration constants OK; OVP & OCP OK
 Probably OK; Data analysis in progress
– Proton Tests
• Single Event Upset (SEU) problem exists in both new and old designs
– We have confirmed that it is due to the soft start feature of the controller chip
– the heart of the brick design
– 3 choices: 1) Live with it; 2) Modify current design; 3) Redesign with new controller
This is the schedule driver for production
3
Discussion of SEU Problem
 Brick Block Diagram
LC Filter
FET
Driver
200V
+
Transformer
+
RSHUNT
-
Vout
RSHUNT
Startup
To
ELMB
Shutdown
Monitor
Voltages
OVP,
OCP,
Temp,
&
Monitor
Stop
Over Temp
LT1681
Controller
Chip
OpAmp
VFB
Opto
Isolator
VIN*
Run
VOUT*
IFB
IOUT*
Opto
Isolator
IIN*
-
GNDSEC
GNDPRI
Startup
&
Shutdown
Control
LC Filter
LC Buck
 Feedback loop makes it hard
to diagnose tripping problems…
 Component Test Board Block Diagram
LC Filter
OpAmp
+
All
Diode
Types
200V
VOUT5
VOUT4
Startup
Shutdown
Startup
&
Shutdown
Control
VOUT6
Temp
VOUT9
H
L
L
Run
Stop
Over Temp
GNDPRI
LT1681
Controller
Chip
VOUT1
OpAmp
Opto
Isolator
-
VREF1
Opto
Isolator
Bias
VREF3
VOUT3
FET
Driver
VREF2
VREF4
VOUT7
VOUT2
VOUT8
 All Active Components Represented
 All Critical Passive Components Represented
(Except Transformer)
4
Discussion of SEU Problem (Cont..)
 Brick Block Diagram
LC Filter
FET
Driver
200V
+
Transformer
+
RSHUNT
-
Vout
RSHUNT
Startup
To
ELMB
Shutdown
Monitor
Voltages
OVP,
OCP,
Temp,
&
Monitor
Stop
Over Temp
LT1681
Controller
Chip
OpAmp
VFB
Opto
Isolator
VIN*
Run
VOUT*
IFB
IOUT*
Opto
Isolator
IIN*
-
GNDSEC
GNDPRI
Startup
&
Shutdown
Control
LC Filter
LC Buck
 Component Test Board Block Diagram
LC Filter
OpAmp
+
All
Diode
Types
200V
VOUT5
VOUT4
Startup
Shutdown
Startup
&
Shutdown
Control
VOUT6
Temp
VOUT9
H
L
L
Run
Stop
Over Temp
GNDPRI
LT1681
Controller
Chip
VOUT1
OpAmp
Opto
Isolator
-
VREF1
Opto
Isolator
Bias
VREF3
VOUT3
FET
Driver
VREF2
VREF4
VOUT7
VOUT2
VOUT8
 Focus on LT1681 Controller Chip
5
Discussion of SEU Problem (Cont.)
 Summary of results from Proton Irradiation Studies
– Observe tripping in V7.3.1 bricks and also the V6.5.4 bricks
– Observe resetting of LT1681 chip in components board
• Clocks stop for a short time, then restart
• Conclusion: SEU in LT1681 initiates a soft-start
– Rates ~same for all brick types, versions, and components board
• LT1681 is common to all…
Plot from Bob
– From most recent tests, observe that when soft-start delay is made very short
in V7.3.1 bricks, then No tripping is observed
• i.e., brick trips, but soft-start restarts brick faster than our DAQ (and DCS) can measure
• Measurement done with resistive load, not in drawer (yet)
6
Discussion of SEU Problem (Cont.)
 What is happening:
Schematic
Soft-start capacitor
Value controls startup delay
LT1681
LT1681 Block Diagram
–
–
–
–
–
Generally, the soft-start feature is used
to restart circuit gracefully after a fault
Not used this way in LVPS – No HDW
automatic restart after a trip
V6.5.4: Used to control startup
sequence, ~0.1 – 0.8 Sec
V7.3.1: Set to 30 mSec (Startup
sequencing done another way…)
Bricks will trip off after ~10 mSec
due to dissipation of energy on
primary side, which is why this
feature does not work in this design
This FF is being reset by protons
7
Discussion of SEU Problem (Cont.)
 Soft-Start Feature
– When brick trips due to OVLO or TEMP or IMAX (or SEU), FF1 is SET
• Causes clocks to stop
• Causes voltage on CSS to be reset
– CSS recharges from internal 10 uA current source
• When reaches ~1.3V, clocks restart with reduced duty factor
• When VSS reaches 2V, clocks fully restarted
• Restart time depends on Css
SEU
Clocks restart when
Vss reaches ~1.3V, Reduced DF
CLK
Clocks normal when
Vss reaches ~2V
5V
Decreasing
CSS
VSS
FF1
VSS
2V
CSS
0.225V
+
Discharge Time
Depends lightly on Css
Recharge Time
Depends dominantly on Css
Slope = 10 uA / Css
Variable DF Time
Depends dominantly on Css
Affects Overshoot of Vo
8
Solutions to SEU Problem Under Study
 Eliminate the soft-start feature in the LT1681
– We have explored this, and it cannot be done
– The soft-start feature is a basic function of the chip needed to start the
brick
 Reduce the effect of SEU in the LT1681
– Work is in progress; 3 techniques being studied
– Best so far: 200 uSec stoppage of clock
– May have consequences for performance of front-end electronics
• Causes an overshoot of output voltage from cold start
• Will have droop of output voltage during 200 uSec dead-time
Subject of discussion today
 Redesign the brick using a different controller chip
– Have identified a different controller that does not have the soft-start
feature
– Major effort…
• Prototypes; Testing in Building 175; Tests on detector; Radiation tests…
• Would make production schedule for 2013 installation very tight…
9
Summary of Bench Studies
Modification Scenario
 Soft-start feature cannot be disabled in the LT chip
– It is an integral part of the operation of the device
 Soft-start delay affects how fast the brick starts up
– CSS voltage modulates duty factor of the clock
– Starts with low DF, and gradually increases
– If soft-start delay is too short, then output voltage rises too fast, and
can cause an overshoot
– Amount of overshoot depends on load current & load capacitance
 If soft-start cap is removed, then are left with parasitic capacitance
– Minimum delay, nominal 200 uSec, spread 150 uSec – 300 uSec
 When in soft-start mode, output voltage sags
– Clocks stop, so switching stops
– Amount of sag depends on load current and load capacitance
10
Summary of Current Approach to Problem
 Eliminate CSS  Soft-Start Delay determined by parasitic capacitance
– When SEU occurs
• Brick trips off for 200 uSec, then restarts
• Since 200 uSec << 10 mSec, brick restarts from soft-start operation
– 2 caveats:
• Results in fast start from cold-start, creating an overshoot on output voltage
• Causes droop in output voltage for 200 uSec dead-time
Output Voltage Overshoot
On Cold-Start
Simulated SEU
Output Voltage
Note:
Overshoot lasts < ~5 mSec
Soft Start Delay
11
Summary of Current Approach to Problem (Cont.)
 Addressing the issues
– Overshoot
• Current values with no Css shown in Table 1
• We are working on a way to reduce the overshoot through other means, but do
not have a solution yet
12
Summary of Current Approach to Problem (Cont.)
 Addressing the issues (Cont.)
– Droop – Add additional load capacitance to increase energy
storage during dead-time
• Current values with no Css shown in Table 2
• For a target of 10% droop (somewhat arbitrary),
additional Cload values needed shown in Table 3
 Cannot quite meet 10% spec for +5MB…
 Note: Have not repeated Overshoot Tests with additional CLOAD…
13
Discussion Points












Component damage due to overshoot at cold start. Consider the
magnitude of the peak value, and the duration of the overshoot before
returning to nominal value.
Pedestal effects due to the droop from SEU.
False positive signals from the droop from SEU.
Changes to gain or operating points from the droop from SEU.
False or missing triggers from the droop from SEU.
Component damage from the droop from SEU.
Loss of timing synchronization from the droop from SEU.
Loss of serial transmission synchronization from the droop from SEU.
Loss of FPGA programming from the droop from SEU
Others?
Would like hard limits from FEE groups, for overshoot & droop
Some optimization might be possible
i.e. increased soft-start delay, a little more droop, less overshoot…
14
Download