Reliability Defining - Designing

advertisement
Reliability Overview
Brad Beaird
Last revised 30 June 2014
Agenda- Reliability Overview
After completing this module, you will be able to:
• Understand the role of reliability in design
• Set system reliability goals
• Allocate reliability in a design to subsystems
• Conduct Weibull life analysis
• Construct/evaluate test plans
Definition
Definition
RELIABILITY is . . .
• Probability that a component or system will
not fail (probability of survival)
• Under specified operating conditions
• Through a given point in time, R(t)
Reliability: It’s About
PERFORMANCE over Time
Reliability
Confidence
The Probability “R” that the
item will perform its intended
function
The chance “C” that the
reliability will be as good as
specified
Quality over Time
R(t) at C% confidence
Time
Tools
At what point in time “t” do
we need to specify operation?
What analyses and tests allow
us to make the prediction?
How will my design operate over time?
Reliability not modeled, predicted or
verified in the development process is
left to the customer to determine!!
Reliability versus Durability
Key Takeaways:
• Reliability testing without testing to failure provides little benefit
• Durability measures how long a product will last until it cannot be repaired.
Reliability measures intermittent interruptions during this usage period.
• We can estimate durability from a reliability test but not the other way around
• We should test similar to the customers’ environment
• The customers’ experience is based primarily on reliability
• Reliability tests are shorter and more efficient than durability tests
Reliability Concepts
Failures
Typical “Quality Over” Time follows a
Bathtub Curve
Infant mortality
Wearout
Useful Life
Time
Initially, failures are due to problems in
Workmanship or poor quality control
Reduced through burn-in
testing, quality control,
error-proofing
Then, most systems reach a constant rate;
failures are caused by environment, chance events
Reduced by design,
redundancy
Finally, systems wearout, failures are caused by fatigue, corrosion, aging
Reduced by derating, PM, parts replacement, design technology
Failures
Quiz- Reliability Concepts
Time
1. Define reliability
2. In the infant mortality phase of the bathtub curve, the failure rate is:
 Increasing
 Decreasing
 Constant
3. In the wear-out phase of the bathtub curve, the MTBF is:
 Increasing
 Decreasing
 Constant
Reliability Planning, Goals &
Growth
Reliability Planning & Goals
• No goal  worst case
• “As good as current” also pretty lousy
• MTBF > 1000 hours a quantified goal for overall system
• MTBF > 1000 hours, with 90% confidence even better,
speaks to sample size
• B10 life (time at which 10% of population will fail) > 1000
hours with 95% confidence for a 90th percentile user Yes!
Very specific, measurable, stated in terms of customer usage
Reliability Goals & Growth – MTBF example
Given the following field trial data, estimate the MTBF:
Unit
1
2
3
4
5
Days
61
35
59
8
90
Comment
failed
failed
failed
failed
suspended
Exercise: Calculate the Mean time between failures (MTBF)
The previous series of field tests revealed an MTBF of 50. Is there
growth in this reliability parameter?
Reliability Goals & Growth
Given the following program test data, calculate and plot the cumulative
mean time between failures:
Test Hours
0-100
101-200
201-300
301-400
401-500
Repairs/Failures
12
7
4
3
3
Cumulative MTBF
100/12 = 8.3 hours per failure
200/19 = 10.5
300/23 = 13.0
400/26 = 15.4
500/29 = 17.2
Program MTBF goal = 22
• Called a “Duane” model, can
be plotted on a log-log scale to
straighten out the line
• Alternatively could have
plotted the reciprocal failure
rate (e.g., failures per 100 hrs)
Growth parameter
• Will we achieve the goal by
the end of the program after
800 hours of testing?
Reliability Allocation
& Modeling
Reliability Allocation-Example
System goal
could also
have been an
MTBF figure
Car Engine,
needs R= 0.90 at 1000 hrs
System Level
(i.e., B10 > 1000 hrs)
Engine Block subsystem
R = 0.925
Subsystem Level
Fuel & Air subsystem
R = 0.973
Component Level
Connecting Rod component
R=0.999
Fuel Injector component
R=0.995
Reliability Allocation is about cascading down a System goal into subsystems &
components. Q: Why do numbers get bigger at lower levels of the model?
Reliability Block Diagrams, Series
R1= 0.95
R2= 0.97
R3= 0.99
What is the system reliability?
Reliability of System= 0.95 x 0.97 x 0.99
= 0.91
We use Reliability Block Diagrams to model our system
from the bottom up using estimates on components and
subsystems
Reliability Block Diagrams, parallel
R1= 0.75
R3= 0.99
R2= 0.75
We can design using redundant, relatively
low reliability components in parallel to
achieve overall system reliability goals
What is the system reliability?
Hint: Figure out the parallel subsystem reliability,
then multiply in series with component 3.
Reliability Block Diagrams, Parallel
(redundancy)
R1= 0.75
R3= 0.99
R2= 0.75
RS= [1-(1-R1)(1-R2)] x R3
The trick for this subsystem reliability is:
= [1-(0.25)2] x 0.99
Probability (subsystem survives)
= Probability (1 or more survives)
= 1 – probability (R1 and R2 fail)
= 1 – 0.252
= 0.9375 x 0.99
=0.928
Reliability Block Diagrams- Exercise
R= 0.90
R= 0.90
R= 0.90
R= 0.85
R= 0.95
R= 0.92
Calculate the reliability of this system
Reliability Allocation Exercise
R= 0.90
R= 0.90
R= 0.90
R= ?
R= 0.95
R= ?
Q: If the overall system reliability goal is 0.98, what
should the reliability be for the two redundant
components in the far right subsystem? Assume
both components have the same reliability.
Weibull Analysis
History on Weibull
• Waloddi Weibull- Swedish
Engineer
• Famous for pioneering work on
reliability and life analysis
• The Weibull distribution is named
after him, and is a popular tool for
modeling lifetimes
Probability
Cumulative
Reliability or
Density Function
Distribution
Survival
Function
Function
Hazard
Function
f ( t)
R ( t)
t
F ( t)
1
F ( t)
f ( t) dt

h ( t)
f ( t)
R ( t)
More on The Weibull Distribution
 Well suited for modeling lifetime data
 Components
Weibull Reliability equation

 Systems
 t 
 

 
 Parameters
R (t )  e
 Slope of the line (shape, β)
 Characteristic life (measures dispersion of data, ), B63.2 life
 Optional, guaranteed life (time before anything will fail, )
 Mimics many distribution shapes (skewed left, skewed right, symmetric)
 Characterizes the failure distribution so we can make predictions
 Tells us where we are in the bathtub curve so we can fix problems
β<1 means infant mortality, β=1 means useful life, β>1 means wear-out
Weibull Reliability Practice Calcs
Weibull Reliability equation
R (t )  e
Time, t
50
50
100
100
Slope, β
2.0
3.0
2.0
3.0
 t
 





Characteristic life,  Reliability, R
100
?
100
?
100
?
100
?
Example Weibull analysis
B10 – Point at which 10% of failures predicted
Failure
Time
(Hours)
Median
Rank
Value
1
2.8
0.07
2
4.3
0.18
3
5.2
0.29
4
5.7
0.39
5
8.2
0.50
6
8.7
0.61
7
9.8
0.71
8
12.3
0.82
9
18.5
0.93
Number
3 hours
1
2
3
4 5 6 789 10
20
30 40 50
Information We Can Get From Completed Plot
We can compare
two products or
processes
Example- compare
components from 2 suppliers
Q: Which one is “better”
(hint- it’s an open-ended
question)
1
2
3
4 5 6 789 10
20
30 40 50
Information We Can Get From Completed Plot
DV1
We can show
objective evidence
of reliability
GROWTH
DV2
On the Weibull plot,
“growth” means
pushing the plotted
line to the right and
flattening it
1
2
3
4 5 6 789 10
20
30 40 50
Weibull Analysis Exercise
• Statement on GE light bulb package: Lifetime 1000 hrs (what does this
mean?)
Let’s do Weibull Analysis by hand to understand
the technique
• Data (time to fail)
1. Gather data & sort in ascending order
– 450 hrs
– 2100 hrs
2. Get median rank values from table
– 1200 hrs
3. Plot ordered pairs (time, median rank)
– 805 hrs
4. Fit a straight line
5.
•
•
•
•
Estimate distribution parameters
What is the slope?
What is the characteristic life?
What is the B10 life? B50 life?
What part of the bathtub curve are we in?
Looks like the GE figure
of 1000 hours is a
median (B50) life
Median Rank Table Examples
Previous MTBF example
Given the following field trial data, estimate the MTBF:
Unit
1
2
3
4
5
Days
61
35
59
8
90
Comment
failed
failed
failed
failed
suspended
•
Weibull gives much more
info than MTBF alone
•
We can characterize the
entire life distribution
•
When we have suspended
data (like unit #5 above),
software is usually used to
do the Weibull analysis
Accelerated Life Testing Example
 We can mimic the life of our product in the field via testing
 Problem is we only have a short time to test for a long life
 Solution  accelerate the testing via higher stresses:
• Stress in this example
is temperature
• We ran tests to fail at
4 higher temps and
did Weibull analysis
• We then predicted the
life at a normal temp
of 80 degrees
System Maintenance Applications
 Used to determine optimal parts replacement strategies
 The approach
1.
Collect time to fail data, construct a Weibull plot, and verify you are in the wearout phase of the bathtub curve.
2.
The optimal time to replace a part is based on Weibull parameters and ratio of
cost of unplanned maintenance to planned maintenance.
3.
Overall goal is to reduce total cost of downtime and improve system availability.
Availability = MTBF/ (MTBF + MTTR).
4.
To improve system availability, you either increase the mean time between
failures or decrease the mean time to repair or both.
System Availability, Exercise
 If MTBF = 100 hours and MTTR = 24 hours, A = ?
 If MTBF = 200 hours and MTTR = 24 hours, A = ?
 If MTTR = 24 hours and Availability goal = 0.97 (97%), MTBF = ?
Reliability Demonstration
Testing
Zero Failure Acceptance Testing
• Calculate
n
ln( 1  confidence )
ln( R )
Acceptance testing
(versus testing to
failure) is sometimes
necessary, though we
do not learn as much.
Exercise: The requirement for a component states that the
supplier must demonstrate at least 95% reliability with 90%
confidence (R95 C90). How many units should the supplier test
with no failures in order to pass?
Exercise- Demonstration Testing
Customer acceptance criteria requires us to demonstrate a B10 life > 500
hours, with 95% confidence. Past testing shows the Weibull slope is 1.6.
There is 1000 hours available in the test lab. How many units must be
tested, with zero failures allowed?
β=1.6
Confidence = 95% (0.95)
R= 0.90 (B10  90% reliability)
Target time= 500 hrs, Actual time=1000 hrs, k is multiple of required time
n=
1
𝑘𝛽
×
ln(1−𝑐𝑜𝑛𝑓)
ln(𝑅)
Agenda- Reliability Overview
Our learning objectives were:
• Understand the role of reliability in design
• Set system reliability goals
• Allocate reliability in a design to subsystems
• Conduct Weibull life analysis
• Construct/evaluate test plans
Do you understand?
Questions?
Download