Statistics 533 Spring 2012 Final Examination Name Instructions When asked to explain something, provide an explanation that could be understood by someone who does not have formal training in statistical methods. Your explanations should be clear, but concise. You must show all of your work. Students may use two sheets (8.5 by 11 inches, both sides) of paper containing equations or notes. Students may have up to 120 minutes (two hours) to complete the exam. There are 26 questions or question parts. Students should mark clearly a total of 6 singlepart or parts of a multi-part question as “Do not grade” (leaving a total of 100 points for grading). As soon as possible after the exam is completed, it should be scanned and emailed to wqmeeker@iastate.edu or faxed to my attention at 515-294-4040. In the case of a fax, please send email to indicate that the fax has been sent. 1 1. A computer system has three main subsystems that are prone to failure: CPU, GPU and disk storage. The computer has to complete a mission of time tM . Repair during a mission is not possible. The CPU and GPU are single components, but the disk-storage subsystem is “RAID 1,” having two drives, so that the subsystem will operate even if one of the drives fails during the mission. Let FCP U (tM ), FGP U (tM ), and FDisk (tM ) denote the probabilities of failing during the mission for the CPU, the GPU, and the individual disk drives, respectively. (a) Draw a block diagram for the computer system. (b) Derive an expression for the reliability of the system (probability that the system does not go down during the mission), assuming that all component failure times are independent. 2. Describe the limitations of optimum Accelerated Life Test plans. Also, explain the primary purpose of computing optimum ALT test plans. 2 3. The delta method is widely used in statistics to estimate standard errors of nonlinear functions of parameters. The Wald method of computing confidence intervals is computationally simple, but depends on a good choice of transformation. For the one parameter exponential distribution with cdf t Pr(T ≤ t) = F (t) = 1 − exp − , t > 0, θ statistical theory predicts that, in small samples, the log likelihood will be approximately b but that symmetric for the transformation θ1/3 . Suppose that you have an estimate of Var(θ) 1/3 you need an estimate for Var(θb ) in order to compute an accurate Wald-based confidence interval for θ. (a) Give an intuitive explanation for why it is that the Wald-based interval will provide a good procedure when based on θ1/3 . (b) Give a delta-method expression that can be used to compute an estimate of Var(θb1/3 ) as b a function of an estimate of Var(θ). (c) Explain why the delta-method approximation tends to work better (i.e., is more accurate) in large samples. Draw a picture to help you explain. (d) Show how you would use an estimate of Var(θb1/3 ) to compute a Wald-based approximate confidence interval for θ. (e) What is the distributional basis for the Wald-based approximate confidence interval based on the transformation θ1/3 ? That is, what random variable is being assumed to approximately follow a NOR(0,1) distribution. 3 4. A company maintains a large number of systems and because of Drenick’s theorem, it is known that the exponential distribution provides a good approximation to the distribution of times between failures for the systems. An analyst has done a Bayesian analysis to estimate the mean time between failures, based on the data from the past month. The analyst has been given a prior distribution, based on reliable expert opinion. In a report describing the results of the Bayesian analysis, the analyst says that the posterior distribution represents variability from system to system in the population of systems. (a) Explain the conditions required for Drenick’s theorem to allow the exponential distribution to be used as a model for times between failures. (b) Comment on the engineer’s observation that the posterior distribution represents variability from system to system in the population of systems. 5. Explain, intuitively, why it is that one should test more units at lower levels of the accelerating variable and fewer at the high levels when conducting an accelerated life test. 6. Masked failure modes are often present in accelerated testing applications. Briefly explain the concept of a “masked failure mode.” Draw a picture to help your explanation. 4 7. A system can fail from either failure Mode A or failure Mode B. Under the assumption that the failure time for Mode A is independent of failure Mode B, on can estimate separately the marginal distributions for failure Mode A and failure Mode B. (a) You have access to software that can estimate a single failure-time distribution, but cannot do multiple failure mode analysis automatically. Give a simple example (with say 8 observations) to explain how you can estimate the marginal distributions of the two different failure modes using this software. (b) Explain the practical interpretation of the marginal distribution of Mode A. That is, how can an engineer make use of this information? 8. In some cases when one uses an analysis where failure mode information is used to estimate separate distributions for each mode, the resulting estimate of the failure time distribution with both failure modes active will be almost exactly the same as when the failure mode information is ignored and a single distribution fit to the data. In other cases the difference between the two analyses can be dramatically large. Explain this. 9. An analyst has suggested using the one-parameter exponential distribution as a model to describe the failure times of incandescent light bulbs. Comment on this choice of a distribution. 5 10. The Arrhenius relationship can be written as R(temp) = γ0 exp −Ea × 11605 temp K where R is the reaction rate and temp K = temp ◦ C + 273.15 is temperature in the absolute Kelvin scale, 11605 is the reciprocal of “Boltzmann’s constant,” and Ea is called the “effective activation energy.” (a) Engineers who conduct accelerated tests often think in terms of acceleration factors. Starting with the Arrhenius relationship above, derive an expression for the acceleration factor that one would have for testing at tempA versus a use condition of tempU . (b) What is the purpose of using the constant 11605 in this model? (c) Suppose that an integrated circuit has a lognormal failure time distribution with parameters µU and σU at the use conditions tempU . What will be the distribution of life at tempA ? (d) Suppose that when looking at data from two different levels of temperature used in an accelerated test that there is strong evidence that the values of σ are different at different levels of temperature. Explain what important implication or implications this has for extrapolating from high-temperature tests to the low-temperature use conditions? 6 11. A failure-time distribution for the random variable T has a hazard function h(t) = θt2 , t > 0. (a) Derive an expression for the corresponding cdf for T . (b) What is the name of the distribution of the random variable T ? 12. You have been asked to design a test to demonstrate that a newly-developed circuit breaker will survive 10 thousand operations with probability (reliability) 0.90. The Weibull shape parameters is unknown but the engineers are willing to assume that that the failure time distribution is Weibull with a shape parameter β > 1. Suppose that it is possible to test units simultaneously for 300 thousand cycles. (a) How many units need to be tested in a minimum sample size demonstration test? (b) What is the usual justification for the common assumption that shape parameter β > 1? (c) Minimum sample size demonstration tests are appealing because such demonstration tests can be conducted with fewer test units. What is the main disadvantage of such a test? (d) If an engineer finds that the minimum sample size test is not suitable, what might be a suitable alternative and what would be the price to be paid for moving to the alternative kind of demonstration test plan? 7