IENE 614 Homework #1 - Industrial & Systems Engineering

advertisement
ISEN 614 Homework #3
(Due: March 8)
Question 1. [8 pts]
You are given the South African Heart Disease data (available at the course website). The response is in
the column K of the EXCEL file, called chd (standing for “coronary heart disease”). When this value is
1, it implies that a patient has the heart disease; when it is 0, it implies that a patient does not have the
heart disease. There are a total of 462 records, where 160 patients have their cnd = 1, whereas the other
302 patients have cnd = 0. In the dataset, there are also nine explanatory variables collected for each
patient. These nine variables can be grouped into two categories: patient’s own risk and patient’s
acquired risk. The patient’s own risk includes genetic cause and their own age. The genetic cause is
measured by the family history (famhist, column F in the dataset), and the data of ages are in column J.
The other variables are considered as the acquired risk, including blood pressure (sbp), use of tobacco and
alcohol, LDL level etc, which are more or less related to the life habit of a patient.
Now, our objective is to set up a CUSUM chart to see the acquired risk for this heart disease. But we
want to adjust patient’s own risk. So we want to do a risk adjustment based on a logistic regression. Here
is what you need to do:
(1) Establish a logistic regression model that connects the probability of having cnd=1 to the two patient’s
own risk factors: famhist and age. This means that your model will have only two input factors (not
nine). Please ignore all other columns of explanatory variables when establishing this model.
(2) Follow Example 2.11, construct a risk-adjusted CUSUM chart (for the increase only) for the total of
462 patients. Please use R0 = 1 and R1 = 2. Because the procedure of deciding the control limit is time
consuming, let us skip it here. So what you need to construct is a CUSUM chart without the control limit.
(3) Once you have the CUSUM chart, pick the three highest CUSUM scores. For each of them, take a
look of the column C (the use of tobacco) to see if the use of tobacco in those cases (flagged by the
CUSUM as relatively high acquired risk) is higher than the other cases. Traditional wisdom says that
smoking will increase your risk of having a heart disease. Does our analysis confirm it?
Question 2. [6 pts]
There are two normally distributed independent quality characteristic x1 and x2 that are measured and
monitored simultaneously. Some quality engineer proposes using two individual control charts with 3sigma (  x ) control limits to monitor two variables in the same time. Let us focus on an x-bar chart in
this problem. As for each individual x-bar chart, the out-of-control condition is when one sample point is
out of the 3-sigma control limits. Sample size n may not be 1 but is not specified here.
(a) As for each individual x-bar chart, please find the -error and the -error when a 2-sigma (  x ) mean
shift has occurred.
(b) (continued from part (a)) As for the two x-bar charts, the quality engineer decides that if either one of
x-bar charts detects out-of-control condition, then the whole system is considered to be out-of-control.
Find the -error and the -error (of detecting a 2-sigma shift) for this situation.
(c) (continued from part (a)) As for the two x-bar charts, the quality engineer decides that if both x-bar
charts detect the out-of-control condition, then the whole system is considered to be out-of-control.
Find the -error and the -error (of detecting a 2-sigma shift) for this situation.
(d) Please complete the following table that demonstrates the ARL0 and ARL1 for each situation. Based
on this table, comment why we need to develop the multivariate process control technique rather than
jointly use individual control charts.
[Note: please show your intermediate steps to receive credits]
ARL0
ARL1 (of detecting a 2-sigma mean shift)
Individual x-bar chart
Joint charts with rule in
part (b)
Joint charts with rule in
part (c)
Question 3. [4 pts]
Suppose X1 and X2 are two random measurements from a process that each follow a normal distribution
X1 ~ N(2, 1) and X2 ~ N(5, 4). Suppose also that X1 and X2 are not independent but have covariance
Cov(X1, X2) = 1.5. Please write down
(1) The mean vector and covariance matrix of the 2-dimensional random vector X  [X1 X2]T,
(2) The joint probability density function of X,
(3) Draw the contour plot of X for =0.01.
(4) If Cov(X1, X2)=0, please repeat (1) - (3).
Question 4. [7 pts]
A T2 statistic is used to monitor a bivariate normally distributed process. A random sample of size 10 is
collected and used to get
1 
2 / 3 1 / 3
 2
. We want to test H0: μ 0    .
x    and S  

 4
 4
1 / 3 1 / 3
(a) What is the value of T2 statistic, given the above parameters?
(b) What is the distribution for T2 for the situation in (a)?
(c) Using (a) and (b), test H0 at =0.01?
(d) Assume that the in-control mean is μ 0 and the in-control covariance matrix is Σ 0 . We also
know that Σ 0 has eigenvalues {1, 2}={1, 1/3}. Let e1 and e2 denote the eigenvectors
corresponding to 1 and 2, respectively, scaled to have unit norm. Consider two mean shifts: 1
= 0 + ce1 and 2 = 0 + ce2, where c is some scalar constant. Which of the two mean shifts would
have a higher probability of being detected? You must justify your answer to receive credit.
[You do NOT need to actually calculate the eigenvectors to solve this problem. Recall that Σ 01
can be decomposed as Σ 01 

p
i 1
1
e i e Ti ]
i
Download