Exam IIIKey

advertisement
Applied Business Statistics
Exam III
December 8, 2006
Answer all of the following (25) questions
Use the following to answer questions 1-2:
Let p1 represent the population proportion of U.S. Senate and Congress (House of
Representatives) democrats who are in favor of a new modest tax on "junk food". Let p2
represent the population proportion of U.S. Senate and Congress (House of
Representative) republicans who are in favor of a new modest tax on "junk food". Out of
the 265 democratic senators and congressman 106 of them are in favor of a "junk food"
tax. Out of the 285 republican senators and congressman only 57 of them are in favor a
"junk food" tax.
1. Find a 95 percent confidence interval for the difference between proportions l and 2.
pˆ 1 
106
 0.4
265
s pˆ1 pˆ 2 =
pˆ 2 
57
 0 .2
285
(.4)(0.6) (0.2)(0.8)
=0.038297

265
285
Confidence Interval:
pˆ 1  pˆ 2  z  s pˆ1  pˆ 2
2
Answer: (0.125, 0.275)
2. At  = .01, can we conclude that the proportion of democrats who favor "junk food"
tax is more than 5% higher than proportion of republicans who favor the new tax?
Z=
(0.4  0.2)  0.05
 3.919
0.038
At α =0.01, 3.91> 2.33
Answer: Reject H0
3. Test H0: μ1  μ2, HA: μ1 > μ2 at α = .10, where X 1 = 77.4, X 2 = 72.2, s1 = 3.3, s2 =
2.1, n1 = 6, n2 = 6. assume population standard deviations are equal.
Answer: Reject H0
t10,.10  1.372
(6  1)(3.3) 2  (6  1)(2.1) 2
s 
 7.65
662
2
1 1
s X1  X 2  7.65     1.597
6 6
77.4  72.2
tcalc 
 3.256
1.597
Since 3.256 >1.372, reject H 0
Use the following to answer questions 4-5
A fast food company uses two management-training methods. Method 1 is a traditional
method of training and Method 2 is a new and innovative method. The company has just
hired 36 new management trainees. 15 of the trainees are randomly selected and
assigned to the first method, and the remaining 16 trainees are assigned to the second
training method. After three months of training, the management trainees took a
standardized test. The test was designed to evaluate their performance and learning from
training. The sample mean score and sample standard deviation of the two methods are
given below. The management wants to determine if the company should implement the
new training method.
Method 1
Method 2
Mean
69
72
Standard deviation
3.4
3.8
4. Write the null hypothesis and the alternative hypothesis.
Answer: H0: 1 - 2  0, H0: 1  2
5.
(15  1)(3.4) 2  (16  1)(3.8) 2
s2 
 13.05
16  15  2
1 1
s X1  X 2  13.05     1.298
 15 16 
69  72
tcalc 
 2.311
1.298
-2.311 < -1.6699 reject H0
Use the following to answer questions 6-7:
The mid-distance running coach, Zdravko Popovich, for the Olympic team of an eastern
European country claims that his six-month training program significantly reduces the
average time to complete a 1500-meter run. Five mid-distance runners were randomly
selected before they were trained with coach Popovich's six-month training program and
their completion time of 1500-meter run was recorded (in minutes). After six months of
training under coach Popovich, the same five runners' 1500 meter run time was recorded
again the results are given below.
Runner
Completion time before training
Completion time after training
1
5.9
5.4
2
7.5
7.1
3
6.1
6.2
4
6.8
6.5
5
8.1
7.8
6. At an alpha level of .05, can we conclude that there has been a significant decrease in
the mean time per mile?
Answer: Reject H0, significant decrease in completion time after training.
Let Di  (Time before)-(Time after)
H 0 :  d  0, H A :  d  0
t4,.05  2.132
.5  .4  (.1)  .3  .3
 .28
5
sd .228
sd  .228,

 .102
n
5
.28  0
t
 2.746
.102
2.746  2.132, reject H 0
d
7. Construct the appropriate 95% confidence interval.
Answer: .063 minutes to .497 minutes
Let Di  (Time before)-(Time after)
t4,.05  2.132
.5  .4  (.1)  .3  .3
 .28
5
sd .228
sd  .228,

 .102
n
5
.28  (2.132)(.102)  .063 min. to .497 min.
d
Use the following to answer questions 8
An experiment was performed on a certain metal to determine if the strength is a function
of heating time. Results based on 10 metal sheets are given below. Use the simple linear
regression model.
 X = 30
 X = 104
 Y = 40
 Y = 178
 XY = 134
2
2
8. Find the estimated y-intercept.
Answer: b0 = 1
(30)(40)
 14
10
(30) 2
SS XX  104 
 14
10
14
b1   1
14
 40   30 
b0     1   1
 10   10 
SS XY  134 
9-11. Complete the following partial ANOVA table from a simple linear regression
analysis with a sample size of 15 observations. Use the F test to test the significance of
the model at  = .05.
Source
Regression
Error
Total
SS
309.9
685.5
995.95
DF
1
13
14
MS
309.9
52.77
71.14
F
5.87
Consider the following partial computer output for a multiple regression model.
Predictor
Constant
X1
X2
X3
Coefficient (bi)
99.3883
-0.007207
0.0011336
0.9324
Standard Dev (sb)
0.0031
0.00122
0.373
12. The calculated value of the t statistic for X1 is ________.
Answer: -0.00727/0.0031 = -2.325
Use the following to answer questions 113-18:
Below is a partial multiple regression ANOVA table.
Source
X1
X2
X3
Error
SS
535.9569
1,167.5634
18.9886
3,459.6803
df
1
1
1
8
13. How many observations were in the sample?
Answer: n-(3+1) = 8 , n=12
14. What is the total sum of squares and the degrees of freedom for total sum of squares?
SS Total = 535.9569 + 1167.5634 + 18.9886 + 3459.68 = 5182.19
15. What is the mean square error?
MSE = 3459.6803/8 = 432.46
16. Calculate the explained variation.
Explained variation = SSR = 535.9569 + 1167.5634 + 18.9886 = 1722.51
17. Calculate the proportion of the variation explained by the multiple regression model.
Explained variation = SSR = 535.9569 + 1167.5634 + 18.9886 = 1722.51
SSR 1722.51
R2 

 .3324
SST 5182.19
18. Test the overall usefulness of the model at  =.01. Calculate F and make your
decision about whether the model is useful for prediction purposes.
Answer:
F.01,3,8  7.59
535.9569  1167.5634  18.9886
 574.17
3
3459.6803
MSE 
 432.46
8
574.17
F
 1.33
432.46
1.33  7.59, failed to reject H 0
MS Regression 
Use the following to answer questions 19-21:
The management of a professional baseball team is in the process of determining the
budget for next year. A major component of future revenue is attendance at the home
games. In order to predict attendance at home games the team statistician has used a
multiple regression model with dummy variables. The model is of the form: y = 0 +
1x1 + 2D2 + 3D3 +  where:
Y = attendance at a home game
x1 = current power rating of the team on a scale from 0 to 100 before the game.
x2 and x3 are dummy variables, and they are defined below.
x2 = 1, if weekend
x2 = 0, otherwise
x3 = 1, if weather is favorable
x3 = 0, otherwise
After collecting the data based on 30 games from last year, and implementing the above
stated multiple regression model, the team statistician obtained the following least
squares multiple regression equation: yˆ  1050  250 x1  2200 x2  5400 x3
The multiple regression compute output also indicated the following:
sb1  800, sb2  1000, sb3  1850
19. Interpret the estimated model coefficient b1
Answer: For each additional rating point the baseball team receives, the average
attendance is expected to increase by 250 people when the independent variable (x1) is
within the experimental region and the other two independent variables are held constant.
Difficulty: Hard
Interpret the estimated model coefficient b2.
Answer: The estimated average attendance for weekend home games is 2200 people
more than the estimated average attendance for weekday home games when the
independent variable (x2) is within the experimental region and the other two independent
variables are held constant. Difficulty: Hard
20. Assume that the overall model is useful in predicting the game attendance and the
team statistician wants to know if the mean attendance is higher on the weekends as
compared to the weekdays. State the appropriate null and alternative hypotheses.
Answer: H0: 2  0 HA: 2 > 0
21. Assume that the overall model is useful in predicting the game attendance. Assume
today is Wednesday morning and the weather forecast indicates sunny, excellent weather
conditions for the rest of the day. Later today, there is a home baseball game for this
team. Assume that the current power rating of the team is 85 and predict the attendance
for today's game.
yˆ  1050  250(85)  2200(0)  5400(1)  25, 600
22. SSE = 1.10 n=5
s
1.10
 0.6055
3
23.
yˆ  6.72  1.39
sb  0.06
1.39  1.67(0.06)
(1.289, 1.49)
b1 1.2024

 4.098
sb1 0.2934
There is significant negative relationship.
24. t 
25. s 2 p 
(n1  1) s12  (n  2) s 22
n1  n2  2
(13  1)(5) 2  (10  1)(3) 2
s 
 4.259
23  2
2
p
85. Calculate the coefficient of determination.
Answer: .7777
Download