252x0421

advertisement
252x0421 3/17/04
(Page layout view!)
ECO252 QBA2
Name
SECOND HOUR EXAM Hour of Class Registered
March 24, 2004
Circle 10am 11am
Show your work! Make Diagrams! Exam is normed on 50 points. Answers without reasons are not
usually acceptable.
I. (8 points) Do all the following.
x ~ N 2,9 - If you are not using the supplement table, make sure that I know it.
1. P27.88  x  0
2. P1  x  16 
3. F 16  (The cumulative probability up to 16)
4.
x.115
252x0421 3/17/04
II. (24+ points) Do all the following? (2points each unless noted otherwise).
Note the following:
1. You will be penalized if you do not compute the sample variance of the x L column in
question 1.
2. This test is normed on 50 points, but there are more points (48 plus extra credit!) possible
including the take-home. You are unlikely to finish the exam and might want to skip some
questions.
3. A table identifying methods for comparing 2 samples is at the end of the exam.
4. If you answer ‘None of the above’ in any question, you should provide an alternative
answer and explain why. You may receive credit for this even if you are wrong.
Questions 1-7 refer to Exhibit 1.
Exhibit 1:(Edited from problems presented by Samuel Wathen (for Lind et. al. 2002)
with one small error corrected)
The first two columns below are evaluations of a sample of five products, first at
FIFO and, second, at LIFO. Based on the results shown, is LIFO more effective than
FIFO in keeping the value of inventory lower?
(Assume that the underlying
distribution is Normal.)
d  xF  xL
Product
xF
xL
1
2
3
4
5
225
119
100
212
248
904
221
100
113
200
245
879
4
19
-13
12
3
25
x F2
x L2
d2
50625
14161
10000
44944
61504
181234
48841
10000
12769
40000
60025
171635
16
361
169
144
9
699
Minitab calculated the following sample statistics:
n
Mean
Median
StDev
SE Mean
xF
5
180.8
212.0
66.7
29.8
xL
5
175.8
200.0
____
d
5
5.00
4.00
11.98
Variable
5.36
1.
Compute the standard deviation of x L . You may use any of the material given in exhibit 1.
2.
What is the null hypothesis?
a)  F   L
b)
c)
F  L
F  L
F  L
d)
e) None of the above.
2
252x0421 3/17/04
Exhibit 1:The first two columns below are evaluations of a sample of five products,
first at FIFO and, second, at LIFO. Based on the results shown, is LIFO more effective
than FIFO in keeping the value of inventory lower?
(Assume that the underlying
distribution is Normal.)
xL
d  xF  xL
x F2
221
100
113
200
245
879
n
4
19
-13
12
3
25
Mean
50625
14161
10000
44944
61504
181234
Median
48841
10000
12769
40000
60025
171635
StDev
16
361
169
144
9
699
SE Mean
xF
5
180.8
212.0
66.7
29.8
xL
5
175.8
200.0
____
d
5
5.00
4.00
Product
1
2
3
4
5
xF
225
119
100
212
248
904
Variable
d2
x L2
11.98
5.36
3.
What is (are) the degrees of freedom?
a) 4
b) 5
c) 8
d) 15
e) 10
4.
If you used the 5% level of significance, what is the appropriate t or z value from the tables.
a)  2.571
b)  2.776
c) 2.262
d)  2.228
e) 1.645
f) 1.960
g) None of the above.
5.
What is the value of your calculated t or z ?
a) 0.933
b) 2.776
c) 0.477
d) 2.028
e) None of the above.
3
252x0421 3/17/04
6.
What is your decision at the 5% significance level?
a) Do not reject the null hypothesis and conclude that LIFO is more effective in keeping the
value of the inventory lower.
b) Reject the null hypothesis and conclude that LIFO is more effective in keeping the value
of the inventory lower.
c) Reject the alternative hypothesis and conclude that LIFO is more effective in keeping the
value of the inventory lower.
d) Do not reject the null hypothesis and conclude that LIFO is not more effective in keeping
the value of the inventory lower.
e) None of the above.
7.
Find an approximate p-value for the null hypothesis that you tested. Please explain your result!
8.
A manufacturer revises a manufacturing process and finds a fall in the defect rate of 4%  5%.
a) The fall in defects is statistically significant because 5% is larger than 4%.
b) The fall in defects is statistically significant because the confidence interval supports H0.
c) The fall in defects is not statistically significant because 4% is smaller than 5%.
d) The fall in defects is not statistically significant because the confidence interval would
lead us to reject H0.
Questions 9-11 refer to Exhibit 2.
Exhibit 2:(Edited from problems presented by Samuel Wathen) A group of adults and a
group of children both tried Wow! Cereal. Was there a difference in how adults and
kids responded to it?
Number in
Number who
Fraction of
Sample
liked it
sample who
250
187
liked it
Adults
.748
(Group 1)
Children
250
100
66
.660
(Group 2)
Total
.748 .252   .000754
.660 .340   .002244
100
350
253
.723
.723 .277   .0005722
350
9.
What is the null hypothesis ?
a) 1   2
c)
1   2
1   2
d)
p1  p 2
e)
p1  p 2
b)
f) p1  p 2
g) None of the above.
4
252x0421 3/17/04
10. Calculate a 99% confidence interval for the difference between the fraction of adults and fraction
of kids that liked Wow! Explain why you reject or do not reject the null hypothesis. (4)
11. (Extra Credit)Calculate a 77% confidence interval for the difference between the fraction of adults
and fraction of kids that liked Wow! (2)
Questions 12-14 refer to Exhibit 3.
Exhibit 3:(Edited from problems presented by Samuel Wathen)
A survey was taken among a randomly selected 100 property owners to see if opinion about a
street widening was related to the distance of front footage they owned. The results
appear below.
Opinion
Front-Footage
For
Undecided
Against
Under 45 feet
12
4
4
45-120 feet
35
5
30
Over 120 feet
3
2
5
12. How many degrees of freedom are there?
a) 2
b) 3
c) 4
d) 5
e) 9
f) None of the above.
13. What is the value of E for people in favor of the project who own less than 45 feet of frontage ?
a) 10
b) 12
c) 35
d) 50
e) None of the above.
14. Assume that the computed value of chi square is 8.5
a) What is the null hypothesis that you are testing ? (2)
b) What is your conclusion ? Why ? (3)
15. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3
points - 2 point penalty for not handing this in.)
5
252x0421 3/17/04
16. The following output is from a computer problem very much like the one you did to compare two
sets of data. Two production processes are in use. I wish to compare numbers of defects in Process
A and Process B to test the statement “ The number of defects in process A is significantly lower
than in process B.” Three tests are done. Assume that the underlying distribution is Normal.
a)Which of the three tests should we use? b) What is the null hypothesis as we use it? c) Should we
reject the null hypothesis? Why?
Test 1:
MTB > twosamplet 'A' 'B'
Two-Sample T-Test and CI: A, B
Two-sample T for A vs B
N
A 90
B 110
Mean
220.5
300.5
StDev SE Mean
34.7
3.7
82.7
7.9
Difference = mu A - mu B
Estimate for difference: -79.98
95% CI for difference: (-97.15, -62.81)
T-Test of difference = 0 (vs not =): T-Value = -9.20 P-Value = 0.000 DF = 152
Test 2:
MTB > twosamplet 'A' 'B';
SUBC> alter 1.
Two-Sample T-Test and CI: A, B
Two-sample T for A vs B
N
A 90
B 110
Mean
220.5
300.5
StDev SE Mean
34.7
3.7
82.7
7.9
Difference = mu A - mu B
Estimate for difference: -79.98
95% lower bound for difference: -94.36
T-Test of difference = 0 (vs >): T-Value = -9.20 P-Value = 1.000 DF = 152
Test 3:
MTB > Twosamplet 'A' 'B';
SUBC> alter -1.
Two-Sample T-Test and CI: A, B
Two-sample T for A vs B
N
A 90
B 110
Mean
220.5
300.5
StDev SE Mean
34.7
3.7
82.7
7.9
Difference = mu A - mu B
Estimate for difference: -79.98
95% upper bound for difference: -65.59
T-Test of difference = 0 (vs <): T-Value = -9.20 P-Value = 0.000 DF = 152
6
252x0421 3/17/04
17. (Extra credit) My boss objects that he thinks that the variances are equal, so that I used the wrong
test. I go back to the computer and do the following. (The null hypothesis is equal variances.) Was
I right? Why?
MTB > %VarTest c3 c4;
SUBC> Unstacked.
Test for Equal Variances
F-Test (normal distribution)
Test Statistic: 0.176
P-Value
: 0.000
18. (Extra Credit)Now my beloved boss says that maybe the underlying distribution is not Normal. I
go back to the computer and run the following. Process A results are in C3. Process B results are in
C4. Remember that there are 90 data items for process A and 100 for process B. What are our
hypotheses and results?
MTB > Stack c3 c4 c5;
SUBC> Subscripts c6;
SUBC> UseNames.
MTB > Rank c5 c7.
MTB > Unstack (c7);
SUBC> Subscripts c6;
SUBC> After;
SUBC> VarNames.
This stacks the 2 sets of results together so they can be ranked.
C7 now contains the ranks.
Ranks for A are now in C7_A. Ranks for B are now in C7_B.
MTB > sum c8
Sum of C7_A
Sum of C7_A = 6008.0
MTB > sum c9
Sum of C7_B
Sum of C7_B = 14092
7
252x0421 3/17/04
Questions 19-22 refer to Exhibit 4.
Exhibit 4:(Edited from problems presented by Samuel Wathen)
A professor asserts that she uses a Normal curve with a mean of 75 and a standard
deviation of 10 to grade students. Last year’s grades are below. Test to see if the
professor’s assertions are correct at the 99% confidence level.
Row
Grade
Interval
1
2
3
4
5
A
B
C
D
F
90+
80-90
70-80
60-70
Below 60
E
7.6820
27.7955
44.0450
27.7955
7.6820
115.0000
O2
O
15
20
40
30
10
115
E
29.2892
14.3908
36.3265
32.3793
13.0174
125.4032
19. Show the calculations necessary to get the number that were expected to get B’s.
20. What table value of chi-square would you use to test the professor’s assertion?
21. What is the calculated value of chi-square?
22. Explain your conclusion.
8
252x0421 3/17/04
(mostly blank page)
Location - Normal distribution.
Compare means.
Location - Distribution not
Normal. Compare medians.
Paired Samples
Method D4
Independent Samples
Methods D1- D3
Method D5b
Method D5a
Proportions
Method D6
Variability - Normal distribution.
Compare variances.
Method D7
9
252x0421 3/17/04
ECO252 QBA2
SECOND EXAM
March 24, 2004
TAKE HOME SECTION
Name: _________________________
Student Number: _________________________
III. Neatness Counts! Show your work! Always state your hypotheses and conclusions clearly.
(19+ points)
1) Chi-squared and Related Tests (Bassett et. al.) To personalize the data below, change the number
of stations reporting 4 thunderstorms to the second to last digit of your student number. This will
change the total number of stations reporting. For example, Seymour Butz’s student number is 976500,
so he will change the number of stations reporting 4 thunderstorms to zero and the total number of
stations reporting will be 22 + 37 + 20 + 13 + 0 + 2 = 94.
a) 100 weather stations reported the following in August 2003:
Number of Thunderstorms x 
0
1
2
3
4
5
Number of stations reporting x
22
37
20
13
6
2
thunderstorms O 
In the region in question, the number of thunderstorms per month is believed to have a Poisson
distribution with a mean of 1. Test to see if this is appropriate using a chi-squared method. For example
if, 5 stations reported 2 thunderstorms and 5 stations reported 3 thunderstorms and there were only 10
stations, the total number of storms reported would be 25  35  25 , and the average number of
storms reported would be 25  2.5 . (4)
10
b) Repeat the test using the Kolmogorov-Smirnov method. (3)
c) Find the average number of storms per station and use it to generate a Poisson table on Minitab.
To do so follow the example below, replacing 0.732 with your mean (a number like 1.723). Head
Column 1 (C1) k , column 2 Pk  and column 3 Px le k  or something similar.. ( ' le' stands for ' ' )
In column 1 place the numbers 0 through 10.
MTB > PDF c1 c2;
SUBC> Poisson 0.732.
MTB > CDF c1 c3;
SUBC> Poisson 0.732.
MTB > print c1 - c3
Data Display
Row
k
P(k)
P(x le k)
1
0
0.480946
0.48095
2
1
0.352053
0.83300
3
2
0.128851
0.96185
4
3
0.031440
0.99329
5
4
0.005753
0.99904
6
5
0.000842
0.99989
7
6
0.000103
0.99999
8
7
0.000011
1.00000
9
8
0.000001
1.00000
10
9
0.000000
1.00000
11
10
0.000000
1.00000
10
252x0421 3/17/04
This table tells us that, for a Poisson distribution with a mean of 0.732, Px  3  .031440 and
Px  3  .99329 . To keep the numbers correct, you could merge the data for k = 5 to 10 into a
category of ‘5 or more storms.’ Decide whether a chi-squared or K-S method is appropriate (Only one
method is!) and test for a Poisson distribution with your mean, remembering that you estimated the
mean from your data. (4)
d) (Extra Credit) Two dice were thrown 180 times with the results below. Test the hypothesis that
the distribution follows the binomial distribution with n  2 and p  .15 . (2)
Number of Sixes x 
0
1
2
Frequency O 
105
70
5
e) (Extra extra credit) Test the data in d) for a binomial distribution in general by using
pˆ 
Total number of sixes
(2)
Total number of throws
11
252x0421 3/17/04
2) (Meyer and Krueger) WEFA compiled the following random samples of single-family home prices
in the eastern and western parts of the US (in $thousands.). (Note – in this problem it is OK to use
Excel or Minitab as a help – but you must fool me into believing that you did it by hand.)
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
City - E
x1
Albany NY
Allentown PA
Baltimore MD
Bergen NJ
Boston MA
Buffalo NY
Charlestown SC
Charlotte NC
Greensboro NC
Greenville SC
Harrisburg PA
Hartford CT
Middlesex NJ
Monmouth NJ
New Haven CT
New York NY
Newark NJ
Philadelphia PA
Raleigh/Durham NC
Rochester NY
Springfield MA
Syracuse NY
Washington DC
108.607
85.250
112.747
195.232
180.865
83.122
92.840
104.433
97.638
88.355
79.846
129.130
169.540
137.859
134.856
170.830
187.128
114.553
119.355
85.043
102.678
82.372
155.176
City-W
x2
Bakersfield CA
Fresno CA
Orange C. CA
Portland OR
Riverside CA
Sacramento CA
San Diego CA
San Francisco CA
San Jose CA
Seattle WA
Stockton CA
Tacoma WA
137.171
107.627
204.862
123.605
123.836
120.232
172.601
220.067
224.828
147.854
98.440
119.884
City-No
1
2
3
4
5
6
7
8
9
10
11
12
These are available on the website in Minitab. Minitab reports the following sample statistics.
Variable
x1
x2
n
23
12
Mean
122.50
150.10
Median
112.75
130.50
StDev
37.20
44.50
You may use the statistics given for x1, but personalize the data for Western cities as follows: Use the
fourth digit of your student number to pick the first city to be eliminated and then eliminate the third city
after that. (You may, if you wish, drop the last two digits of the prices in the Western Cities.) For example,
Seymour Butz’s student number is 976500, so he will use the number 5 to eliminate cities 5 (Riverside) and
8 (San Francisco). If the fourth digit of your student number is zero, eliminate cities 10 and 1. You will thus
have only 10 cities in your second sample.
a. Compute a (mean and) standard deviation for your personalized second sample. Show your
work! (2)
b. Test to see if there is a significant difference between the mean home prices in the eastern and
western US. You may assume that the samples come from Normal populations with equal variances, though
there are 2 points extra credit if you do not assume equal variances. You may use a test ratio, critical value
or a confidence interval (4 points) or all three of these (6 points – assuming that you get the same
conclusion for all of them) .
c. Test the variances to find out if you were or would have been justified to assume equality of
variances. Were you? (2)
d. (Extra Credit)Use a Lilliefors test to see if the Western data is Normally distributed. (2)
e. (Extra Credit) Assume that the data is not normally distributed and test to see if there is a
significant difference between the medians. (3)
12
Download