252 1takehome081 2/22/08 ECO252 QBA2 Student Number and class time: _________________________

advertisement
252 1takehome081 2/22/08
ECO252 QBA2
FIRST EXAM
February, 2008
TAKE HOME SECTION
Name: _________________________
Student Number and class time: _________________________
IV. Do at least 3 problems (at least 7 each) (or do sections adding to at least 20 points - Anything extra
you do helps, and grades wrap around) . Show your work! State H 0 and H 1 where appropriate. You
have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated
your conclusion. (Use a 95% confidence level unless another level is specified.) Answers without
reasons usually are not acceptable. Neatness and clarity of explanation are expected. This must be
turned in when you take the in-class exam. Note that answers without reasons and citation of
appropriate statistical tests receive no credit. Failing to be transparent about which section of which
problem you are doing can lose you credit. Many answers require a statistical test, that is, stating or
implying a hypothesis and showing why it is true or false by citing a table value or a p-value. If you haven’t
done it lately, take a fast look at ECO 252 - Things That You Should Never Do on a Statistics Exam (or
Anywhere Else).
Problem 1: (Doane and Seward) A fast food restaurant has just started serving hot cocoa. The management
wishes to serve cocoa of an average temperature of 142 degrees. 24 measurements of the temperature in 10
stores are taken. You are manager of store a and will use the corresponding column, where a is the
second to last digit of your student number. (For example, Seymour Butz’ student number is 543987 so he
uses column x8.) If that number is zero, use column 10. You are testing to see if the mean for your store is
142. There will be a penalty if you do not make it clear what column you are using.
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
x1
140
142
141
142
141
141
145
142
142
142
137
139
139
144
140
141
140
140
140
139
146
138
139
142
x2
142
143
141
142
139
142
144
145
143
143
141
139
143
144
140
138
141
140
142
140
138
139
142
141
x3
142
143
141
142
142
140
143
142
141
136
142
138
142
139
140
143
139
139
139
141
143
141
140
143
x4
143
138
140
140
142
139
142
139
145
139
142
138
142
141
142
141
141
140
142
141
143
141
140
140
x5
144
139
144
143
141
142
141
138
144
145
139
141
145
142
144
145
142
144
136
138
141
142
140
141
x6
142
142
142
140
141
141
137
142
139
141
141
143
141
142
140
142
142
142
139
142
143
143
142
140
x7
143
144
144
145
146
142
145
142
145
143
142
142
141
147
141
137
142
140
144
142
147
146
140
143
x8
146
145
144
144
145
144
141
141
144
144
139
142
141
142
144
145
139
142
143
146
142
144
144
146
x9
143
144
141
144
142
140
142
141
146
140
143
144
142
143
142
141
142
144
144
145
145
141
143
142
x10
143
139
145
144
145
142
141
140
142
142
142
146
142
143
142
140
144
143
141
144
143
141
139
142
Assume that the Normal distribution applies to the data and use a 98% confidence level.
a. Find the sample mean and sample standard deviation of the incomes in your data, showing
your work. (1) (Your mean should be between 140 and 146 and your sample standard deviation
should be around 2.)
b. State your null and alternative hypotheses (1)
c. Test the hypothesis using a test ratio (1)
d. Test the hypothesis using a critical value for a sample mean. (1)
1
252 1takehome081 2/22/08
e. Test the hypothesis using a confidence interval (1)
f. Find an approximate p-value for the null hypothesis. (1)
g. On the basis of your tests, is the mean temperature correct in your restaurant?? Why? (1)
h. How do your conclusions change if the random sample of 24 temperatures is taken on a day in
which only 48 cups cocoa are sold? (2)
i. Assume that the Normal distribution does not apply and test to see if the median is 142. Be
careful! What should you do with numbers that are exactly 142? (2)
[12]
j. (Extra Credit) Do a 98% confidence interval for the median. (2)
Problem 2: Once again assume that the Normal distribution applies to the data in Problem 1, but that we
know that the population standard deviation is 2. Our confidence level remains 98%, but we are now testing
the hypothesis that the mean is below 143 degrees.
a. State your null and your alternative hypotheses. (1)
b. Find the value of z that you need for a critical value for a 1-sided test if the confidence level is
98%. You may use a confidence level of 99% if you wish for slightly less credit.
c. Find a critical value for the sample mean to test if the mean is below 143 degrees. (1)
d. Test the hypothesis that the mean is below 143 degrees using an appropriate confidence
interval. (2)
e. Using your critical value from 2b, create a power curve for your test. (6)
f. Assume that the population standard deviation is 2. How large a sample do you need to get a
two-sided 98% confidence interval with an error not exceeding 0.5 degrees? (2)
[22]
Problem 3: According to Doane and Seward about 13% of goods bought at a department store are
returned. An organization called Return Exchange will sell you a software product called Verify-1for which
it makes the claims below.
Verify-1® is quickly operational. And it authorizes returns even quicker
Verify-1® identifies fraud and abuse at the point of return before they become liabilities to your brand equity or profits. In
stand-alone mode, this easy-to-use, turnkey solution can be operational in 30 days and will reduce your return rate
immediately, without disrupting your business or IT configuration. Verify-1® also integrates easily into your existing POS
platform.
You set the policy, Verify-1® enforces it
With Verify-1®, your returns are dealt with consistently utilizing advanced statistical modeling in combination with state
return laws and your existing return policies. At the point of return, using the customer’s driver’s license or other valid
identification, Verify-1® automatically checks prior return behavior and authorizes or declines the transaction. Customers
identified as risks for presenting fraudulent returns are declined, while legitimate returns are speedily accepted.
You take a sample of n items and find that there were x returns (about 9%).You are the manager of store
a . (a is the last digit of your student number. (For example, Seymour Butz’ student number is 543987 so
he manages store 7.) The sample size and number of returns for your store is given below. On the basis of
this sample, can you now say that the return rate is now below 13%? Use a confidence level of 95%.
Store
1
2
3
4
5
6
7
8
9
10
n
275
250
225
200
175
150
125
100
75
50
x
25
22
20
18
16
13
11
9
7
4
a) State your null and alternative hypotheses. (1) Make sure I know which store you manage.
b) Test the hypothesis using a test ratio or a critical value for the observed proportion. (1) Make a diagram
showing clearly where your ‘reject’ region is. (Do not round excessively. If you compute proportions carry
at least 3 significant figures.)
c) Find a p-value for your null hypothesis. (1)
d) Test your hypothesis using an appropriate confidence interval. (2) [5]
e) Using the 13% proportion as an estimate of the true proportion, find out how large a sample you need to
create a 95% confidence interval with an error of no more than 1% (2)
2
252 1takehome081 2/22/08
f) (Extra credit) Remember that the method that you have been using to deal with proportions substitutes
the Normal distribution for the binomial distribution. In general the p-values that you have computed are
lower than you would get if you used the binomial distribution. Verify this by making a continuity
correction as described in the outline and repeating your test in c). (2)
g) (Extra credit) Using 13%, your critical value, a point between your critical value and 13% and one or
two other points on the side of the critical value implied by the alternative hypothesis (only one point on
this side may give a reasonable value for a proportion) put together a power curve for your test. Remember
that your standard error will change if the true proportion changes. (8)
h) Go back to the test in parts a) b) and c) of this problem. Take your values of n and x and multiply them
by 1.6, rounding your values to the nearest whole number (or numbers) if necessary. Find the new value of
the test ratio and get a p-value. What does the change in p-value between parts c) and g) suggest about the
effect of increased sample size on the power of the test? (3) [32]
Problem 4: According to Doane and Seward both the mean and the standard deviation of pH (a measure
of acidity) are of interest to winemakers. Assume that your firm (store from the last problem) has gotten
into the wine business. A sample of 16 wine bottles is taken. Your column has the same number as your
store. Minitab has calculated all sorts of sample statistics on your data. These are listed below. Use them.
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
C1
C2
C3
C4
C5
C6
C7
C8
C9
3.41
3.45
3.51
3.52
3.68
3.29
3.39
3.57
3.38
3.14
3.61
3.23
3.48
3.39
3.49
3.50
3.44
3.42
3.45
3.48
3.68
3.45
3.42
3.50
3.41
3.36
3.69
3.40
3.48
3.48
3.45
3.63
3.61
3.59
3.63
3.65
3.87
3.62
3.59
3.67
3.58
3.52
3.87
3.57
3.66
3.65
3.62
3.81
3.39
3.37
3.41
3.44
3.66
3.41
3.37
3.45
3.36
3.30
3.66
3.35
3.44
3.43
3.40
3.60
3.41
3.39
3.43
3.46
3.69
3.43
3.39
3.48
3.38
3.32
3.70
3.37
3.46
3.45
3.42
3.63
3.43
3.41
3.45
3.47
3.69
3.44
3.41
3.49
3.40
3.34
3.70
3.39
3.48
3.47
3.44
3.64
3.40
3.38
3.42
3.45
3.68
3.42
3.38
3.47
3.37
3.31
3.68
3.36
3.45
3.44
3.41
3.62
3.56
3.53
3.59
3.63
3.95
3.58
3.53
3.65
3.52
3.43
3.95
3.51
3.63
3.62
3.57
3.87
3.53
3.56
3.63
3.65
3.82
3.39
3.50
3.70
3.49
3.23
3.75
3.32
3.59
3.51
3.61
3.62
Variable
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
N
16
16
16
16
16
16
16
16
16
16
N*
0
0
0
0
0
0
0
0
0
0
Variable
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
Maximum
3.6800
3.6900
3.8700
3.6600
3.7000
3.7000
3.6800
3.9500
3.8200
3.4400
Mean
3.4400
3.4837
3.6569
3.4400
3.4631
3.4781
3.4525
3.6325
3.5562
3.2000
SE Mean
0.0347
0.0245
0.0259
0.0268
0.0281
0.0265
0.0278
0.0388
0.0382
0.0347
StDev
0.1387
0.0980
0.1037
0.1072
0.1124
0.1061
0.1110
0.1553
0.1528
0.1387
Minimum
3.1400
3.3600
3.5200
3.3000
3.3200
3.3400
3.3100
3.4300
3.2300
2.9000
Q1
3.3825
3.4200
3.5900
3.3700
3.3900
3.4100
3.3800
3.5300
3.4925
3.1425
C10
3.17
3.21
3.27
3.28
3.44
3.05
3.15
3.33
3.14
2.90
3.37
2.99
3.24
3.15
3.25
3.26
Median
3.4650
3.4500
3.6250
3.4100
3.4300
3.4450
3.4200
3.5850
3.5750
3.2250
Q3
3.5175
3.4950
3.6675
3.4475
3.4750
3.4875
3.4650
3.6450
3.6450
3.2775
3
252 1takehome081 2/22/08
You must state H 0 and H 1 where applicable to get credit for any of the tests below. Make sure that I
know which column you are using!
a) The acceptable standard deviation for wine pH is 0.10. Using the data for your store, test the hypothesis
that the standard deviation is 0.10 using a 95% confidence level. (2)
b) Test the hypothesis that the standard deviation is below .14. (1)
c) Repeat a) and b) using the sample (mean and) variance you used in a) and b) but assuming a sample size
of 100. Find p-values. (4)
d) Find 2-sided 95% confidence interval for the standard deviation using data from your store and assuming
a sample size of 16. (2)
e) Repeat d) for a sample size of 100. (1)
[41]
f) Here’s the easiest question on the exam. By now you should have figured out that you don’t have to
understand a statistical test at all if you know i) what it assumes, ii) what the null hypothesis is and iii) what
the p-value is associated with the null hypothesis. So, I am going to do a test that the standard deviation is
0.1 on the following data set.
C11
3.53
3.49
3.51
3.57
3.54
3.57
3.57
3.54
3.78
3.72
3.54
3.51
3.59
3.50
3.44
3.78
Then I am going to run a Lilliefors test on these data using Minitab. The null hypothesis of the Lilliefors
test is that the sample comes from the Normal distribution. The test makes no assumptions about the mean
and standard deviation of the population and computes these as sample statistics from the data. After it
printed ‘Probability plot of C11,’ the computer printed a graph of the data, but the only thing I looked at
was the p-value which was less than .01. After the Lilliefors test, the computer printed out the results of
two versions of a statistical test on the standard deviation. The ‘Standard’ version is the method that you
learned and is only applicable if the data comes from a Normal distribution. The ‘Adjusted’ version is for
all other cases. So explain what p-value I look at and what it tells me.
MTB > NormTest c11;
SUBC>
KSTest.
Probability Plot of C11
MTB > OneVariance c11;
SUBC>
Test .1;
SUBC>
Confidence 95.0;
SUBC>
Alternative 0;
SUBC>
StDeviation.
Test and CI for One Standard Deviation: C11
Method
Null hypothesis
Alternative hypothesis
Sigma = 0.1
Sigma not = 0.1
The standard method is only for the normal distribution.
The adjusted method is for any continuous distribution.
Statistics
Variable
N
C11
16
StDev
0.100
Variance
0.0100
95% Confidence Intervals
Variable
C11
Tests
Variable
C11
Method
Standard
Adjusted
CI for StDev
(0.074, 0.155)
(0.071, 0.170)
Method
Standard
Adjusted
Chi-Square
15.06
11.12
CI for
Variance
(0.0055, 0.0240)
(0.0050, 0.0288)
DF
15.00
11.07
P-Value
0.895
0.880
4
Download