    6

advertisement
252probl
10/19/07
1
ECO 252 PROBLEMS
A1.
Let x ~ N 5, 6
Find:
a. Px  3
b. Px  5
c. Px  6
d. Px  8
e. Px  2
f. P3  x  6
g. P6  x  8
h. P1  x  3
i. P3  x  27 
j. A symmetrical region about the mean with 60% of the probability
k. The 30th percentile of the distribution
l. The 70th percentile
m. x .05
n. x.07
A2. If n  64 and x  11 .50 , find 95% confidence intervals for the mean under the following circumstances:
a.   6.30, N  3000
b.   6.30 , N  300
c. s  6.30, N  3000
d. s  6.30 , N  300
A3. In a study of a grain market in an African country we want to figure out how large a sample we must take to
find a daily average price for a grain transaction. (Assume a standard deviation of 5 cents.)
a. We want a 99% confidence interval for the mean with an error of ±1 cent.
b. What if the error is to be ±1/2 cent?
A4. If s  15 find a 95% confidence interval for  if a) n  26 , b) n  99 .
A5. a. Find the confidence level for an interval for the median using binomial tables, if from a sample of 12 we
take the third observation from both ends.
b. Do the same for the 19th observation from both ends in a sample of 50.
c. Do the same for an interval using the 10th observation from both ends in a sample of 40, using the normal
approximation to the binomial distribution.
d. In part c, try to find a 95% confidence interval for the median.
B1. (Leabo) A sample of married couples are asked their opinion of a new township noise ordinance. 40 couples are
asked to rate the ordinance on a 1-10 scale. Of the couples, 4 husbands and wives liked the proposal equally well, 11
husbands liked it better than their wives, 25 wives liked it better than their husbands. Our null hypothesis is that
husbands who like the proposal more than their wives are as likely as wives who like the proposal more than their
husbands.
B2. A firm claims that its median wage is $32000. The union claims that the median () is lower. A random
sample of 100 employees shows that 40% are above $32000. Set this up as two hypotheses and test with a
significance level of 5%.
pg. 78
252probl
10/19/07
2
B3. We are testing that the median is 14. Let x be the number of items above 14. From a sample of size n  30,
we find x  25 . Use p for the proportion of the population over 14 and p for the proportion of the sample over
14.
a) Test  = 14
b) Test  > 14
c) Test  < 14
B4. A bank's average default rate on loans is supposedly 6 per month. In the first month there are 12 defaults.
Test the first assertion assuming a Poisson distribution. Use a two-sided test with a 5% significance level.
B5. a. I claim that x is binomially distributed with p = .01. Test this assertion using a 2-sided 5% test if there are 3
successes in 10 trials.
b. Test for a binomial distribution with p = 0.10 when n = 10 and x = 4.
c. If n = 100 and x = 9, test to see if p is at least 0.4.
d. Calls coming into a switchboard in an hour presumably have a Poisson distribution with a mean of 144. Test
this hypothesis if, in a given hour, 200 calls come in.
B6. If
 x  x    x
2
2
 nx 2  40 and the confidence level is 95%, test if it is true that the variance is 2 when
a) n  10, b) n  20 , c) n  40 .
B7. Test the following using (i) a test ratio with a p-value, (ii) a confidence interval for the sample mean and (iii) a
critical value for the sample mean.   .05 .
a. H 0 :   5, H 1 :   5 when n  49 ,  x  8.40 and x  5.92 .
b. H 0 :   5, H 1 :   5 when n  49 ,  x  8.40 and x  5.92 .
c. H 0 :   5, H 1 :   5 when n  49 ,  x  8.40 and x  5.92 .
d. H 0 :   5, H 1 :   5 when n  49 , s x  8.40 and x  5.92 .
e. H 0 :   5, H 1 :   5 when n  49 , s x  8.40 and x  5.92 .
f. H 0 :   5, H 1 :   5 when n  49 , s x  8.40 and x  5.92 .
B8. I am testing the hypothesis H 0 :   5 . The result of the test is a p-value of .99. What are the p-values for
H 0 :   5 and H 0 :   5 ?
C1. Assume that  = 4 and n  70 . Find the critical values, power function and operating characteristic curve for:
 H 0 :   50
Use a significance level of 5 percent.

 H 1 :   50
C2. A hardware firm charges a flat rate for mailing of small tools based on an average weight of 20 oz. with a
standard deviation of 3.60 oz. A consultant challenges this assumption and a sample of 100 packages is taken. Find
critical values for a significance level of 1% and compute the power function and operating characteristic curve.
pg. 79
252probl
10/19/07
3
D1. A trucking company wishes to compare mileage per gallon on its current air filter with a new product. Results
are as below. See if the new filter actually gives better mileage. Assume that the underlying distribution is normal.
Use a 5% significance level.
Current Filter
8.6 6.1 11.4 7.9 6.6 8.9 6.4 6.5 6.3
New Filter
8.3 8.2 7.8 11.6 9.8 9.7 6.7 9.9 8.1
a.
b.
c.
d.
D2.
Assume each pair of numbers represents experience on a single truck.
Assume that these represent two independent random samples, but 1 = 2.
(Optional) Again assume two random samples, but that the variances are not equal.
Test that the mean is 8.2 for each filter.
Do 2-sided and (if appropriate) 1-sided confidence intervals in D1.
D3. (Optional):
are:
A secretary types 16 pages on word processor 1 and 16 pages on word processor 2. Her times
x1  8.2
s12  4.1
x 2  7.1
s 22  4.2
If   1   2 test   0 at the 90 per cent confidence level. Assume
that these are independent samples and that  12   22 . (   .10 )
D4. (Old Minitab Manual - modified) In a study of tool life , two independent samples of wear are taken. The first of
these represents volume loss in millionths of a cubic inches from 10 untreated tools. The second represents loss in
the same units from 10 tools that were treated by a new wear retardant process.
Untreated
.56 .50 .69 .59 .47 .42 .45 .47 .50 .50
Treated
.13 .13 .18 .23 .18 .31 .35 .23 .31 .33
On the assumption that the parent populations are Normal, test the hypothesis that the means are equal and do a
confidence interval for the difference between the means ) (a) assuming that the variances are equal and (b) assuming
that the variances are not equal.
 H 0 :   15
D5. a)If our sample consists of the numbers 9,14,16,16,18,19,22,23,25,26 , test the hypotheses 
by
 H 1 :   15
computing x   for each value of x and using the magnitude and sign of the results to rank them and perform a
Wilcoxon signed rank test.   .05 
b) for the following data, test the hypotheses
 x1

x2
H 0 : 1   2
on the following paired samples

H 1 : 1   2
09 14 16 16 18 19 22 23 25 78
using a Wilcoxon signed rank test.   .05  .
14 10 08 14 13 16 12 40 13 24
D.6 We have the following data for returns on two stocks:
Stock A 7, 8, -5, 9, 11 nA = 5
Stock B 6, 7, 0, 4, 9, 15 nB = 6
a. Find a 95% interval for
 A2
 B2
b. Test the following at a 95% level:
H 0 :  A2   B2
H 1 :  A2   B2
pg. 80
252probl
10/19/07
4
D.7 In a study of sleep gotten with a sleeping pill and with a placebo the results were as below (Keller, Warren,
Bartel, 2nd ed. p. 354). Test for a difference between means and medians as appropriate.
x1
Pill
x2
Placebo
7.3
8.5
6.4
9.0
6.9
6.8
7.9
6.0
8.4
6.5
x1  7.620
s12
 1.197
d
difference
.5
.6
.4
.6
.4
x 2  7.120 d  0.500
s 22  0.997 s d2  0.010
a. Assume that these are independent samples from a normal distribution and that  12   22 (Test if  12   22 ).
b. Assume that these are independent samples and that  12   22 .
c. Assume these are paired samples.
In each case do (i) a 99% confidence interval for   1   2 , (ii) test if 1   2 . (iii) In case a test
if  12
  22 .
d. Redo part a(ii) assuming that the parent population is not normal.
e. Redo part c(ii) assuming that the parent distribution is not normal.
D8. (2001 Graded Assignment 3) In your outline there are 6 methods to compare means or medians, methods D1,
D2, D3, D4, D5a and D5b. Method D6 compares proportions and method D7 compares variances or standard
deviations. In the following cases, identify H 0 and H 1 and identify which method to use. If the hypotheses involve
a mean, state the hypotheses in terms of both  and   1   2 . If the hypotheses involve a proportion, state
them in terms of both p and p  p1  p 2 . If the hypotheses involve standard deviations or variances, state them in
terms of both  2 and
 12
 22
or
 22
 12
. All the questions involve means, medians, proportions or variances.
Note: Look at 252thngs ( 252thngs) on the syllabus supplement part of the website before you start (and before you
take exams) - especially the new rules.
a. You have data on income in two villages ( x1 in village 1, x 2 in village 2). You want to test the hypothesis that
village 1 has higher earnings than village 2. You know that income has an extremely skewed distribution. and you
have to decide whether to use the mean or the median income.
b. You have a sample of earned incomes for 25 couples, both of whom are teachers. ( x1 is the women's incomes
in a column, x 2 is the men's. Each line represents one couple. ) Test to see if the women make more than the men.
pg. 81
252probl
10/19/07
5
c. You have interviewed a sample of 80 small businesses in the Northeast and 75 small businesses in the Southeast.
Each business has indicated whether they sell in foreign markets. You want to show that businesses in the Northeast
are more likely to export. ( x1 is the total number of firms that export in the Northeast sample, x 2 in the Southeast).
d. You have profit rates, x1 , for a sample of 20 pharmaceutical firms in Europe and profit rates, x 2 , for a sample
of 17 pharmaceutical firms in the US. You believe that they are normally distributed and you wish to see whether the
European firms were more profitable than the American firms.
e. In order to see which garage to use under contract for automobile repairs, 10 cars are towed first to garage 1 and
than to garage 2. You end up with two data sets, the first data column, x1 , is estimates from the first garage and the
second data column, x 2 , is estimates for the second garage. Each of the 10 lines of data refers to one car. You
believe that the estimates are approximately normally distributed. Compare the estimates in garage 1 and 2.
f. You are having a part produced in two different machines. x1 is 200 randomly selected data points that represent
the length of parts from machine one, x 2 is 200 randomly selected data points that represent the length of parts from
machine two. You want to test your suspicion that parts from machine 2 are longer than parts from machine 1. In a
problem of this type you would assume that the lengths are normally distributed.
g. You also suspect that parts from machine two are more variable in length than parts from machine one. Test this
suspicion.
E1. (Sincich}An Ernst and Young survey of 126 warehouses operated by retail stores tests the independence of the
number of deliveries to stores per week to warehouse size. Use   .05 for a test of independence.
Deliveries/week
1 or fewer
2-3
4-5
Below 100
5
12
9
Size (thousands of square feet)
100-249.9
250-400
13
9
11
13
14
13
Above 400
5
6
11
E2. A random sample of 64 cans of each of 3 brands of canned fruit is examined. The proportion that are not as
labeled is .1094 for brand 1, .0781 for brand 2 and .1563 for brand 3. Is the proportion the same for each brand?
  .01
E3. A real estate firm wants to check whether selling price is related to the number of days a home is on the market.
A random sample of 100 homes is taken and divided into three classes according to selling price. The realtor
discovers that 57% of the 30 homes in the under $100,000 class were on the market for 60 days or fewer. 38% of the
50 homes in the $100,000 - $200,000 class were on the market for 60 days or fewer. Finally, in the above $200,000
class, 35% of 20 homes were on the market for 60 days or fewer.
a. Do a test of the equality of proportions for the $100,000-$200,000 class and the above $200,000 class.
Repeat this test as a chi-squared test.
b. Do a test of equality of proportions for all three classes.
pg. 82
252probl
10/19/07
6
E4. Check to see if the following 1000 tax payments come from the distribution N(25000, 10000).
Amount in thousands
Below 10
10-15
15-20
20-25
25-30
30-35
35-40
Above 40
Number
40
60
170
140
180
210
80
120
E5. See if Frunzi earthquakes fit a Poisson distribution with parameter of 1
Earthquakes Per Day
0
1
2
3
4 or more
Number of Days Observed
25
17
5
2
1
E6. Check to see if the earthquakes in Frunzi fit a Poisson Distribution
Earthquakes Per Day
0
1
2
3
4
5
6
7
Number of Days Observed
40
45
7
4
2
0
1
1
E7. Redo E5 using the Kolmogorov-Smirnov method
E8. Using data from E6 check for a Poisson Distribution with parameter of 0.9.
E9. Is the following data normal?
420, 440, 445, 450, 460, 475, 480, 500, 520, 530
E10. Consider the following data:
65, 67, 69, 70, 73, 75
a. Test to see if this data is N 70,3.5
b. Test to see if this data is normal.
pg. 83
252probl
10/19/07
7
E11. This is the problem I used to use to introduce Kolmogorov-Smirnov. I stopped because everyone seemed to
assume that all K-S tests were tests of uniformity.
Five different formulas are used for a new cola. Ten tasters are asked to sample all five formulas and to
indicate which one they preferred. The sponsors of the test assume that equal numbers will prefer each formula - this
is the assumption of uniformity. Instead none prefer Formulas 1 and 3; 1 person prefers Formula 2; 5 people prefer
formula 4 and 4 people prefer Formula 5. Test the responses for uniformity.
F1. In a 2-way analysis of variance there are 5 rows, 10 columns and 5 observations per cell.
a. SST  1000 , SSW  100 , SSR  200 , SSC  300
b. SST  272 .5, SSW  100 , SSR  6, SSC  22 .5
Complete the ANOVA using a 1 percent significance level. State the hypotheses you test.
F2. 5 workers are trained to use 3 different data management systems. Management wishes to know if there is a
significant difference between the amount of time it takes to train for each method. Do a 2-way analysis of variance
using systems as treatments and workers as blocks. Provide estimates of the differences between means that are, first,
individually valid and, second, simultaneously valid.   .05 
Worker
System
1
2
3
1
17
17
25
2
20
18
23
3
15
14
20
4
14
13
19
5
19
18
23
F3. 48 measurements describe the time it took a group of truckers to get from their terminal to a destination. The
trip times were characterized by driver’s experience (Factor A – 2 levels), route (Factor B – 3 levels) and season
(Factor C – 2 levels). For each combination of factors there are 4 measurements. Set up the ‘degrees of freedom’
column of and ANOVA table showing all interactions.
F4. A computation job is done by 4 students on 4 computers. Times are below. Does the computer make a
difference? Handle this as a 2-way ANOVA.
Computer
Student 1
2
3
4
1
20
18
16
10
2
15
12
9
8
3
25
20
18
10
4
40
35
30
29
pg. 84
252probl
10/19/07
8
J1. Use the following data
Row
y
x1
x2
1
2
3
4
5
6
7
1.0
2.7
3.8
4.5
5.0
5.3
5.2
0
1
2
3
4
5
6
0
1
4
9
16
25
36
a) Do the regression of y against x1 and x 2 . Compute b) R 2 and c) s e . d) Do the ANOVA and, e) following the
formulas in the outline, try to find approximate confidence and prediction intervals when x1  5 and x 2  25. f)You
may also run this on Minitab using c1, c2 and c3 with the command
Regress c1 on 2 c2 c3
J2. n  80 , k  3, R 2  .95
n  80, k  4, R 2  .99
Use an F test to show if the second regression is an improvement.
K1. (Levin and Rubin)   .05 
a. Men and women are admitted to a training program in the following order. Is it random?
MWWMMMMWWWWMMWMWWM.
b. What about WMWMWMWMWMWMWMWWMM?
c. A professor hypothesizes that the more able students will tend to turn their exams in either earlier or later
than the majority and that less able students tend to be more apt to turn exams in after an average amount of
time. If we only count grades above 89 as high, test the following sequence for randomness.
94 70 85 89 92 98 63 88 74 85
69 90 57 86 79 72 80 93 66 74
50 55 47 59 68 63 89 51 90 98
L1. Assume that for n = 49 r = .24. Test for
a. Correlation of zero
b. Correlation of 0.3
L2. The following are rankings given by 3 judges
Swimmer
Judge A
Judge B
Judge C
1
2
1
2
2
1
2
1
3
3
3
4
4
4
4
3
5
5
5
5
Is there significant agreement?
L3. In order to validate an aptitude test, a random sample of 15 salespersons is selected by an agency and their scores
on the test are compared with their sales during their first year. Scores are as follows:
Row
1
2
3
4
5
6
7
Score
71.0
87.5
69.0
86.0
70.0
84.0
88.0
8
9
10
11
12
13
14
15
Sales
225
244
218
246
205
243
249
92.0
97.0
95.0
85.0
81.0
87.0
82.0
79.0
251
250
250
245
238
248
234
237
The correlation is .911, but the statistician believes that a rank correlation is more appropriate. Calculate a rank
correlation, and test it for significance. Try to explain why the rank correlation is higher than the correlation.
pg. 85
Download