252grass3-051 3/24/05 Name: Class days and time:

advertisement
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
Name:
Class days and time:
Please include this on what you hand in!
Graded Assignment 3
In your outline there are 6 methods to compare means or medians, methods D1, D2, D3, D4, D5a and D5b. Method D6 compares
proportions and method D7 compares variances or standard deviations. In all the following cases, identify H 0 and H 1 and identify
 and D  1   2 . If the
p and p  p1  p 2 . If the hypotheses involve standard deviations
which method to use. If the hypotheses involve a mean, state the hypotheses in terms of both
hypotheses involve a proportion, state them in terms of both
or variances, state them in terms of both
2
and
 12
 22
or
 22
 12
. All the questions involve means, medians, proportions or
variances. (Most problems are highly edited versions of problems in McClave, et. al.)
Note: Look at 252thngs (252thngs) in the syllabus supplement before you start (and before you take exams).
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1. We have the amount spent by a sample of 25 pharmaceutical firms on research in 2002 and again for the
same firms in 2004. We assume that the underlying distributions are Normal. Was the average amount spent
in 2004 above the average amount spent in 2002?
Solution: You are comparing means before and after the passing of 2 years. You can get away with using
means because the parent distributions are Normal. If  2 is the mean of the 2004 sample, you are testing
that  2  1 , which, because it contains no equality is an alternate hypothesis. So your hypotheses are
 H 0 : 1   2
H :    2  0
H : D  0
or  0 1
. If D  1   2 , then  0
. The important thing to notice

H
:



H
:




0
2
2
 1 1
 1 1
H 1 : D  0
here is that the data are in before and after pairs, so you use Method D4. Of course, if you took  2 as the
H :    2
mean of the 2002 sample, you have  0 1
etc. Also, you might have found it more natural to
 H 1 : 1   2
H 0 : D  0
define D as  2  1 . In these cases we have 
H 1 : D  0
2. Assume that you had the same data and a similar task to Problem 2 but your preliminary analysis
indicated that the underlying distributions were highly skewed to the right.
Solution: The presence of skewness, especially with small samples, means that you should be using a
nonparametric (rank) method. There are two reasons for this. First, for small samples the mean is not
reliably Normally distributed, and, second, because even for a large sample, the median of a skewed
 H 0 : 1   2
distribution is more ‘typical’ than the mean. If  is the median. 
. Since we are comparing
 H 1 : 1   2
medians and the data are paired, use Method D5b.
3. You have the grades of a sample of 15 traditional university students and  2 grades of a sample of 13
non-traditional university students, do these differ on average? Assume that preliminary analysis indicates
that the variances of grades of the two groups are similar and presume an underlying Normal distribution.
 H 0 : 1   2
 H 0 : 1   2  0
H 0 : D  0
Solution: 
or 
. If D  1   2 , then 
. Because you believe
H
:



H
:




0
2
2
 1 1
 1 1
H 1 : D  0
that the Normal distribution applies, you use a method that compares means. The total sample size is too
small to use Method D1, which means that D2 or D3 should work. You have tested the variances and not
disproved equality so use D2.
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
4. Since the results of Problem 3 were inconclusive, you take new samples of 250 traditional university
students and 150 non-traditional students, again you assume that the underlying distribution is Normal, but
you do not bother to compare variances. You are still trying to find out if grades of the two groups differ.
H :    2
H :    2  0
H : D  0
Solution:  0 1
or  0 1
. If D  1   2 , then  0
. Because you believe
 H 1 : 1   2
 H 1 : 1   2  0
H 1 : D  0
that the Normal distribution applies, you use a method that compares means. The total sample size is large,
so this is the place where most people would use Method D1. But, if a computer is handy, there is no reason
not to use D3, nor would you expect different results from D1.
5. A recheck of the data in Problem 4 indicates that the underlying distribution is far from Normal, but you
are still trying to find out if the grades of the two groups differ.
Solution: For the second reason given in Problem 2, caution should rule out method D1. If  is the
 H 0 : 1   2
median. 
. Since we are comparing medians and the data are not paired, use Method D5a.
 H 1 : 1   2
6. If 10 out of a group of 80 randomly chosen people who have received information on your product are
inclined to buy it and 12 out of a group of 81 randomly chosen people are inclined to buy your product after
receiving the same information and seeing your commercial, does the commercial increase the proportion
who will buy your product?
 p  x1
 1
n1
Solution: If group 2 are those who saw the commercial, 
. We have asked if the commercial
 p2  x2
n2

increased the proportion p1  p 2 , which must be an alternate hypothesis because it does not contain an
 H 0 : p1  p 2
H 0 : p1  p 2  0
H 0 : p  0
equality. 
or 
. If p  p1  p 2 , then 
. Since we are
 H 1 : p1  p 2
H 1 : p1  p 2  0
H 1 : p  0
comparing proportions, use Method D6. Note that some people may find it more natural to define p as
H 0 : p  0
.
p 2  p1 , so that we have 
H 1 : p  0
7. Have new procedures decreased the variability of delivery times? Two samples have been taken and you
know x1 and s1 taken before the new procedures were instituted, and x 2 and s 2 , taken afterwards. There
seems to be little difference between average delivery times before and after.
 H 0 :  12   22
Solution: If  1 is the standard deviation after the new procedures are instituted, 
or
 H 1 :  12   22
H 0 :   
2
2
2
1
2
. In terms of the variance ratio 12 or 22 , the alternate hypothesis rules, so H 0 : 22  1

1
2
1
H 1 :  1   2
and H 1 :
 22
 12
 1 . Since you are comparing variances, use Method D7.
2
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
8. A paper company wishes to know whether a new procedure has decreased the amount of time it takes to
unload trucks. What are the hypotheses this implies? Two samples are taken with the results below. How do
we decide whether to use Method D2 or D3? What hypotheses, etc do we test to make this decision?
Old Method
New Method
n1  50
n 2  50
x1  25 .4 minutes
x 2  27 .3 minutes
s 2  3.7 minutes
s1  3.1 minutes
Solution: We have been asked to decide between Methods D2 and D3. The choice depends on the equality
 H 0 :  12   22
of variances. If  1 is the standard deviation after the new procedure is instituted, 
or
 H 1 :  12   22
H 0 :   
 12
 22
1
2
.
In
terms
of
the
variance
ratio
or
, both are equivalent in a 2-sided test, so

 22
 12
H 1 :  1   2


 12
 22
H 0 : 2  1 H 0 : 2  1
2
1
s2


DF ,DF 
or 
are both fine. Supposedly, you would test both 12 against F 1 2 and

2
2
2
s2
1
2


H
:

1
H
:

1
 1
 1
 22
 12


s 22
s12
DF ,DF 
against F 2 1 , but, actually only one of these will be above 1 and actually need testing. Since you
2
are comparing variances, use Method D7.
9. You have the following data and want to see if means are similar before and after. Assume that the parent
distributions are Normal.
Person
Before
After
Difference
Squared
1
82
92
-10
100
2
60
72
-12
144
3
55
57
-2
4
4
97
104
-7
49
5
79
89
-10
100
Sum
-41
379
Solution: You are comparing means before and after. If  2 is the mean of the second sample, you are
testing that  2  1 , which, because it contains an equality is null alternate hypothesis. So your hypotheses
 H 0 : 1   2
 H 0 : 1   2  0
H 0 : D  0
are 
or 
. If D  1   2 , then 
. The important thing to
H
:



H
:




0
2
2
 1 1
 1 1
H 1 : D  0
notice here is that the data are in before and after pairs, so you use Method D4. I have added the marerial in
red on the right to remind you that x1  x 2  d 
sd 
1
n
d 2  n d
n 1
2

1
5
 d   41  8.20
n
5
and that
379  58.20 2
4
3
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
10. An experiment at Duke University was conducted to see if ability to identify food by taste and smell
decreases by age. One food (mushed apple) was correctly identified by 81out of 100 students and by 51 out
of 100 older people.
 p  x1
 1
n1  H 0 : p1  p 2
H : p  p 2  0
H : p  0
Solution: If 
or  0 1
. If p  p1  p 2 , then  0
.

 H 1 : p1  p 2
H 1 : p1  p 2  0
H 1 : p  0
 p2  x2
n2

Since we are comparing proportions, use Method D6.
Important note: Two problems from last year’s exam should serve as a warning, especially on the final
exam.
11. You have interviewed a sample of 80 small businesses in the Northeast and 75 small businesses in the
Southeast. Each business has indicated whether they sell in foreign markets. You want to show that
businesses in the Northeast are more likely to export. ( x1 is the total number of firms that export in the
Northeast sample, x 2 in the Southeast).
 p  x1
 1
n1  H 0 : p1  p 2
H : p  p 2  0
H : p  0
Solution: If 
or  0 1
. If p  p1  p 2 , then  0
.

 H 1 : p1  p 2
H 1 : p1  p 2  0
H 1 : p  0
 p2  x2
n2

Since we are comparing proportions, use Method D6.
12. You expand the sample in 11 by adding 60 small businesses in the Midwest, ( x3 is the number of
these that export). You test the hypothesis that the same fraction of businesses export in each region.
 p  x1
 1
n1

H 0 : p1  p 2  p 3
H 0 : p1  p 2  0

x
Solution: If  p 2  2 n 
or 
. This is a chi-squared test of
2  H 1 : not all ps equal.
H 1 : p1  p 2  0

 p  x3
3
n3

homogeneity. Since we are comparing multiple proportions, use a chi-squared test. O.K. How would you
set it up?
4
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
Extra Credit: In Method D6, we assume that we are comparing proportions from two independent
samples. In the McNemar Test we compare two proportions taken from the same sample. Assume that two
question 2
question 1
yes no
different questions are asked of the same group with the following responses.
x
yes
 11 x12 
x

no
 21 x 22 
So, for example x 21 is the number of people who answered no to question 1 and yes to question 1.
x11  x12  x 21  x 22  n , p1 
z
x12  x 21
x12  x 21
H : p  p 2
x11  x12
x  x 21
and p 2  11
. If we wish to test  0 1
, let
n
n
H 1 : p1  p 2
(The test is valid only if x12  x 21  10 .)
A famous example of this concerns a debate between candidates, question 1 is whether the respondent
supports candidate 1 before the debate and question 2 is whether the respondent supports candidate 1 after
question 2
question 1
yes no
the debate. The data is
and the question is whether the debate has changed the
yes
27 7
 13 28 
no


fraction supporting candidate 1. Write this out as a hypothesis test and do the test.
H : p  p 2
H : p  p 2  0
Solution:  0 1
or  0 1
This is a two-sided test, so if we use a 5% significance
H 1 : p1  p 2
H 1 : p1  p 2  0
x  x 21
level, our rejection region are below z .025  1.96 and above z.025  1.96 . z  12
x12  x 21
7  13
6
36
  1.8  1.34 , and we cannot reject the null hypothesis. If we use a p-value,
20
7  13
20
2Pz  1.34   2.5  .4099   0.0901 , so we could reject the null hypothesis at a 10% significance level,



H 0 : p1  p 2
but not a 5% level. If you (wrongly, but understandably though that the hypotheses were 
or
H 1 : p1  p 2
H 0 : p1  p 2  0
, the 5% rejection region would be below z .05  1.645 and we still could not reject the

H 1 : p1  p 2  0
null hypothesis.
Note: This is a version of the Chi-Square Test – Recall that  2 

O  E 2
E
. If we take x11 and x 22 as
question 1
given, and assume that the null hypothesis is correct, then the table already given,
yes
no
question 2
yes no
 x11 x12 
x

 21 x 22 
is our O , and the numbers in the x12 and the x 21 slots must be equal for there to be no change in
5
252grass3-051 3/24/05 (Open this document in 'Page Layout' view!)
question 1
preferences, so that out E is
yes
no
question 2
yes
no
x12  x 21  . This means that two of the four terms in

x11


2
x  x

21
 12
x 22 
2


2
2 


O  E 2
E
x12  x 21 2
x12  x 21
x  x 21 
x  x 21 


 x12  12

 x 21  12

2
2




2
are zero and the remaining terms are  

x12  x 21
x12  x 21
2
2
2
. But  2 has only one degree of freedom, and, since  2 is defined as a sum of z 2 , we can
take a square root and say z 
x12  x 21
x12  x 21
.
6
Download