252solngr3-061 11/01/06 Name: Class days and time:

advertisement
252solngr3-061 11/01/06 (Open this document in 'Page Layout' view!)
Name:
Class days and time:
Please include this on what you hand in!
Graded Assignment 3
In your outline there are 6 methods to compare means or medians, methods D1, D2, D3, D4, D5a and D5b.
Methods D6a and D6b compare proportions and method D7 compares variances or standard deviations. In
the following cases, identify H 0 and H 1 and identify which method to use. Method E1 is a chi-squared test
that compares multiple proportions. Method F1 is analysis of variance (ANOVA), which compares multiple
means. If the hypotheses involve two means, state the hypotheses in terms of both  and   1   2 . If
the hypotheses involve two proportions, state them in terms of both p and p  p1  p 2 . If the hypotheses
involve two standard deviations or variances, state them in terms of both  2 and
 12
 22
or
 22
 12
. All the
questions involve means, medians, proportions or variances. One of these problems is a chi-squared test.
Note: Look at 252thngs (252thngs) on the syllabus supplement part of the website before you start (and
before you take exams). Neatness and clarity of explanation are expected. Note that from now on
neatness means paper neatly trimmed on the left side if it has been torn, multiple pages stapled and
paper written on only one side.
----------------------------------------------------------------------------------------------------------------------------Example: This may seem long but it appears on a previous graded assignment 3.
A group of supervisors are given exams on management skills before and after taking a course in
management. Scores are as follows.
Supervisor
Before
After
1
63
78
2
93
92
3
84
91
4
72
80
5
65
69
6
72
85
7
91
99
8
84
82
9
71
81
10
80
87
11
68
93
If we assume that the distribution of results is Normal, what method should we use to answer the question
“Has the course improved the scores of the managers?” What are our hypotheses?
Solution: You are comparing means before and after the course. You can get away with using means
because the parent distributions are Normal. If  2 is the mean of the second sample, you are hoping that
 2  1 , which, because it contains no equality is an alternate hypothesis. So your hypotheses are
 H 0 : 1   2
 H 0 : 1   2  0
H 0 : D  0
or 
. If D  1   2 , then 
. The important thing to notice

H
:



H
:




0
2
2
 1 1
 1 1
H 1 : D  0
here is that the data are in before and after pairs, so you use Method D4.
252solngr3-061 11/01/06
1. West Chester University plays 11 football games a year. Though it is a questionable practice, these 11
games can be regarded as a random sample taken from an infinite number of games that they might have
played. Statistics can be computed from them like proportion of games won or the mean and standard
deviation of touchdowns or completed passes per game. At Christmas 2007, you look at these statistics and
assert that the team did significantly better in 2007 than it did in 2006.
a. The statistic you wish to use is completed passes. If x1 is a column listing the number of passes
completed in each game in 2006 in random order and x 2 is a column listing the number of passes
completed in each game in 2007 in random order and you consider these as independent random samples
and wish to compare the mean passes completed per game and to get results that verify your statement, what
are your hypotheses? What is your method?
b. Your roommate, Gigglebutz, says that you have done this all wrong, because the distributions
have not been shown to be Normal. You do the test described in section 6.3 of the text and sadly agree with
Gigglebutz that the distributions are probably not Normal. What is your test and method now?
c. Gigglebutz is still not satisfied and points out that these are not independent random samples and
what you should do is rearrange all your data so that each column represents a year and each row an
opponent. Any school that was not played in both years should be dropped from your data. What is your test
and method now?
2. Standard deviation is often a measure of reliability. A manufacturer is providing a connector and is
getting complaints that the connector is often too large or too small for the intended use. A sample of 250
connectors produced by the process in current use is taken. Then a new process is tried and a sample of 45
connectors is taken. Find the hypotheses and method to show that the new process produces a more reliable
product.
3. (Dummeldinger pg 148) We are interested in attitudes about sexual discrimination. Various groups were
asked if they believed sexual discrimination is a problem in the United States.   .05 
a. Out of 50 men 11 believed that sexual discrimination is a problem. Out of 40 women 19 said
that sexual discrimination is a problem. Is there a significant difference between male and female attitudes?
b. A group of 100 men are picked from upper levels of a group of large American firms. They are
asked if sexual discrimination is a problem in the United States. 38 said ‘yes.’ They are then sent to a
program on mentoring female executives where they discuss, among other things, what problems seem to be
peculiar to young female executives. Afterwards the men are asked the same question. This time 41 said
‘yes.’ Assuming that these 100 men are representative of all executives who are trained to mentor female
executives, has the training increased the proportion that believes there is a problem?
c. Random samples (each 100 people) of male 1) high school graduates, 2) college graduates and
3) people with MBAs are asked the same question. Out of the first group 35 said ‘yes.’ Out of the second
group 38 said ‘yes.’ Out of the third group 39 said ‘yes.’ Is there a significant difference between the
attitudes of the three groups?
Extra credit: (Place answers at the end of the assignment)
d. You have enough information to do two of the three problems above. What
additional information do you need?
e. Do one of the three problems. State your conclusion clearly.
4. (Dummeldinger) A researcher took a random sample of n graduates of MBA programs, which included
n1 women and n 2 men. Their starting salaries were recorded. Use 1 and/or  1 for population parameters
for women and  2 and/or  2 for women. Choose hypotheses and methods for the following.   .10 
a. n1  n 2  150 . The researcher wants to show that men have a higher mean starting salary than
women.
b. n1  18 and n2  12 . The researcher wants to compare the dispersion of men’s and women’s
salaries. The researcher has no prior opinion as to which is more variable.
2
252solngr3-061 11/01/06
c. n1  18 and n2  12 and the null hypothesis in 4b) has not been rejected. Dummeldinger gives
the following data.
x1  48266 .70
x 2  55000 .0
and
. The researcher wants to show that men have a
s1  13577 .63
s 2  11741 .25
higher mean starting salary than women.
Extra credit: (Place answers at the end of the assignment)
d. Do problem 4c. (Hint: Before you start, move the decimal point so that the sample
means and standard deviations are in thousands.) State your conclusion clearly.
5. (Dummeldinger) In order to find the effect of membership in a fraternity or sorority on grades, 250
students who had joined fraternities or sororities were picked as a random sample.
a. GPA’s were recorded for each student for the semester before ( x1 column) and the semester
after they joined ( x 2 column). Mean GPAs were compared. Choose hypotheses and method.
b. Dummeldinger says that a 95% confidence interval for the difference between the means was
0.24 to 1.02. What is your conclusion?)
Solutions
General considerations.
1) All methods in section D are methods that can only be used for comparison of 2 samples. This is
because, if  (theta) is a parameter like  or p,   1   2 is easy to define and will be zero if  1 and
 1 are equal. If we go to more than two samples, say 3, we need something like
1   0 2   2   0 2   3   0 2
, where  0 is some sort of average of the parameters of the samples.
This will equal zero if all the parameters are equal and will not allow positive discrepancies in one sample
to cancel out negative discrepancies in another. This is what takes us to chi-squared and ANOVA methods.
Saying    2     2     2  0 is not the same as saying D 3        0 , because
1
0
2
0
3
0
1
2
3
D 3 would be negative if 1   2   3 , but saying 1   0 2   2   0 2  0 is the same as saying
D 2      0 . (Try proving this – it’s simple algebra.)
1
2
2) You can always substitute a method for the median for a method for the mean, but not vice versa.
However, if a Normal distribution applies, a method involving means will be more efficient and powerful.
3) The computer will used Method D3 when it is not told what method to use. This is quite general because
if the sample variances are similar, it gives results like D2 and if the sample sizes are large, it gives results
like D1. However, if variances are equal D2 is easier to use and if the samples are large D1 is easier to use.
4) The K-S and Lilliefors methods only exist because chi-square performs so poorly for small samples. K-S
needs  ,  or other parameters. Lilliefors uses x or s and only works to test for a Normal distribution.
5) ‘Significant’ in statistics means that we have rejected a hypothesis like H 0 :   0 and ‘significantly
different’ means that we have rejected a hypothesis like H 0 : 1   2 . Of course, if two parameters are
significantly different, their difference is significant.
6) Be careful of inequalities. If 1   2 or  2  1 and D  1   2 , then D  0.
7) In most problems you are better off trying to figure out what the alternative hypothesis is before
you try to state the null hypothesis.
3
252solngr3-061 11/01/06
1. West Chester University plays 11 football games a year. Though it is a questionable practice, these 11
games can be regarded as a random sample taken from an infinite number of games that they might have
played. Statistics can be computed from them like proportion of games won or the mean and standard
deviation of touchdowns or completed passes per game. At Christmas 2007, you look at these statistics and
assert that the team did significantly better in 2007 than it did in 2006.
a. The statistic you wish to use is completed passes. If x1 is a column listing the number of passes
completed in each game in 2006 in random order and x 2 is a column listing the number of passes
completed in each game in 2007 in random order and you consider these as independent random samples
and wish to compare the mean passes completed per game and to get results that verify your statement, what
are your hypotheses? What is your method?
H :    2
H :    2  0
Solution: If this is a valid method for testing for improvement.  0 1
or  0 1
. If
 H 1 : 1   2
 H 1 : 1   2  0
H 0 : D  0
. If you have decided to use means, you must believe that the Normal
D  1   2 , then 
H 1 : D  0
distribution applies. The total sample size is too small to use Method D1, which means that D2 or D3
should work. You could test the variances for equality and use D2, or not bother and use D3. You may want
to see the methodology below.
If we use a confidence interval it will be of the form D  d  t  s d .
If we use a test ratio, we will compare t 
d  D0
with t .
sd
If we use a critical value for d , we will use d cv  D0  t  s d .
b. Your roommate, Gigglebutz, says that you have done this all wrong, because the distributions
have not been shown to be Normal. You do the test described in section 6.3 of the text and sadly agree with
Gigglebutz that the distributions are probably not Normal. What is your test and method now?
Solution: If the sample is small and we have no reason to believe that a symmetrical distribution applies,
 H 0 : 1   2
we should compare medians. If  is the median 
. Since we are comparing medians and we
 H 1 : 1   2
still think that the data are independent random samples, use Method D5b.
c. Gigglebutz is still not satisfied and points out that these are not independent random samples and
what you should do is rearrange all your data so that each column represents a year and each row an
opponent. Any school that was not played in both years should be dropped from your data. What is your test
and method now?
Solution: If the sample is small and we have no reason to believe that a symmetrical distribution applies,
 H 0 : 1   2
we should compare medians. If  is the median. 
. Since we are comparing medians and we
 H 1 : 1   2
have just put the data into pairs, use Method D5b.
2. Standard deviation is often a measure of reliability. A manufacturer is providing a connector and is
getting complaints that the connector is often too large or too small for the intended use. A sample of 250
connectors produced by the process in current use is taken. Then a new process is tried and a sample of 45
connectors is taken. Find the hypotheses and method to show that the new process produces a more reliable
product.
Solution: When you see words like reliability or variability, think variance or standard deviation. The
statement that a new process is more reliable means that the variance or standard deviation of the new
process is smaller. Since this statement does not include an equality, it must be an alternate hypothesis.
4
252solngr3-061 11/01/06
H 0 :   
 H 0 :  12   22
2
2
1
2
or 
. In terms of the variance ratio 12 or 22 , the alternate hypothesis

2
1
 H 1 :  12   22
H 1 :  1   2
rules, so H 1 :
 12
 22
 1 . This means that the null hypothesis is H 0 :
variances, use Method D7. Compare the ratio
s12
s 22
 12
 22
 1 . Since you are comparing
against F .
3. (Dummeldinger pg 148) We are interested in attitudes about sexual discrimination. Various groups were
asked if they believed sexual discrimination is a problem in the United States.   .05 
a. Out of 50 men 11 believed that sexual discrimination is a problem. Out of 40 women 19 said
that sexual discrimination is a problem. Is there a significant difference between male and female attitudes?
Solution: The only possible measure that can come out of this statement is the proportions of men and
women that believe sexual discrimination is a problem. ‘Difference’ means not equal.
 p  x1
 1
n1  H 0 : p1  p 2
H 0 : p1  p 2  0
 H : p  0
If 
or 
. If p  p1  p 2 , then  0
. Since we are

H
:
p

p
H
:
p

p

0
x
2
2
 1 1
 1 1
 H 1 : p  0
 p2  2
n2

comparing proportions from independent samples, use Method D6a.
If we use a confidence interval it will be of the form p  p  z s p .
2
If we use a test ratio, we will compare t 
d  D0
with z .
sd
2
If we use a critical value for p , we will use pcv  p0  z  p .
2
b. A group of 100 men are picked from upper levels of a group of large American firms. They are
asked if sexual discrimination is a problem in the United States. 38 said ‘yes.’ They are then sent to a
program on mentoring female executives where they discuss, among other things, what problems seem to be
peculiar to young female executives. Afterwards the men are asked the same question. This time 41 said
‘yes.’ Assuming that these 100 men are representative of all executives who are trained to mentor female
executives, has the training increased the proportion that believes there is a problem?
Solution: It clearly says that we are comparing proportions and that we want to see if p 2  p1 . Since this
 p  x1
 1
n1
 H 0 : p1  p 2
does not contain an equality, it must be an alternate hypothesis. If 
we can use 
 H 1 : p1  p 2
 p2  x2
n2

H 0 : p1  p 2  0
H 0 : p  0
or 
. If p  p1  p 2 , then 
. We do not have independent samples but are
H 1 : p1  p 2  0
H 1 : p  0
sampling twice from the same sample. Since we are not comparing proportions from independent samples,
question 2
question 1
yes no
use Method D6b. Note that our setup is
, We know that x11  x12  x1  38 and
yes
 x11 x12 
x

no
 21 x 22 
that x11  x 21  x 2  41 , but we do not know x 21 or x12 . Since these numbers are missing, we will have to
go back to our original data and compute them.
x  x 21
We will then compare z  12
against  z  .
x12  x 21
5
252solngr3-061 11/01/06
c. Random samples (each 100 people) of male 1) high school graduates, 2) college graduates and
3) people with MBAs are asked the same question. Out of the first group 35 said ‘yes.’ Out of the second
group 38 said ‘yes.’ Out of the third group 39 said ‘yes.’ Is there a significant difference between the
attitudes of the three groups?
 p  x1
 1
n1

H 0 : p1  p 2  p 3

x
Solution: If  p 2  2
. This is a chi-squared test of homogeneity. Since we are
n 2 H : not all ps equal.
 1

 p  x3
 3
n3
comparing multiple proportions, use a chi-squared test. The O that we need must account for all of each
O
HSGrads CollegeGr MBAs
 35
Yes
38
39 
sample and is
.


No
62
61 
 65
Extra credit is at the end of the assignment.
4. (Dummeldinger) A researcher took a random sample of n graduates of MBA programs, which included
n1 women and n 2 men. Their starting salaries were recorded. Use 1 and/or  1 for population parameters
for women and  2 and/or  2 for men. Choose hypotheses and methods for the following.   .10 
a. n1  n 2  150 . The researcher wants to show that men have a higher mean starting salary than
women.
 H 0 : 1   2
Solution: If this is a valid method for testing for a difference in starting salaries, use 
or
 H 1 : 1   2
 H 0 : 1   2  0
H 0 : D  0
. If D  1   2 , then 
. If you have decided to use means, you must believe

 H 1 : 1   2  0
H 1 : D  0
that the Normal distribution applies. The total sample size is large enough for Method D1, which means that
D2 or D3 should work as well. You could test the variances for equality and use D2, or not bother and use
D3.
b. n1  18 and n2  12 . The researcher wants to compare the dispersion of men’s and women’s
salaries. The researcher has no prior opinion as to which is more variable.
Solution: When you see words like reliability or variability, think variance or standard deviation. Since the
research has no idea of which dispersion is larger, this is a two-sided test of equality of variances unless
2
2
 H 0 :  1   2
there is a good reason to believe that the Normal distribution does not apply. We can use 
 H 1 :  12   22
H 0 :   
2
 22
1
2
or 
. In terms of the variance ratio 12 or
, it doesn’t matter which ratio we use, so we
2
 12
H 1 :  1   2
have H 0 :
 12
 22
 1 and H 1 :
 12
 22
 1 or H 0 :
 22
 12
 1 and H 1 :
 22
 12
 1 . Since you are comparing variances,
6
252solngr3-061 11/01/06
use Method D7. In practice this means to compare the ratio
ratio
s 22
s12
s12
s 22
n 1, n 1
against F 1 2
and to compare the

2
n 1, n 1
against F 2 1 .

2
c. n1  18 and n2  12 and the null hypothesis in 4b) has not been rejected. Dummeldinger gives
the following data.
x1  48266 .70
s1  13577 .63
x 2  55000 .0
and
s 2  11741 .25
. The researcher wants to show that men have a
higher mean starting salary than women.
H :    2
Solution: If this is a valid method for testing for a difference in starting salaries, use  0 1
or
 H 1 : 1   2
 H 0 : 1   2  0
H : D  0
. If D  1   2 , then  0
. If you have decided to use means, you must believe

H
:




0
2
 1 1
H 1 : D  0
that the Normal distribution applies. The total sample size large is not large enough for Method D1, which
means that D2 or D3 should work. But you have tested the variances for equality and shown that you could
use method D2.
Extra credit is at the end of the assignment.
5. (Dummeldinger) In order to find the effect of membership in a fraternity or sorority on grades, 250
students who had joined fraternities or sororities were picked as a random sample.
a. GPA’s were recorded for each student for the semester before ( x1 column) and the semester
after they joined ( x 2 column). Mean GPAs were compared. Choose hypotheses and method.
 H 0 : 1   2
Solution: If this is a valid method for testing for a difference in starting salaries, use 
or
 H 1 : 1   2
 H 0 : 1   2  0
H 0 : D  0
. If D  1   2 , then 
. If you have decided to use means, you must believe

 H 1 : 1   2  0
H 1 : D  0
that the Normal distribution applies. The total sample size is large enough for Method D1, which means that
D2 or D3 should work as well. You could test the variances for equality and use D2, or not bother and use
D3.
b. Dummeldinger says that a 95% confidence interval for the difference between the means was
0.24 to 1.02. What is your conclusion?
Solution: The confidence interval has been thrown at us, so we must assume that the author knew what he
was doing. Since the interval includes zero and a 2-sided test was appropriate if the hypotheses are as in 5a),
we can say that there was no significant difference between the before and after means.
Extra Credit Problems
3d. (Dummeldinger pg 148) We are interested in attitudes about sexual discrimination. Various groups were
asked if they believed sexual discrimination is a problem in the United States.   .05 
a. Out of 50 men 11 believed that sexual discrimination is a problem. Out of 40 women 19 said
that sexual discrimination is a problem. Is there a significant difference between male and female attitudes?
b. A group of 100 men are picked from upper levels of a group of large American firms. They are
asked if sexual discrimination is a problem in the United States. 38 said ‘yes.’ They are then sent to a
program on mentoring female executives where they discuss, among other things, what problems seem to be
peculiar to young female executives. Afterwards the men are asked the same question. This time 41 said
7
252solngr3-061 11/01/06
‘yes.’ Assuming that these 100 men are representative of all executives who are trained to mentor female
executives, has the training increased the proportion that believes there is a problem?
c. Random samples (each 100 people) of male 1) high school graduates, 2) college graduates and
3) people with MBAs are asked the same question. Out of the first group 35 said ‘yes.’ Out of the second
group 38 said ‘yes.’ Out of the third group 39 said ‘yes.’ Is there a significant difference between the
attitudes of the three groups?
d. You have enough information to do two of the three problems above. What additional
information do you need?
Solution: In 3e) I will demonstrate that we have enough information to do 3a) and 3c). If you look at the
solution to 3b) above you will notice that I complained that we do not know x 21 or x12 . These represent
people who have changed their answers between Question 1 and Question 2.
3e. Do one of the three problems. State your conclusion clearly.   .05 
Solution: We cannot do 3b.
If we wish to finish 3a), remember that x1  11 , n1  50 , x 2  19 and n 2  40 . The proportions are
11
19
 .2200 and p 2 
 .4750 . q1  1  p1  1  .2200  .7800 and
50
40
q 2  1  p 2  1  .4750  .5250 . p  p1  p2  .2200  .4750  .255 . The Formula Table says the
following.
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
pcv  p0  z 2  p
Difference
p  p 0
H 0 : p  p 0
p  p  z 2 s p
z
between
If
p  0
 p
H 1 : p  p 0
p  p1  p 2
proportions
 1
1 
If p  0
 p  p 0 q 0   
p 0  p 01  p 02
p1 q1 p 2 q 2
q  1 p
 n1 n 2 
s p 

p 01q 01 p 02 q 02
 p 

or p 0  0
n1
n2
n1
n2
n p  n2 p 2
p0  1 1
n1  n 2
Or use s p
p1 
 H 0 : p1  p 2
H 0 : p1  p 2  0
 H : p  0
Our hypotheses are 
or 
or if p  p1  p 2 ,  0
.
 H 1 : p1  p 2
H 1 : p1  p 2  0
 H 1 : p  0
x x
11  19 30
n p n p
50 .2200   40 .4750 

 0.3333  1 1 2 2 
z 2  z.025  1.960 . p0  1 2 
n1  n2 50  40 90
n1  n2
50  40
q0  1  p0  1  .3333  .6667
 p  p 0 q 0

1
n1

1
n3

.3333 .6667  150  1 40 
.22222 .02  .025   .0099999  .1000
(Only one of the following methods is needed!)Test Ratio: z 
p  p 0
 p

 .255  0
 2.55 . Make a
.1000
diagram showing a 95% 'accept' region between -1.960 and +1.960. Shade the 'reject regions below -1.960
and above 1.960. Since -2.55 lies in the lower 'reject' region, reject H 0 .
or Critical Value: pcv  p0  z  p  0  1.9600.1000  0.196 . Make a diagram showing a 95%
2
'accept' region between -0.196 and +0.196. Shade the 'reject' regions below -0.196 and above 0.196. Since
p  .255 lies in the lower 'reject' region, reject H 0 . p1  .2200 , p 2  .4750 , q1  .7800 and
q 2  .5250 .
8
252solngr3-061 11/01/06
or Confidence Interval: p  p  z sp
2
s p 
p1  .2200 , p 2  .4750 , q1  .7800 and q 2  .5250 .
p1 q1 p 2 q 2
.2200 .7800  .4750 .5250 



 .003432  .006234  .0096664  0.0983 .
n1
n2
50
40
So p  .255  1.960 0.0983   .255  .193 or -.448 to -.062. Note that this interval does not include
zero and thus causes us to reject the null hypothesis.
Minitab output follows.
MTB > PTwo 50 11 40 19.
#Format of this instruction is
n1 , x1 , n 2 , x 2 .
Test and CI for Two Proportions
Sample
X
N Sample p
1
11 50 0.220000
2
19 40 0.475000
Difference = p (1) - p (2)
Estimate for difference:
# Just prints out the two proportions.
-0.255
#Prints
p  p1  p2  .2200  .4750  .255
95% CI for difference: (-0.447699, -0.0623008) #Exactly the same as the interval above.
Test for difference = 0 (vs not = 0): Z = -2.59 P-Value = 0.009
#The z was evidently computed using
s p
instead of
 p . Bad!
H : p  p 2  p3
To finish 3c) note that we had  0 1
. The O that we found accounted for all of each
H 1 : not all ps equal.
sample and was, with row and column totals, as below.
O
HSGrads CollegeGr MBAs
Total
pr
 35
Yes
38
39 
112
.3733
. Row proportions are gotten by


No
62
61 
188
.6267
 65
Total
100
100
100
300 1.0000
112
dividing row totals into the overall total, for example .3733 
. We now get our Expected E  table by
300
using the row proportions to multiply the column totals. For example we replace 35 by .3733 100   37 .33 .
E
HSGrads CollegeGr MBAs
Total
pr


Yes
37.33
37.33
37.33
112
.3733
The expected array is


No
62.67
62.67
62.67
188
.6267


Total
100
100
100
300 1.0000
The formula for the chi-squared statistic is  2 

O  E 2
or  2 
E
formulas are shown below. DF  r  1c  1  2  13  1  2 .
index O
E
1
35 37.33
2
65 62.67
3
38 37.33
4
62 62.67
5
39 37.33
6
61 62.67
Total 300 300.00
So we have  2 

O  E 2
2 2 
E
E  O2
E O
-2.33
2.33
0.67
-0.67
1.67
-1.67
0.00
 0.3705 or  2 

O2
 n . Both of these two
E
E  O  2
E
0.145430 32.8154
0.086627 67.4166
0.012025 38.6820
0.007163 61.3372
0.074709 40.7447
0.044501 59.3745
0.370456 300.3704

O2
 n  300 .3704  300  0.3704 . If we compare
E
our results with  .05  5.9915 , since our computed value of chi-squared is less than the table value, we
do not reject our null hypothesis. Minitab output for the same problem follows.
9
252solngr3-061 11/01/06
MTB > print c11-c13
Data Display
Row
1
2
O1
35
65
O2
38
62
#Columns are in c11 – c13 and are labeled O1-O3.
O3
39
61
MTB > ChiSquare C11-C13.
Chi-Square Test: O1, O2, O3
Expected counts are printed below observed counts
Chi-Square contributions are printed below expected counts
O1
35
37.33
0.146
O2
38
37.33
0.012
O3
39
37.33
0.074
Total
112
2
65
62.67
0.087
62
62.67
0.007
61
62.67
0.044
188
Total
100
100
100
300
1
#You should be able to find the elements of
E, O
and
E  O  2
E
in the
#table above.
Chi-Sq = 0.370, DF = 2, P-Value = 0.831
4d. Do problem 4c. (Hint: before you start, move the decimal point so that the sample means and standard
deviations are in thousands.) State your conclusion clearly.
H :    2
H :    2  0
Solution: We already have our hypotheses  0 1
or  0 1
or if D  1   2 ,
 H 1 : 1   2
 H 1 : 1   2  0
H 0 :   
 H 0 :  12   22
1
2
or 
. Finally we have

2
2
 H 1 :  1   2
H 1 :  1   2
x1  48266 .70
x 2  55000 .0
and
. All of these are to be expressed in thousands.
n1  18 , n2  12 ,
s1  13577 .63
s 2  11741 .25
H 0 : D  0
. We know that we cannot reject

H 1 : D  0
The formula table gives us the formulas below.
Interval for
Confidence
Hypotheses
Interval
Difference
H 0 : D  D0 *
D  d  t 2 s d
between Two
H 1 : D  D0 ,
1 1
Means (
sd  s p

D  1   2
n1 n2
unknown,
variances
assumed equal)
DF  n1  n2  2
Test Ratio
t
sˆ 2p 
d  D0
sd
Critical Value
d cv  D0  t  2 s d
n1  1s12  n2  1s22
n1  n2  2
DF  n1  1  n 2  1  17  11  28  n1  n 2  2. d  x1  x 2  48.26670 55.00000  6.7333 .
sˆ 2p 
n1  1s12  n2  1s 22
n1  n 2  2

17 13.57763 2  1111.74125 2

28

 166 .0861 . This is the pooled variance. sˆ p  12.8874
s d  sˆ p

3133 .984619  1516 .426467
28
  .10 so t.28
10  1.313 .
 1
1
1
1
1 
1
  166 .0861     166 .0861 .0555555  .0833333

 sˆ 2p  
18
12
n1 n 2
n
n


2 
 1

 23.0675  4.8029 . Recall that our alternate hypothesis is H 1 : D  0 so this is a left-sided test.
10
252solngr3-061 11/01/06
(Only one of the following methods is needed!) Test Ratio: t 
If this test ratio lies below  t .28
10  1.313 , reject H 0 .
t
x  x 2   10   20 
d  D0
.
or t  1
sd
sd
d  D0
 6.7333  0

 1.401 . Make a
sd
4.8029
diagram with zero in the middle showing a shaded ‘reject’ region below -1.313. Since -1.401 falls in the
28
28
'reject' region, reject H 0 . Or you can say that, since 1.401 falls between t .10
 1.313 and t .05
 1.701 , for
a one-sided test, .05  p  value  .10 . Since the p-value is below   .10 , reject H 0 .
Critical Value: d CV  D0  t  2 s d or x1  x 2 CV  10   20   t  2 s d . For a left-sided test we want a
critical value below D0  0. If d  x1  x 2 is below the critical value, reject H 0 .
d CV  D0  t s d  0  1.313 4.8029   6.3062 . Make a diagram with 0 in the middle showing a
shaded 'reject' region below -6.3062. Since d  6.7333 falls in the 'reject' region, reject H 0 .
Confidence Interval: D  d  t  2 s d becomes D  d  t  s d for a left sided test. We already know that
t s d  1.3134.8029  6.3062 so we can say D  6.7333  6.3062 or D  0.4271 . Make a diagram
with -6.7333 in the middle. Represent the confidence interval by shading the area below -0.4271. Since zero
is not in this area, reject H 0 . Or simply note that it is impossible for D  0 as stated by the null hypothesis
and D  0.4271 .
The Minitab run gives us the following. The instruction TwoT is followed by the size, mean and standard
deviation of sample 1 followed by the size, mean and standard deviation of sample 2. The ‘alternative -1’ is
always used to set a left sided test. Note that the pooled standard deviation and the value of t are identical
to the values that I got using my calculator.
MTB > TwoT 18 48266.70 13577.63 12 55000.00 11741.25;
SUBC>
Pooled;
SUBC>
Alternative -1.
Two-Sample T-Test and CI
Sample
N
Mean StDev SE Mean
1
18 48267 13578
3200
2
12 55000 11741
3389
Difference = mu (1) - mu (2)
Estimate for difference: -6733.30
95% upper bound for difference: 1437.00
T-Test of difference = 0 (vs <): T-Value = -1.40
Both use Pooled StDev = 12887.4400
P-Value = 0.086
DF = 28
11
Download