Tests of Significance - Sakai

advertisement
Tests of Significance
I. 1 Sample z-Test
Consider the following question:
A coin is tossed 100 times, giving 58 heads. Does the coin seem to be “fair”?
Test at the 1% level of significance.
(1) Every legitimate test of significance involves a box model.
Fair Coin
1
1
AV = .5
SD = .5
1
100 Draws
0
EV = (100)(.5) = 50
SE = 100 (.5)  5
(2) To make a test of significance, the null hypothesis has to be formulated as a
statement about the box model.
H  : The observed number of heads (58) is not significantly greater than the
expected number of heads (50); any difference is due to chance variation.
(3) Calculate the appropriate test statistic.
Observed  Expected
SE
58  50 8
z
  1.6
5
5
z  Statistic 
(4) Calculate the P-value from the normal table.
89%
P  5.5%
50
z=0
58
z = 1.6
1
Note: The P-value is the probability of getting a value more extreme than the
observed value; you will be finding the area of 1 tail of the normal curve.
(5) Compare the P-value to the level of significance. If the P-value is less than the
given level of significance, then reject the null; this would suggest that the
difference between the observed value and the expected value is due to some
factor other than chance variation. If the P-value is not less than the given level
of significance, then fail to reject the null; this would suggest that the difference
is probably due to chance variation.
P  5.5%  1%  fail to reject the null  there is no reason to believe that the
coin is not fair.
Note: The 1 Sample z-Test is used when comparing the results of an experiment
(sample) with an established standard or ideal situation (population box model).
Observed (%)  Expected(%)
When comparing percents, use z 
.
SE (%)
Observed ( Ave)  Expected( Ave)
When comparing averages, use z 
.
SE ( Ave)
II. 1 Sample t-Test
Consider the following question:
A car company states that its 2002 Celantra gets an average of 32 mpg. A consumer
testing company randomly selects 8 Celantras with the following gas mileages (in
mpg): 29, 33, 30, 30, 28, 31, 34, and 29. Do the data support the car company’s
claim? Test at the 5% level of significance.
(1) Make a box model.
Company’s
Standard
8 Draws
AV = 32 mpg
Sample
SD   (1.94) 8  2.07 mpg
7
AV = 30.5 mpg
SD = 1.94 mpg
SE ( Ave) 
2.07
8
 .73 mpg
2
(2) Formulate the null hypothesis.
H  : The observed average of 30.5 mpg is not significantly less than the expected
average of 32 mpg.
(3) Calculate the t-Statistic.
Observed ( Ave)  Expected( Ave)
SE ( Ave)
30.5  32
t
 2.05
.73
t-Statistic =
(4) Calculate the P-value from the Student’s t-curve with df = 7.
5%
P
2.5%
30.5
t = -2.05
t = -2.36
32
t=0
t = -1.89
Thus, 2.5% < P < 5%.
(5) Since P-value < 5%, reject the null. The data do not seem to support the car
company’s claim.
Note: The 1 sample t-test will be used instead of the 1 sample z-test whenever the
the sample size is small (N  26), the SD of the population is unknown, and the
population is normally distributed.
III. 2 Sample z-Test
Consider the following question:
A testing laboratory is testing the life of air conditioning compressors produced
by two different companies. A random sample of 400 compressors was taken
3
from company A, giving an average life of 110 months with SD of 60 months.
Also, a random sample of 100 compressors was taken from company B, giving an
average life of 90 months with SD of 40 months. Do the compressors of company
A last significantly longer than those of company B? Test at the 1% level of
significance.
(1) Make the box models.
400 Draws
Sample
Company A
All Compressors
Company A
All Compressors
Company B
AV = ?
SD = 60
SE ( Ave) 
AV = 110
SD = 60
AV = ?
SD = 40
60
Sample
40
SE ( Ave) 
400
=3
100
=4
SE for Difference of Averages =
100 Draws
Company B
AV = 90
SD = 40
32  4 2  5
(2) Formulate the null hypothesis.
H  : The observed difference of the averages (110 months – 90 months =
20 months) is not significantly greater than the expected difference of the
averages (0 months).
(3) Calculate the z-statistic.
Observed  Difference  Expected  Difference
SE  for  Difference
20  0 20
z

4
5
5
z
(4) Calculate the P-value from the normal curve.
99.9937%
0
z=0
P  .00315%
20
z=4
4
(5) Since P-value < 1%, reject null. Thus, the difference between the averages seems
to be significant; company A’s compressors seem to last longer than those from
company B.
IV. 2 Sample t-Test
Consider the following question:
A high school counselor wishes to test the effectiveness of a SAT prep course.
One class of 10 students takes the prep course and another class of 15 students
does not receive any special instruction. The SAT is given to both classes at the
end of 8 weeks. The first class scored an average of 1180 with SD of 120, and the
second scored an average of 1000 with SD of 160. Is the difference in the
average scores significant? Test at the 1% level of significance.
(1) Make the box models.
Population for
SAT Prep
10 Draws
Population for
Regular Class
AV = ?
SD  120 10
Class
AV = 1180
SD = 120
AV = ?

SAT Prep
15 Draws
SD   160 15
9
 126.5
126.5
SE ( Ave) 
10
 40
14
 165.6
165.6
SE ( Ave) 
15
 42.76
SE for Difference of Averages =
Regular
Class
AV = 1000
SD = 160
40 2  (42.76) 2  58.6
(2) Formulate the null hypothesis.
H  : The observed difference of the averages (1180 – 1000 = 180) is not
significantly greater than the expected difference of the averages (0).
(3) Calculate the t-statistic.
t
Observed ( Diff )  Expected( Diff ) 180  0

 3.07
SE ( Diff )
58.6
5
(4) Calculate the P-value from the Student’s t-curve with combined df =
df (1st sample) + df (2nd sample) = 9 + 14 = 23.
0.5%
P
0
t=0
180
t = 3.07
t = 2.81
(5) Since P-value < 0.5% <1, reject null. Thus, the difference between the averages
seems to be significant; the SAT prep course seems to have improved the scores.
V. Matched Difference (1 Sample z-Test or t-Test)
Consider the following question:
A systems analyst is testing the possibility of using a new computer system. In
order to make a decision, a sample of seven jobs was selected and the processing
time in seconds was recorded on the old and the new systems with the following
results:
Job
1
2
3
4
5
6
7
Old
8
4
10
9
8
7
12
New
6
3
7
8
5
8
9
Is there sufficient evidence to conclude that the old system uses more processing
time? Test at the 0.5% level of significance.
We could use a 2 sample t-test to compare the average processing time on the old
system with the average processing time on the new system. However, there is a
more effective technique that can be used. Since the same seven jobs are used for
both samples of processing times, we can use the natural pairing or matching between
these times and do a 1 sample t-test on the seven differences. This technique
eliminates chance variation between two different jobs which would occur if we use
the 2 sample test.
6
(1) Make the box model.
The seven differences in processing times are: 8 – 6 = 2, 4 – 3 = 1, 10 – 7 = 3,
9 – 8 = 1, 8 – 5 = 3, 7 – 8 = – 1, and 12 – 9 = 3.
Matched Differences
In Processing Times
For All Jobs
7 Draws
Differences In
Processing Times
AV = 0 (*)
SD   1.385 7
6
 1.496 sec
1.496
SE ( Ave) 
 0.565 sec
7
For The 7 Jobs
AV = 1.714 sec
SD = 1.385 sec
Note (*): In this problem, we assume that the box model is “theoretical”,
consisting of a large number of tickets that are normally distributed and have
an average of 0.
(2) Formulate the null hypothesis.
H  : The observed average of the differences (1.714 sec) is not significantly
greater than the expected average of the differences (0 sec).
(3) Calculate the t-statistic.
Observed ( Ave  of  Differences)  Expected( Ave  of  Differences)
SE ( Ave  of  Differences)
1.714 sec  0 sec
t
 3.03
0.565 sec
t
(4) Calculate the P-value from the Student’s t-curve with df = 6.
2.5%
P
1%
0 sec
t=0
2.45
7
1.714 sec
3.03
3.14
Thus, 1% < P < 2.5%.
(5) Since P > 0.5%, then fail to reject null. Thus, the difference in processing times
between the old system and the new system do not seem to be significant.
VI.  2 - Test (Goodness of Fit )
Consider the following question:
A die is rolled 60 times with the following results:
Number on Die
1
2
3
4
5
6
Observed Frequency
4
6
17
16
8
9
Are these results significantly different from what we would expect from a fair
die? Test at the 1% level of significance.
We could use a 1 sample z-test to test each of the 6 numbers individually. In each of
the 6 cases, the box model would be the same:
1
AV =
SD 
1
5
1
6
15
   .373
66
0
EV 
60 Draws
1
60  10
6
SE  .373 60  2.9
For instance, if we test whether or the results for “6” are significantly different from
9  10
 .34 with a corresponding P-value
what we would expect, we would get z 
2.9
of 37% and no significant difference. However, if we test the results for “3”, we
17  10
 2.41 with corresponding P-value of 0.8% and a significant
would get z 
2.9
difference. The  2 -test allows us to test all 6 numbers at once.
8
(1) Make a box model for each of the 6 numbers. We need to calculate the EV, not
the SE.
1
1
5
0
60 Draws
AV =
1
6
EV 
1
60  10
6
(2) Formulate the null hypothesis.
H  : There is no significant difference between the observed frequencies and the
expected frequencies for a fair die.
(3) Calculate the  2 -statistic.
Number on Die
1
2
3
4
5
6
2  
Observed Frequency
4
6
17
16
8
9
Expected Frequency
10
10
10
10
10
10
(Obs  Exp) 2 (4  10) 2 (6  10) 2 (17  10) 2 (16  10) 2





Exp
10
10
10
10
(8  10) 2 (9  10) 2

 14.2
10
10
(4) Calculate the P-value from the Chi-Square table with df = 6 – 1 = 5.
5%
P
1%
 2  11.07
14.2
15.09
9
Thus, 1% < P < 5%.
(5) Since P > 1%, then fail to reject null. There is no reason to believe that the die is
not fair.
VII.  2 -Test ( Independence Between 2 Attributes )
Consider the following question:
Wake Forest University recorded the statistics shown below in the table for its
1992 – 1993 annual giving campaign:
Class
1980
182
260
1970
192
174
Contributed
Did Not Contribute
1990
325
586
Are the giving patterns independent of the class year? Test at the 5% level of
significance.
(1) Make the box models. First, we must get the column totals and the row totals.
Contributed
Did Not Contribute
Totals
1970
192
174
Class
1980
182
260
1990
325
586
Total
699
1020
366
442
911
1719
Thus, the ratio of “contributed” in the population to “did not contribute” in the
population is 699 to 1020. If the giving patterns are independent of the class year,
then the ratio of “contributed” in each individual class to “did not contribute” in
that same class should be the same as for the population. In order to calculate the
expected number of “contributed” in the class 1970, use the following box model:
699
1
AV 
699
1719
1020
0
EV 
366 Draws
699
366  149
1719
Since the expected number of “contributed” in the class 1970 is 149, then the
expected number of “did not contribute” in 1970 is 366 – 149 = 217. In order to
calculate the expected number of “contributed” in the classes 1980 and 1990, use
10
the same box model except with 442 draws and 911 draws:
699
442  180
1719
699
911  370
EV 
1719
1980: EV 
1990:
Also, the expected number of “did not contribute” in 1980 is 442 – 180 = 262
and in 1990 is 911 – 370 = 541. Thus, the completed contingency table would be:
Contributed
Did Not Contribute
Totals
1970
Obs Exp
192 149
174 217
Class
1980
Obs Exp
182 180
260 262
1990
Obs Exp
325 370
586 541
699
1020
366 366
442
911 911
1719
442
Total
(2) Formulate the null hypothesis.
H  : The giving patterns are independent of the class year.
(3) Calculate the  2 -statistic.
2 
(192  149) 2 (182  180) 2 (325  370) 2 (174  217) 2




149
180
370
217
(260  262) 2 (586  541) 2

 30.18
262
541
(4) Calculate the P-value. If the contingency table has C columns and R rows, then
df = (C – 1)(R – 1). Thus, in this example, df = (3 – 1)(2 –1) = 2(1) = 2.
1%
P
Thus, P < 1%.
 2  9.21
30.18
(5) Since P < 5%, reject the null. It seems as though the giving patterns are not
independent of the class year.
11
Download