3.1 The C.I. and Hypothesis Testing: Normal and T

advertisement
45
3. Basic Statistical Inference
3.1. The C.I. and Hypothesis Testing: Normal and T
(I) 1    100% confidence intervals:
(point estimate)  [( z  , t n 1, , t n n
2
1
2

2  2,
)
2
(standard deviation (error) of point estimate)]
One sample:
(A) Population mean(s)  :
(i) C.I.

Large sample
n  30
: x  z s X  x  z
2

Small sample
2
s
s
s 

  x  z
, x  z

2
2
n 
n
n
n  30 , normal population:
x  t n1, s X  x  t n1,
2
2
s

  x  t n1,
2
n

(ii) Sample Size and Margin of Error: n 
s
, x  t n1,
2
n
s 

n
z2 s 2
2
E2
(B) Population proportion p :
(i) C.I.: p  z s P  p  z
2
2
p 1  p  
  p  z
2
n

(ii) Sample Size and Margin of Error: n 
p 1  p 
, p  z
2
n
p 1  p  

n

z2 p1  p 
2
E2
Two samples:
(A) Population Mean Difference 1   2 :
Large sample n1  30, n2  30 :
x1  x2   z
2
s X1  X 2   x1  x2   z
2
s12 s 22

n1 n2

s12 s 22
s12 s 22 
  x1  x2  z
 , x1  x2  z
 
2
2
n1 n2
n1 n2 

Small sample n1  30, n2  30 , normal populations:
x1  x2   t n  n 2,
1
2
2
s X 1  X 2  x1  x 2   t n1  n2  2,

  x1  x 2  t n1  n2  2,
2

2
1
1
s 2p  
 n1 n2
1
1 
 , x1  x2  t n1  n2  2,
s 2p  
2
 n1 n2 
45



1
1 
 
s 2p  
 n1 n2  
,
46
 x
n1
where s 2p 
n1  1s12
 n2  1s22

n1  n2  2
2
1,i
i 1
 x1    x2 ,i  x2 
n2
2
i 1
n1  n2  2
(B) Population Proportion Difference p1  p2 :
 p1  p2   z
2
s P1  P2   p1  p2   z
p1 1  p1  p2 1  p2 

n1
n2
2
(II) Hypothesis testing:
test statistic 
point estimate - mean of point estimate under H 0
standard deviation (error) of p o i net s t i m aut ne d eHr 0
One sample:
(A) Population mean  :
x  0
x  0

sX
 s



n



Large sample
n  30

Small sample
n  30 , normal population: t  x  0  x  0
: z 
sX
(B) Population proportion p : z 
p  p0
P

 s




n
p  p0




p0 1  p0  


n

Two samples:
(A) Population Mean Difference 1   2 :
(a) Large sample n1  30, n2  30 : z 
x1  x2  0
x  x2  0
 1
sX1  X 2
s12
s22

n1
n2
(b) Small sample n1  30, n2  30 , normal populations:
t
x1  x2  0
x  x2  0
 1

sX1  X 2
1
1
s 2p   
 n1 n2 
46
47
d  0
2. Matched samples for testing : z  s
d
d  0
or t  s
d
n
.
n
(B) Population Proportion Difference p1  p2 :

p  p2
H 0 : p1  p2  0 : z  1 

s P1  P2

p1  p2
 1
1 
p 1  p 
n  n 

2 
 1

p  p2  c
H 0 : p1  p2  c  0 : z  1

s P1  P2

p1  p2  c
p1 1  p1 
p 1  p2 
 2
n1
n2
Summary table:
Point estimate
Classical approach
(critical values)
Null distribution
(p-value)
One sample
x, p
Two samples
x1  x2 , d , p1  p2
 z , z , z ,
 z , z , z ,
2
 t n 1, , t n 1, , t n 1,
Z , T n  1
2
2
 t n1  n2  2, , t n1  n2  2, , t n  n
1

2  2,
2
Z , T n1  n2  2
Example 1:
A sample size of 40 provides a sample mean of 16.5 and sample standard
deviation of 7.
(a) Find the 94% confidence interval for population mean.
(b) With a length of 2 at 90% confidence, what size sample would be required to
estimate the population mean?
[solution:]
(a) A 94% confidence interval is
x  z
2
s
7
 16.5  z0.03
 16.5  1.88 1.107  14.42, 18.58 .
n
40
z2 s 2
z 2  7 2 1.6452  49
2
 1 , n  2 2  0.05 2 
(b) E 
 132.59  n  133 .
2
1
E
1
Example 2:
It is believed that the running time of movies is normally distributed with mean
47
48
to 140 minutes (i.e., H 0 :   140 v.s. H a :   140 ). A sample of 4 movies
was taken and the following running times were obtained,
150 , 150, 180, 170.
At the 5% level of significance,
(a) test the hypothesis based on the classical hypothesis test procedure.
(b) using a p-value, test the hypothesis.
(c) using a confidence interval, test the hypothesis.
(d) With a margin of error of 5 or less at 95% confidence, what size sample is
required?
[solution:]
4
n  4, x 
150  150  180  170
 162.5, s 
4
 x
i
i 1
 x
4 1
2
 15 .
(a)
x   0 162.5  140 22.5


 3, t  3  t n 1,  t 3,0.025  3.182
s
15
2
7.5
n
4
 not reject H 0
t
(b)
p - value  PT (n  1)  t   PT (3)  3  PT (3)  3.182    0.05
 not reject H 0
(c) A 95% confidence interval for  is
x  t n1,
2
s
15
 162.5  t3,0.025
 162.5  3.182  7.5  138.63,186.36
n
4
Since 140  138.63,186.36, we do not reject H 0 .
(d) n 
z2 s 2
2
E2

z 02.025 15 2
52
2

1.96  225

 34.57  n  35 .
25
Example 3:
A random sample of 400 people was taken. 228 of the people in the sample
favored candidate A. We are interested in determining whether or not the
proportion in favor of candidate A is significantly less than 50%,
H 0 : p  0.5 v.s. H a : p  0.5 .
48
49
(a) As   0.1 , test the hypothesis based on the classical hypothesis test
procedure.
(b) As   0.03 , test the hypothesis based on the p-value.
(c) Develop a 90% confidence interval estimate for the proportion in favor of
candidate A.
(d) With a margin of error of 0.01 or less at 95% confidence, what size sample
would be required to estimate the proportion in favor of candidate A?
[solution:]
p  p0
228
0.57  0.5
n  400, p0  0.5, p 
 0.57, z 

 2.8
400
p0 1  p0 
0.51  0.5
n
400
(a)   0.1,
z  2.8  z  z0.1  1.28  reject H 0
(b)   0.03 ,
p - value  PZ  z   PZ  2.8  0.5  0.4974  0.0026    0.03
 reject H 0
(c) A 90% confidence interval for p is
p 1  p 
0.571  0.57 
 0.57  z 0.05
2
n
400
 0.57  1.645  0.02475  0.5293,0.6107
p  z
(d)
n
E  0.01,   0.05 . Then,
z2 p1  p 
2
E2

z 02.025  0.57  1  0.57 
0.012

1.962  0.57  1  0.57  9415.76
0.012
 n  9416 .
Example 4:
Consider the following hypothesis test.
H 0 : 1   2  0 v.s. H a : 1   2  0 .
The following data are for two independent samples taken from the two normal
populations with equal variances.
49
50
Sample 1
Sample 2
6, 10, 9, 8, 7
9, 12, 10, 11, 9
(a) With   0.05 , test the hypothesis based on the classical hypothesis test
procedure.
(b) With   0.01, test the hypothesis based on the p-value.
(c) With   0.05 , using the confidence interval method to test the hypothesis
[solution:]
n1  5, n2  5, x1  8, x2  10.2, s12  2.5, s22  1.7,  0  0
s
2
p

n1  1s12  n2  1s22

n1  n2  2

4  2.5  4 1.7
 2.1
552
(a)
t 
x1  x 2   0
1
1 

s 2p  
n
n
2 
 1

8  10.2  0
1 1
2.1  
5 5
 2.4  t n  n
1

2 2,
2
 t 8, 0.025  2.306
Therefore, we reject H 0 .
(b)
p  value  PT n1  n2  2  t   PT 8  2.4  PT 8  3.355  0.01
Therefore, we do not reject H 0 .
(c) A 95% confidence interval for 1   2 is
1 1
1 1
s 2p     8  10.2   t8, 0.025 2.1   
1
2
2
5 5
 n1 n2 
 2.2  2.306  0.9165   4.313,0.086
x1  x2   t n n 2,
Since 0   4.313,0.086 , we reject H 0 .
Example 5:
To determine the effectiveness of a new weight control diet, 8 randomly selected
students observed the diet for 4 weeks with the results shown below.
Dieter
Weight (before)
Weight (after)
A
138
135
B
151
147
C
129
132
D
125
127
50
51
E
168
155
F
139
131
G
152
144
H
140
142
We like to test the hypothesis
H 0 : 1   2  2 , where
1
and
2
are
the mean weights of the students before and after taking the weight control diet,
respectively. The above data can be considered as the matched-sample data.
(a) For   0.1, test the above hypothesis using the classical hypothesis
test.
(b) For   0.05 , please use the confidence interval method to test the
above hypothesis.
[solution:]
(a)
d7
d3
d5
d6
d8
d2
d4
d1
3
Therefore,
t
4
-3
-2
13
8
8
-2
d  3.625, sd  5.7802 . Thus,
d 2
3.625  2

 0.795  t  0.795  1.895  t 7,0.05  t n1, .
2
 sd
  5.7802 

 

n 
8

We do not reject
H0 .
(b) A 95% confidence interval for
d  t n 1,
Since
sd
2
n
 3.625  t 7,0.025 
1   2
5.7802
8
is
 3.625  2.365 
2   1.2081,8.4581  not reject H 0
5.7802
8
  1.2081, 8.4581.
.
Example 6:
The results of a recent poll on the preference of shoppers regarding two products
are shown below.
51
52
Shoppers Favoring
Product
Shoppers Surveyed
This Product
A
800
560
B
900
612
Let p1 be the proportion favoring product A and p 2 be the proportion
favoring product B.
(a) Develop a 90% confidence interval estimate for the difference p1  p2
between the proportions favoring each product.
(b) Test H 0 : p1  p 2 at   0.05 based on classical approach.
(c) Test H 0 : p1  p 2 at   0.05 based on p-value method.
[solution:]
n1  800, n2  900, p1  560
s P1  P2 
 0.7, p2  612
 0.68
800
900
.
p1 1  p1  p2 1  p2 
0.71  0.7  0.681  0.68



 0.02246
n1
n2
800
900
(a) Thus, a 90% confidence interval is
 p1  p 2   z
(b) p 
2
s P1  P2  0.7  0.68  z 0.05  0.02246  0.02  0.03695
  0.01695,0.05695
n1 p1  n2 p2 560  612 1172


 0.689
n1  n2
800  900 1700
and
s P1  P2 
1
1 
1 
 1
p1  p     0.689  1  0.689

  0.02249 .
 800 900 
 n1 n2 
Therefore,
z
p1  p2 0.7  0.68

 0.89  z  0.89  1.96  z0.025  z ,
2
sP1 P2
0.02249
we do not reject
(c)
H0 .
p - value  P Z  .89  0.3734  0.05  not reject H0
52
.
Download