Pertemuan 17 Pembandingan Dua Populasi-1 Matakuliah : A0064 / Statistik Ekonomi

advertisement
Matakuliah
Tahun
Versi
: A0064 / Statistik Ekonomi
: 2005
: 1/1
Pertemuan 17
Pembandingan Dua Populasi-1
1
Learning Outcomes
Pada akhir pertemuan ini, diharapkan mahasiswa
akan mampu :
• Membandingkan dua observasi yang
berpasangan dan pengujian perbedaan
antara dua rata-rata populasi
2
Outline Materi
• Pembandingan Observasi yang
Berpasangan
• Pengujian Perbedaan antara Dua Ratarata Populasi
3
COMPLETE
BUSINESS STATISTICS
8
•
•
•
•
•
•
8-4
5th edi tion
The Comparison of Two Populations
Using Statistics
Paired-Observation Comparisons
A Test for the Difference between Two Population
Means Using Independent Random Samples
A Large-Sample Test for the Difference between
Two Population Proportions
The F Distribution and a Test for the Equality of
Two Population Variances
Summary and Review of Terms
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-5
5th edi tion
8-1 Using Statistics
• Inferences about differences between
parameters of two populations
Paired-Observations
Observe the same group of persons or things
– At two different times: “before” and “after”
– Under two different sets of circumstances or “treatments”
Independent Samples
• Observe different groups of persons or things
– At different times or under different sets of circumstances
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-6
5th edi tion
8-2 Paired-Observation Comparisons
• Population parameters may differ at two
different times or under two different sets of
circumstances or treatments because:
 The circumstances differ between times or treatments
 The people or things in the different groups are
themselves different
• By looking at paired-observations, we are
able to minimize the “between group” ,
extraneous variation.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-7
BUSINESS STATISTICS
5th edi tion
Paired-Observation Comparisons
of Means
Test statistic for the paired - observations t test :
D 
s
n
where D is the sample average difference between each
t
D0
D
pair of observations, s is the sample standard deviation
D
of these differences, and the sample size, n, is the number
of pairs of observations. The symbol  is the population
D0
mean difference under the null hypothesis. When the null
hypothesis is true and the population mean difference is  ,
D0
the statistic has a t distribution with (n - 1) degrees of freedom.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-8
BUSINESS STATISTICS
5th edi tion
Example 8-1
A random sample of 16 viewers of Home Shopping Network was selected for an experiment. All viewers in the
sample had recorded the amount of money they spent shopping during the holiday season of the previous year.
The next year, these people were given access to the cable network and were asked to keep a record of their total
purchases during the holiday season. Home Shopping Network managers want to test the null hypothesis that
their service does not increase shopping volume, versus the alternative hypothesis that it does.
Shopper Previous
1
334
2
150
3
520
4
95
5
212
6
30
7
1055
8
300
9
85
10
129
11
40
12
440
13
610
14
208
15
880
16
25
McGraw-Hill/Irwin
Current
405
125
540
100
200
30
1200
265
90
206
18
489
590
310
995
75
Diff
71
-25
20
5
-12
0
145
-35
5
77
-22
49
-20
102
115
50
H0: D  0
H1: D > 0
df = (n-1) = (16-1) = 15
Test Statistic:
t 
D  D
0
sD
n
Critical Value: t0.05 = 1.753
Do not reject H0 if : t 1.753
Reject H0 if: t > 1.753
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-9
BUSINESS STATISTICS
5th edi tion
Example 8-1: Solution
D  D
32.81  0
0
t

 2.354
sD
55.75
t = 2.354 > 1.753, so H0 is rejected and we conclude that
there is evidence that shopping volume by network
viewers has increased, with a p-value between 0.01 an
0.025. The Template output gives a more exact p-value
of 0.0163. See the next slide for the output.
16
n
t Distribution: df=15
0.4
f(t)
0.3
0.2
Nonrejection
Region
0.1
Rejection
Region
0.0
-5
0
1.753
= t0.05
5
2.131
= t0.025
t
2.602
= t0.01
2.354=
test
statistic
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-10
5th edi tion
Example 8-1: Template for Testing
Paired Differences
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-11
5th edi tion
Example 8-2
It has recently been asserted that returns on stocks may change once a story about a company appears in The Wall
Street Journal column “Heard on the Street.” An investments analyst collects a random sample of 50 stocks that
were recommended as winners by the editor of “Heard on the Street,” and proceeds to conduct a two-tailed test of
whether or not the annualized return on stocks recommended in the column differs between the month before and
the month after the recommendation. For each stock the analysts computes the return before and the return after
the event, and computes the difference in the two return figures. He then computes the average and standard
deviation of the differences.
H0: D  0
H1: D > 0
n = 50
D = 0.1%
sD = 0.05%
Test Statistic: z 
D  D
0.1  0
0
z 

 14.14
sD
0.05
n
D  D
0
sD
n
McGraw-Hill/Irwin
50
p - value: p ( z  14.14 )  0
This test result is highly significant,
and H 0 may be rejected at any reasonable
level of significance.
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-12
BUSINESS STATISTICS
5th edi tion
Confidence Intervals for Paired
Observations
A (1 -  ) 100% confidence interval for the mean difference 
D
:
s
D  t D
2 n
where t  is the value of the t distributi on with (n - 1) degrees of freedom that cuts off an
2
area of

to its right, When the sample size is large, we may use z instead.
.
2
2
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-13
5th edi tion
Confidence Intervals for Paired
Observations – Example 8-2
95% confidence interval for the data in Example 8  2 :
s
0.05
D
D  z
 0.1  1.96
 01
.  (196
. )(.0071)
n
50
2
 01
.  0.014  [0.086,0114
. ]
Note that this confidence interval does not include the value 0.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-14
5th edi tion
Confidence Intervals for Paired
Observations – Example 8-2 Using the
Template
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-15
BUSINESS STATISTICS
5th edi tion
8-3 A Test for the Difference between Two Population
Means Using Independent Random Samples
• When paired data cannot be obtained, use
independent random samples drawn at
different times or under different
circumstances.
 Large sample test if:
• Both n1 30 and n2 30 (Central Limit Theorem), or
• Both populations are normal and 1 and 2 are both known
 Small sample test if:
• Both populations are normal and 1 and 2 are unknown
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-16
5th edi tion
Comparisons of Two Population Means:
Testing Situations
•
•
•
I: Difference between two population means is 0
 1= 2
• H0: 1 -2 = 0
• H1: 1 -2  0
II: Difference between two population means is less than 0
 1 2
• H0: 1 -2  0
• H1: 1 -2  0
III: Difference between two population means is less than D
 1  2+D
• H0: 1 -2  D
• H1: 1 -2  D
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-17
BUSINESS STATISTICS
5th edi tion
Comparisons of Two Population Means:
Test Statistic
Large-sample test statistic for the difference between two
population means:
z
( x  x )  (   )
1
2

1
2
1
n

1
2

0
2
2
n
2
The term (1- 2)0 is the difference between 1 an 2 under the
null hypothesis. Is is equal to zero in situations I and II, and it is
equal to the prespecified value D in situation III. The term in the
denominator is the standard deviation of the difference between
the two sample means (it relies on the assumption that the two
samples are independent).
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-18
BUSINESS STATISTICS
5th edi tion
Two-Tailed Test for Equality of Two
Population Means: Example 8-3
Is there evidence to conclude that the average monthly charge in the entire population of American Express Gold
Card members is different from the average monthly charge in the entire population of Preferred Visa
cardholders?
Population1 : Preferred Visa
H
0
:  0
1
2
H :  0
1
1
2
n = 1200
1
x = 452
1
 = 212
1
Population 2 : Gold Card
( x  x )  (   )
2
1
2 0  ( 452  523)  0
z  1
2
2
2
2


212
185
1  2

1200
800
n
n
1
2
 71

80.2346

 71
 7.926
8.96
n = 800
2
x = 523
p - value : p(z < -7.926)  0
2
 = 185
2
McGraw-Hill/Irwin
H
0
is rejected at any common level of significan ce
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-19
BUSINESS STATISTICS
5th edi tion
Example 8-3: Carrying Out the Test
Standard Normal Distribution
0.4
f(z)
0.3
0.2
0.1
0.0
-z0.01=-2.576
Rejection
Region
Test Statistic=-7.926
McGraw-Hill/Irwin
0
Nonrejection
Region
z
z0.01=2.576
Rejection
Region
Since the value of the test
statistic is far below the lower
critical point, the null
hypothesis may be rejected,
and we may conclude that
there is a statistically
significant difference between
the average monthly charges
of Gold Card and Preferred
Visa cardholders.
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-20
5th edi tion
Example 8-3: Using the Template
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-21
5th edi tion
Two-Tailed Test for Difference Between
Two Population Means: Example 8-4
Is there evidence to substantiate Duracell’s claim that their batteries last, on average, at least 45 minutes longer
than Energizer batteries of the same size?
Population1 : Duracell
H :     45
0 1
2
H :     45
1 1
2
n = 100
1
x = 308
1
 = 84
1
Population 2 : Energizer
( x  x )  (   )
2
1
2 0  (308  254)  45
z 1
2
2
2
2


84
67
1  2

100
100
n
n
1
2

9
115.45
9

 0.838
10.75
n = 100
2
x = 254
2
 = 67
2
McGraw-Hill/Irwin
p - value : p(z > 0.838) = 0.201
H may not be rejected at any common
0
level of significan ce
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-22
5th edi tion
Two-Tailed Test for Difference Between
Two Population Means: Example 8-4 –
Using the Template
Is there evidence to substantiate Duracell’s claim that their batteries last, on average, at least 45 minutes longer
than Energizer batteries of the same size?
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-23
5th edi tion
Confidence Intervals for the Difference
between Two Population Means
A large-sample (1-)100% confidence interval for the difference
between two population means, 1- 2 , using independent random
samples:
(x  x )  z
1
2

2
2
2

1  2
n
n
1
2

A 95% confidence interval using the data in example 8-3:
(x  x )  z
1
2

2
McGraw-Hill/Irwin
2
2
2 1852

212
1  2  (523  452)  1.96

 [53.44,88.56]
1200
800
n
n
1
2

Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-24
BUSINESS STATISTICS
5th edi tion
8-4 A Test for the Difference between Two Population
Means: Assuming Equal Population Variances
• If we might assume that the population variances 12 and 22 are equal
(even though unknown), then the two sample variances, s12 and s22,
provide two separate estimators of the common population variance.
Combining the two separate estimates into a pooled estimate should
give us a better estimate than either sample variance by itself.
** * * * * **
**
x1
Deviation from the
mean. One for each
sample data point.
}
}
Deviation from the
mean. One for each
sample data point.
* *
* *
Sample 1
From sample 1 we get the estimate s12 with
(n1-1) degrees of freedom.
* ** *
* ** * *
x2
** *
*
Sample 2
From sample 2 we get the estimate s22 with
(n2-1) degrees of freedom.
From both samples together we get a pooled estimate, sp2 , with (n1-1) + (n2-1) = (n1+ n2 -2)
total degrees of freedom.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-25
5th edi tion
Pooled Estimate of the Population
Variance
A pooled estimate of the common population variance, based on a sample
variance s12 from a sample of size n1 and a sample variance s22 from a sample
of size n2 is given by:
2
2
(
n

1
)
s

(
n

1
)
s
1
2
2
s2p  1
n1  n2  2
The degrees of freedom associated with this estimator is:
df = (n1+ n2-2)
The pooled estimate of the variance is a weighted average of the two
individual sample variances, with weights proportional to the sizes of the two
samples. That is, larger weight is given to the variance from the larger
sample.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-26
5th edi tion
Using the Pooled Estimate of the
Population Variance
The estimate of the standard deviation of (x1  x 2 ) is given by:
1 
2 1
sp 


 n1 n2 
Test statistic for the difference between two population means, assuming equal
population variances:
(x1  x 2 )  (  1   2 ) 0
t=
1
2 1
sp  

n
n
 1 2
where (  1   2 ) 0 is the difference between the two population means under the null
hypothesis (zero or some other number D).
The number of degrees of freedom of the test statistic is df = ( n1  n2  2 ) (the
2
number of degrees of freedom associated with s p , the pooled estimate of the
population variance.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-27
BUSINESS STATISTICS
5th edi tion
Example 8-5
Do the data provide sufficient evidence to conclude that average percentage increase in the CPI differs when oil
sells at these two different prices?
H 0 : 1   2  0
H1:  1   2  0
Population 1: Oil price = $27.50
n1 = 14
( x1  x 2 )  (  1   2 ) 0
t 
 ( n1  1) s12  ( n2  1) s22   1 1 

  
n1  n2  2

  n1 n2 
0.107
0.107


 2.154
0.00247 0.0497
x1 = 0.317%
s1 = 0.12%
Population 2: Oil price = $20.00
n2 = 9
x 2 = 0.21%
s 2 = 0.11%
Critical point: t
df = (n  n  2 )  (14  9  2 )  21
1
2
McGraw-Hill/Irwin
= 2.080
0.025
H 0 may be rejected at the 5% level of significance
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-28
5th edi tion
Example 8-5: Using the Template
Do the data provide sufficient evidence to conclude that average percentage increase in the CPI differs when oil
sells at these two different prices?
P-value =
0.0430, so
reject H0 at
the 5%
significance
level.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-29
BUSINESS STATISTICS
5th edi tion
Example 8-6
The manufacturers of compact disk players want
to test whether a small price reduction is enough
to increase sales of their product. Is there
evidence that the small price reduction is enough
to increase sales of compact disk players?
H :  0
0
2
1
H :  0
1
2
1
t
Population 1: Before Reduction
n 1 = 15
x 1 = $6598

s1 = $844
Population 2: After Reduction
n 2 = 12

( x  x )  (   )
2
1
2
1 0
 ( n  1) s 2  ( n  1) s 2  1 1
 1
1
2
2 


 n n
n n 2
1
2

 1 2




( 6870  6598)  0
 (14)8442  (11)6692  1 1 

  

 15 12 
15  12  2


272

89375.25
272
 0.91
298.96
x 2 = $6870
s 2 = $669
Critical point : t
= 1.316
0.10
df = (n  n  2 )  (15  12  2 )  25
1
2
McGraw-Hill/Irwin
H may not be rejected even at the 10% level of significan ce
0
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-30
5th edi tion
Example 8-6: Using the Template
P-value =
0.1858, so do
not reject H0
at the 5%
significance
level.
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-31
BUSINESS STATISTICS
5th edi tion
Example 8-6: Continued
t Distribution: df =25
0.4
f(t)
0.3
0.2
0.1
0.0
-5
-4
-3
-2
-1
Nonrejection
Region
0
1
2
3
4
t0.10=1.316
Rejection
Region
5
t
Since the test statistic is less
than t0.10, the null hypothesis
cannot be rejected at any
reasonable level of
significance. We conclude
that the price reduction does
not significantly affect sales.
Test Statistic=0.91
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
8-32
BUSINESS STATISTICS
5th edi tion
Confidence Intervals Using the Pooled
Variance
A (1-) 100% confidence interval for the difference between two
population means, 1- 2 , using independent random samples and
assuming equal population variances:
( x1  x2 )  t

2 1
sp 
 n1



n2 
1
2
A 95% confidence interval using the data in Example 8-6:
( x1  x 2 )  t

2
sp
 1  1


 n1 n2 
 ( 6870  6598 )  2 .06 ( 595835)( 0.15)  [ 343.85,887 .85]
2
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
COMPLETE
BUSINESS STATISTICS
8-33
5th edi tion
Confidence Intervals Using the Pooled
Variance and the TemplateExample 8-6
Confidence Interval
McGraw-Hill/Irwin
Aczel/Sounderpandian
© The McGraw-Hill Companies, Inc., 2002
Penutup
• Pembahasan materi dilanjutkan dengan
Materi Pokok 18 (Pembandingan Dua
Populasi-2)
34
Download