1 - a - Binus Repository

advertisement
Mata kuliah
Tahun
: A0392 - Statistik Ekonomi
: 2010
Pertemuan 12
Analisis Varians Satu Arah dan
Dua Arah
1
Outline Materi :
 Model tabel ANOVA klasifikasi satu arah
 ANOVA ulangan sama
 ANOVA ulangan tidak sama
2
Analisis Variansi
• Analisa variansi (ANOVA) adalah suatu metoda
untuk menguji hipotesis kesamaan rata-rata dari
tiga atau lebih populasi.
• Asumsi
 Sampel diambil secara random dan saling
bebas (independen)
 Populasi berdistribusi Normal
 Populasi mempunyai kesamaan variansi
• Hipotesis
H0 : 1 = 2 = … = k
H1 : paling sedikit dua tidak sama
3
3
Analisis Variansi
Total
1
x11
x12
:
x1n
T1
Sampel dari Populasi ke :
2
…
i
…
x21
…
xi1
…
x22
…
xi2
…
:
:
:
:
x2n
…
xin
…
T2
…
Ti
…
k
Xk1
Xk2
:
xkn
Tk
Total
T
Ti adalah total semua pengamatan dari populasi ke-i
T adalah total semua pengamatan dari semua populasi 4
4
Rumus Hitung Jumlah Kuadrat
Untuk Pengujian Hipotesis Di atas Perlu ditentukan
Jumlah Kuadrat Setiap Sumber Variasi
2
T
JKT   x ij2  
nk
i 1 j1
k
Jumlah Kuadrat Total =
n
k
2
T
 i
Jumlah Kuadrat Perlakuan =
Jumlah Kuadrat Galat =
2
T
JKP  i 1
 
n
nk
JKG  JKT  JKP
5
5
Tabel Anova dan Daerah
Penolakan
Sumber
Variasi
Derajat
bebas
Jumlah
kuadrat
Kuadrat
Rata-rata
Statistik F
Perlakuan
k–1
JKP
KRP =
JKP/(k – 1 )
F=
KRP/KRG
KRG =
JKG/(k(n-1))
Galat
k(n-1)
JKG
Total
nk – 1
JKT
H0 ditolak jika F > F(; k – 1; k(n – 1))
6
6
Contoh 1
Sebagai manager
produksi, anda ingin
melihat mesin pengisi akan
dilihat rata-rata waktu
pengisiannya. Diperoleh
data seperti di samping.
Pada tingkat signifikansi
0.05 adakah perbedaan
rata-rata waktu ?
Mesin1
Mesin2
Mesin3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
7
7
Penyelesaian
 Hipotesa :
H 0:  1 =  2 =  3
H1: Ada rata-rata yang tidak sama
 Tingkat signifikasi  = 0.05
 Karena df1= derajat bebas perlakuan = 2
dan df2 = derajat bebas galat = 12, maka
f(0.05;2;12) = 3.89.
Jadi daerah pelokannya:
H0 ditolak jika F > 3.89
8
8
Data
Populasi
Total
1
2
3
25.40
23.40
20.00
26.31
21.80
22.20
24.10
23.50
19.75
23.74
22.75
20.60
25.10
21.60
20.40
124.65
113.05
102.95
Total
340.65
9
9
Jumlah Kuadrat Total
2
T
JKT   x ij2  
nk
i 1 j1
k
n
 25.40 2  26.312  24.10 2  23.74 2  25.10 2 
23.40 2  21.80 2  23.50 2  22.752  21.60 2 
20.00 2  22.20 2  19.752  20.60 2  20.40 2
340.652

5 3
 58.2172
10
10
Jumlah Kuadrat Perlakuan dan
Jumlah Kuadrat Galat
k
T
2
i
2

T
JKP 

n
nk
2
2
2
2
124.65  113.05  102.95 340.65


5
5 3
 47.1640
i 1
JKG  58.2172  47.1640  11.0532
11
11
Tabel Anova dan Kesimpulan
Sumber
Variasi
Derajat
Bebas
Jumlah
Kuadrat
Kuadrat
Rata-rata
Perlakuan
3-1=2
47.1640
23.5820
Statistik
F
F = 25.60
Galat
15-3=12
11.0532
Total
15-1=14
58.2172
0.9211
Karena Fhitung = 25.60 > 3.89 maka H0 ditolak.
Jadi ada rata-rata yang tidak sama.
12
12
Rumus Hitung Jumlah Kuadrat
Untuk ukuran sampel yang berbeda
2
T
2

JKT   x ij 
N
i 1 j1
k
Jumlah Kuadrat Total =
ni
Ti2 T2

Jumlah Kuadrat Perlakuan = JKP  
N
i 1 n i
k
Jumlah Kuadrat Galat =
JKG  JKT  JKP
k
dengan N   n i
i 1
13
13
Tabel Anova
Untuk ukuran sampel yang berbeda
Sumber
Variasi
Derajat
bebas
Jumlah
kuadrat
Perlakuan
k–1
JKP
KRP =
F=
JKP/(k – 1 ) KRP/KRG
KRG =
JKG/(N - k)
Galat
N–k
JKG
Total
N–1
JKT
Kuadrat
Rata-rata
Statistik F
14
14
Contoh 2
• Dalam Sebuah percobaan biologi
4 konsentrasi bahan kimia
digunakan untuk merangsang
pertumbuhan sejenis tanaman
tertentu selama periode waktu
tertentu. Data pertumbuhan
berikut, dalam sentimeter, dicatat
dari tanaman yang hidup.
• Apakah ada beda pertumbuhan
rata-rata yang nyata yang
disebabkan oleh keempat
konsentrasi bahan kimia tersebut.
• Gunakan signifikasi 0,05.
Konsentrasi
1
2
3
4
8.2 7.7 6.9 6.8
8.7 8.4 5.8 7.3
9.4 8.6 7.2 6.3
9.2 8.1 6.8 6.9
8.0 7.4 7.1
6.1
15
15
Penyelesaian
 Hipotesa :
H 0:  1 =  2 =  3=  4
H1: Ada rata-rata yang tidak sama
 Tingkat signifikasi  = 0.05
 Karena df1= derajat bebas perlakuan = 3
dan df2 = derajat bebas galat = 16, maka
f(0.05;3;16) = 3.24.
Jadi daerah pelokannya:
H0 ditolak jika F > 3.24
16
16
Data
1
Populasi
2
3
4
8.2
7.7
6.9
6.8
8.7
8.4
5.8
7.3
9.4
8.6
7.2
6.3
9.2
8.1
6.8
6.9
8.0
7.4
7.1
Total
6.1
Total
35.5
40.8
40.2
34.4
150.9
17
17
Jumlah Kuadrat Total
2
T
JKT   x ij2  
N
i 1 j1
k
ni
 8.2 2  8.7 2  9.4 2  9.2 2  7.7 2  8.4 2  8.6 2
 8.12  8.0 2  6.9 2  5.82  7.2 2  6.82  7.4 2
150.9
 6. 1  6. 8  7 .3  6 .3  6 .9  7 .1 
20
 19.350
2
2
2
2
2
2
18
18
2
Jumlah Kuadrat Perlakuan dan
Jumlah Kuadrat Galat
Ti2 T2
JKP  

N
i 1 n i
k
35.52 40.82 40.2 2 34.4 2 150.9 2





4
5
6
5
20
 15.462
JKG  19.350  15.462  3.888
19
19
Tabel Anova dan Kesimpulan
Sumber
Variasi
Derajat
Bebas
Jumlah
Kuadrat
Kuadrat
Rata-rata
Perlakuan
4-1=3
15.462
5.154
Galat
20-4=16
3.888
Total
20-1=19
19.350
0.243
Statistik
F
F=
21.213
Karena Fhitung = 21.213 > 3.24 maka H0 ditolak.
Jadi ada rata-rata yang tidak sama.
20
20
Latihan 1
Seorang kontraktor di bidang jenis
jasa pengangkutan ingin
mengetahui apakah terdapat
perbedaan yang signifikan pada
kapasitas daya angkut 3 merk truk,
yaitu Mitsubishi, Toyota dan Honda.
Untuk itu kontraktor ini mengambil
sampel masing-masing 5 truk pada
tiap-tiap merek menghasilkan data
seperti disamping.
Jika ketiga populasi data tersebut
berdistribusi normal dan variansi
ketiganya sama, uji dengan
signifikasi 5% apakah terdapat
perbedaan pada kwalitas daya
angkut ketiga merek truk tersebut
Kapasitas
Mitsubishi
(A)
Toyota
(B)
Honda
(A)
44
42
46
43
45
47
48
44
45
45
45
44
46
44
43
21
21
Latihan 2
Seorang guru SMU mengadakan
penelitian tentang keunggulan
metode mengajar dengan
beberapa metode pengajaran.
Bila data yang didapat seperti
pada tabel disamping, ujilah
dengan signifikasi 5% apakah
keempat metode mengajar
tersebut memiliki hasil yang
sama? (asumsikan keempat data
berdistribusi Normal dan
variasnisnya sama)
Metode
A
B
C
D
70
68
76
67
76
75
87
66
77
74
78
78
78
67
77
57
67
57
68
89
22
22
ANALISIS VARIANSI
DUA ARAH
(Randomized Block Design)
23
The ANOVA Procedure
• The ANOVA procedure for the randomized block
design requires us to partition the sum of squares
total (SST) into three groups: sum of squares due to
treatments, sum of squares due to blocks, and sum
of squares due to error.
• The formula for this partitioning is
•
SST = SSTR + SSBL + SSE
• The total degrees of freedom, nT - 1, are partitioned
such that k - 1 degrees of freedom go to treatments,
b - 1 go to blocks, and (k - 1)(b - 1) go to the error
term.
24
ANOVA Table for a
Randomized Block Design
Source of
Variation
F
Sum of
Squares
Degrees of
Freedom
Treatments
SSTR
k-1
Blocks
SSBL
b-1
Error
SSE
(k - 1)(b - 1)
Total
SST
nT - 1
Mean
Squares
SSTR MSTR
k - 1 MSE
SSBL
MSBL 
b-1
MSTR 
SSE
MSE 
( k  1)(b  1)
25
Randomized Block Design
• Example: Crescent Oil Co.
Crescent Oil has developed three
new blends of gasoline and must
decide which blend or blends to
produce and distribute. A study
of the miles per gallon ratings of the
three blends is being conducted to determine if the
mean ratings are the same for the three blends.
26
Randomized Block Design

Example: Crescent Oil Co.
Five automobiles have been
tested using each of the three
gasoline blends and the miles
per gallon ratings are shown on
the next slide.
27
Randomized Block Design
Automobile
(Block)
Type of Gasoline (Treatment)
Blend X
Blend Y
Blend Z
Block
Means
30.333
29.333
28.667
31.000
25.667
1
2
3
4
5
31
30
29
33
26
30
29
29
31
25
30
29
28
29
26
Treatment
Means
29.8
28.8
28.4
28
Randomized Block Design

Mean Square Due to Treatments
The overall sample mean is 29. Thus,
SSTR = 5[(29.8 - 29)2 + (28.8 - 29)2 + (28.4 - 29)2] = 5.2
MSTR = 5.2/(3 - 1) = 2.6

Mean Square Due to Blocks
SSBL = 3[(30.333 - 29)2 + . . . + (25.667 - 29)2] = 51.33
MSBL = 51.33/(5 - 1) = 12.8
• Mean Square Due to Error
SSE = 62 - 5.2 - 51.33 = 5.47
MSE = 5.47/[(3 - 1)(5 - 1)] = .68
29
Randomized Block Design

ANOVA Table
Source of
Variation
Degrees of
Freedom
Mean
Squares
F
5.20
2
2.60
3.82
51.33
4
12.80
Error
5.47
8
.68
Total
62.00
14
Treatments
Blocks
Sum of
Squares
30
Randomized Block Design
• Rejection Rule
p-Value Approach:
Reject H0 if p-value < .05
Critical Value Approach:
Reject H0 if F > 4.46
For  = .05, F.05 = 4.46
(2 d.f. numerator and 8 d.f. denominator)
31
Randomized Block Design

Test Statistic
F = MSTR/MSE = 2.6/.68 = 3.82
• Conclusion
The p-value is greater than .05 (where F = 4.46)
and less than .10 (where F = 3.11). (Excel provides
a p-value of .07). Therefore, we cannot reject H0.
There is insufficient evidence to conclude that
the miles per gallon ratings differ for the three
gasoline blends.
32
• Selamat Belajar Semoga Sukses.
33
Materi Tambahan :
34
Analysis of Variance
• The Completely Randomized Design:
One-Way Analysis of Variance
– ANOVA Assumptions
– F Test for Difference in c Means
– The Tukey-Kramer Procedure
35
General Experimental Setting
• Investigator Controls One or More
Independent Variables
– Called treatment variables or factors
– Each treatment factor contains two or more
groups (or levels)
• Observe Effects on Dependent Variable
– Response to groups (or levels) of independent
variable
• Experimental Design: The Plan Used to
Test Hypothesis
36
Completely Randomized
Design
• Experimental Units (Subjects) are
Assigned Randomly to Groups
– Subjects are assumed to be homogeneous
• Only One Factor or Independent Variable
– With 2 or more groups (or levels)
• Analyzed by One-Way Analysis of
Variance (ANOVA)
37
Randomized Design Example
Factor (Training Method)
Factor
Levels
(Groups)
Randomly
Assigned
Units
Dependent
Variable
(Response)



21 hrs
17 hrs
31 hrs
27 hrs
25 hrs
28 hrs
29 hrs
20 hrs
22 hrs
38
One-Way Analysis of Variance
F Test
• Evaluate the Difference Among the Mean Responses of
2 or More (c ) Populations
– E.g., Several types of tires, oven temperature
settings
• Assumptions
– Samples are randomly and independently drawn
• This condition must be met
– Populations are normally distributed
• F Test is robust to moderate departure from
normality
– Populations have equal variances
• Less sensitive to this requirement when samples
are of equal size from each population
39
Why ANOVA?
• Could Compare the Means One by One using Z
or t Tests for Difference of Means
• Each Z or t Test Contains Type I Error
• The Total Type I Error with k Pairs of Means is
1- (1 - ) k
– E.g., If there are 5 means and use  = .05
• Must perform 10 comparisons
• Type I Error is 1 – (.95) 10 = .40
• 40% of the time you will reject the null
hypothesis of equal means in favor of the
alternative when the null is true!
40
Hypotheses of One-Way
ANOVA
•
H 0 : 1  2 
 c
– All population means are equal
– No treatment effect (no variation in means
among groups)
•
H1 : Not all i are the same
– At least one population mean is different
(others may be the same!)
– There is a treatment effect
– Does not mean that all population means are
different
41
One-Way ANOVA
(No Treatment Effect)
H 0 : 1  2   c
H1 : Not all i are the same
The Null
Hypothesis is
True
1   2  3
42
One-Way ANOVA
(Treatment Effect Present)
H 0 : 1  2 
 c
H1 : Not all i are the same
1   2  3
The Null
Hypothesis is
NOT True
1  2  3
43
One-Way ANOVA
(Partition of Total Variation)
Total Variation SST
=






Variation Due to
Group SSA
Commonly referred to as:
Among Group Variation
Sum of Squares Among
Sum of Squares Between
Sum of Squares Model
Sum of Squares Explained
Sum of Squares Treatment
Variation Due to Random
Sampling
SSW
Commonly
referred
to as:
+




Within Group Variation
Sum of Squares Within
Sum of Squares Error
Sum of Squares
Unexplained
44
Total Variation
nj
c
SST   ( X ij  X )
2
j 1 i 1
X ij : the i -th observation in group j
n j : the number of observations in group j
n : the total number of observations in all groups
c : the number of groups
c
X 
nj
 X
j 1 i 1
n
ij
the overall or grand mean
45
Total Variation
(continued)

SST  X 11  X
 X
2
21
X
  X
2
nc c
X
Response, X
X
Group 1
Group 2
Group 3
46

2
Among-Group Variation
c
SSA   n j ( X j  X )
j 1
2
SSA
MSA 
c 1
X j : The sample mean of group j
X : The overall or grand mean
i  j
Variation Due to Differences Among Groups
47
Among-Group Variation
(continued)

SSA  n1 X 1  X
 n X
2
2
2
X

2

 nc X c  X
Response, X
X3
X1
Group 1
Group 2
X2
Group 3
X
48

2
Within-Group Variation
c
nj
SSW   ( X ij  X j )
2
j 1 i 1
SSW
MSW 
nc
X j : The sample mean of group j
X ij : The i -th observation in group j
Summing the variation
within each group and then
adding over all groups
j
49
Within-Group Variation
(continued)
SSW   X 11  X 1    X 21  X 1  
2
2

 X nc c  X c
Response, X
X3
X1
Group 1
Group 2
X2
Group 3
X
50

2
Within-Group Variation
(continued)
For c = 2, this is the
SSW
MSW 
pooled-variance in the
nc
t test.
2
2
2
(n1  1) S1  (n2  1) S2      (nc  1) Sc

(n1  1)  (n2  1)      (nc  1)
•If more than 2 groups,
use F Test.
•For 2 groups, use t test.
F Test more limited.
j
51
One-Way ANOVA
F Test Statistic
• Test Statistic
– F  MSA
MSW
• MSA is mean squares among
• MSW is mean squares within
• Degrees of Freedom
–
–
df1  c 1
df 2  n  c
52
One-Way ANOVA
Summary Table
Degrees
Source
of
of
Freedo
Variation
m
Among
c–1
(Factor)
Within
(Error)
Total
Sum of
Squares
SSA
n–c
SSW
n–1
SST =
SSA +
SSW
Mean
Squares
(Variance)
F
Statistic
MSA =
MSA/MS
SSA/(c – 1 )
W
MSW =
SSW/(n – c )
53
Features of One-Way ANOVA
F Statistic
• The F Statistic is the Ratio of the Among
Estimate of Variance and the Within
Estimate of Variance
– The ratio must always be positive
– df1 = c -1 will typically be small
– df2 = n - c will typically be large
• The Ratio Should Be Close to 1 if the Null
is True
54
Features of One-Way ANOVA
F Statistic
(continued)
• If the Null Hypothesis is False
– The numerator should be greater than the
denominator
– The ratio should be larger than 1
55
One-Way ANOVA F Test
Example
As production manager,
you want to see if 3 filling
machines have different
mean filling times. You
assign 15 similarly trained
& experienced workers, 5
per machine, to the
machines. At the .05
significance level, is there
a difference in mean filling
times?
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
56
One-Way ANOVA Example:
Scatter Diagram
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
27
20.00
22.20
19.75
20.60
20.40
X 1  24.93
X 2  22.61
X 3  20.59
X  22.71
26
25
24
23
22
21
20
•
••
•
•
X1
••
•
••
X2
•
••
••
X
X3
19
57
One-Way ANOVA Example
Computations
Machine1 Machine2 Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
X 1  24.93
nj  5
X 2  22.61
c3
X 3  20.59
n  15
X  22.71
2
2
2

SSA  5  24.93  22.71   22.61  22.71   20.59  22.71 


 47.164
SSW  4.2592  3.112  3.682  11.0532
MSA  SSA /(c -1)  47.16 / 2  23.5820
MSW  SSW /( n - c)  11.0532 /12  .9211
58
Summary Table
Source Degree
of
s of
Variatio Freedo
n
m
Among
(Factor)
Within
(Error)
Total
3-1=2
153=12
151=14
Mean
Squares
(Variance)
F
Statistic
47.1640
23.5820
MSA/MS
W
=25.60
11.0532
.9211
Sum of
Squares
58.2172
59
One-Way ANOVA Example
Solution
Test Statistic:
H0: 1 = 2 = 3
H1: Not All Equal
MSA
23.5820
 25.6
F

MSW
.9211
 = .05
df1= 2
df2 = 12
Decision:
Reject at  = 0.05.
Critical Value(s):
 = 0.05
0
3.89
F
Conclusion:
There is evidence that at
least one  i differs from
the rest.
60
The Tukey-Kramer Procedure
• Tells which Population Means are Significantly
Different
– E.g., 1 = 2  3
f(X)
– 2 groups whose means
may be significantly
different
X
1= 2 3
• Post Hoc (A Posteriori) Procedure
– Done after rejection of equal means in ANOVA
• Pairwise Comparisons
– Compare absolute mean differences with
critical range
61
The Tukey-Kramer Procedure:
Example
Machine1 Machine2 Machine3
25.40
23.40
20.00
26.31
21.80
22.20
24.10
23.50
19.75
23.74
22.75
20.60
25.10
21.60
20.40
2. Compute critical range:
Critical Range  QU ( c,nc )
1. Compute absolute mean
differences:
X 1  X 2  24.93  22.61  2.32
X 1  X 3  24.93  20.59  4.34
X 2  X 3  22.61  20.59  2.02
MSW
2
1 1 
    1.618
 nj nj' 
3. All of the absolute mean differences are greater than the
critical range. There is a significant difference between
each pair of means at the 5% level of significance.
62
Levene’s Test for
Homogeneity of Variance
• The Null Hypothesis
2
2
2
H
:






–
0
1
2
c
– The c population variances are all equal
• The Alternative Hypothesis
2
– H1 : Not all  j are equal ( j  1, 2, , c)
– Not all the c population variances are equal
63
Levene’s Test for
Homogeneity of Variance:
Procedure
1. For each observation in each group,
obtain the absolute value of the
difference between each observation and
the median of the group.
2. Perform a one-way analysis of variance
on these absolute differences.
64
Levene’s Test for
Homogeneity of Variances:
Example
As production manager,
you want to see if 3 filling
machines have different
variance in filling times.
You assign 15 similarly
trained & experienced
workers, 5 per machine, to
the machines. At the .05
significance level, is there
a difference in the variance
in filling times?
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
65
Levene’s Test:
Absolute Difference from the
Median
median
Machine1
25.4
26.31
24.1
23.74
25.1
25.1
Time
Machine2 Machine3
23.4
20
21.8
22.2
23.5
19.75
22.75
20.6
21.6
20.4
22.75
20.4
abs(Time - median(Time))
Machine1 Machine2 Machine3
0.3
0.65
0.4
1.21
0.95
1.8
1
0.75
0.65
1.36
0
0.2
0
1.15
0
66
Summary Table
SUMMARY
Groups
Machine1
Machine2
Machine3
Count
5
5
5
ANOVA
Source of Variation
SS
Between Groups
0.067453
Within Groups
4.17032
Total
4.237773
Sum
Average Variance
3.87
0.774 0.35208
3.5
0.7
0.19
3.05
0.61 0.5005
df
MS
F
P-value
F crit
2 0.033727 0.097048 0.908218 3.88529
12 0.347527
14
67
Levene’s Test Example:
Solution
2
2
2





H0: 1
2
3
H1: Not All Equal
Test Statistic:
MSA 0.0337
F

 0.0970
MSW 0.3475
 = .05
df1= 2
df2 = 12
Decision:
Critical Value(s):
Do not reject at  = 0.05.
 = 0.05
0
3.89
F
Conclusion:
There is no evidence that
2
at least one  j differs
from the rest.
68
Download