Chi-Square_Test

advertisement
Chi-Square Test and
Goodness-of-Fit Testing
Ming-Tsung Hsu
OPLab@im.ntu.edu.tw
1
Outline





Goal of Hypothesis Test
Terms & Notation
Chi-Square Test
Goodness-of-Fit Testing
Example
OPLab@im.ntu.edu.tw
2
Goal of Hypothesis Test

To examine statistical evidence, and to
determine whether it supports or contradicts
a claim



The life of lamps is more than 10,000 hours
The data are from normal distribution
To reduce the directly-relevant data to a “level
of suspicion” based purely on the data
OPLab@im.ntu.edu.tw
3
Terms & Notation

Null Hypothesis (H0) vs. Alternative hypothesis (H1
or HA)



Parametric Test vs. Non-Parametric Test
Significance level (α) and Critical Region


“Reject H0” vs. “Do not reject H0“
Central Limit Theorem


Type I Error vs. Type II Error
Sampling distribution of the sample mean
Test Statistic vs. Table Value

P-value
OPLab@im.ntu.edu.tw
4
Null Hypothesis vs. Alternative hypothesis
H 0 :   0

 H1 :    0
 H 0 : Data are from norm aldistribution

 H 1 : Not H 0
OPLab@im.ntu.edu.tw
5
Type I Error vs. Type II Error

Type I error



H0 is true but reject H0
Pr(reject H0 | H0) = α
Type II error


H1 is true but do not reject H0
Pr(do not reject H0 | H1) = β
OPLab@im.ntu.edu.tw
6
Parametric Test vs. Non-Parametric Test

Parametric Test



Parameters of population
Mean test, variance test, etc.
Non-Parametric Test


Make no assumptions about the frequency
distributions of the variables being assessed
Independent test, distribution test, etc.
OPLab@im.ntu.edu.tw
7
Significance level (α) and Critical Region
OPLab@im.ntu.edu.tw
8
Central Limit Theorem
Central Lim it Theorem(CLT) If X 1 , , X n is a random
sam ple from a distribution with m ean and variance
 2  , then the lim iting distribution of
n
Zn 
X
i 1
i
 n
n
is the standardnorm al,
d
Zn 

Z ~ N (0, 1) as n  
OPLab@im.ntu.edu.tw
9
Test Statistic vs. Table Value
T . S. :
Z
X  0

n
T .V . :
Z 1 (One  Sided), Z1 (T wo - Sided)
2
Z 0.95  1.645, Z 0.975  1.96
Z 0.99  2.326
Decision Rule :
| Z |  T . V .  Rej H 0
OPLab@im.ntu.edu.tw
10
P-value
p  value  p ( X  x |  )
 Decision rule:
p  value   (one  sided) or  (t wo - sided)
2
 Rej H 0
OPLab@im.ntu.edu.tw
11
Chi-Square Test

Non-Parametric Test


Goodness-of-Fit Test



T. S. ~χ2(ν)
Also known as “Pearson's chi-square test”
Independent Test
Homogeneity Test
OPLab@im.ntu.edu.tw
12
Goodness-of-Fit Testing

Used to test if a sample of data came from a
population with a specific distribution
T . S. :
2
(
O

E
)
i
2   i
~  2 (k  1  m)
Ei
i 1
k
Oi :Observations of ith group
Ei :Expected frequency of ith group
k:Number of groups
m: Number of estimated parameters
K-1-m: Degree of freedom
OPLab@im.ntu.edu.tw
13
Example
OPLab@im.ntu.edu.tw
14
Parameter Estimation - λ
The MLE of 
1
1
ˆ
 
 0.246
t 4.06
OPLab@im.ntu.edu.tw
15
Observations and Expected Frequencies
Interval
Obs t
F(t) = p(T < t)
C.F.
Frequency
0~<1
14
1
0.218078
12.86659
12.86659
1 ~ < 2.5
12
2.5
0.459359
27.10219
14.2356
2.5 ~ < 5
18
5
0.707707
41.75474
14.65255
5 ~ < 7.5
5
7.5
0.841975
49.67651
7.921768
7.5 ~ < 10 5
10
0.914565
53.95934
4.282832
≧10
≧10
1
59
5.040662
5
?!
OPLab@im.ntu.edu.tw
16
Test Statistic and P-value
 T . S. :
(Oi  E i ) 2
 
 2.4137
Ei
i 1
k
2
 p  value:
P (  2  2.4137|  2 (6  1  1))  0.66
OPLab@im.ntu.edu.tw
17
Observations and Expected Frequencies Paper
18
12.87
14.24
14.65
7.92
4.28
5.04
 T . S. :
(Oi  Ei ) 2
 
 2.043
Ei
i 1
k
2
 p  value
P (  2  2.403|  2 (6  1  1))  0.72785
OPLab@im.ntu.edu.tw
18
Re-Grouping
ID
lower
upper
Freq.
Obs t
F(x)
C. F.
E. F.
30
3.3
0.556 32.812
32.813
17
6.3
0.788 46.486
13.673
1
0.3
3.3
30
2
3.3
6.3
17
3
6.3
9.3
6
6
9.3
0.899 53.020
6.534
4
9.3
12.3
3
6
≧9.3
1
5.980
5
12.3
15.3
1
 T . S. :
6
15.3
18.3
1
7
18.3
21.3
1
(Oi  Ei ) 2
 
 1.0942
Ei
i 1
# of groups = 1+3.322*log(n)
59
k
2
 p  value
P (  2  1.0942|  2 (4  1  1))  0.579
OPLab@im.ntu.edu.tw
19
Download