Testing Hypothesis That Data Fits a Given Probability Distribution

advertisement
Testing Hypothesis That Data Fit a
Given Probability Distribution
• Problem: We have a sample of size n. Determine if the
data fits a probability distribution.
• Null Hypothesis, H0: The data fits the distribution.
• Fact: Divide the range into k intervals. If the data fits the
distribution, then following random variable follows the
chi-square distribution with k-1 degrees of freedom.
k (observednumber of valuesin kth interval expectednumber of valuesin kth interval) 2

expectednumber of pointsin kth interval
j 1
k (n j  np j ) 2
 
j 1
np j
Testing Hypothesis That Data Fit a
Given Probability Distribution
• The value of the above variable computed
in a hypothesis test is called chi-square
statistic.
• If chi-square statistic is too large (far in the
right tail of the chi-square distribution) this
is a surprising result, and it means that the
evidence from the test contradicts the
hypothesis that the data fit the probability
distribution.
Algorithm
1. Perform visual test first. If there is no reason
to reject hypothesis proceed as follows.
2. Divide range of values in a sample into k
adjacent intervals.
3. Tally the number of observations in each
interval.
4. Calculate the chi-square statistic.
5. Calculate the p-value of the test.
6. Decide if the hypothesis should be rejected.
Decision Rule
Do not reject H0 Reject H0
• Reject hypothesis if pvalue less or equal to
some low significance
level  (e.g. 0.05).
Otherwise do not
reject hypothesis.
0.15
0.1
dchisq( x  7)
0.05
0
0
10
20
x
C rit ical v alue (probability of exc eedence 0.05
q chi sq( 0 .95 7)  1 4.0 67
Download