Fundamentals of Research Project Planning: Hypotheses

advertisement
Comparing Population Parameters
(Z-test, t-tests and Chi-Square test)
Dr. M. H. Rahbar
Professor of Biostatistics
Department of Epidemiology
Director, Data Coordinating Center
College of Human Medicine
Michigan State University
Is there an association between
Drinking and Lung Cancer?
Suppose a case-control study is
conducted to test the above
hypothesis?
QUESTION: Is there a difference
between the proportion of drinkers among
cases and controls?
Group 1
Disease
P1= proportion of drinkers
Group 2
No Disease
P2= proportion of drinkrs
Elements of Testing hypothesis
•
•
•
•
•
•
Null Hypothesis
Alternative hypothesis
Level of significance
Test statistics
P-value
Conclusion
Case Control Study of Drinking
and Lung Cancer
Null Hypothesis: There is no
association between Drinking and
Lung cancer, P1=P2 or P1-P2=0
Alternative Hypothesis: There is
some kind of association between
Drinking and Lung cancer, P1P2 or
P1-P20
Based on the data in the following contingency
table we estimate the proportion of drinkers
among those who develop Lung Cancer and
those without the disease?
Drinker Yes
No
Lung Cancer
Case
Control
A=33
B=27
C=1667
D= 2273
eP1=33/1700
Total
60
3940
eP2=27/2300
Test Statistic
How many standard deviations has our
estimate deviated from the hypothesized
value if the null hypothesis was true?
Z  (eP1  eP 2  0) /[(1/ n1  1/ n2)( p(1  p))]
where
p  (33  27) /(1700  2300)  60 / 4000  3/ 200  0.015
Z  [(33/1700)  (27 / 2300)  0)]/( (1/1700  1/ 2300)(0.015)(0.985)
Z  2.003
P-value for a two tailed test
P-value= 2 P[Z > 2.003] = 2(.024)=0.048
How does this p-value compared with =0.05?
Since p-value=0.048 < =0.05, reject the null
hypothesis H0 in favor of the alternative
hypothesis Ha.
Conclusion:
There is an association between drinking and
lung cancer.
Is this relationship causal?
Chi-Square Test of Independence
(based on a Contingency Table)
(Observed  E xp ected )
 
Expected
2
df  (r  1)(c  1)
2
In the following contingency table estimate the
proportion of drinkers among those who develop
Lung Cancer and those without the disease?
Drinker Yes
No
Total
Lung Cancer
Case
Control
O11=33
O12=27
O21=1667 O22= 2273
C1 = 1700
Total
R1=60
R2=3940
C2 = 2300 n = 4000
E11=1700(60)/4000=25.5 E12=34.5
E21=1674.5
E22=2265.5

E11=1700(60)/4000=25.5
E12=34.5
E21=1674.5
E22=2265.5
k 4
2
obs

(Observed  E xp ected )
Expected
k 1
2

(33  25.5) (27  34.5)


25.5
34.5
2
2
(1667  1674.5) (2273  2265.5)

1674.5
2265.5
 4.0
2
2
How do we calculate P-value?
• SPSS, Epi-Info statistical packages could
be used to calculate the p-value for
various tests including the Chi-Square
Test
• If p-value is less than 0.05, then reject the
null hypothesis that rows and column
variables are independent
Testing Hypothesis When Two
Population Means are Compared
H0: 1= 2
Ha: 1 2
QUESTION: Is there an association
between age and Lung Cancer?
Group 1
Disease
Mean age of the cases
Group 2
No Disease
Mean age of the controls
Use Two-sample t-test when both
samples are independent
• H0: 1 = 2 vs Ha: 1  2
• H0: 1 - 2 = 0 vs Ha: 1 - 2  0
• t= difference in sample means – hypothesized diff.
SE of the Difference in Means
• Statistical packages provide p-values and
degrees of freedom
• Conclusion: If p-value is less than 0.05, then
reject the equality of the means
Paired t-test for
Matched case control study
• H0: 1 = 2 vs Ha: 1  2
• H0: 1 - 2 = 0 vs Ha: 1 - 2  0
• Paired t-test= Mean of the differences –0
SE of the Differences in Means
• Statistical packages provide p-values for
paired t-test
• Conclusion: If p-value is less than 0.05,
then reject the equality of the means
Download