Hypothesis Testing (Inference) By Abebe Megerso (BSc in PH., MPH in Epid., Asst. Prof.) 1 Session Objectives • Define hypothesis & describe types of hypothesis, • Describe steps in hypothesis testing, • Discus rules for stating statistical hypotheses, • Explain hypothesis testing process, • Describe types of errors in hypothesis tests, • Test hypothesis on single & double population, 2 Definition of Hypothesis • Is a claim (assumption) about a population parameter. • Is a statement about one or more population. • Is frequently concerned with the parameters of the population about which the statement is made. 3 Examples of Hypotheses: Population Mean: • The average length of stay of patients admitted to the hospital is five days. • The mean birth weight of babies delivered by mothers with low socioeconomic status (SES) is lower than those from higher SES. • Etc. 4 Examples … Population Proportion: • The proportion of adult smokers in Adama is 40% (p = 0.40). • The prevalence of HIV among non-married adults is higher than that in married adults. • Etc. 5 Types of Hypothesis 1. The Null Hypothesis, HO: Is a statement claiming that there is no difference between the hypothesized value & the population value. (The effect of interest is zero no difference) States the assumption (hypothesis) to be tested. 6 Types of Hypothesis… H0 is a statement of agreement (or no difference), H0 is always about a population parameter, not about a sample statistic, Always contains “=” , “ ≤” or “≥ ” sign. 7 Types of Hypothesis… 2. The Alternative Hypothesis, HA • Is a statement of what we will believe is true if our sample data causes us to reject Ho. • Is generally the hypothesis that is believed (or that needs to be supported) by the researcher. 8 Types of Hypothesis… • Is a statement that disagrees (opposes) with H o. (The effect of interest is not zero), • Never contains “=” , “ ≤” or “≥ ” sign. • May or may not be accepted. 9 Hypothesis Testing • The majority of statistical analyses involve comparison, (e.g. between treatments or procedures or between groups of subjects). • Hypotheses are formulated, experiments are performed, & results are evaluated for their consistency or non-consistency with a hypothesis. 10 Hypothesis Testing… • Hypothesis Testing (HT) provides an objective framework for making decisions using probabilistic methods. • The purpose of HT is to aid the clinicians, researchers or administrators in reaching a decision concerning a population by examining a sample from that population. 11 Hypothesis Testing… • Begin with the assumption that the Ho is true: – Similar to the notion of innocent until proven guilty. • Ho may or may not be rejected. 12 Steps in Hypothesis Testing 1. Formulate the appropriate statistical hypotheses clearly. • Specify HO & HA H0: = 0 H0: ≤ 0 H1: 0 H1: > 0 two-tailed one-tailed H0: ≥ 0 H1: < 0 one - tailed 2. State the assumptions necessary for computing probabilities. • • A distribution is approximately normal. Variance is known or unknown. 13 Steps … 3. Select a sample & collect data. • Categorical, continuous; 4. Decide on the appropriate test statistic for the hypothesis. E.g., One population: Or 14 Steps … 5. Specify the desired level of significance for the statistical test (=0.05, 0.01, etc.). 6. Determine the critical value. – -1.96 A value the test statistic must attain to be declared significant (=0.05). 1.96 1.645 -1.645 15 Steps … 7. Obtain sample evidence & compute the test statistic. 8. Reach a decision & draw the conclusion. • If Ho is rejected, we conclude that HA is true (or accepted). • If Ho is not rejected, we conclude that Ho may be true. 16 Rules for Stating Statistical Hypotheses 1. One population: • Indication of equality (either =, ≤ or ≥) must appear in Ho. Ho: μ = μo, HA: μ ≠ μo • Can we conclude that a certain population mean is – not 30? Ho: μ = 30 & HA: μ ≠ 30 OR – greater than 50? Ho: μ ≤ 50 & HA: μ > 50 17 Rules … Population Proportions: Ho: P = Po, HA: P ≠ Po E.g. Can we conclude that the proportion of patients with leukemia who survive more than six years is not 60%? Ho: P = 0.6 & HA: P ≠ 0.6 18 Rules … 2. Two populations: Mean Difference: Ho: μ1 = μ2 & HA: μ1 ≠ μ2 Proportion Difference: Ho: P1 = P2 & HA: P1 ≠ P2 19 In summary: 1. What you hope to conclude as a researcher should be placed in the HA. 2. The Ho should have a statement of equality, either =, ≤ or ≥. 3. The Ho is the hypothesis that is tested. 4. The Ho & HA are complementary. 20 Hypothesis Testing Process • Now think about how the hypothesis test should be carried out. • We draw a random sample of size n from the underlying population & calculate its sample mean. • We compare the sample mean to the postulated mean μ0. • Is the difference between sample mean & μ0 too large to be attributed to chance alone? 21 Process … 22 Decision Rule: • Results used for decision are computed from the data of the sample. • The decision to reject or not to reject the Ho is based on the magnitude of the test statistic. 23 Decision Rule … • An example of a test statistic is the quantity obtained from: • When the variance of the population is unknown & sample is small, we use. 24 Rejection & Non-Rejection Regions • The values of the test statistic assume the points on the horizontal axis of the normal distribution & are divided into two groups: Rejection region, & Non-rejection region. • The values of the test statistic forming the rejection region are less likely to occur if the Ho is true. • The values making the acceptance (non-rejection) region are more likely to occur if the Ho is true. 25 Example: Two-sided test at α 5% = 0.025 -1.96 Rejection region = 0.025 0.95 1.96 Non-rejection region Rejection region 26 Statistical Decision • Reject Ho if the value of the test statistic that we compute from our sample is one of the values in the rejection region. • Don’t reject Ho if the computed value of the test statistic is one of the values in the nonrejection region. 27 Level of Significance, α • Is the probability of rejecting a true Ho. • Defines unlikely values of sample statistic if Ho is true. – Defines rejection region of the sampling distribution. • The decision is made on the basis of the level of significance, designated by α. • More frequently used values of α are 0.01, 0.05 & 0.10. • α is selected by the researcher at the beginning. 28 One tail & two tail tests • In a one tail test, the rejection region is at one end of the distribution or the other. • In a two tail test, the rejection region is split between the two tails. • Which one to use depends on the way the Ho is written. 29 Level of Significance & the Rejection Region Example: • The average survival year after cancer diagnosis is less than 3 years. • See the next slide: 30 Level of Significance … 31 Another way to state conclusion • Reject Ho if P-value < α • Accept Ho if P-value ≥ α (fail to reject) P-value is the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained if the Ho is true. The larger the test statistic, the smaller is the P-value. OR, the smaller the P-value the stronger the evidence against the Ho. 32 Types of Errors in Hypothesis Tests • Whenever we reject or fail to reject the Ho, we commit errors. • Two types of errors are committed: Type I Error, Type II Error, 33 Type I Error • The error committed when a true Ho is rejected. • Considered as a serious type of error. • The probability of a type I error is the probability of rejecting the Ho when it is true. • The probability of type I error is α. • Called level of significance of the test. • Set by researcher in advance. 34 Type II Error • The error committed when a false Ho is not rejected (fail to reject false Ho). • The probability of type II Error is . • Usually unknown but larger than α. 35 Power • The probability of rejecting the Ho when it is false. • Power = 1 – β = 1- probability of type II error • We would like to maintain low probability of a type I error (α) & low probability of a type II error (β) [high power = 1 - β]. 36 Action (Conclusio n) Reality Ho True Ho False Do not reject Ho Correct action Type II error (β) (Prob.= 1-α) (Prob. = β= 1-Power) Reject Ho Type I error (α) Correct action (Prob. = α = Sign. level) (Prob. = Power = 1-β) 37 Type I & II Error Relationship 38 Factors Affecting Type II Error 39 Factors affecting the Power of the Test The power depends on the following: 1. As n↑, power ↑ 2. As |µ1-µo|↑, power ↑ 3. As ↑, power ↓ 4. As α↓, power ↓ 40 Hypothesis Testing approach Hypothesis Test for One Samples • Test for single mean, • Test for single proportion, Hypothesis Test for Two Samples • Test for the difference between two population means, • Test for the difference between two population proportions, 41 1. Hypothesis Testing of a Single Mean (Normally Distributed) 42 1.1 Known Variance 43 Example: Two-Tailed Test 1. A simple random sample of 10 people from a certain population has a mean age of 27. Can we conclude that the mean age of the population is not 30? The variance is known to be 20; & let level of significance be = .05. A. Hypothesis Ho: µ = 30 HA: µ ≠ 30 B. Assumptions • Simple random sample, • Normally distributed population, 44 Example … C. Data: n = 10, sample mean = 27, 2 = 20, α = 0.05 D. Test statistic: As the population variance is known, we use Z as the test statistic. 45 Example … E. Decision Rule: • Reject Ho if the Z value falls in the rejection region. • Don’t reject Ho if the Z value falls in the nonrejection region. • Because of the structure of Ho it is a two tail test. • Therefore, reject Ho if Z ≤ -1.96 or Z ≥ 1.96. 46 Example … F. Calculation of test statistic: G. Statistical decision: We reject the Ho because Z = -2.12 is in the rejection region; the value is significant at 5% = α. H. Conclusion: We conclude that µ is not 30; P-value = 0.0340 < α. A Z value of -2.12 corresponds to an area of 0.0170. Since there are two parts to the rejection region in a two tail test, the P-value is twice this which is .0340. 47 Hypothesis test using confidence interval • A problem like the above example can also be solved using a confidence interval. • A confidence interval will show that the calculated value of Z does not fall within the boundaries of the interval; however, it will not give a probability. • Confidence interval: 48 Example: One -Tailed Test • A simple random sample of 10 people from a certain population has a mean age of 27. • Can we conclude that the mean age of the population is less than 30? The variance is known to be 20; let α = 0.05. • Data: n = 10, sample mean = 27, 2 = 20, α = 0.05 • Hypotheses: Ho: µ ≥ 30, HA: µ < 30 49 Example … • Test statistic: • Rejection Region: Lower tail test • With α = 0.05 & the inequality, we have the entire rejection region at the left. • The critical value is be Z = -1.645; we reject Ho if Z < 1.645. 50 Example … • Statistical Decision: We reject the Ho because -2.12 & < -1.645. • Conclusion: We conclude that µ < 30. p = .0170; this time because it is only a one tail test & not a two tail test. 51 Example … • Suppose that the Ho & Ha take the form Ho: µ = µo, Ha: µ > µo • In this case, Ho would be rejected for large values of test statistic (critical values >0). • The P-value would correspond to the area in the upper tail of the SND, to the right of the value of the test statistic. Upper tail test 52 1.2 Unknown Variance • In most practical applications the standard deviation of the underlying population is not known. • In this case, can be estimated by the sample standard deviation s. • If the underlying population is normally distributed, then the test statistic is: 53 Example: Two-Tailed Test • A simple random sample of 14 people from a certain population gives a sample mean body mass index (BMI) of 30.5 & s of 10.64. • Can we conclude that the BMI is not 35 at α = 5%? • Ho: µ = 35, Ha: µ ≠35 • Test statistic • If the assumptions are correct & Ho is true, the test statistic follows student's t distribution with 13 degrees of freedom. 54 • Decision rule: Example … We have a two tailed test; with α = 0.05 & it means that each tail is 0.025. The critical t values with 13df are -2.1604 & 2.1604. We reject Ho if the t ≤ -2.1604 or t ≥ 2.1604. • Do not reject Ho because -1.58 is not in the rejection region. • Based on the data of the sample, it is possible that µ = 35. P-value = 0.1375 55 2. Hypothesis Testing about the Difference Between Two Population Means • When studying one-sample tests for a continuous random variable, the unknown mean μ of a single population was compared to some known value μo. • We are usually interested in comparing the means of two different populations when the values of both means are unknown. 56 Two Population Means, Independent Samples 57 Two Sample Means, Independent Samples Two Population Means … 58 2.1 Known Variances (Independent Samples) • When two independent samples are drawn from a normally distributed population with known variance, the test statistic for testing the Ho of equal population means is: 59 Example: • Researchers wish to know difference in mean serum uric acid (SUA) levels between normal individuals & those with Down’s syndrome. • The means SUA levels on 12 individuals with Down’s syndrome & 15 normal individuals are 4.5 & 3.4 mg/100 ml, respectively, with variances (2=1, 2=1.5, respectively). • Is there a difference between the means of both groups at α = 5%? • Hypotheses: Ho: µ1- µ2 = 0 or Ho: µ1 = µ2 HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2 60 Example … • With α = 0.05, the critical values of Z are -1.96 & +1.96. We reject Ho if Z < -1.96 or Z > +1.96. • Reject Ho because 2.57 > 1.96. • From these data, it can be concluded that the population means are not equal. • A 95% CI would give the same conclusion; & P-value = 0.01. 61 2.2 Unknown Variances i. Equal variances (Independent samples) • With equal population variances, we can obtain a pooled value from the sample variances. • The test statistic for µ1 - µ2 is: • Where tα/2 has (n1 + n2 – 2) df., & 62 Example: • We wish to know if we may conclude, at the 95% confidence level that smokers, in general, have greater damaged lung cells than do non-smokers. • Calculation of Pooled Variance: 63 Example … • Hypotheses: Ho: µ1 ≤ µ2 = 0, HA: µ1 > µ2 • With α = 0.05 & df = 23, the critical value of t is 1.7139; we reject Ho if t > 1.7139. • Test statistic: • Reject Ho because 2.6563 > 1.7139; & on the basis of the data, we conclude that µ1 > µ2. 64 ii. Unequal variances (Independent samples) • We are still interested in testing H0 : μ1 = μ2 vs HA: μ1 ≠ μ2 • The test statistic used is: • To compute a test statistic, we simply substitute s 2 for 2 & s 2 for 2. 1 1 2 2 65 Unequal variances … • Where the degree of freedom (d’) is given by: 66 Unequal variances … • If t > td’’,1-α/2 or t < -td’’,1-α/2 then reject Ho. • If -td’’,1-α/2 ≤ t ≤ td’’,1-α/2, then accept Ho 67 Example: • Suppose we want to compare the characteristics of tuberculosis meningitis for patients infected with HIV & those not infected with HIV. • In particular, we are interested in comparing age at diagnosis. • A random sample of n1 = 37 HIV infected patients has mean age at diagnosis x1 = 27.9 years & s1 = 5.6 years. • A sample of n2 = 19 uninfected patients has mean age at diagnosis x2 = 38.8 years & s2 = 21.7 years. 68 Example … • The test statistic is: 69 Example … • Note that: • And 70 Example … • For a t distribution with 19 df, the area to the left of −2.15 is between 0.01 & 0.025. • Therefore, 0.02 < p < 0.05 • For a test conducted at α= 0.05, H0 is rejected. • We conclude that among patients diagnosed with tuberculosis meningitis, those who are infected with HIV tend to be younger than those who are not. 71 Hypothesis Testing for Paired Samples • Two samples are paired when each data point of the first sample is matched & is related to a unique data point of the second sample. • Tests means of two related populations: Paired or matched samples, Repeated measures (before/after), • Longitudinal or follow-up study, 72 Paired Samples … • Assumptions: – Both populations are normally distributed, – Or, if not normal, use large samples, 73 The Paired t Test n = number of pairs in the paired sample Sd = Sample standard deviation 74 Paired t Test … 75 Paired t Test … 76 Example: • The following data show the SBP levels (mm Hg) in 10 women while not using (baseline) & while using (follow-up) oral contraceptives (OC). • Can we conclude that there is a difference between mean baseline & follow-up SBP at α 5%? di = baseline – follow-up, i 1 2 3 4 5 6 7 8 9 10 SBP (baseline) 115 112 107 119 115 138 126 105 104 115 SBP (follow-up) 128 115 106 128 122 145 132 109 102 117 di 13 3 -1 9 7 7 6 4 -2 2 77 Example: = (13 + 3 + …. + 2)/10 = 4.80 S2d = [(13-4.8)2 + … + (2-4.8)2]/9 = 20.844 Sd = √20.844 = 4.566 t = 4.80/(4.566/√10) = 4.80/1.44 = 3.32 • From the Table, t9,α/2 = 2.262 • Since t = 3.32 > t9,α/2 (2.262) Ho is rejected • P-value is between 0.001 & 0.01 • Since 3.32 falls in the rejection region, there is a significant difference between the population means of SBP while not using & using OC. 78 Hypothesis Tests for Proportions • Involves categorical values, • Two possible outcomes: – “Success” (possesses a certain characteristic) – “Failure” (does not possesses that characteristic) • Fraction or proportion of population in the “success” category is denoted by p. 79 Proportions … 80 3. Hypothesis Testing about a Single Population Proportion (Normal Approximation to Binomial Distribution) 81 Single Population Proportion… 82 Example • We are interested in the probability of developing asthma over a given one-year period for children 0 to 4 years of age whose mothers smoke in the home. • In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%. • If 10 cases of asthma are observed over a single year in a sample of 500 children whose mothers smoke, can we conclude that this is different from the underlying probability of p0 = 0.014? α = 5% H0 : p = 0.014 HA: p ≠ 0.014 83 Example … • The test statistic is given by: 84 Example … • The critical value of Zα/2 at α=5% is ±1.96. • Don’t reject Ho since Z =1.14 is in the nonrejection region between ±1.96. • P-value = 0.2548 • We do not have sufficient evidence to conclude that the probability of developing asthma for children whose mothers smoke in the home is different from the probability in the general population. 85 4. Hypothesis Tests about the Difference Between Two Population Proportions 86 Two Population Proportions… Where X1 = the observed number of events in the first sample & X2 = the observed number of events in the second sample 87 Two Population Proportions… 88 Example • A study was conducted to investigate the possible cause of gastroenteritis outbreak following a lunch served in a high school cafeteria. • Among the 225 students who ate the sandwiches, 109 became ill; while, among the 38 students who did not eat the sandwiches, 4 became ill. • Is there a significant difference between the two groups at α =5%. • We wish to test: Ho: p1 = p2 against the alternative HA: p1 ≠ p2 89 Example … 90 Example … • Assume that the sample sizes are large enough, & the normal approximation to the binomial distribution is valid. • If the Ho is true, then p1 = p2 = p 91 The area under the standard normal curve to the right of 4.36 is less than 0.0001; &, p < 0.0002. We reject H0 at the 0.05 level. We conclude that the proportion of students who became ill differs in the two groups; those who ate the prepared sandwiches were more likely to develop gastroenteritis. 92 Thank You ! 93