Section 9 - Introduction to Hypothesis Testing

advertisement
9 – Introduction to Hypothesis Testing
9.1 – Basic Steps in a Hypothesis/Significance Test
Before we look at hypothesis testing for a single population mean we will examine the five basic
steps in a hypothesis test and introduce some important terminology and concepts.
Steps in a Hypothesis Test
1. State Hypotheses
2. Determine Test Criteria/Procedure
Type I and Type II Errors ( &  )
Truth
Decision
H o true
H a true
Reject H o
Fail to
Reject H o
76
Step 2 (cont’d)
Example of Type I and II Errors:
Testing Wells for a Perchlorate in Morgan Hill & Gilroy, CA.
EPA guidelines suggest that drinking water should not have a perchlorate level exceeding 4 ppb
(parts per billion). Perchlorate contamination in California water (ground, surface, and well) is
becoming a widespread problem. The Olin Corp., a manufacturer of road flares in the Morgan
Hill area from 1955 to 1996 was is the source of the perchlorate contamination in the this area.
Suppose you are resident of the Morgan Hill area which alternative do you want well testers to
use and why?
H o :   4 ppb
H a :   4 ppb
or
H o :   4 ppb
H a :   4 ppb
Test Statistic (in general)
In general the basic form of most test statistics is given by:
(estimate)  (hypothesized value)
Test Statistic =
(think “z-score”)
SE (estimate)
which measures the discrepancy between the estimate from our sample and the hypothesized
value under the null hypothesis.
Intuitively, if our sample-based estimate is “far away” from the hypothesized value assuming the
null hypothesis is true, we will reject the null hypothesis in favor of the alternative or research
hypothesis. Extreme test statistic values occur when our estimate is a large number of standard
errors away from the hypothesized value under the null.
77
Step 3. Collect Data and Compute Test Statistic
Step 4. Compute p-value
The p-value is the probability, that by chance variation alone, we would get a test statistic as
extreme or more extreme than the one observed assuming the null hypothesis is true. If this
probability is “small” then we have evidence against the null hypothesis, in other words we have
evidence to support our research hypothesis.
Step 5.
Make Decision and Interpret
78
Step 6.
Quantify the Size of Significant Effects
(using confidence interval)
9.2 - Hypothesis Testing for a Single Population Mean (  )
Null Hypothesis ( H o )
Alternative Hypothesis ( H a )
p-value area
  o
  o
Upper-tail
  o
  o
Lower-tail
  o
  o
Two-tailed
(perform test using CI for  )
79
Test Statistic for Testing a Single Population Mean (  ) ~ (t-test)
t
X  o
X  o
~ t-distribution with df = n – 1.
or t 
s
SE ( X )
n
Assumptions:
When making inferences about a single population mean we assume the following:
1. The sample constitutes a random sample from the population of interest.
2. The population distribution is normal. This assumption can be relaxed when our
sample size in sufficiently “large”. How large the sample size needs to be is dependent
upon how “non-normal” the population distribution is.
Example 1: Mercury Levels in Boulder Reservoir Walleyes
Fish consumption guidelines suggest you should limit the number of fish you eat with Hg levels
above .25 ppm. Is there evidence to suggest that walleyes from Boulder Reservoir have a mean
Hg content exceeding .25 ppm?
Hypothesis Test:
1)  
Ho :
Ha :
2) Choose 
Test statistic
80
3) Compute test statistic
4) Find p-value (use t-Probability Calculator.JMP)
5) Make decision and interpret
To perform a t-test in JMP, select Test Mean from the HGPPM pull-down menu and enter
value for mean under the null hypothesis, .25 in this example.
81
Example 2: Length of Stay in a Nursing Home
In the past the average number of nursing home days required by elderly patients before they
could be released to home care was 17 days. It is hoped that a new program will reduce this
figure. Do these data support the research hypothesis?
3
5
12
7
22
6
2
18
9
8
20
15
3
36
38
43


Normality does not appear to be satisfied here!
Notice the CI for the mean length of stay is (8.38 days, 22.49 days).
Hypothesis Test:
1)  
Ho :
Ha :
2) Choose 
Test statistic
3) Compute test statistic
4) Find p-value (use t-Probability Calculator.JMP)
82
5) Make decision and interpret
To perform a t-test in JMP, select Test Mean from the LOS pull-down menu and enter value for
mean under the null hypothesis,17.0 in this example.
Conclusion:
Example 3: Creatinine Levels in End-Stage Renal Disease Patients
A nephrology nurse believes that the population mean creatinine level for end-stage renal disease
patients is greater than 8.4 mg/dl. A sample n = 12 of end-stage renal disease patients
undergoing hemodialysis was taken and their creatinine levels were recorded resulting in the data
below:
6.7 8.0 13.4 12.4 14.9 6.3 16.5 13.5 12.4 16.9 9.1 13.0
Do these data provide evidence in support the nurse’s research hypothesis?
Hypothesis Test:
1)  
Ho :
Ha :
2) Choose 
Test statistic
83
3) Compute test statistic
4) Find p-value (use t-Probability Calculator.JMP)

5) Make decision and interpret
In JMP
* click High Side for upper-tail test, similarly for the other two types of alternatives.
84
6) Quantify Using a Confidence Interval
85
9.3 – Test for Single Population Proportion (p)
Hypotheses
H o : p  po
H a : p  po or p  po or p  po (use CI for two - sided which is rarely of interest for p anyway)
Test Statistic (using approximate normality)
pˆ  p o
z
~ standard normal N(0,1) provided npo  5 and n(1  po )  5
p o (1  p o )
n
When our sample size is small or we want an exact test we can use the binomial distribution to
calculate the p-value, this is called the Binomial Exact Test.
Example: Hypertension During Finals Week
In the college-age population in this country (18 – 24 yr. olds), about 9.2% have hypertension
(systolic BP > 140 mmHg and/or diastolic BP > 90 mmHg). Suppose a sample of n = 196 WSU
students is taken during finals week and 29 have hypertension.
Do these data provide evidence that the percentage of students who are hypertensive during
finals week is higher than 9.2%?
Hypothesis Test:
1) p =
Ho :
Ha :
2) Choose 
Test statistic
3) Compute test statistic
4) Find p-value (use Normal Probability Calculator or Binomial Table Generator)
86
Binomial Exact Test
Use n = 196 and p = .092 (hypothesized value under Ho)
Exact p-value =
5) Make decision and interpret
6) Find a Confidence Interval for p
87
Example 2: Swain vs. Alabama
In 1965, an appeal to the Supreme Court was made by a black man (the petitioner) sentenced to
death after being convicted in the Circuit Court of Talladega County, Alabama, of the rape of a
17 year-old white girl by an all-white jury. At the time of the crime, the defendant was 19. The
petitioner alleged (among other things) that the entire process of selecting eligible jurors, from
the jury pool of eligible jurors to the jury panel (from which the trial jury is chosen), was racially
discriminatory.
At the time, in the early 1960’s, eligible jurors in Alabama were males over 21. According to
Census figures available then, there were 16,406 individuals that fit the profile in Talladega
Country. Twenty-six percent of these individuals were African-Americans. When 100
individuals were chosen to serve on the jury panel, only 8 of them were African-Americans*.
Is there evidence to suggest that in the jury panel selection process the proportion of AfricanAmericans selected is less than 26% ?
88
Download