Methods for a Single Numeric Variable – Hypothesis Testing

advertisement
Methods for a Single Numeric Variable – Hypothesis Testing & Confidence Intervals
We’ve already discussed how to summarize a single numeric variable both numerically and graphically.
In this set of notes we will look at inferential procedures (hypothesis testing and confidence intervals)
for a single numeric variable.
Example: Consider a study in which the weight of insulin-dependent diabetics is being investigated. The
variable of interest is the percent of their ideal body weight. For example, a value of 120% implies that
an individual weighs 20% more than their ideal body weight, and a value of 95% implies an individual
weighs 5% less than their ideal body weight. The data below can be found in the file diabetics.jmp on
the course website.
107
100
116
104
119
125
101
88
99
114
121
114
114
95
152
124
120
117
We can use JMP to summarize the data as follows:
Questions:
1. What is the mean percent ideal weight of the observed data? The standard deviation?
2. If another sample of n = 18 patients was obtained, would these new individuals have a mean
exactly the same as the mean from this sample? Why or why not?
3. Given your answer to the previous question, do you think it is appropriate to use only this
sample mean to make inferences about the true ideal body weights in the greater population of
insulin-dependent diabetics? Explain.
1
Sampling Distribution of the Sample Mean
The sample mean is a random quantity. That is, it changes from _______________ to ______________.
Therefore, the sample mean HAS a distribution. The distribution will tell us two things:
1. What values the sample mean can ________________.
2. How __________ the sample mean will assume these values.
The distribution is referred to as the sampling distribution of the sample mean. An understanding of
this sampling distribution allows us to make decisions about a population mean for a single numeric
variable. When we make decisions about a population mean, we will use both of the following:
1. The sample __________ (from the observed data).
2. The sampling ____________________ of the sample mean.
Exploring the Sampling Distribution of the Sample Mean
Before we discuss the procedure for hypothesis testing, let’s consider the next activity to gain a better
understanding of how this particular sampling distribution works.
Example: Simulation study – suppose we set up a hypothetical population of 500 insulin-dependent
diabetics. This population was purposefully created so that the mean percent ideal body weight is
100%.
Questions:
4. Looking at the histogram, how would you describe the shape of the distribution of percent ideal
body weight values?
5. In this simulation, what is the value of the true population mean, µ?
2
Note that in reality, the true population mean is usually an _______________ quantity which we are
trying to estimate. If it were impossible or not feasible to collect data on the entire population, we
would take a sample from the population in order to estimate the average percent ideal body weight.
Let’s see what happens when we take various samples of size 5 from this population. The “population”
can be found in the file diabetic_population.jmp on the course website.
To take a random sample of size 5, we need to perform the following steps in JMP.
1. Choose Table  Subset and the following dialogue box should appear.
2. You’ll then want to choose the Random – sample size: option and enter 5 in the box. You may
also want to call the table outputted Sample 1.
3. Click OK and a new data table should pop up with only 5 observations in it. This is your random
sample of size 5 from the “population” of insulin-diabetic patients.
4. Next, find the mean of your observations and record your sample mean below. Then repeat this
procedure 2 more times so you have 3 sample means total.
Sample
1
2
3
Sample Mean
3
Next repeat this procedure using a random sample of size 10.
Sample
1
2
3
Sample Mean
Questions:
6. Consider the means calculated from your random samples of size 5. We will use the entire
class’s data to construct a histogram of the sample means. Sketch the plot below. Based on the
histogram, what can you say about the shape of the distribution?
7. How does the amount of variability in this sampling distribution compare to the amount of
variability in the original population? Why does this happen?
8. Next, consider the means calculated from your random samples of size 10. Again, we will use
the class data to construct a histogram of the sample means. What can you say about the shape
of this distribution?
4
9. How does the amount of variability in this sampling distribution compare to the amount of
dispersion in the sampling distribution for a random sample of size 5? Why does this happen?
Next, let’s see what happens when we take many more samples than we did in class. The following
output shows the results from one thousand random samples of size 5 and one thousand random
samples of size 10.
Questions:
10. How does the shape of the sampling distribution for the mean change as sample size increases?
11. How does the amount of variability of the sampling distribution for the mean change as the
sample size increases?
12. How does the center of the sampling distribution for the mean compare to the true population
mean?
5
13. Suppose a sample of size 10 was taken for this study. Does a sample mean of x = 112.78 seem
likely to occur by chance if the true mean is really 100? What does this say about the research
question?
14. What do you expect the sampling distribution of the sample mean to look like with a sample size
of n = 18? Recall, this is the sample size used in the actual study.
Characteristics of the Sampling Distribution for the Sample Mean
To characterize the sampling distribution of the sample mean, we need to describe its center, shape,
and amount of variability (or dispersion).

The ____________ (or center) of the sampling distribution for the sample mean is the ________
as the mean from the original distribution.
Comment: We often use µ to denote the mean of the original distribution. Therefore,
the mean of the sampling distribution for the sample mean is also µ.

The standard deviation (which measures the variability or dispersion) of the sampling
distribution for the sample mean ______________ as the sample size gets larger. Specifically, if
we let σ denote the standard deviation of the ____________________ distribution, then the
standard deviation of the sample mean is given by:
This quantity is referred to as the standard error of the sample mean.
Central Limit Theorem for the Sample Mean
Consider a random sample of _____ observations from _____ population with mean _____ and standard
deviation _____. Then, when n is sufficiently large, the sampling distribution of the sample mean ( x )
will be an approximately normal distribution with a mean of µ and standard deviation

.
n
Note: This approximation gets better as the sample size (n) increases.
6
Question:
15. How large does n have to be?

If the original population is normally distributed, then the sampling distribution of the
sample mean will also be normally distributed REGARDLESS of the sample size.

For most populations, a sample size of n ≥ 30 or 40 will be sufficiently large enough to
say the sampling distribution of the sample mean is approximately normally distributed.

The more skewed the distribution, the larger the sample size must be before the normal
approximation fits the sampling distribution for the sample mean well. If the
distribution is very skewed, the sample size may have to be MUCH larger than 30!!!
Example: Recall the study where the insulin-dependent diabetics are being investigated. Some
summaries from JMP are shown below.
Research Question – Do the data provide evidence that the mean percent ideal body weight for
insulin dependent diabetics differs from 100?
As mentioned earlier, we know the sample mean is a random variable. For our sample, x = 112.78. This
would probably not be the case if we took another sample. Our goal is to use what we just learned
about the sampling distribution of the sample mean, in addition to our sample mean ( x = 112.78) to
decide whether we have evidence the true POPULATION mean percent of ideal body weight (µ) differs
from 100.
Consider these characteristics of the sampling distribution for the sample mean for this example:

Center: The sampling distribution for the sample mean will be centered at a mean of
µ = 100 (assuming the percent ideal body weight is 100).

Shape: Based on the Central Limit Theorem, the sampling distribution for the sample
mean will be approximately normal if:
o The sample size is sufficiently large OR
o The distribution of the original is normally distributed
7

.
n
At this point we have established that the sampling distribution of the sample mean will approximately
follow a normal, bell-shaped curve centered at µ = 100.

Variability: The standard error of the sampling distribution for the sample mean is
Our next step is to determine where our sample mean falls on this sampling distribution. We can then
use this information to find the p-value for the test. To determine whether or not the distance between
µ (the hypothesized mean) and x (the mean from our observed data) is larger than what we would
expect under repeated sampling, we can consider using the z-score for a sample mean:
The z-score comes from what is called the standard normal distribution. Let’s look at the formal
hypothesis test.
Step 0: Define the research question.
Do the data provide evidence that the mean percent ideal body weight for insulin
dependent diabetics differs from 100?
Step 1: Determine the appropriate null and alternative hypotheses.
H0: The mean percent ideal body weight of insulin-dependent diabetics is 100.
Ha: The mean percent ideal body weight of insulin-dependent diabetics differs from 100.
Equivalently, we could state the hypotheses as follows:
H0: µ = 100
Ha: µ ≠ 100
8
Step 2: Check the assumptions behind the test and calculate the test statistic
For this hypothesis test, we must check that one of the two assumptions has been met:

The sample is sufficiently large
OR

It is reasonable to assume the distribution of the population is normally distributed
Since n = 18 which is NOT sufficiently large, we can graphically determine whether the
distribution of the population is normally distributed.

Histogram: A histogram of the data with a smooth curve (which represents the
underlying distribution of the population from which the data came from) and the
normal distribution can be used to assess normality.
If the red and green lines are roughly the same, then it can be concluded that the
population the data came from is normally distributed.

Normal Quantile Plot: This is another plot which can be used to assess normality. In this
plot you’re looking for the points to lie on or very close to the y = x line. To get this plot
click on the little red arrow next to the variable name and choose Normal Quantile Plot.
Question:
16. Looking at the histogram and the quantile plot, is it reasonable to assume the population the
data came from is normally distributed?
9
We discussed using the z-score to determine if our observed sample mean is unusual to observe by
x 
chance alone. However, do you see a problem with this formula, z =
? What is σ?

n
In practice, we have to use the sample standard deviation, s, which is our best guess for the
population standard deviation. That is, we’ll have to use
x 
which no longer follows the
s
n
standard normal distribution. Once we have to use s to estimate σ, the new statistic comes from
the t-distribution. This distribution is also _________________________, __________________,
and centered at _____ (just like the standard normal distribution). The difference is that the tdistribution is more variable than the standard normal distribution. The amount of variability in the
t-distribution depends on the sample size n. Therefore, this distribution is indexed by its degrees of
freedom (df). For inference regarding a single mean, df = n – 1.
Consider the following t-distributions.
Question:
17. Calculate the test statistic for the diabetic example.
t
x 
=
s
n
10
18. What does the numerator of the test statistic tell us?
Step 3: Find the p-value.
As we’ve already seen, the p-value is the probability (assuming H0 is true) of observing results as
extreme as was observed in the observed data.
Lower-Tailed Test
(Ha contains <)
Upper-Tailed Test
(Ha contains >)
Two-Tailed Test
(Ha contains ≠)
We can use JMP to find the p-value for us. Click on the red drop-down arrow and choose Test Mean.
You should then get a dialogue box that looks like the one given below. You’ll want to enter the
hypothesized value (in this case 100) in the top box as shown below.
Click OK and you should get the following JMP output.
p-value = ________________
11
Step 4: Report the conclusion in context of the research question.
Example: A physician has noticed that a large number of adults tend to have a body temperature less
than 98.60F. Therefore, the physician decides to conduct a study to examine true average body
temperature in adults. A random sample of 75 patients was taken and each had their body temperature
taken. The data can be found in the file Temp.jmp on the course website.
Research Question – Is there evidence that the average body temperature of adults is less than
the conjectured 98.60F?
Step 0: Define the research question.
Is there evidence that the average body temperature of adults is less than the conjectured
98.60F?
Step 1: Determine the appropriate null and alternative hypotheses.
Step 2: Check the assumptions behind the test and calculate the test statistic.
Step 3: Find the p-value.
Step 4: Report the conclusion in context of the research question.
12
Example: The State Environmental Protection Agency (SEPA) is responsible for monitoring the air
pollution level for a large western metropolis. The air pollution level is considered to be acceptable (or
safe) if the mean pollution level is at or below a reading of 100mg of pollution per cubic yard of air. Air
pollution levels substantially above 100mg/yd3 are considered to be dangerous. To monitor air pollution
levels, the SEPA will take a pollution reading 10 times a day. If the evidence from this sample suggests
that the air pollution levels are unacceptable, then the SEPA must decrease an air pollution emergency
and impose emergency measures to reduce pollution levels in the air. Suppose the readings for one day
are given in the following table (and can also be found on the course website in the file pollution.jmp).
Pollution Level
(mg/yd3)
98.6
100.2
101.1
109.4
99.4
110.5
95.6
108.9
112.9
110.5
Consider the following summary statistics and graphics for this example:
Research Question – Is there evidence that the pollution levels are unacceptable?
Step 0: Define the research question.
Is there evidence that the pollution levels are unacceptable?
13
Step 1: Determine the appropriate null and alternative hypotheses.
Step 2: Check the assumptions behind the test and calculate the test statistic.

n = 10 < 30

Data is not normally distributed
It appears neither assumption has been satisfied in this example. Therefore, a t-test is not
appropriate to carry out the analysis. If the data is not normally distributed, but is symmetric
we can conduct what is called the Wilcoxon Signed Rank Test. To carry out this test, choose
click on the red drop-down arrow and choose Test Mean as done before. However, this time
check the box for the Wilcoxon Signed Rank Test in the dialogue box as shown below.
Click OK and you’ll get the following output from JMP.
Step 3: Find the p-value.
Step 4: Report the conclusion in context of the research question.
14
Types of Errors Encountered in Hypothesis Testing
After examining evidence from a sample, we will make one of two decisions when carrying out a
hypothesis test:

Evidence for RQ (Reject H0): This indicates that we have enough evidence to conclude
the _____________________ (research question) is true.

No Evidence for RQ (Do Not Reject H0): This indicates we do not have enough evidence
to ____________ the null hypothesis (i.e. no evidence for the RQ).
The two possible outcomes:
Ideally, these outcomes would occur:
Reject H0
Do not Reject H0
However, we know that hypothesis tests are not error proof! The following table summarizes all
possible scenarios when carry out a hypothesis test (the probability of each occurring is listed in
parentheses).
Null is true
Alternative is true
Reject H0
Type I Error (α)
No Error (1 – β)
Do not Reject H0
No Error (1 – α)
Type II Error (β)
You’ll see from the above table that two types of errors exist:

Type I: This error occurs when we falsely reject the null hypothesis. That is, we _______
the null hypothesis when it is true. The probability of committing this error is α.
Note: We can control the Type I Error rate by our selection of α prior to
conducting the experiment.

Type II: This occurs when we fail to reject the null hypothesis when a particular
alternative scenario is true. The probability of committing this error is β.
15
There is a relationship between α, β, and n (sample size):

We have __________________ control over _____.

A decrease in _____ results in an increase in _____.

An increase in the sample size will decrease both _____ and _____.
Example: The MedAssist Pharmaceutical Company makes a pill intended for children susceptible for
seizures. The pill is supposed to contain 20mg of Phenobarbital. A random sample of pills is selected
and tested to see that the average amount of the drug is correct.
H0:
Ha:
Questions:
19. Describe a Type I Error in context.
20. Give one consequence/implication of making a Type I Error.
21. Describe a Type I Error in context.
22. Give one consequence/implication of making a Type I Error.
16
Example: The State Environmental Protection Agency (SEPA) is responsible for monitoring the air
pollution level for a large western metropolis. The air pollution level is considered to be acceptable (or
safe) if the mean pollution level is at or below a reading of 100mg of pollution per cubic yard of air. Air
pollution levels substantially above 100mg/yd3 are considered to be dangerous. To monitor air pollution
levels, the SEPA will take a pollution reading 10 times a day. If the evidence from this sample suggests
that the air pollution levels are unacceptable, then the SEPA must decrease an air pollution emergency
and impose emergency measures to reduce pollution levels in the air.
H0:
Ha:
Questions:
23. Describe a Type I Error in context.
24. Give one consequence/implication of making a Type I Error.
25. Describe a Type I Error in context.
26. Give one consequence/implication of making a Type I Error.
17
Confidence Interval for a Single Population Mean
In the hypothesis testing set of notes we found evidence that the mean percent ideal body weight of
insulin-dependent diabetics differs from 100. Our next question is obvious: How much does it differ?
To answer this question, we must first construct a confidence interval.
Confidence Intervals
 This procedure does not require any hypotheses concerning our population parameter of
interest, in this case µ.
 We will use both sample data, in particular the observed _______________, and the
appropriate sampling distribution to obtain a range of likely values for the population mean.
 A confidence interval allows us to estimate the population parameter of interest (recall a
hypothesis test will NOT allow us to do this). Therefore, when available, a confidence interval
should accompany a hypothesis test.
 Because the confidence interval does not require any hypothesized value for the population
parameter, we can’t center the sampling distribution about the “true” or population
hypothesized mean. However, the confidence interval will still incorporate both the data
collected in our sample and what we know about sample-to-sample variation. Consider the
following example.
Example: Our goal is to construct a 95% confidence interval for the mean percent ideal body weight of
insulin-dependent diabetics. To do this, we will center our sampling distribution at the observed mean.
Then, we will find the lower and upper endpoints the separate the middle 95% of the distribution from
the rest (since we are constructing a 95% confidence interval).
18
JMP automatically provides the endpoints of the 95% confidence interval:
Questions:
27. Interpret this interval. What does this interval tell us about the true percent ideal body weight
of insulin-dependent diabetics?
28. Does this interval agree with what you learned from the hypothesis test? Explain.
29. What additional information is gained by using a confidence interval over a simple hypothesis
test? Explain.
19
Margin of Error
The margin of error is defined as the difference between the center of the confidence interval and either
endpoint. For this problem we have:
Upper Endpoint – Center of Interval = 119.951 – 112.778 = 7.173
Center of Interval – Lower Endpoint = 112.778 – 105.605 = 7.173
So, the margin of error for this problem is ± 7.173.
Question: Can you identify at least two ways to make this margin of error smaller?
1.
2.
Changing the Confidence Level in JMP
Recall the 95% confidence interval for the mean:
To change the level of confidence, click on the red drop-down arrow and choose Confidence Interval
and choose 0.90.
20
You should get the following output:
Question:
30. Did the margin of error change as you thought it would?
More on the Interpretation of Confidence Intervals
Consider the 95% confidence interval from the diabetic example.
 Correct interpretation: We’re 95% confident the true mean percent ideal body weight
of insulin-dependent diabetics is between 105.6% and 119.95%.
 Incorrect interpretation: The probability that the true mean percent of ideal body
weight of insulin-dependent diabetics is between 105.6% and
119.95%.
The 95% refers to the process of constructing the confidence interval. This means that if we were to
take 100 samples of size 18, constructing a confidence interval each time, we would expect 95% of them
to capture the true population mean. Consider the following example:
Example: Our goal is take samples from a population in order to estimate the true population mean.
Shown below are 10 random samples of size n = 5. Construct a confidence interval for each of the
samples.
Sample ID
1
2
Data from
Sample
12.49983
11.4342
8.210933
7.373925
8.776002
5.655407
8.903349
12.98215
10.22548
6.172528
Sample Statistics
90% Confidence Interval
Mean
Std Dev
9.65898
2.19771
7.56 ≤ μ ≤ 11.75
Mean
Std Dev
8.78778
3.01349
5.91 ≤ μ ≤ 11.66
21
3
4
5
6
7
8
9
10
8.181802
12.08606
6.176875
5.556382
5.822172
13.19405
5.122735
2.469639
7.373925
6.401793
9.293009
10.52984
7.260893
10.50763
7.431728
9.303573
2.354969
8.811873
17.06401
10.45554
10.91127
8.023941
8.432168
14.17466
8.603912
11.53353
5.782364
11.44628
10.61424
-1.68752
8.197059
6.193274
9.114461
6.290799
9.661013
6.53196
12.08221
6.81856
13.46314
9.183324
Mean
Std Dev
7.56466
2.73035
4.96 ≤ μ ≤ 10.17
Mean
Std Dev
6.91243
3.96465
3.13 ≤ μ ≤ 10.69
Mean
Std Dev
9.00462
1.59555
7.31 ≤ μ ≤ 10.71
Mean
Std Dev
9.59799
5.23552
4.6 ≤ μ ≤ 14.6
Mean
Std Dev
10.02919
2.57711
7.58 ≤ μ ≤ 12.48
Mean
Std Dev
7.53778
5.67659
2.13 ≤ μ ≤ 12.95
Mean
Std Dev
7.89132
1.59424
6.37 ≤ μ ≤ 9.41
Mean
Std Dev
9.61584
3.09866
6.67 ≤ μ ≤ 11.75
22
A graphical representation of the confidence intervals is given below.
Questions:
31. Why are some of the 90% confidence intervals wider than others?
32. In truth, these 10 random samples were generated from a population with a mean of 10. How
many of the confidence intervals contain the true mean? What does it mean to say that we are
90% confident?
Example: Recall the Student Data Survey completed at the beginning of the semester. Using JMP,
construct a 99% confidence interval for the true average number of hours spent on facebook in a day.
The data can be found in the file facebook.jmp on the course website.

Interval 

Interpretation 
23
Download