Hypothesis Testing of Qualitative Data Connecting Probability Concepts in a

advertisement
Hypothesis Testing of
Qualitative Data
Connecting Probability Concepts in a
logical way to develop the formulas
used for testing hypotheses
By Pete Kaslik
Pierce College, Ft. Steilacoom
The Need To Do Statistics
Begins With A Question That
Leads to a Decision
It is typical that the decision
usually has financial or health
implications.
For the purposes of this
demonstration, the question is
whether the US should allow
drilling for oil in the Pacific
Ocean and in the Arctic Ocean
We will assume there are only 2
possible points of view on this.
1. Our lifestyles require oil and
therefore we should get oil no
matter where it is and no matter
the possible consequences.
2. We have already gotten the
easiest oil and the potential
consequences of drilling in the
ocean outweigh the benefits.
Therefore we should transition
to an oil-free society instead of
drilling for more.
Before analyzing data, it is
useful to understand the topic.
Following is a brief explanation
of the oil issue.
Hubbert
• In 1956, M King Hubbert, a petroleum
geologist explained that oil field production
follows a normal curve. That is, it starts
slow, increases to a peak and then
declines. The same curve applies to
countries and the world.
• He predicted that the US would peak in
the early 1970s.
US Oil Production and
Consumption (and population)
240,000,000
14
220,000,000
12
200,000,000
10
180,000,000
8
160,000,000
6
140,000,000
4
120,000,000
2
100,000,000
0
80,000,000
M o n th
US Population
16
Notice how
much more
we consume
than produce,
hence the
reason for all
the imports.
Mar-2023
260,000,000
Jul-2009
18
Oct-1995
280,000,000
Feb-1982
20
Jun-1968
300,000,000
Oct-1954
22
Jan-1941
320,000,000
May-1927
24
Sep-1913
Million Barrels per Day
U.S. Crude Oil Daily Production and Consumption
and US Population (1920-Oct 2011)
- Y e a r
U.S. Field Production of Crude Oil (Million Barrels Per Day)(L)
U.S. Field Consumption of Crude Oil (Million Barrels Per Day)(L)
US Population(R)
Notice the peak
in US oil
production
around 1970
World Oil Production
World oil production history and projections (left)
are produced by the US Energy Information
Administration (EIA) and are available on the
website:
http://www.eia.gov/pub/oil_gas/petroleum/present
ations/2000/long_term_supply/sld001.htm.
World Oil Supply (Billion BBL/year)
Data From http://www.eia.gov/cfapps/ipdbproject/IEDIndex3.cfm
These projections were made in 2000, so that
we can now see if they are on track by looking at
data that has been obtained between the time of
the predictions and 2010.
World Oil Supply (Billion BBL/year)
35
30
25
20
1975
1980
1985
1990
1995
2000
2005
2010
2015
New Oil Fields
• There are known reserves off the west
coast of the US and in the Arctic.
• There is conflict about allowing drilling in
the ocean, particularly after the incident in
the Gulf in 2010.
What if the US citizens decided
whether to drill in the oceans or
transition to a more environmental
sustainable society?
Our Choice
• If a super majority of Americans wanted to
drill, then we drill.
• If those who want to drill are not a super
majority, then we should take a more
conservative approach to impacting our
planet and begin our transition to a more
sustainable society.
• We will consider a super majority to be
66.7%.
Our Question
• Does a super majority of adult Americans
want to drill for oil in the ocean waters, in
spite of the risks?
A Census
• To know the absolute answer to this
question would require a census – that is
asking every single adult in the country
(approximately 230 million people).
• Since a census is expensive, time
consuming and generally not possible, we
will make a hypothesis about the opinions
of people and then take a sample to test
our hypothesis.
Our Hypotheses
• H0: p = 0.667
• H1: p > 0.667
• p is the proportion of all adult Americans
who think we should drill in marine waters.
• A super majority is over 66.7%, thus the
null hypothesis indicates there isn’t a
super majority and the alternate indicates
a super majority.
3 Methods
• We will test these hypotheses using 3
different, but related methods.
– Binomial distribution – gives exact results
– Normal approximation to the binomial
distribution – gives approximate results
– Sampling distribution of sample proportions –
gives approximate results.
Visualizing the Hypotheses
The Null Distribution
Don't Drill
H0: p = 0.667
Drill
A Possible Alternate Distribution
A Possible Alternate Distribution
A Possible Alternate Distribution
Don't Drill
Don't Drill
Don't Drill
Drill
Drill
Drill
H1: p > 0.667
Picture the Opinion of the Entire
Adult US Population
Use your
imagination to
picture 230
million black
or green
circles on this
map, one for
each adult’s
opinion.
drill
don’t drill
Image of US Map from http://www.thinkstockphotos.com/search/#us map/f=PIHV
The Sample
• Since we can’t do a census, our next best
alternative to understanding the population
is to take a random sample. We are going
to then have to use this sample as a way
of determining which hypothesis to
support.
Error
• Because we are going to make a judgment
about the entire population based on a sample,
it is possible that we will make an error as a
result of the data we randomly select.
• If the data supports the alternate hypothesis, we
could make a Type I Error.
• If the data supports the null hypothesis, we could
make a Type II Error.
• We will not know if we make an error but there
are consequences if we do.
Consequences of Errors
• The consequence of a Type I error would
be that we would drill for oil when we don’t
have a super majority.
• The consequence of a Type II error would
be that we wouldn’t drill for oil when the
super majority thinks we should.
Our Sample
This is our sample, in the order in which the samples were taken.
Black represents drill, green represents not drilling.
What we have to decide is how to use this sample to
determine which hypothesis to support.
One way to use this sample is to count the number in favor of
drilling (black) and the number opposed to drilling (green).
Our sample has 16 in favor, 4 opposed.
Disclaimer: Obviously, this is a very small sample. This sample
size is being used to keep this demonstration reasonable.
Hypothesis Testing Theory
• When testing hypotheses, we start with
the assumption the null hypothesis is true.
• We reject the null hypothesis only if we get
data that is unlikely. That is, we get data
that would be considered a rare event.
• To determine what is and isn’t a rare
event, we must first recognize that random
samples do not look exactly like the
population from which they were drawn.
The Null Distribution
• Our objective is to create the null distribution.
This distribution shows the complete set of
possible outcomes and gives the probability
of each of the outcomes.
• Probability = Number of Favorable Outcomes
Number of Possible Outcomes
This assumes that every outcome is equally
likely, which in a simple random sample
would be the case (theoretically).
The Starting Point
• If we assume the null hypothesis is true,
that is, we assume that exactly two thirds
(0.667) of the population wants to drill and
exactly one third (0.333) does not want to
drill, and if we randomly select from this
population, then the probability that we
select someone who wants to drill is
0.667.
P(A or B)=P(A) + P(B)
• When one selection is made from a
population, the probability it has one of two
mutually exclusive characteristics is found
by adding the probabilities of each
characteristic.
• This rule is useful for understanding
complements.
P(A or B)=P(A) + P(B)
• Assuming everyone in the country has an
opinion on this topic, then the probability
we select someone who wants to drill or
doesn’t want to drill = 1, which is a
certainty. Therefore:
• P(Drill or Don’t Drill)=P(Drill) + P(Don’t Drill)
• 1=P(Drill) + P(Don’t Drill)
Complements
• 1=P(Drill) + P(Don’t Drill)
• With a little algebra, we see that
P(Don’t Drill) = 1 – P(Drill).
Therefore, if the probability of selecting
someone who wants to drill is 0.667, the
probability of selecting someone who
doesn’t want to drill is 1- 0.667 = 0.333.
P(A and B) = P(A)P(B)
• For independent events such as selecting
people and asking their opinion about
drilling, the probability of any specific
sequence of responses is found by
multiplying the probabilities of each
individual response.
P(A and B) = P(A)P(B)
• Thus, if we selected 2 people from the
population, it is possible to get the following four
combination of opinions.
• These can be shown as
P(B and B), P(B and G), P(G and B), P(G and G)
however, this will be abbreviated by the removal
of the word “and” to give:
P(BB), P(BG), P(GB), P(GG)
P(A and B) = P(A)P(B)
• Using the above rule we have:
P(BB) = P(B)P(B) = 0.667• 0.667 = 0.445
P(BG) = P(B)P(G) = 0.667 • 0.333 = 0.222
P(GB) = P(G)P(B) = 0.333 • 0.667 = 0.222
P(GG) = P(G)P(G) = 0.333 • 0.333 = 0.111
Notice that the sum of all these probabilities
is 1, since they are the complete set of
possible outcomes for a sample size of 2.
Appling this to our sample
• Remember that our sample was:
This can be represented mathematically as:
P(BBGBBBBGBBBBBGGBBBBB) =
P(B)P(B)P(G)P(B)P(B)P(B)P(B)P(G)P(B)P(B)P(B)P(B)P(B)P(G)P(G)P(B)P(
B)P(B)P(B)P(B) =
(0.667)(0.667)(0.333)(0.667)(0.667)(0.667)(0.667)(0.333)(0.667)(0.667)(0.667)(
0.667)(0.667)(0.333)(0.333)(0.667)(0.667)(0.667)(0.667)(0.667) =
0.0000189
Therefore, the probability of getting our exact sequences of drill and don’t drill
opinions, assuming the null hypothesis is true is 0.0000189.
Applying this to our sample
If the probability of this exact sequence
is 0.0000189, because multiplication is
commutative, does it seem reasonable that the
probability of each of the following exact
sequences is also 0.0000189?
Using Exponents
The last sequence
is a convenient way of representing 16 black and 4 green
circles because we can use exponents to make our
calculations faster. Thus we have:
P(drill)16P(don’t drill)4 = (0.667)16(0.333)4 = 0.0000189
A shift in our thinking
• When we take a sample, we don’t really
care about the order in which the data are
collected. What we want to know is the
probability of getting a particular number of
people who want to drill. In this example,
we might want to know the probability that
exactly 16 out of 20 people in the sample
will want to drill.
Adding Probabilities
• We now need to return to the rule for
mutually exclusive events.
• P(A or B) = P(A) + P(B)
• Our use of this formula involves the
different combinations of outcomes to the
survey:
Of course, there are many more arrangements (combinations) of 16
black and 4 green circles than are shown here.
Multiplication instead of addition
• Since each combination has exactly the same
probability (0.0000189) then instead of finding each
combination and adding the probabilities of each, we
could simply multiply that probability by the number of
combinations. Combinations are found using
n
n!
nCr    
 r  n  r !r!
Since our sample size is n = 20 and the number of people who want to
drill is r = 16, then
 20 
20!
20 C16  
 16   20  16!16!  4845
 
TI 83/84: 20 Math PRB #3 16 Enter
Contrast
• Make sure you understand the difference
between finding the probability of one particular
sequence of outcomes and the number of
people who want to drill.
• The probability of one particular sequence such
as
= 0.0000189.
• The probability of exactly 16 successes in a
sample of size 20 with a probability as defined in
the null hypothesis is 4845•0.0000189 = 0.0916.
Focus on what’s important
• We care about the number of people who want
to drill more than the order in which their opinion
was recorded.
• Recall that our objective is to determine the
likelihood of particular outcomes.
• For a sample of size 20, there are 21 possible
outcomes for the number of people who want to
drill. These possible outcomes are:
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
Binomial Distribution Formula
• We found the probability of exactly 16 people
who want to drill by multiplying the number of
combinations times the probability of a specific
combination. Formally, this is:
n!
r n r 
p q
n  r !r!
Where n = sample size, r = number who want to drill, p = probability of
selecting someone who wants to drill and q = probability of selecting
someone who doesn’t want to drill.
TI 83/84: 2nd Distr Binompdf(n,p,r): Example: Binompdf(20,0.667,16)
Binomial Distribution
• If we apply the binomial distribution formula to
each possible number of successes, we can
create a binomial distribution.
0
2.81E-10
7
0.003
14
0.182
1
1.13E-08
8
0.009
15
0.146
2
2.14E-07
9
0.025
16
0.091
3
2.58E-06
10
0.054
17
0.043
4
2.19E-05
11
0.098
18
0.014
5
1.41E-04
12
0.148
19
0.003
6
7.04E-04
13
0.182
20
0.000
Binomial Distribution Stick Graph
Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants
To Drill
0.20
0.18
0.16
Probability
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Number of People in a Sample of Size 20 Who Want To Drill
Testing Our Hypotheses
Review:
The principle behind hypothesis testing is to assume the null
hypothesis is true and then determine if selecting our data was a
rare event.
The probability of selecting our data or more extreme data is called the
p-value.
Since our hypotheses are
H0: p = 0.667
H1: p > 0.667
The direction of our extreme is to the right because it is high values that
would lead us to conclude the proportion is more than 0.667 and
high values are to the right on a number line.
Testing Our Hypotheses
• Rare events have small p-values.
• A p-value is considered to be small
enough to be regarded as a rare event if it
is less than or equal to alpha, which is the
probability of making a type I error.
If p-value ≤ α, accept H1. The data are significant.
Testing Our Hypotheses
•
•
Let α=0.05.
Since our data was 16 people who wanted to drill, then we can find the probability of
getting 16 or more by adding up the probabilities on the binomial distribution.
Sum:
16
0.091
17
0.043
18
0.014
19
0.003
20
0.000
0.152
Our p-value = 0.152. Since the p-value is greater than α, we conclude
that it would not be a rare event to get 16 out of 20 people to support
drilling if exactly 66.7% of the adult American population supports
drilling, therefore the evidence from our sample supports the null
hypothesis which indicates there is not a super majority.
Using the Calculator to Test a hypothesis
using the binomial Distribution
• Rather than creating an entire binomial
distribution and adding up the probabilities for
the data and more extreme values, we can use
the binomcdf function on the calculator.
Binomcdf always adds the probabilities to the
left, so if the direction of the extreme is to the
right, we need to use the complement rule, and
also subtract one from our x value. That means
the probability of getting 16 or more equals 1
minus the probability of getting 15 or less.
• 1 – binomcdf(n,p,r-1) = 1 – binomcdf(20,0.667,15)=0.152
Using the Calculator to Test a hypothesis
using the binomial Distribution
• If the direction of the extreme had been to
the left because of a < symbol in the
alternate hypothesis, we would use:
binomcdf(n,p,r) to get the p-value.
Summary
• A binomial distribution includes all possible
outcomes and their probabilities if the null
hypothesis is true. Using this distribution
allows us to find the exact p-value for our
data and thereby determine if the data is
rare enough to cause us to reject the null
hypothesis.
The Second Method
Normal Approximation of the Binomial Distribution
Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants To Drill
Function = exp(-.5*(((x-1)-13.34)/2.1077)^2)/(2.1077*sqrt(2*Pi))
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Number of People Who Want To Drill
Notice how nicely the normal curve fits the binomial distribution.
Binomial Distribution Parameters
• The expected value, or mean of this
distribution is μ = np = 20(0.667) = 13.34
• The standard deviation of this distribution is
  np1  p  20  0.6670.333  2.1077
This mean and standard deviation can be used to label the x axis of
the normal curve.
Normal Approximation
Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants To Drill
Function = exp(-.5*(((x-1)-13.34)/2.1077)^2)/(2.1077*sqrt(2*Pi))
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
Number of People Who Want To Drill
19
20
19.7
17
18
17.6
16
15
15.4
14
13
13.3
12
11.2 11
10
9
9.1
8
7.0 7
6
5
4
3
2
1
0
0.00
Using the Normal Distribution
• Since the mean is 13.34, the standard
deviation is 2.1077 and our sample
number of people who want to drill is 16,
then we can use the z formula to
determine how many standard deviations
our sample is from the mean. We can use
that value and the standard normal z
tables to find the probability of getting 16
or more successes.
Using the Normal Distribution
z
z
x

16  13.34
 1.26
2.1077
From the tables, we find the area to the right of this z-value
(hint: 1 – area to the left) is 0.1038, which is our p-value. The TI
83/84 calculator gives a more precise version of this if you enter
2nd distr #2 Normalcdf(low, high, mean, standard deviation) =
Normalcdf(16, 1E99, 20*0.667,sqrt(20*0.667*0.333)) = 0.10346
As when the hypothesis was tested using the binomial distribution,
this p-value is greater than α, so the data supports the null
hypothesis – there isn’t a super majority who want to drill.
Counts are difficult to compare
Both of the methods used to test hypotheses
rely on counting the number of people who
want to drill. If every sample size taken
was the same size, then this would be a
good way to compare results from different
samples. However, if our sample had 16
people who want to drill and someone else
had 42 people who want to drill, how could
we compare these numbers?
Counts are difficult to compare
The obvious question that would be asked is
42 out of how many? If the sample size
was 48, we would come to one conclusion,
if the sample size was 100, we would
come to a different conclusion.
From Counts to Proportions
• For us to compare the results of two samples with
different sample sizes leads us to the concept of
proportions.
• Proportions result from dividing the number of
people who want to drill by the sample size. Thus:
16
 0 .8
20
42
 0.42
100
42
 0.875
48
We will use this approach of dividing the counts by the sample size to
convert our distribution from counts to proportions. We will also convert
the mean and standard deviation of the number of people who want to
drill.
From Counts to Proportions
Remember that the mean of a binomial distribution is μ = np. If
we divide the mean by the sample size n we get:

np

 p
n
n
Remember the standard deviation of a binomial distribution is
  np(1  p)
We can divide the standard deviation by the sample size n to get:

n

np(1  p)

n
np(1  p)

2
n
p(1  p)
n
18 (0.90)
17 (0.85)
19 (0.95)
19.7 (0.985)
20 (1.00)
17.6 (0.88)
15 (0.75)
15.4 (0.77)
16 (0.80)
13 (0.65)
13.3 (0.667)
14 (0.70)
12 (0.60)
11.2 (0.56)11 (0.55)
10 (0.50)
9.1 (0.455) 9 (0.45)
8 (0.40)
7 (0.35)
7.0 (0.35)
6 (0.30)
5 (0.25)
4 (0.20)
3 (0.15)
2 (0.10)
1 (0.05)
0 (0.00)
Changing to Proportions
Distribution of Sample Proportion of People Who Want To Drill
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Using the Normal Distribution
For the distribution of sample proportions, x =
x 16
pˆ    0.8
n 20
z

x


p̂ , μ = p and

0.667(1  0.667)
 0.10538
20
pˆ  p
p(1  p)
n

0.8  0.667
 1.26
0.10538
Notice that this is exactly the same Z value we got when using
the Normal Approximation for the Binomial Distribution.
Consequently, the p-value will be the same as will be our
conclusion.
From the tables, we find the area to the right of this z-value
(hint: 1 – area to the left) is 0.1038, which is our p-value. The TI
83/84 calculator gives a more precise version of this if you enter
2nd distr #2 Normalcdf(low, high, mean, standard deviation) =
Normalcdf(16/20, 1E99, 0.667,sqrt(0.667*0.333/20)) = 0.10346
p(1  p)
n
Surprised?
• It should come as no surprise that the
normal approximation for the binomial
distribution and the sampling distribution of
sample proportions yields exactly the
same result. This is because the normal
approximation uses counts while the
sampling distribution represents the
counts as proportions. Nothing else is
different.
Summary
• Binomial Distributions produce exact pvalues
• The normal approximation to the binomial
distribution and the sampling distribution of
sample proportions produce identical pvalues (0.103) that will be close to, but not
the same as those produced in the
Binomial Distribution (0.152).
Conclusion
• Ultimately, we use the last method to test
hypotheses because it is easier to discuss
proportions than counts since sample
sizes can vary. Keep in mind however,
that the method produces approximate pvalues. The approximation improves as
the sample size increases.
Conclusion
• Because the p-value is greater than the
level of significance (0.05),
– we select the null hypotheses.
– the data are not significant
– we will not drill.
– there is the possibility we made a Type II
error.
Download