Chapter 2 Comparing Two Proportions

advertisement
CHAPTER 2
COMPARING TWO
PROPORTIONS
Objectives:
Students will be able to:
1) Test a difference in proportions
2) Use technology to simulate a difference in
proportions
3) Make conclusions from experiments
• In Chapter 1, we examined if an athlete’s or
team’s ABILITY increased or decreased from an
early time period to a later time period.
• In Chapter 2, we will examine if an athlete or
team has a greater ABILITY in one context or
another, during the same time period.
• Examples
• If a baseball player’s ABILITY to hit is greater
against left-handed pitchers than against righthanded pitchers during his career.
• If a team’s ABILITY to win at home is greater than
their ABILITY to win on the road during the
regular season.
Is there a home field advantage in the
NFL?
• It is assumed in the NFL (and most other leagues)
that the home team has an advantage.
• Here are the results of the Arizona Cardinal’s
games from the 2008 NFL season:
• This table is known as a two-way table.
• A two-way table displays the distribution of a
categorical variable in at least two different contexts.
• The team’s regular season overall record was 9-7,
meaning they won 9 games and lost 7, thus they
played a total of 16 games.
• How would we find their winning percentage?
• To calculate winning percentage, divide the total number of
wins by the total number of games played, and then multiply
by 100.
• Let’s calculate their different winning percentages:
• What’s one way we can compare these
PERFORMANCES?
• It certainly looks like there is evidence that the
Cardinals had a greater ABILITY to win at home.
However, is our evidence convincing evidence?
Performance vs Ability
• If the Cardinals could continue playing the 2008
season indefinitely, in the same conditions, their
overall winning percentage in home games would
be their ABILITY to win at home.
• The Cardinals PERFORMANCES at home (and
on the road) in the actual 2008 season will vary
from their ABILITY due to random chance.
Why were the Cardinals better at home?
• We have two possible explanations:
• Claim 1: The Cardinals had the same ABILITY
to win at home and on the road, and the
difference in PERFORMANCES was due to
RANDOM CHANCE.
• Claim 2: The Cardinals had a greater ABILITY
to win at home than on the road.
• Remember the comparison to our judicial system.
We must start with the assumption that someone is
innocent until proven guilty. Only if we have
convincing evidence can we claim someone is guilty.
• In this instance, we must start with the assumption
that the Cardinals had the same ABILITY to win at
home and on the road, and the difference in
PERFORMANCES was due to RANDOM CHANCE.
• If our simulations show that the difference in
PERFORMANCES is unlikely to occur by RANDOM
CHANCE, then we will have convincing evidence that
the Cardinals had a greater ABILITY to win at home.
Stating Hypotheses
• The two competing claims are known as
hypotheses.
• Null hypothesis: describes the initial belief that
there has been no change in ABILITY or that
there is no difference in ABILITY in two different
contexts
• The null hypothesis is denoted as
• This is read as H-naught, or H-sub 0.
• Alternative hypothesis: describes what we want to
establish or what we suspect is true
• The alternative hypothesis is denoted as
• This is read as H-sub a.
• Let’s formally state our hypotheses:
Things to keep in mind…
• Hypotheses are always stated in terms of ABILITY, which
is what we are trying to estimate.
• The null hypothesis should always claim equality.
• The alternative hypothesis will either claim an inequality, or
claim non-equality. For this chapter, most alternative
hypotheses will contain a “greater” inequality.
• The alternative hypothesis is what we suspect is true, not
the null hypothesis.
• A hypothesis test is the formal process used to decide
whether the data provide convincing evidence to support
the alternative hypothesis.
• Before we talk more about this, let’s practice writing
hypotheses…
Example 1
• Do golfers have a lower ABILITY to land on the green
when it rains? To investigate, you compare the
percentage of time landing on the green for Tiger
Woods in tournaments when it rained and in
tournaments when it was sunny. State the null and
alternative hypotheses for investigating this claim.
Example 2
• Do quarterbacks have a greater ABILITY to complete
passes at home as opposed to on the road? To
investigate, you compare the pass completion
percentages for Shane Falco (of the Washington
Sentinels) in games at home and in games on the
road. State the null and alternative hypotheses for
investigating this claim.
Test Statistics
• A test statistic is a measure calculated from a
team’s (or player’s) PERFORMANCES that is
used as evidence in a hypothesis test.
• The test statistic in this chapter is the difference
in proportions (percentage of successes).
• The Cardinals home winning percentage was
75%, and their road winning percentage was
37.5%. Since we want the difference in
proportions, subtract them, and we get our test
statistic of 37.5%.
• We will be testing to see how likely it is for a team
that goes 9-7 overall to have a difference in
PERFORMANCES at home and on the road of at
least 37.5% due to RANDOM CHANCE alone,
assuming there is no difference in their ABILITY
at home and on the road.
• In other terms, we want to see how likely it is to
get a difference of at least 37.5%, assuming the
null hypothesis is true.
• To do this, let’s run a simulation! How would we
set up our spinner???
• Problem: We do not have an assumed ABILITY.
• Solution: We have to use a different type of simulation.
• We are going to use index cards to run this
simulation. This will tell us what values of the test
statistic could occur by RANDOM CHANCE alone.
Here’s what you will do:
1) Start with 16 equally sized index cards.
2) The outcomes will go on the cards.
-On 9 of the cards, write a “W” on one side to
represent the 9 wins. On the remaining 7 cards, write
a “L” on one side to represent the 7 losses.
3) Next, fully shuffle the cards. Then deal 8 cards into
one pile to represent the team’s simulated
PERFORMANCE at home, and the other 8 cards into
another pile to represent the simulated road
PERFORMANCE.
• Here are the results for my simulation:
• Here, the Cardinals still performed better at home,
but only by 12.5%, not the 37.5% we saw in the
actual 2008 season.
• Let’s record some of our observations. Note: you can
have a negative difference! This would occur if the
road winning percentage was higher than the home
winning percentage.
• Here is 100 simulated seasons.
• From this simulation, does it look like we have
convincing evidence to claim that the Cardinals had a
greater ABILITY to win at home than on the road?
• 16 out of 100 (16%) simulations produced a
difference greater than or equal to the actual
difference of 37.5%. As a result, we do not have
convincing evidence to state that the Cardinals
have a greater ABILITY to win at home than on
the road.
• The value 16/100 (.16) is called our approximate
p-value (probability value).
• A p-value measures how likely it is to get a test
statistic at least as extreme as the observed test
statistic by RANDOM CHANCE, assuming that
the null hypothesis is true. The results of a
simulation provide an estimate of the p-value.
Making Conclusions
• There are two possible conclusions that we can
reach:
• 1) Reject the null hypothesis
• 2) Fail to reject the null hypothesis
• The p-value gives us the information we need
about the null hypothesis to reach a conclusion.
• If the p-value is small, we will reject the null
hypothesis.
• If the p-value is large, we will fail to reject the null
hypothesis.
• The significance level of a test is a
predetermined level of evidence that is required
to essentially rule out RANDOM CHANCE as a
plausible explanation.
• Most of the time we use a significance level of
5%. (Sometimes you will see other values.
Other common values are 1% and 10%).
• The p-value is considered small if it is less than
the significance level.
• The smaller the p-value, the more convincing our
evidence is.
• For example, a p-value of 0.002 would be more
convincing than a p-value of 0.03.
• This video best sums it all up:
• Understanding p-value
• Remember, if p is low, null must go!!!
• Our initial hypotheses:
• The p-value in our simulation was 16%.
• Conclusion: Fail to reject the null hypothesis.
• At the 5% level of significance, we do not have
convincing evidence to support the claim that the
2008 Cardinals had a greater ABILITY to win at home
than on the road.
Tennis anyone???
Using Technology for Simulation
• Even though index cards are super fun, they are
also super time consuming. Let’s look at a
quicker way we can simulate a difference in
proportions by using technology.
• We will go back to the example in the beginning
of the chapter of the 2008 Arizona Cardinals.
Reminder: they were 6-2 at home, and 3-5 on the
road. We are going to need this data!
Steps to simulate a difference in proportions by using
technology:
1) Either go to http://lock5stat.com/statkey/ , or click on
my teacher page and click the link “Statkey”
2) In the right hand column under “Randomization
Hypothesis Tests” click “Test for Difference in
Proportions.”
3) On the top of the page click on “Edit Data”
4) Group 1 count: 6
Group 1 sample size: 8
5) Group 2 count: 3
Group 2 sample size: 8
6) Click on “right tail”
7) Click “generate 100 samples”
• It might seem a little odd that we did not find
convincing evidence for home field advantage,
seeing as how most fans believe it exists.
• Is there anything we can do to try and improve this
study?
• Perhaps if we increased the sample size, we may
find convincing evidence for home field advantage.
• Sample size: the number of observations (games,
shots, etc…)
• Remember, when sample size is increased, the
effects of RANDOM CHANCE are decreased, and
the PERFORMANCE will be closer to the actual
ABILITY.
• Instead of looking at the PERFORMANCES of the
Cardinals for just the 2008 season, let’s look at the
PERFORMANCES for the 2004-2008 seasons.
• What are the hypotheses we want to test?
• Before we test these hypotheses, let’s calculate
our test statistic.
• Now let’s head back to Statkey and test the
likelihood of a difference of at least 32.5% due to
RANDOM CHANCE.
• Edit Data
Group 1 Count: 23
Group 2 Count: 10
Group 1 Sample size: 40
Group 2 Sample size: 40
Right tail
Generate 100 samples
Try it again and generate 1000 samples
• Here are the results of 1000 trials of the simulation:
• Estimate the p-value from this simulation.
Make a Conclusion
• Conclusion: Reject the null hypothesis
• At the 5% level of significance, there is convincing
evidence to support the claim that the 2004-2008
Cardinals had a greater ABILITY to win at home than
on the road.
Something to think about…
• When we used 5 year’s worth of data, a difference of
32.5% was convincing evidence of home field
advantage.
• When we used 1 year’s worth of data, a difference of
37.5% was not convincing evidence of home field
advantage. So a larger difference was actually not
convincing.
• One of the most important concepts of this course is
that if there really is a difference in ABILITY, we are
more likely to find convincing evidence of the
difference when we use a larger sample size.
• Have you ever been watching a college basketball
game, a player on the visiting team is shooting freethrows, and the home student section is going
absolutely crazy trying to distract the free throw
shooter?
• This begs the question “Do basketball players have a
lower ABILITY to make free-throws in a hostile
environment?”
• If you have never seen this in action, here are a few
examples of college fans distracting free-throw
shooters:
• Sport Science
• Speedo Guy
• Wild Bill
Experiments
• The way we can test something like whether
environment makes a difference is by performing an
experiment.
• In an experiment, treatments are deliberately imposed
on individuals to measure their responses.
• Let’s talk about some of the key elements of
experiments, and how we would set up an experiment
to test if there is a difference in environments when
shooting free throws…
• Treatment: what each individual in the experiment is
assigned to do
• This experiment would compare shooting percentage using
two different treatments:
1) shooting free-throws in a distraction free environment
2) shooting free-throws in an environment with distractions
• Response variable: measures the outcome of interest
• What would be the response variable here?
• Whether or not the shot was made
• Explanatory variable: what is deliberately changed to
see if this change causes a change in the response
variable
• What would be the explanatory variable here?
• The type of environment
Setup
• It is important to think about the setup of the
experiment. Would it be fair to have a participant
should 50 shots in a distraction-free environment and
then 50 shots with distractions?
• No because the participant may get tired and shoot worse in
the second set
• How can we avoid this?
• We need to randomly assign our treatments, meaning
we need to use a chance process to assign the
treatments to individuals so that no treatment is given
an advantage.
• One way to randomly assign treatments for this
experiment would be to use notecards. If the
participant will take 100 total shots, then write a
“D” for distraction on 50 cards and a “N” for no
distraction on the other 50 cards. Then shuffle up
the cards and reveal them one at a time to
determine which environment each shot will be
taken in. Doing this will ensure that one
environment isn’t given a disadvantage (or
advantage) over the other.
• Another important aspect of an experiment is making
sure to control the environment. This means that all
conditions are kept exactly the same for each trial,
except for the treatments being compared.
• In this context, you want to make sure that the
conditions the participant is shooting in are the same
for all shots.
• Some things that should be kept the same:
• Same basketball
• Same sneakers
• Shoot at the same hoop
• Shoot in the same gym
• Shoot in the same time period
• By controlling for other conditions, we will be able to
say the explanatory variable is the cause for the
change in ABILITY (providing we have convincing
evidence).
• In this example, if we found convincing evidence, we
would be able to say that the reason is the shooting
environment. However, if we had not controlled for
something (say the same hoop), we would not know
if a change in ABILITY was due to the shooting
environment or simply shooting on a different hoops.
• What would be our hypotheses for this
experiment?
• By setting them up this way, we should be dealing
with a positive test statistic.
Time for the experiment
• Most of the examples we have dealt with have
been about one athlete, so we will have one
athlete perform the experiment. Then, we can
use our hypothesis testing procedure to see if we
have any convincing evidence. Afterwards, we
can run the experiment again with a new athlete.
• In the favor of time, we will do 10 shots with
distractions, and 10 shots without distractions.
• Here are the results from someone who took 100
attempts.
The p-value is 2%. What is the conclusion?
Reject the null hypothesis.
At the 5% level of significance, there is convincing
evidence to conclude that this particular student has a
greater ABILITY to make free throws when shooting
without distractions.
Making Conclusions from Experiments
• Up until this point, we have only been dealing with
observational studies.
• Observational studies use available data and do not
impose treatments.
• As a result, even if we do have convincing evidence
of a difference in ABILITY, we cannot conclude that a
change in one variable causes a change in another
variable. This is because other variables are not
controlled in an observational study and these
variables might be confounded with the explanatory
variable.
• This contrasts from experiments, which control all
variables (other than the explanatory variable), as
well as randomly assign the treatments. This then
allows us to make a cause-and-effect conclusion.
• In the free-throw experiment, everything stayed the
same except the environment. If we find convincing
evidence (which we did with the 100 shot example),
then we can conclude that the only plausible
explanation for shooting percentage being higher was
the environment without distractions.
• If this happened to be an observational study, we
would be able to say that we have convincing
evidence that ABILITY to make free-throws
increases when shooting without distractions, but
we would not be able to conclude why. We did
not control for any variables, so perhaps the
environment could have caused the ABILITY to
increase, but other things could have caused it to
increase as well, such as the basketball being
used, the hoop being shot on, etc…
Download