Teacher Notes What`s Your Statistic? Using Coin Tosses to

advertisement
Teacher Notes
What’s Your Statistic?
Using Coin Tosses to Generate and Use an Unfamiliar Sampling Distribution
Abstract: This lesson is designed to give students experience with generating a sampling
distribution. Students will invent their own test statistic for measuring randomness of a coin toss.
They will use this statistic to generate a sampling distribution using Fathom. Then, given both
randomly and non-randomly generated coin toss sequences, students will develop a more
authentic understanding of p-values as they examine the likelihood of each sample statistic
when compared to their sampling distribution. Students can discuss the reasonableness and
reliability of their test statistic to decide whether a sequence of coin tosses was randomly
determined, or created by a non-random process. By having students use unconventional test
statistics, this lesson pushes students to a deeper understanding of sampling distributions and
p-values.
Objectives:
- To give students opportunities to explore the idea of inventing a test statistic, use the test
statistic to generate a sampling distribution, and make decisions from the sampling distribution.
- To help students find and interpret p-values, particularly when working off of unfamiliar
sampling distributions.
References:
Class Notes from Park City Mathematics Institute Secondary School Teachers Program 2007.
Online at http://mathforum.org/pcmi/hstp/sum2007/morning/bowen/day1handout.pdf
Clements, Cindy. (2007). Exploring Statistics with Fathom. Emeryville, CA: Key Curriculum
Press.
CME Project (2009). Precalculus. Boston, MA: Pearson.
Scheaffer, Richard L., Watkins, Ann, Witmer, Jeffrey, Gnanadesikan, Mrudulla. (2004). ActivityBased Statistics. Emeryville, CA: Key Curriculum Press.
Watkins, Ann E., Scheaffer, Richard L.,Cobb, George W. (2008). Statistics in Action:
Understanding a World of Data, 2nd ed. Emeryville, CA: Key Curriculum Press.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
Key Definition:
The p-value for a test is the probability of seeing a result from a random sample that is as
extreme as or more extreme than the result you got from your random sample if the null
hypothesis is true. (Statistics in Action, p. 835)
Prerequisites:
Students will need an understanding of sampling distributions and some familiarity with Fathom.
This lesson was designed to be used prior to a unit on formal hypothesis testing. It is most
appropriate for an AP Statistics or a high school statistics class studying hypothesis testing.
Part 1: Discussion and Homework (approx. 20 minutes)
Discussion:
What does a random sequence look like?
Write out a random sequence of 200 digits, consisting of 0’s and 1’s. Compare your sequences
with other students. Again, discuss – what does a random sequence look like?
Expected student responses:
Students will likely start to get ideas about the patterns and non-patterns they will see in
sequences. We expect students to generate sequences that don’t have many streaks in them
because such a pattern is not “random” enough. Students may start to have conversations
about how many “0”s or “1”s they might expect to see in a row, or whether they expect to see
alternating patterns. Perhaps students will start thinking about test statistics they would measure
from the sequence, and this discussion can be furthered the next day. To press students to
think further, the teacher may choose to show a sequence such as 101001000100001… and
ask students to consider the randomness of this sequence.
The homework assignment asks students to examine two different sequences and then use a
penny to help them make a decision as to which of the sequences is random.
Expected student responses:
Students should explain their choice as to which sequence is random. Students should
reference their experience with tossing a coin in justifying their response. Students who have
tossed a coin should see that tossing a coin can result in “streaks,” and that the first sequence is
actually the randomly generated sequence.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
Part 2: Discussion and Fathom Activity (approx. 90 minutes)
Opening Discussion (10 minutes):
On the homework assignment, what are some different ways you could tell that the sequence
was real?
What kinds of measures could you collect that could serve as a test to see whether a sequence
is “truly” random? Invent a summary statistic that will allow you to determine whether a new
sequence you see comes from a fair coin.
Expected student responses:
Length of longest streak
Number of streaks
Mean length of streak
Median length of streak
Proportion of H’s (or T’s)
Number of switches (how many times the sequence changes from H to T (or T to H))
Proportion of streaks more than 3 long
(and others)
Student Handout #1:
Ask each student to choose a summary statistic they think will work best to test for randomness.
They should calculate this statistic for the sequence they generated at the end of the previous
class and both sequences from the homework assignment.
They should also answer the questions included on the handout:
1. Describe your summary statistic. What kind of information does it give you about the
sequence?
2. What kind of values do you expect your statistic to take for a truly random sequence?
Why?
Student responses will depend on statistic chosen. Students should back up claims about
values they expect their statistic to take for a truly random sequence.
Fathom Activity (approx. 80 minutes):
Have students pair up according to the statistic they chose to examine (so that both students
are examining the same statistic). The Fathom activity guides students to use their statistic to
generate an approximate sampling distribution by looking at 120 different randomly generated
sequences of 200 digits.
There are two fathom files available. One file (wg10_data_coinstatistic_fathom.ftm) should be
given to the students. The other file (wg10_data_coinstatistic_fathomteacher.ftm) is for the
teacher to see what the final product may look like. The teacher file uses the proportion of
heads as an example of a statistic that is used to test for randomness. This example can also
be used as a class demonstration.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
Student Directions:
1. Open the Fathom file wg10_data_coinstatistic_fathom.ftm.
 You should see a collection called Coin. It has the attribute CoinToss, which picks
randomly between the choices “Head” and “Tail.” If you inspect the cases, you will see
one case for CoinToss, displaying either “Head” or “Tail.”
 You will also see a collection called Sample of Coin. In this collection, there are 200
cases, representing 200 tosses of a coin. Double-click on the collection, which opens up
the Inspector. “With replacement” and “Replace existing cases” are selected. You can
press “Sample More Cases” to fill your collection.
2. With the Sample of Coin collection selected, drag a Table onto your workspace. You should
see that the collection has three attributes, CoinToss, Streak, and StreakLength. CoinToss is
the result of each coin toss and was described in the Coin collection. Streak is an attribute that
keeps track of the number of times Head or Tail appears in succession. For example, the first
time “Head” appears, Streak = 1. Then if, on the next toss, “Head” appears again, then the
Streak value will become 2. If “Head” appears a third time, the Streak value changes to 3. If,
instead, the next toss results in “Tail”, then the Streak value will go back to 1. StreakLength
keeps track of only the highest number of coin tosses in the streak. Otherwise, StreakLength
returns no value. For example, if there were three heads in a row, you will see no value next two
the first two times “Head” appears, and then StreakLength = 3 next to the last time “Head”
appears in the streak.
Questions:
(a) Describe the table. How many cases are there? What is the longest streak of Heads you
see? What is the longest streak of Tails you see? What else do you notice? Any other
interesting features in your table?
Students should see that there are 200 rows, each row representing 1 coin toss, for a total of
200 coin tosses. This table represents one sample. The length of the longest streaks will vary.
Students may also comment on other patterns they see.
3. Now, you will collect measures to calculate the summary statistic your group chose. Doubleclick on the Sample of Coin to open the Inspector. Go to the “Measures” tab. You will define
your measure here. The formula you choose will depend on the summary statistic you invented.
For example, if your summary statistic was “Proportion of Heads,” under Measures, you might
name your statistic ProportionOfHeads. Then, under the Formula heading, double-click to
open the formula editor. Type in proportion(CoinToss=“Head”). Press “Ok.” The value should
appear.
Of course, your summary statistic will need a different formula. You can experiment with the
formula editor to calculate your measure. You can also change the sample size and collect
more samples to examine a smaller sample to see if your formula is working as you intend.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
At the end of this document in Appendix A, there is a list of possible statistics students may
choose and their corresponding formulas in case students need additional help.
4. Close the Inspector. Right-click on the Sample of Coin collection, and choose “Collect
Measures.” By default, Fathom collects 5 measures. With the collection “Measures from Sample
of Coin” selected, drag a Graph onto your workspace. In the Inspector, go to the Cases tab.
Grab your summary statistic and drop it into the graph. Examine the dotplot.
Question
(b) What information does the graph tell you? Be specific.
Students should see that there are only 5 measures displayed. They should recognize that each
dot represents the value of their statistic taken on the entire sample of 200 coin tosses. They
should recognize that there are 5 dots because they have examined 5 completely different
samples of 200 coin tosses.
(c) Do the values displayed make sense? What is the highest value your summary statistic
takes? Describe what this number represents. Do the same for your lowest value.
Students should pay attention to the scale of the graph and what that means in the context of
their statistic.
5. In the Inspector, go back to the “Collect Measures” tab. Turn off the animation, and choose
“replace existing cases.” Then set the number of measures to 120. Collect the measures.
[A collection of 120 measures is a nice size to work with, but you might want to try larger
measures of 300 or 500, perhaps depending on how fast your technology works. Experiment
with the size of measures and decide which size makes the most sense for your students to
work with.]
(d) What does your dotplot represent?
This is a display of 120 dots. Each dot represents one value of the statistic evaluated from one
sample of 200 cases. It is important to emphasize that this is an approximation of a sampling
distribution.
(e) Describe the dotplot.
Answers will vary, but should include a discussion of shape, center, and spread. Pay particular
attention to the language used. Students should be talking about the distribution of their statistic
(e.g. looking at all the proportions collected from the 120 samples, the center of the graph
represents the mean of these proportions). Students could recognize that their sampling
distribution is not necessarily approximately normal.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
(f) What values would you consider likely on your dotplot? What values would you consider
not as likely to occur? Why?
Students don’t have to talk about specific percentages (although they could). This question is
meant to get students to start making a judgment about which values are and are not as likely to
occur.
6. Right-click on the graph. Choose “Duplicate Graph.” Move the second graph so that both are
visible. Then turn the second graph into a histogram. Readjust the histogram if desired.
(g) Use the histogram to fill out a table with the following headings. Hint: To find Frequency,
if you highlight a bin on your histogram, you can either count the number of red dots or
look in the lower left hand corner of the Fathom screen to see the frequency and bin
width. You must compute the values for the other three columns by hand. (Think about
it! What should be the last value in the Cumulative Frequency column? In the
Cumulative Percent column?)
Test Statistic
Values (Name it!)
Frequency
Cumulative
Frequency
Percent
Cumulative
Percent
The values depend on the students’ histograms. They can use the bin width values to fill in the
first column. The Frequency values also come from Fathom. Then students will add up the
frequencies to get the cumulative frequencies. The percent column is their frequency column
divided by 120 (not 100!!). They will add up the percent column to get the cumulative percent
column. The size of the table will depend on how students adjust their histogram’s bin width.
Students should see that the last value of the Cumulative Frequency column must be 120 and
the last value of the Cumulative Percent column must be 100%.
(h) Recall that you have made this sampling distribution to help you determine whether a
sample of 200 coin tosses is truly random. Based on your histogram, what kind of values
for your statistic would you have to get to make the decision that a sample of coin tosses
is truly random? What kind of values for your statistic would make you decide that a
sample of coin tosses was not random?
Students will be looking at their distributions. They should provide a reasonable range of values
for both questions and understand that they are making a decision as to the values that they
consider reasonable.
(i) What is the smallest percentage likelihood of getting a value for your statistic that would
make you decide that the sample of coin tosses was not random? What is the largest
percentage likelihood of getting a value for your statistic that would make you decide that
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
the sample of coin tosses was not random? Describe how you came up with each one of
these values.
Students should recognize the values nearer to the ends of the distributions are less likely.
[Note that students may choose a value for one end of their distribution outside their range of
their observed values, that is they might say “17 or larger” when the largest value they observed
in their distribution was only 16. Be prepared to discuss what this means for their situation. (For
example, it could mean that they will be fooled more often than otherwise by non-random
sequences.)] Students should look at both ends of their distributions. They should turn each of
their cutoff values from each end of their distribution into a percentage. Students should be very
specific as to where each of their percentages comes from. (Again, it is possible that a student
might have a percentage of 0% or of 100%, depending on their distribution and their decision of
non-random values.)
(j) Relate the percentages from (i) to the values in the Cumulative Percent column of your
table. What do these percentages tell you about the information the Cumulative Percent
column provides?
Students should recognize that their answers to (i) are actually point values, and that (i) or (100
– (i)) are the values they want to relate to on their cumulative percent column. They shouldn’t be
concerned about getting specific values (such as 5% (or 95%)), but they should have relatively
small cumulative percentage values for each end of their distribution.
Students should be comfortable with understanding the cumulative percentages on the ends of
the distribution. They should see these percentages are for the less likely values of their test
statistic.
[Note: It could be interesting to discuss the symmetry of these percentages on the ends of the
distributions. Since these simulated distributions approximate sampling distributions, they could
be approximately symmetric (depending on how many measures were collected and what
summary statistic was chosen). An interesting discussion might be, “Should these percentages
be similar? Why or why not?” However, you may want to have this discussion at another time,
depending on your circumstances.]
7. Right-click on your dotplot and choose “Plot Value”. Type percentile([value],[summary
statistic name]). “[value]” refers to the percentage you found in (i) without the percent symbol
which gave the largest percentage in (j). “[summary statistic name]” is whatever name your gave
your summary statistic when you collected measures. Notice that the line representing the value
of that percentage is drawn onto the dotplot.
Repeat this procedure to draw the line representing the same percentage for the other end of
your distribution.
Do the same for your histogram.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
8. Now look back at the three samples we made at the beginning of the activity (your sample of
0’s and 1’s and the two samples from the homework assignment). You already computed the
test statistic for each one (on Student Handout #1). Plot the value of your test statistic from your
sample of 0’s and 1’s on your dotplot and histogram. You can right-click on the graph and
choose “Plot Value” and type the number into the formula editor.
(k) Look at your dotplot closely. Count the number of dots on the line you just plotted and all
dots that are as or more unlikely. What percent of the dots is this? Statisticians call this
percentage a p-value.
For example, if a student counted 12 dots, then their percent (p-value) is 12/120 = 0.10 or 10%.
This assumes a one-sided test.
For a two-sided test, you will need the student to additionally find the value that is symmetric to
their test statistic across the mean of the simulated sampling distribution and repeat the above
process (“Plot Value” the symmetric value and count the number of dots on that line and all the
dots that are as or more unlikely for it.) The total of the dots on the two lines and the more
unlikely dots from both lines divided by 120 (the total number of dots) is the p-value for the twosided test.
(l) Write a sentence explaining what your p-value represents. What does this percentage
tell you about the test statistic generated from your sample?
My p-value of 0.10 represents the percent of values in the sampling distribution that are at my
test statistic or more unlikely than my test statistic.
(m) Recall that you have made this sampling distribution to help you determine whether a
sample of 200 coin tosses is truly random. Using your p-value, decide whether the
sample of 0’s and 1’s you made is truly random. Also use your graph to justify your
decision.
For example, if the student had said that they would decide a sample was not random if the
statistic was in the top 5% of values in their sampling distribution, then the student would decide
that since 10% of the values of the sampling distribution was above their test statistic for their
sample, they would decide the sample was truly random.
Students should be uncomfortable because they know that the sample wasn’t random. But this
will lead them to make decisions about the reliability of their statistic, not just the methods used
here.
9. Repeat step 8 and answer (k) through (m) for both homework samples as well.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
10. Did your choice for a summary statistic help you determine if a sample was random? What
is important when considering the summary statistic to use in making decisions about
randomness? Compare your statistic with those of your classmates. How could you improve or
revise your summary statistic to provide a more reliable test of randomness?
Part 3: Wrap-Up Discussions, Further Extensions
Wrap-Up Discussion Suggestions:
 Students can share their results.
 Are you convinced that any of these summary statistics are better than the others for
determining randomness? Do you have other ideas for summary statistics now that
you’ve seen these results?
 As a class, students could determine what characteristics the ‘best’ statistics for
determining randomness would have.
 Students should recognize that sometimes, just by chance, they will make incorrect
conclusions. Sometimes they will conclude ‘non-random’ when the sequence was in fact,
random. They can determine this quantity from the area (percentage of dots) under their
simulated sampling distribution where this would occur (the tails of the distribution that
are not likely but happen occasionally anyway). Also, sometimes they will conclude
random when, in fact, the sequence was non-random. This quantity cannot be
determined from the simulated sampling distributions.
 Have different students describe or define the terms “statistic,” “summary statistic,” “test
statistic,” “sampling distribution,” and “p-value.”
 Though this is not the central learning focus of this activity, a discussion of what is truly
“random” or “randomness” is always interesting.
A follow-up homework assignment is also included.
Further Extensions:
 Students can examine free response question #6 from the 2010 AP Statistics exam and
free response question #6 from the 2009 AP Statistics exam. The 2010 exam question
deals with the idea of gathering a statistic, and finding a p-value from a histogram or
cumulative frequency/percent table. The 2009 exam question deals with the idea of
reasoning through the expected behavior of an unfamiliar test statistic.
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
Appendix A: Formulas for Different Possible Test Statistics
We expect students to come up with the following test statistics. Others are possible, and may
require more creative formula-writing on the part of the student. The measure names given here
are only suggestions, and students should create as much of this on their own as possible.
Length of longest streak
LongestStreak = max(Streak)
or
LongestStreak = max(StreakLength)
Number of streaks
NumberStreaks = count(Streak = 1)
or
NumberStreaks = count(StreakLength)
Mean length of streak
MeanStreak = mean(StreakLength)
Median length of streak
MedianStreak = median(StreakLength)
Proportion of H’s (or T’s)
ProportionH = proportion(CoinToss=“Head”)
Number of switches (how many times the sequence changes from H to T (or T to H))
NumberSwitch = count(Streak = 1) – 1
Proportion of streaks more than 3 long
Proportion3orMore = proportion(StreakLength>2)
Institute for Advanced Study/Park City Mathematics Institute
Secondary School Teachers Program/Data
Summer 2010
Download