Bootstrap Confidence Intervals - Sharon J. Lane

advertisement
+
Using StatCrunch to Teach Statistics Using
Resampling Techniques
Webster West
Texas A&M University
+
Background

George Cobb started the interest in using resampling
methods for introductory statistics with his plenary talk at the
First USCOTS in 2007.

Several groups our now working on integrating these
approaches into the curriculum.

I have added numerous resampling procedures to
StatCrunch which is widely used in teaching introductory
statistics.

Roger Woodard and I have our INCIST NSF grant to develop
teaching materials which incorporate these methods.

We have conducted numerous workshops around the country
where we have presented these materials to statistics
teachers.
+
A Randomization Activity

Students were randomly assigned to a version of an exam
(Yellow or Green) when they entered the classroom.

Afterwards, both sets of students complained about the exam
saying their version was harder.

Students investigate the possibility that the observed
difference between means of 6.3 might occur due to random
chance.

They shuffle cards with scores written on them into yellow
and green groups and then calculate the difference between
the two means.

They place a post it note on a whiteboard in the proper
location and evaluate the resulting randomization
distribution.
+
The randomization applet
+
A Sampling Distribution Activity

A very inconvenient printed roster of 12,000 students at a
fictitious university is provided to the students.

They build a sampling distribution by each collecting a
random sample of size 30 and reporting the mean number of
Facebook friends for their sample via a StatCrunch survey.

From the sampling distribution, students see the normal
curve is a good descriptor of the sampling variability of the
sample mean.

This leads to the CLT and the idea of statistic ± 2×standard
error as a 95% confidence interval for an unknown
population mean.
+
A Sampling Distribution Activity
Mary’s Sample Mean
+
A Sampling Distribution Activity

The instructor then uses their access to the data in electronic
form within StatCrunch to compute 1000 sample means with
each sample mean based on a sample of 30 students.

This is like doing the activity with 1000 students instead of
10.
+
A Bootstrapping Activity

Building off the sampling distribution activity, students are
then tasked with estimating the standard error of the sample
mean using a single sample.

Using the sample as a proxy for the population, each student
collects 30 resamples taken with replacement from the
common sample data.

Each student reports the mean of their bootstrap sample.

The student results are then augmented with applet results to
compute a 95% confidence interval for the population mean.
+
The bootstrapping applet
+
What we have learned about the
randomization approach

This approach appeals nicely to a basic intuition that most
people have about the problem.

The tactile simulation adds a great deal of value in terms of
students understanding of what is taking place in the applet.

It can be introduced with little or no background required in
terms of other statistical concepts such as normal theory or
even the jargon of hypothesis testing.

It can be easily used at a variety of time points in the
standard introductory course.

Students and instructors seem to really take to it!
+
What we have learned about the
bootstrapping approach

The bootstrap can be used to reinforce this idea of a sampling
distribution.

The bootstrap approach requires a great deal of backstory
before it can be effectively introduced.

It is probably best to rely on technology alone for the bootstrap
after the student does the sampling activity in a more tactile
way.

Students seem to like it but instructors not so much!

The bootstrap may also lead to possible misconceptions on the
part of students: “Am I getting a confidence interval for the
sample mean?”
+
For discussion

With these approaches, two people can get different results
even when using the same data set.

Taking a simple random sample from a large list of values is
difficult to do in a tactile way. Must we always rely on help
from the computer?

How can we make one sample problems more interesting to
students?

Students are interested in samples but not interested in
population parameters! Why do we care about the mean
number of Facebook friends?

Are we too focused on inference in the introductory course?
How often will the average student ever be confronted with a
random sample?
Download