Discovery Activity for Sampling Material

advertisement
Name:
Section:
Recitation 11 – Time to Regroup
Purposes for Today’s Recitation
There are three things we need to do. One is clerical, one is backward-looking, but absolutely essential,
and the third is to provide a venue in which you can get help for Connections 4. Don’t put off asking for
help on this latest Connections assignment.
Part I – Reading Some Language
Please read the following brief reminder about basic language. We saw this on a content video.
Population
A population is any entire collection of people, animals, plants or things from which we may collect data.
It is the entire group we are interested in, which we wish to describe or draw conclusions about.
In order to make any generalizations about a population, a sample, that ideally is meant to be
representative of the population, is often studied. For each population there are many possible samples.
A sample statistic gives information about a corresponding population parameter. For example, the
sample mean for a set of data would give information about the overall population mean.
Sample
A sample is a group of units selected from a larger group (the population). By studying the sample it is
hoped to draw valid conclusions about the larger group.
A sample is generally selected for study because the population is too large to study in its entirety. The
sample should be representative of the general population. This is often best achieved by random
sampling. Also, before collecting the sample, it is important that the researcher carefully and completely
defines the population, including a description of the members to be included.
Parameter
A parameter is a value, usually unknown (and which therefore has to be estimated), used to represent a
certain population characteristic. For example, the population mean is a parameter that is often used to
indicate the average value of a quantity.
Within a population, a parameter is a fixed value which does not vary. Each sample drawn from the
population has its own value of any statistic that is used to estimate this parameter. For example, the
mean of the data in a sample is used to give information about the overall mean in the population from
which that sample was drawn.
Statistic
A statistic is a quantity that is calculated from a sample of data. It is used to give information about
unknown values in the corresponding population. For example, the average of the data in a sample is
used to give information about the overall average in the population from which that sample was drawn.
It is possible to draw more than one sample from the same population and the value of a statistic will in
general vary from sample to sample. For example, the average value in a sample is a statistic. The
average values in more than one sample, drawn from the same population, will not necessarily be equal.
Name:
Section:
Sampling Distribution
The sampling distribution of a sample statistic is a plot of the distribution of the values of that statistic
computed from many different (random) samples from the population. Those statistics don’t just appear
in a haphazard way. Indeed the shape is bell-shaped, and peaks above the parameter – but that is not
what defines the parameter. Many of you said that on the second exam. Rather this is a mathematical
tool (sort of like the proof to Pythagorean’s theorem) that helps us understand something that is true
that we don’t have to construct every time to know.
Here’s the catcher: because you know it is true (it = this predictable bell shape) then you know
something about where statistics are likely to be EVEN WHEN YOU ONLY DO ONE SAMPLE. I didn’t say a
sample of size one. I said one sample. It may be a sample of size 100. But you don’t have to do 40 of
those to know this bell shape and to get some idea about what the parameter is, an idea you can quantify
the goodness of using the MOE. Abstract? Maybe it is, a little. But it is essential, if you want to
understand any poll in any publication or outlet anywhere.
Part II - Do you get it?
1. A recent poll surveyed 500 new mothers in the U.S. and found that only 18% chose to breastfeed
their infants. Identify the likely population, the sample, the statistic, and the parameter. You don’t
have to turn anything in, but answers need to be given in recitation and discussed.
2. Back to Mr. Niles and that open response question on the exam. Why is the concept of sampling
distribution about different hypothetical samples taken at a snapshot in time and why is this idea
muddied and confused if you start talking about different samples taken over time?
Part III – Help with Beyond the Class 4
This assignment tries to capture the skills you honed last recitation (with the FST data) and in class,
and seeks to uncover a more subtle point. In particular, you will see that if we define “accuracy” the
way the FST studies often do, then we can generate “accurate” decision processes that really don’t
make a whole lot of sense, particularly when the concentration of positives in the validation group is
overwhelmingly large.
Download