biol.283.s2013.lab5.exercise

advertisement
BIOL 283 Lab 5: Confidence Intervals
Lab Objectives:
1. Review frequency distributions
2. Practice making confidence intervals for population means
3. Practice making confidence intervals for differences in
population means
4. Get a feel for the R language
5. Develop a resourceful attitude
Background: A species of bear, Ursus gumbis, is found in both North America and Europe. It is
believed that the North American bears grow larger than the European bears (although the
reason for this phenomenon is unclear). It is also believed that they have different distributions
of color morphs.
You and your team of researchers will sample bears from the Kroger National Forest of Ohio
and the Black Forest of Germany to (1) determine their color and (2) measure their mass (you
might try to imagine what this would really involve). In this and subsequent labs, you will
perform inferential statistical tests to determine if current conjectures are substantiated. Make
sure to keep these data handy for the future!
Part I. Defining two populations.
Open up one bag of gummi bears, representing one entire population (from either North
America or Europe). Pour the gummi bears into a plastic cup. One by one, weigh each bear on
the provided scale (make sure that the scale is zeroed), by placing it in the provided plastic dish.
Once measured, record the mass and color in the table provided in the Excel file found on
Blackboard. Place the measured gummi bear in another cup (do not return it to the source
cup). After every bear is measured, repeat with the second population. After both populations
are measured and all data are collected, simulate a mass extinction event. Hint: using extra
cups to sort the bears by color will save time later.
Make sure to save a file with a name that will allow you to find it again. (You might consider
saving it on the desktop during lab. At the end of lab, save it to a jump drive or email it to
yourself as a back-up.) Have one person per group email the file to the instructor before leaving
class.
Part II. Summarizing populations.
The Excel spreadsheet asks you to create frequency and relative frequency tables, as well as
find some summary statistics for the populations. In R, create variables for color and mass, for
each population, and make sure that you can re-create the same information in tabular or
graphical form. This exercise might involve the need to refer to previous labs to find the
appropriate functions, or look in the R help files.
1
BIOL 283 Lab 5: Confidence Intervals
How did you create
these variables?
Copy and paste tabular
or graphical output to
the right
2
BIOL 283 Lab 5: Confidence Intervals
Part III. Creating population confidence intervals
Recall that the way to create a sample from a population in R is to do something like the
following:
a.s.10 = sample(mass.American.bears, 10)
e.s.10 = sample(mass.European.bears, 10)
Explain if your
populations and/or your
samples are reasonably
normally distributed.
(Hint: you might need to
recall the function for
normal probability
plots)
Create two samples – one each from both bear populations, sampling mass – of the same size
(you decide the sample size) and record the statistics in the table below
Sample Statistics:
American
n:
y:
s:
SE :
European
3
BIOL 283 Lab 5: Confidence Intervals
Based on the sample size you chose, find the t0.025 value from a t-table and write it below
t0.025 :
As you now have all the information you need, calculate a 95% confidence interval for each
population mean, and record that in the table below
American
European
LCL = y - t0.025 ´ SE :
UCL = y + t0.025 ´ SE :
Where do the
population means exist
relative to the
confidence intervals?
Explain.
Part IV. Varying sample size and α.
Using the accompanying R script, repeat the procedure above multiple times and record the
effects below. You can choose one population or comment on how the choice of population
alters your conclusion. Make sure you read the direction in the r script!
4
BIOL 283 Lab 5: Confidence Intervals
What is the effect of
increasing sample size
for a constant value of
α?
What is the effect of
changing α for small
sample sizes? For large
sample sizes?
What general
conclusions can you
make about estimation
of population means?
Part V. Comparing two population means.
From the table on page 3 and the formula in your text book or notes, make calculation that
allow you to fill in the following table
5
BIOL 283 Lab 5: Confidence Intervals
y A - yE
Pooled SE
6
df
As you now have all the information you need, calculate a 95% confidence interval for the
difference in population means, and record that in the table below
American
LCL = ( yA - yE ) - t0.025 ´ SE :
UCL = ( yA - yE ) + t0.025 ´ SE :
Does this 95%
confidence interval
contain the true
difference between
population means? If
so/ if not, what does it
suggest about your
confidence interval?
Does this 95%
confidence interval
contain 0? If it did
contain 0, what would
that say about the two
population means
European
BIOL 283 Lab 5: Confidence Intervals
Repeat the procedure (using the second accompanying R script) as on page 5, plus varying the
sample sizes, and answer the questions below.
What is the effect of
increasing sample size
for a constant value of
α?
What is the effect of
changing the similarity
of sample sizes?
What is the effect of
changing α for small
sample sizes? For large
sample sizes?
What general
conclusions can you
make about estimation
of difference between
population means?
7
BIOL 283 Lab 5: Confidence Intervals
CHALLENGE
What would happen if you took two samples of the same size from one population and
calculated a 95% confidence interval for the difference in population means? What is the
expected difference? What do you notice about the confidence interval? If you repeated the
procedure 100 times, how many times does the confidence interval not contain 0? How does
this compare to α?
8
Download