Lab Assignment 2

advertisement
Math 58B - Introduction to Biostatistics
Spring 2015
Jo Hardin
Lab Assignment 2
Lab Goals:
1.
To understand why we might need an interval estimate (as opposed to a point
estimate).
2.
To understand the connection between interval estimates and hypothesis testing.
3.
To understand how Confidence Intervals vary from sample to sample just like
statistics vary from sample to sample.
In class
During the lab, go through sections (a) through (f) of Investigation 1.5. Make sure to create
both 95% and 99% intervals using both the applet and R (iscambinomtest).
To turn in
1.
Suppose X is the random variable that counts the number of kissing couples who turn
right in a sample of size 124. What assumptions do you have to make in order for X to
have a binomial distribution (report all 4, even the obvious ones)? Do you believe all of
those assumptions? Comment on any assumptions that might not hold.
2.
What is the population parameter of interest? What is its symbol? What is your point
estimate of it? (A point estimate is an estimate of the parameter using a single number,
not an interval.)
3.
Report your 95% and 99% confidence intervals using both the applet and R. Describe
in words what these intervals mean; it may help to recall how you computed them.
Now suppose that you repeat the experiment (observe 124 kissing couples
and record how many of them lean right) 10,000 times. Then for each of
the 10,000 datasets, you compute a 95% confidence interval.
4.
Do you expect to get the same confidence interval for each of these 10,000 datasets?
Why or why not?
5.
Suppose pi=0.77 is the ACTUAL TRUE parameter value in the population. How many of
the 10,000 confidence intervals that you computed would you expect to contain 0.77?
Explain.
6.
Continue to suppose pi=0.77 is the true population parameter value. Now, for each of
the 10,000 times, you conduct a hypothesis test with H0: pi=0.77 and Ha: pi != 0.77, at
an alpha=0.05 significance level---that is, you plan to reject the null hypothesis
whenever your p-value is less than 0.05. How many times do you expect to
(erroneously) reject the (true) null hypothesis? Explain.
7.
How do the datasets (the 10,000 samples) connect in questions #5 and #6 above?
8.
When - in scientific research - would we want to create a confidence interval instead of
performing a hypothesis test? When would we prefer to perform a hypothesis test
instead of creating a confidence interval?
9.
NOT DUE (just to think about!!) With a sample of size 124 kissing couples, does the
Central Limit Theorem predict the normal probability distribution will be a reasonable
model for the distribution of the sample proportion? (You need calculations here but
no R or applet, see part (a) in Investigation 1.10, page 70.)
Download