Math 58B - Introduction to Biostatistics Spring 2015 Jo Hardin Lab Assignment 2 Lab Goals: 1. To understand why we might need an interval estimate (as opposed to a point estimate). 2. To understand the connection between interval estimates and hypothesis testing. 3. To understand how Confidence Intervals vary from sample to sample just like statistics vary from sample to sample. In class During the lab, go through sections (a) through (f) of Investigation 1.5. Make sure to create both 95% and 99% intervals using both the applet and R (iscambinomtest). To turn in 1. Suppose X is the random variable that counts the number of kissing couples who turn right in a sample of size 124. What assumptions do you have to make in order for X to have a binomial distribution (report all 4, even the obvious ones)? Do you believe all of those assumptions? Comment on any assumptions that might not hold. 2. What is the population parameter of interest? What is its symbol? What is your point estimate of it? (A point estimate is an estimate of the parameter using a single number, not an interval.) 3. Report your 95% and 99% confidence intervals using both the applet and R. Describe in words what these intervals mean; it may help to recall how you computed them. Now suppose that you repeat the experiment (observe 124 kissing couples and record how many of them lean right) 10,000 times. Then for each of the 10,000 datasets, you compute a 95% confidence interval. 4. Do you expect to get the same confidence interval for each of these 10,000 datasets? Why or why not? 5. Suppose pi=0.77 is the ACTUAL TRUE parameter value in the population. How many of the 10,000 confidence intervals that you computed would you expect to contain 0.77? Explain. 6. Continue to suppose pi=0.77 is the true population parameter value. Now, for each of the 10,000 times, you conduct a hypothesis test with H0: pi=0.77 and Ha: pi != 0.77, at an alpha=0.05 significance level---that is, you plan to reject the null hypothesis whenever your p-value is less than 0.05. How many times do you expect to (erroneously) reject the (true) null hypothesis? Explain. 7. How do the datasets (the 10,000 samples) connect in questions #5 and #6 above? 8. When - in scientific research - would we want to create a confidence interval instead of performing a hypothesis test? When would we prefer to perform a hypothesis test instead of creating a confidence interval? 9. NOT DUE (just to think about!!) With a sample of size 124 kissing couples, does the Central Limit Theorem predict the normal probability distribution will be a reasonable model for the distribution of the sample proportion? (You need calculations here but no R or applet, see part (a) in Investigation 1.10, page 70.)