Word

BIOL 933 Fall 2015 Problem Set 1 Topics 1 & 2 Due Thursday, September 17, at the beginning of lecture. Answer all parts of the questions completely, and clearly document the procedures used in each exercise. Please refer to the "Homework Tips" on the course website. R can generate prolific output; it is your job to wade through it and submit only that which is relevant to the question being asked. Quantity ≠ Quality Question 1 (now) Fundamental statistical concepts The following six samples of 10 observations each were randomly drawn from a population with mean = 40 and variance = 100. Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 38 37 42 39 49 39 20 24 46 45 36 40 22 27 45 31 33 50 50 25 40 42 50 30 46 38 50 21 42 47 25 58 43 60 42 72 45 50 45 48 46 60 40 32 29 26 33 34 39 57 38 51 44 47 43 38 51 38 25 34 1.1 What is the mean, standard deviation, coefficient of variation (CV), and standard error of the mean (SE) for each sample? [You can do this with a calculator or with Excel, but I encourage/challenge you to write and use an R script for this!] Show one sample calculation for each of the four requested statistics. 1.2 Add 15 to each value in the data set and determine the effect on the mean, standard deviation, CV, and SE for each sample. 1.3 Now consider the six means as a sample of six values, and calculate the overall mean and standard deviation of this new dataset. Comment on the relationship of the mean of the six means with both the overall mean of the 60 observations and the theoretical (i.e. given) mean for the population. Comment on the relationship of the standard deviation of the six means with the theoretical SE for the population (i.e. the SE based on the given population parameters and the given sample sizes). 1.4 Explain in your own words why the variation among the six sample means is smaller than the average sample variation. 1.5 The following table was derived by subtracting Sample 1 from Sample 2 (S2-S1), Sample 3 from Sample 4 (S4-S3), and Sample 5 from Sample 6 (S6-S5): S2-S1 S4-S3 S6-S5 BIOL 933 - Fall 2015 -1 -3 -10 4 -1 4 5 -14 17 -25 2 -20 -8 -29 5 1.1 33 17 30 5 3 14 -8 -3 1 18 13 3 -5 -13 9 HW1; topics 1 & 2 Calculate the mean and variance for each of these three derived samples, and then find the averages of these statistics. Comment of how these averages compare with what you would expect. [Phrased another way: What mean and variance do you expect when you subtract one random variable from another?] Question 2 (Tues, 9/8) Distributions and probabilities For these questions, use R to find the exact probabilities associated with critical values and vice-versa. [Hint: With questions like these, it is very helpful to draw a figure first...see the reading from Lecture 1 for examples.] 2.1 Find Z0 such that P(Z0≤Z≤1.66) = 0.200. 2.2 Given a normal distribution Y with mean = -12 and variance = 16, find Y0 such that P(Y≤ Y0) = 0.02. 2.3 Y is normally distributed with mean = 15 and variance = 9. For a random sample of 10 observations, find Y0 such that P(Y  Y0 )  0.72 . 2.4 The mean weight of a Conserviola olive is 1.9 g, with a sample variance of 0.4 g2 (population variance unknown). What is the probability that a bag of 18 randomly-picked Conserviola olives has an average per-olive weight of more than 2.0 g? 2.5 Given a t distribution, find t0 such that 60% of the values are within the (-t0,t0) interval for df = 24. 2.6 Describe in words the relationship between the Z and t distributions. Provide a numerical example to illustrate this relationship. Question 3 (Thurs, 9/10) t-test; independent observations To test the lateral effect of a new sprayed herbicide on insect diversity, a researcher selected 32 fields and randomly assigned them to two treatments. One week after application, she examined five random 1 m2 sections in each field and calculated the average number of insect species present. Her data: Field Control Sprayed 1 4.6 6.4 2 2.8 4.3 3 6.4 2.9 4 4.7 4.3 5 4.8 2.4 6 6.8 4.4 7 1.1 2.8 8 7.1 1.6 9 9.5 0.6 10 11.0 4.9 11 9.1 4.0 12 3.8 5.0 13 6.6 5.7 14 3.9 1.5 15 5.8 3.3 16 4.8 0.2 Answer the following questions using R. 3.1 For each sample (i.e. treatment), test normality using the Shapiro-Wilk test and present a QuantileQuantile plot. Comment. 3.2 What is the probability that these two samples are different just by chance? 3.3 Calculate the power of the test with R. Confirm the R result with a hand calculation of the power; show formulas and intermediate steps. BIOL 933 - Fall 2015 1.2 HW1; topics 1 & 2 3.4 How many replications would be required to detect a significant difference between these two groups with a power of 95% and α = 0.01? Refer to lecture notes section 2.4.4; you can solve either by hand or using R. Question 4 (Tues, 9/8) Power and sample size 4.1 How many baby chipmunks must you weigh to achieve 90% confidence that their average weight deviates no more than 1% from the true mean weight of the entire newborn chipmunk population? Assume a CV of less than 5%. 4.2 Prepare a graph showing the number of replicates required to detect significant differences between means that are ¼, ½, ¾, 1, 1¼, 1½, 1¾, and 2 standard deviations apart, with α = 0.05 and Power = 0.80. [You can do this by hand, with Excel, or with R (if you're feeling brave)...whatever works for you]. 4.3 Researchers want to determine whether over-irrigation decreases antioxidant levels in kiwis. They conclude that an irrigation rate of 15% over the ET rate does not significantly decrease antioxidant levels at α = 0.05. Assume the true mean of the over-irrigated treatment is 1.4 standard deviations less than the untreated mean and that there were 5 replications per treatment. Assuming a one-tailed test, does the experiment have adequate power (i.e. > 80%) to support the researchers' claim? Question 5 t-test; paired observations A researcher sets out to study the effect of a vegetarian diet on the level of triglycerides in the blood. To do this, he measures cholesterol levels in 15 volunteers before and after two months on a strict vegetarian diet. The data: Volunteer Before After 1 252.7 209.6 2 215.6 256.4 3 214.5 169.9 4 249.9 228.1 5 219.6 223.9 6 232.6 234.3 7 267.3 217.3 8 223.7 201.4 9 258.0 214.5 10 228.0 200.0 11 226.9 218.2 12 232.3 216.2 13 242.6 207.9 14 271.7 220.8 Since both measures were taken from the same individual, they are not independent (i.e. these are paired comparisons). 5.1 (Tues, 9/8) Use R to calculate the mean and standard deviation of each treatment and of their differences (Before - After); comment on the relative sizes of these statistics. 5.2 (Thurs, 9/10) Use R to determine the power of this two-tailed t-test (use α = 5%). Now create a new variable EFFECT = BEFORE - AFTER, and calculate the power of the test for that new variable (H0: μ = 0; α = 5%). Compare the powers of the two analyses and comment. Finally, did the treatment affect the level of saturated lipids? 5.3 (Tues, 9/8) Prepare a graphical depiction of the power of the test for variable EFFECT by hand, using Figure 3 in lecture topic 2 as a template. Use the numerical values from the previous analysis of variable EFFECT (H0: μ = 0; H1: μ = 21.673), and assume normal distributions about these means. Indicate the areas corresponding to α, β, and power. BIOL 933 - Fall 2015 1.3 HW1; topics 1 & 2 15 232.4 224.2

Word

Related documents

Products

Support

Word

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib