Biometry 109 Final Fall 2001 Name:________________ You may use a calculator and a page of notes. Ask the instructor if you need any clarification of an exam question. Show all work. If calculator statistics functions are used, specify in answer. Write final answer in the provided space. Unless specified, use a level of significance 0.05 for all statistical tests. Test is worth 70% of final exam score; i.e., 70 points. Probability (1) Fill in both blanks. The smallest value a probability can be is ______ and the largest value is ________, and any probability value stated outside of this range is a mistake. (2) Suppose the following probabilities among an animal population: P( diseased | male ) = 0.2 P( diseased | female ) = 0.1 P( male ) = 0.30 P( female ) = 0.70 (2i) True or False: Disease and sex are independent. Defend your answer:_________ (2ii) If an animal were sampled at random, what is the probability of the animal being a diseased male? Answer: ____________ (2iii) If an animal were sampled at random, what is the probability of the animal being diseased? (Hint: Drawing a probability tree is one way to find the answer.) Answer:____________ (3) Fill in both blanks. Suppose the height of men is shown by the density curve below. The probability of a randomly chosen man having a height between 5’6” and 6’ is ___________. Also, the probability of a man being shorter than 5’6” is __________. __________________________________________________ (4) Suppose X is distributed according to the standard normal distribution (mean=0, sd=1), find P(X < 2). Answer:______________ 1 (5) Suppose X is distributed with a mean of 50 and standard deviation of 5. If 16 X’s were sampled and their mean, X , calculated, find P( X < 52.5 ). Answer:__________ (6) Fill in both blanks. Suppose X has any distribution with mean and standard deviation . Then, as n gets large, the distribution of ________________ will become “approximately _________________” with mean and standard deviation n . (7) Suppose the probability of a newborn calf being male is 0.3; i.e., P(male)=0.3. If six calves were born, and their sexes are independent, what is the probability of exactly two males and four female; i.e., find P(#males)=2. Answer:_____________. (8) Suppose you were to calculate a 1-standard deviation window by calculating x s . Assuming the dot plot of the data is approximately “bell-shaped”, you would expect there to be roughly _______% of the data within the 1-standard deviation window. Statistics (9) Suppose 36 island foxes were captured and weighed. The mean weight was 10 pounds with a sample standard deviation of 3 pounds. Calculate a 95% confidence for the mean weight of island foxes. Answer: ______________ (10) Suppose 71 randomly sampled adult orangutans were weighed . The calculated 95% confidence interval for the mean weight of orangutans was (210, 226) pounds. Which one of the following statements is false? (a) There is about a 95% chance that the true population mean weight of adult orangutans is between 210 and 226 pounds. (b) About 95% of adult orangutans weigh between 210 and 226 pounds. (c) If we were to weigh more orangutans, the confidence interval would most likely get more narrow. (d) The sample mean weight of the 71 orangutans was 218 pounds. 2 (11) Suppose a drug was designed to lower the heart rate. 30 people’s heart rates were measured after taking a placebo drug and also when on the new drug. The order was not known to the patients nor the nurses (double-blind study). Which one of the following analyses would be most appropriate? (a) ANOVA (b) 2-sample t-test (c) paired t-test (d) chi-square test of independence (12) A plant physiologist conducted an experiment to determine whether mechanical stress can retard the growth of soybean plants. Young plants were randomly allocated to two groups of 13 plants each. Plants in one group were mechanically agitated by shaking for 20 minutes twice daily, while plants in the other group were not agitated. After 16 days of growth, the total stem length (cm) of each plant was measured. Minitab analysis follows. Two-sample T for Control vs Stress N Mean StDev SE Mean Control 13 30.59 2.13 0.59 Stress 13 27.78 1.73 0.48 Difference = mu Control - mu Stress Estimate for difference: 2.815 95% lower bound for difference: 1.508 T-Test of difference = 0 (vs >): T-Value = 3.70 P-Value = 0.001 DF = 22 (12i) Show how the standard error of 0.59 was calculated for the Control beans. (12ii) Which one of the following is false? (a) On average, control beans grew 2.815cm more than stressed beans. (b) The standard deviation describes the dispersion of the data, while the standard error describes the dispersion (due to sampling error) in the sample means. (c) There is insufficient statistical evidence to suggest that stressed beans grow less than the control beans. (d) There is statistically significant evidence that stressed beans grow less than control beans. (13) Which one of the following is false? (a) Power is the probability of rejecting the null hypothesis when the alternative hypothesis is true. (b) If the assumptions for a parametric test are valid, the parametric test will have more power than its corresponding nonparametric test. (c) Typically the more data available, the more powerful the test. (d) Not rejecting Ho is equivalent to concluding that Ho is true. 3 (14) There is a genetic model which assumes that black coat color in mice is inherited as a simple dominant trait, and that brown color is inherited as a recessive trait. A cross between pairs of heterozygous black mice produced an F2 generation of 220 black mice and 60 brown mice. The genetic model would have us expect a ratio of three black mice to one brown mice; i.e., P(black)=0.75 and P(brown)=0.25. (14i) If the genetic model were true, how many black and brown mice would have been expected? ___________ black mice and ____________ brown mice. (14ii) Which test would be most appropriate? (a) 1-sample t-test (b) paired t-test (c) chi-square goodness-of-fit test (d) chi-square test of independence (e) ANOVA (15) Suppose a two-tailed 1-sample t-test was performed with H 0 : 80, H A : 80 giving a t-statistic of t =1.4 with 30 degrees of freedom. Give as much detail you can about the P-value. Answer: ______________. (16) Suppose a two-tailed 1-sample t-test was performed with H 0 : 80, H A : 80 . Which one of the following is false? (a) If Ho is true, decreasing from 5% to 1% decreases the chance of a rejection error (type I error). (b) If H A is true, decreasing from 5% to 1% decreases the test’s power. (c) If H A is true, decreasing from 5% to 1% decreases the chance of an acceptance error (type II error). (d) The further away the true mean, , is from 80, the greater the power. 4 (17) A study was carried out where the weight (pounds) and cholesterol levels (mg/100 ml) were compared. Of interest was whether cholesterol is associated with weight. A simple linear regression analysis was performed by a statistician. Below is some of the Minitab output. The regression equation is cholesterol = - 128 + 2.03 wt Predictor Constant wt Coef -127.57 2.0320 SE Coef 78.90 0.4447 T -1.62 4.57 P 0.130 0.001 Regression Plot cholesterol = -127.567 + 2.03199 wt S = 36.8697 R-Sq = 61.6 % R-Sq(adj) = 58.7 % cholesterol 300 200 100 140 150 160 170 180 190 200 210 220 wt (17i) If a randomly sampled man weighed 180 pounds, using the regression analysis, what would you expect his cholesterol to be? Answer: ___________________ (17ii) For each pound increase in weight, you would expect cholesterol to (a) Decrease about 128 mg/100ml (b) Increase about 128 mg/100ml (c) Increase about 2.0 mg/100ml (d) Increase about 0.4 mg/100ml (e) Increase about 4.6 mg/100ml (17iii) The correlation coefficient between weight and cholesterol is about: (a) –2.0 (b) – 0. 75 (c) –0.06 (d) 0 (e) +0.06 (f) +0.75 (g) +2.0 5 (18) A fisheries student studied the weights of 2-year old rainbow trout raised in three different creeks. 35 fish were captured in Creek A, 33 in Creek B, and 35 in Creek C. Below is Minitab ANOVA output for the data. One-way ANOVA: A, B, C Analysis of Variance Source DF SS Factor 2 4.09 Error 100 100.93 Total 102 105.02 Level A B C N 35 33 35 Pooled StDev = Mean 2.503 2.985 2.656 1.005 MS 2.04 1.01 StDev 0.997 1.033 0.985 F 2.02 P 0.138 Individual 95% CIs For Mean Based on Pooled StDev ---------+---------+---------+------(---------*--------) (---------*---------) (---------*---------) ---------+---------+---------+------2.45 2.80 3.15 (18i) For the above ANOVA test, specify the critical value when using a level of significance ( ) of 5%. Answer:_____________ (18ii) Besides independence and random samples, the two primary assumptions for oneway ANOVA are that the means are normally distributed and that the ______________ of the populations are equal. (18iii) Why was it probably not necessary to test these data for normality? Answer:_______________ (18iv) Which conclusion is most appropriately inferred from the ANOVA output? (a) There is insignificant statistical evidence to suggest that the mean weight of 2year old rainbow trout differs between creeks A,B, and C. (One-way ANOVA, P=0.138) (b) There is statistically significant evidence that the mean weight of 2-year old rainbow trout differs between creeks A, B, and C. (One-way ANOVA, P=0.138) (c) There is insignificant statistical evidence to suggest that the standard deviations of 2-year old rainbow trout differs between creeks A,B, and C. (One-way ANOVA, P=0.138) (d) There is statistically significant evidence that the standard deviations of 2-year old rainbow trout differs between creeks A, B, and C. (One-way ANOVA, P=0.138) (18v) The value 2.04 in the MS column is: (circle one) (a) an estimate of the mean trout weights. (b) an estimate of the variance of the trout weights. (c) a measure of the variability within the different groups. (d) a measure of the variability between sample means of the groups. 6 (19) Suppose all the data for Creek C were thrown out, thus only leaving data for Creeks A and B. (35 trout from creek A and 33 trout from Creek B). The student wanted to estimate the difference between mean weight of Creek A trout and those of Creek B. The resulting 95% confidence interval was (-0.973, 0.011). True or False: If a two-tailed ( A B ) two sample t-test ( =5%) were performed, the student should keep the null hypothesis that there is not a statistically significant difference between the two means. Defend your answer:_________________ (20) The below contingency table is a chi-square test output from Minitab with some parts deleted. It involves the student data and compares hair color against gender. Rows: sex Columns: hair black blond brown lightbro red All female 3 2.96 15 15.28 9 WWW 4 VVVV 3 1.97 34 UUUUU male 3 3.04 16 15.72 XXX YYY 4 TTTT 1 2.03 35 35.00 All 6 6.00 31 31.00 20 20.00 8 8.00 4 4.00 69 69.00 Chi-Square = 1.218, DF = ZZZ, P-Value = 0.875 6 cells with expected counts less than 5.0 Cell Contents -Count Exp Freq (20i) For brown-hair males, XXX= __________ and YYY=___________. (20ii) For degrees-of-freedom, ZZZ= ______________. (20iii) Why is Minitab stating “6 cells Answer:__________________ (20iv) with expected counts less than 5.0”? The critical value for the above chi-square test is ___________________. (20v) Which conclusion is most appropriate? (a) There is statistically significant evidence that the mean hair color of males is equal to females (P=0.875). (b) There is not statistically significant evidence that he mean hair color of males is equal to females (P=0.875). (c) There is statistically significant evidence that gender and hair color are dependent (P=0.875). (d) There is not statistically significant evidence that gender and hair color are dependent (P=0.875). 7