Statistics 515 – Statistical Methods I Practice Test for Exam 1 E. A. Pena’s Class ______________________________________________________________________ Part I (24 points). For questions 1-8 please refer to the following information: Petroleum pollution in seas and oceans stimulates the growth of some types of bacteria. A count of petroleumlytic microorganisms (bacteria per 100 milliliters) in n = 10 portions of seawater gave the following readings. Raw Data: 49, 70, 54, 67, 59, 40, 61, 69, 71, 52 The associated ordered/arranged values are given below. Ordered Values: 40, 49, 52, 54, 59, 61, 67, 69, 70, 71 Furthermore, for this data set, Xi = Sum of the Observations = 592 (Xi) = Sum of the Squared Observations = 36014 2 1. Construct a stem-and-leaf or dot plot for this data set. 2. Compute the sample mean. 3. Determine the sample median. 4. Compute the sample variance. [You may use the information given above!] 1 5. Compute the sample standard deviation. 6. Determine the first quartile. 7. Determine the third quartile. 8. Draw the boxplot. ________________________________________________________________________ Part II (18 points). For questions 9-14 please refer to the following information: Americium 241 (241Am) is a radioactive material used in the manufacture of smoke detectors. The article "Retention and Dosimetry of Injected 241Am in Beagles" [a beagle is a small short-legged smooth-coated hound] published in Radiation Research (1984), pp. 564-575, described a study in which 55 beagles were injected with a dose of 241Am (proportional to the animals' weights). Skeletal retention of 241Am (Ci/kg) was recorded for each of the 55 beagles. The following summary information pertains to these 55 observations. 2 Frequency Histogram for the Amount of Americium Retained in 55 Beagles Frequency 15 10 5 0 0.175 0.225 0.275 0.325 0.375 0.425 0.475 0.525 0.575 0.625 Amount of Americium Retained Numerical Summary Measures Type of Summary Measure Value of Americium Retained n (# of Observations) 55 Sample Mean 0.3489 Sample Median 0.3370 Sample Standard Deviation 0.0800 Minimum 0.1860 First Quartile (Q1) 0.3030 Third Quartile (Q3) 0.4080 Maximum 0.5850 Boxplot for the Amount of Americium Retained by the 55 Beagles Americium Retained 0.6 0.5 0.4 0.3 0.2 3 9. Describe the shape of the distribution for the Amount of Americium Retained by these 55 beagles. Provide explanations and/or reasons for your answer. 10. Based on the information provided, are there any outliers in the data? If so, what is the approximate value of this(these) outlier(s). 11. Approximately what percentage of the 55 observations are between 0.3030 (the first quartile) and 0.4080 (the third quartile)? 12. Provide a plausible explanation why the sample mean is larger than the sample median. 13. Based on the histogram, approximately how many observations exceed the value of 0.375? 4 14. The interval around the sample mean whose limits are two sample standard deviations away from the sample mean is [.3489 - 2(.08), .3489 + 2(.08)] = [.1889, .5089]. What could you say about the percentage of observations that will fall in this interval? Provide a reason for your answer. ________________________________________________________________________ Part III (21 points). For questions 15-21 please refer to the following information. In a three-year study of cocaine addiction by D. M. Barnes as reported in the article "Breaking the cycle of addiction" which appeared in Science, 241(1988), pp. 1029-1030, 72 chronic cocaine users were either given the antidepressant desipramine, lithium (the standard drug to treat cocaine addiction), or a placebo. The 72 subjects were randomly divided into three equal groups. The purpose of the study was to determine whether giving a cocaine addict an antidepressant will help in breaking the addiction. The following table presents the result of the study. Cocaine Relapse? Desipramine Lithium Placebo Total Yes 10 18 20 48 No 14 6 4 24 15. Compare the relapse rate for the three groups. Which among desipramine, lithium, or placebo is most effective in lowering the relapse rate among cocaine addicts? 5 16. Consider the experiment of choosing at random one of the subjects in the above study and then determining the treatment given (which is either desipramine, lithium, or placebo) and observing whether the subject has a relapse. The sample space of this experiment is: S = {(Desipramine, Yes), (Desipramine, No), (Lithium, Yes), (Lithium, No), (Placebo, Yes), (Placebo, No)} What would be the appropriate probabilities to assign to these six outcomes in this sample space. Note that these probabilities should be based on the number of individuals in the different cells of the table and the overall total. 17. Define event A to be the event that "Desipramine" was assigned, and B be the event that the subject had a relapse. What are P(A) and P(B)? 18. Find P(A or B), that is, the probability that either A or B occurs. 19. Find P(B|A), the conditional probability of B given A. 20. Are events A and B independent? Provide a reason for your answer. 6 21. Find the probabilities a) P(Desipramine was assigned | B); and b) P(Lithium was assigned | B). Based on these probabilities, if you are given the information that the subject had a relapse, is it more likely that the subject was assigned desipramine or lithium? ________________________________________________________________________ Part IV (12 points). For questions 22-24 please refer to the following information. ELISA tests are used to screen donated blood for the presence of the AIDS virus. The test actually detects antibodies, substances that the body produces when the virus is present. If the antibodies are present, ELISA is positive with probability of .997 and negative with probability of .003. If the blood being tested is not contaminated with AIDS antibodies, ELISA gives a positive result with probability of .015 and a negative result with probability of .985. Assume that 1% of a large population carries the AIDS antibody in their blood. Suppose that one individual is randomly chosen from this population. 22. Draw a tree diagram which depicts the outcomes of this two-step experiment, with step 1 being the process of choosing the person (outcomes: the person does or does not carry the antibody) and step 2 being the process of performing the ELISA test on the person’s blood (outcomes: positive or negative). 23. What is the probability that the ELISA test for the AIDS virus will show a positive result? 7 24. Given that the ELISA test is positive, what is the probability that the chosen person has the AIDS antibody? ________________________________________________________________________ Part V (16 points). For questions 25-28 please refer to the following information. Let X be the random variable denoting the number of revisions (including the original version) before a manuscript is accepted for publication in a scientific journal. Suppose that the probability function of X is given by: x = number of revisions 1 2 3 4 5 p(x) = P{x revisions needed} 0.10 0.30 0.35 0.15 0.10 25. Find P{2 < X < 4} = probability that the manuscript will take between 2 and 4, inclusive, revisions before getting accepted for publication. 26. Determine the mean of X. 8 27. Determine the standard deviation of X. 28. Suppose that we define the variable Y = 2X + 5. By simply using the mean and standard deviation of X, what will be the mean and standard deviation of Y? ________________________________________________________________________ Part VI (12 points). For questions 29-32 please refer to the following information. A psychiatrist believes that 80% of all people [a very large population] who visit doctors have problems of a psychosomatic nature. She decides to select 25 patients at random to test her theory. Let X denote the number of patients out of the 25 who have problems of a psychosomatic nature, so that X has a binomial distribution. Assume that the psychiatrist's theory is correct. 29. What is the mean of X? 30. What is the standard deviation of X? 31. Find the probability that X = 20. [You may just write this in formula form.] 9 32. By using a table of binomial probabilities or a calculator, we find that P{X < 14} = .0056 when the psychiatrist’s theory is correct. Suppose that when the sample of 25 patients was actually taken, only 14 has problems of a psychosomatic nature. What conclusions could you make about the psychiatrist's theory? 10 Some Formulas That May Be Useful X 1 n Xi n i 1 2 n Xi 1 n 1 n 2 i 1 2 2 S Xi ( X i X ) n 1 n 1 i 1 n i 1 M = value that divides arranged data into two equal parts Q1 = Divides arranged data into 25:75 split Q3 = Divides arranged data into 75:25 split P(A or B) = P(A) + P(B) - P(A and B) P(B|A) = P(A and B)/P(A) P(B) = P(A)P(B|A) + P(Ac)P(B|Ac) P(A|B) = P(A)P(B|A)/P(B) P(A and B) = P(A)P(B) if A and B are independent xp(x) 2 ( x )2 p( x) x 2 p( x) 2 2 n p( x ) p x (1 p ) n x x =np; 2 np(1 p) n! = (n)(n-1)(n-2)...(2)(1) with 0! = 1 n n n! Cr r r!(n r )! 11