faculty of information technology Probability and Statistics Exercises Lecturer: Mr. David Hagumuwumva November 19, 2023 Question 1 The mean height of 6 children is 1.63 m . What’s the mean if they’re joined by another child who’s 1.75 m tall? Question 2 Suppose that the annual income of the residents of a certain country has a mean of $48,000 and a median of $34,000. What is the shape of the distribution? Question 3 The 41 2 3 14 0 data below 9 0 4 10 4 0 1 14 7 9 3 5 7 3 5 represent the ages of a different set of 50 pennies. 3 0 3 8 21 3 14 0 25 12 24 19 2 4 4 5 1 20 3 0 8 17 16 0 23 7 28 17 9 2 a. Draw a relative frequency histogram to describe the distribution of penny ages. b. Draw a stem and leaf plot to describe the penny ages. Are there any unusually large or small measurements in the set? c. Plot and use a box plot to determine outlier values. Question 4 The lengths of a large shipment of chromium strips have a mean of 0.44 m and standard deviation of 0.001 m. At least what percentage of these lengths must lie between a. 0.438 and 0.442 m? b. 0.436 and 0.444 m? c. 0.430 and 0.450 m? 1 Question5 A set of n = 10 measurements consists of the values 5, 2, 3, 6, 1, 2, 4, 5, 1, 3 a. Use the range approximation to estimate the value of s for this set b. Use your calculator to find the actual value of s.Is the actual value close to your estimate in part a? c. Draw a dotplot of this data set. Are the data mound shaped? d. Can you use Tchebysheff’s Theorem to describe this data set? Why or why not? e. Can you use the Empirical Rule to describe this data set? Why or why not? Question 6 Is your breathing rate normal? Actually, there is no standard breathing rate for humans. It can vary from as low as 4 breaths per minute to as high as 70 or 75 for a person engaged in strenuous exercise. Suppose that the resting breathing rates for college-age students have a relative frequency distribution that is mound-shaped, with a mean equal to 12 and a standard deviation of 2.3 breaths per minute. What fraction of all students would have breathing rates in the following intervals? a. 9.7 to 14.3 breaths per minute b. 7.4 to 16.6 breaths per minute c. More than 18.9 or less than 5.1 breaths per minute Question 7 The number of passes completed by Brett Favre, quarterback for the Green Bay Packers, was recorded for each of the 16 regular season games in the fall of 2006 (www.espn.com). 15 31 25 22 22 19 17 28 24 5 22 24 22 20 26 21 a. Draw a stem and leaf plot to describe the data. b. Calculate the mean and standard deviation for Brett Favre’s per game pass completions. c. What proportion of the measurements lie within two standard deviations of the mean? 2 Question 8 Given the following data set: 8, 7, 1, 4, 6, 6, 4,5, 7, 6, 3, 0 a. Find the five-number summary and the IQR. b. Calculate x̄ and s. c. Calculate the z-score for the smallest and largest observations. Is either of these observations unusually large or unusually small? Question 9 The data listed here are the weights (in supermarket meat display: 1.08 .99 .97 1.18 1.41 1.28 .83 1.38 .75 .96 1.08 .87 .89 .89 1.12 .93 1.24 .89 .98 1.14 .92 pounds) of 27 packages of ground beef in a 1.06 1.14 .96 1.12 1.18 1.17 a. Find the mean and standard deviation of the data set. 1. b. Find the percentage of measurements in the intervals x ± s, x ± 2s, and x ± 3s. d. How do the percentages obtained in part c compare with those given by the Empirical Rule? Explain. Question 10 The data below are 30 waiting times between eruptions of the Old Faithful geyser in Yellowstone National Park. 56 89 51 79 58 82 52 88 52 78 69 75 77 72 71 55 87 53 85 61 93 54 76 80 81 59 86 78 71 77 a. Calculate the range. b. b. Use the range approximation to approximate the standard deviation of these 30 measurements. c. Calculate the sample standard deviation s. d. What proportion of the measurements lie within two standard deviations of the mean? Within three standard deviations of the mean? Do these proportions agree with the proportions given in Tchebysheff’s Theorem? 3 Question 11 Male and female respondents to a questionnaire about gender differences are categorized into three groups according to their answers to the first question: Men Women Group1 37 7 Group2 49 50 Group3 72 31 a. Create side-by-side pie charts to describe these data. b. Create a side-by-side bar chart to describe these data. c. Draw a stacked bar chart to describe these data. d. Which of the three charts best depicts the difference or similarity of the responses of men and women? Question 12 When you were growing up, did you feel that you did not have enough free time? Parents and children have differing opinions on this subject. A research group surveyed 198 parents and 200 children and recorded their responses to the question, “How much free time does your child have?” or “How much free time do you have?” The responses are shown in the table below: Parent Children Just the right amount Not enough 138 14 130 48 Too much 40 16 Don’t know 6 6 a. Define the sample and the population of interest to the researchers. b. Describe the variables that have been measured in this survey. Are the variables qualitative or quantitative? Are the data univariate or bivariate? c. What do the entries in the cells represent? c The entry in a cell represents the number of people who fell into that relationship-opinion category. d. Use comparative pie charts to compare the responses for parents and children. e. What other graphical techniques could be used to describe the data? Would any of these techniques be more informative than the pie charts constructed in part d? 4 Question 13 The number of passes completed and the total number of passing yards was recorded for Brett Favre for each of the 16 regular season games in the fall of 2006. Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Completions Total Yards 15 170 31 340 25 340 22 205 22 220 19 206 17 180 28 287 24 347 5 73 22 266 24 214 22 293 20 174 26 285 21 285 a. Draw a scatterplot to describe the relationship between number of completions and total passing yards for Brett Favre. [b]. Describe the plot in part a. Do you see any outliers? Do the rest of the points seem to form a pattern? c. Calculate the correlation coefficient, r, between the number of completions and total passing yards. d. What is the regression line for predicting total number of passing yards y based on the total number of completions x? e. If Brett Favre had 20 pass completions in his next game, what would you predict his total number of passing yards to be? Question 14 Evaluate the following permutations: a. P35 b. P910 c. P66 20 d. P19 5 Question 15 Evaluate these combinations: a. C35 b. C25 c. C615 d. C520 Question 16 Two city council members are to be selected from a total of five to form a subcommittee to study the city’s traffic problems. a. How many different subcommittees are possible? b. If all possible council members have an equal chance of being selected, what is the probability that members Smith and Jones are both selected? Question 17 Suppose that P(A) =0.4 and P(B)=0.2. If events A and B are independent, find these probabilities: P (Ac ), P (A ∩ B), P (A ∪ B), P (A ∪ B)c Question 18 Suppose that P (A) = 0.4 and P (A ∩ B) = 0.12. a. Find P (B|A). b. Are events A and B mutually exclusive? c. If P (B) = 0.3, are events A and B independent? 6 Question 19 Jane has three children, each of which is equally likely to be a boy or a girl independently of the others. Define the events: A = {all the children are of the same sex} B = {there is at most one boy} C = {the family includes a boy and a girl} a) Show that A is independent of B, and that B is independent of C. b) Is A independent of C? c) Do these results hold if boys and girls are not equally likely? d) Do these results hold if Jane has four children? Question 20 Two people enter a room and their birthdays (ignoring years) are recorded. • Find the probability that Both people have the same birthday. • Find the probability that Both people have different birthday. Question 21 Medical case histories indicate that different illnesses may produce identical symptoms. Suppose a particular set of symptoms, which we will denote as event H, occurs only when any one of three illnesses—A, B, or C—occurs. (For the sake of simplicity, we will assume that illnesses A, B, and C are mutually exclusive.) Studies show these probabilities of getting the three illnesses: P (A) = 0.01 P (B) = 0.005 P (C) = 0.02 The probabilities of developing the symptoms H, given a specific illness, are P (H|A) = 0.90 P (H|B) = 0.95 P (H|C) = 0.75 Assuming that an ill person shows the symptoms H, what is the probability that the person has illness A? 7 Question 22 A coin is tossed three times. If X denotes number of heads and Y denotes the absolute difference between the number of heads and the number of tails, find: • P(X=1) • P(X=2) • P(X<2) • P(Y=1) • P(Y=0) • P(Y ≥ 1) Question 23 A fire-detection device uses three temperature-sensitive cells acting independently of one another in such a manner that any one or more can activate the alarm. Each cell has a probability p =.8 of activating the alarm when the temperature reaches 100°F or higher. Let x equal the number of cells activating the alarm when the temperature reaches 100°F. a. Find the probability distribution of x. b. Find the probability that the alarm will function when the temperature reaches 100°F. c. Find the expected value and the variance for the random variable x. Question 24 Two cold tablets are accidentally placed in a box containing two aspirin tablets. The four tablets are identical in appearance. One tablet is selected at random from the box and is swallowed by the first patient. A tablet is then selected at random from the three remaining tablets and is swallowed by the second patient. Define the following events as specific collections of simple events: a. The sample space S. b. The event A that the first patient obtained a cold tablet c. The event B that exactly one of the two patients obtained a cold tablet d. The event C that neither patient obtained a cold tablet e. By summing the probabilities of simple events, find P (A), P (B), P (A ∩ B), P (A ∪ B), P (C), P (A ∩ C), and P (A ∪ C). 8 Question 25 (Binomial) Car color preferences change over the years and according to the particular model that the customer selects. In a recent year, suppose that 10% of all luxury cars sold were black. If 25 cars of that year and type are randomly selected, find the following probabilities: a. At least five cars are black. b. At most six cars are black. c. More than four cars are black. d. Exactly four cars are black. e. Between three and five cars (inclusive) are black. f. More than 20 cars are not black. Question 26 (Binomial) A bias coin is flipped independently 10,000 times. Find mean and standard deviation, assuming the probability of head is 0.2. Question 26(Binomial) According to the Humane Society of the United States, there are approximately 65 million owned dogs in the United States, and approximately 40% of all U.S. households own at least one dog.4 Suppose that the 40% figure is correct and that 15 households are randomly selected for a pet ownership survey. a. What is the probability that exactly eight of the households have at least one dog? b. What is the probability that at most four of the households have at least one dog? c. What is the probability that more than 10 households have at least one dog? 9 Question 27(Poisson) Let x be a Poisson random variable with mean µ = 2. Calculate these probabilities: P (x = 0), P (x = 1), P (x ≥ 1), P (x = 5) Question 28 Parents who are concerned that their children are “accident prone” can be reassured, according to a study conducted by the Department of Pediatrics at the University of California, San Francisco. Children who are injured two or more times tend to sustain these injuries during a relatively limited time, usually 1 year or less. If the average number of injuries per year for school-age children is two, what are the probabilities of these events? a. A child will sustain two injuries during the year. b. A child will sustain two or more injuries during the year c. A child will sustain at most one injury during the year. Question 28:HYPERGEOMETRIC Let x be the number of successes observed in a sample of n = 5 items selected from N = 10. Suppose that, of the N = 10 items, 6 are considered “successes.” a. Find the probability of observing no successes. selecting 5 items, we must select at least one “success”., therefore P(x =0)=0 b. Find the probability of observing at least two successes. c. Find the probability of observing exactly two successes. Question 29: Poisson The 10-year survival rate for bladder cancer is approximately 50%. If 20 people who have bladder cancer are properly treated for the disease, what is the probability that: a. At least 1 will survive for 10 years? b. At least 10 will survive for 10 years? c. At least 15 will survive for 10 years? 10 Question 30: HYPERGEOMETRIC A new surgical procedure is said to be successful 80% of the time. Suppose the operation is performed five times and the results are assumed to be independent of one another. What are the probabilities of these events? a. All five operations are successful. b. Exactly four are successful. c. Less than two are successful. Question 31 Calculate the area under the standard normal curve to the left of these values: a. z = 1.6 b. z = 1.83 c. z = 0.9 d. z = 4.18 Question 32 Find the following probabilities for the standard normal random variable z: • P (0.58 < z < 1.74) • P (−1.55 < z < −0.44) • P (z > 1.34) • P (z < −4.32) Question 33 Cerebral Blood Flow(CBF) in the brains of healthy people is normally distributed with a mean of 74 and a standard deviation of 16. 11 a. What proportion of healthy people will have CBF readings between 60 and 80? b. What proportion of healthy people will have CBF readings above 100? c. If a person has a CBF reading below 40, he is classified as at risk for a stroke. What proportion of healthy people will mistakenly be diagnosed as “at risk”? Question 34 An article in American Demographics claims that more than twice as many shoppers are out shopping on the weekends than during the week. Not only that, such shoppers also spend more money on their purchases on Saturdays and Sundays! Suppose that the amount of money spent at shopping centers between 4 P.M. and 6 P.M. on Sundays has a normal distribution with mean $85 and with a standard deviation of $20. A shopper is randomly selected on a Sunday between 4 P.M. and 6 P.M. and asked about his spending patterns. a. What is the probability that he has spent more than $95 at the mall? b. What is the probability that he has spent between $95 and $115 at the mall? c. If two shoppers are randomly selected, what is the probability that both shoppers have spent more than $115 at the mall? Question 35 A bias coin is flipped independently 10,000 times. Determine the probability that the number of heads is between 1950 and 2100, assuming the probability of head is 0.2. Question 36 A random sample of public opinion in a small town was obtained by selecting every 10th person who passed by the busiest corner in the downtown area. Will this sample have the characteristics of a random sample selected from the town’s citizens? Explain. Question 37 Random samples of size n were selected from populations with the means and variances given here. Find the mean and standard deviation of the sampling distribution of the sample mean in each case: a. n = 36, µ = 10, σ 2 = 9 b. n = 100, µ = 5, σ 2 = 4 12 c. n = 8, µ = 120, σ 2 = 1 Question 38 Suppose a random sample of n = 25 observations is selected from a population that is normally distributed with mean equal to 106 and standard deviation equal to 12. a. Give the mean and the standard deviation of the sampling distribution of the sample mean x̄ b. Find the probability that x̄ exceeds 110. c. Find the probability that the sample mean deviates from the population mean µ = 106 by no more than 4. Question 39 Suppose that college faculty with the rank of professor at two-year institutions earn an average of $64,571 per year with a standard deviation of $4,000. In an attempt to verify this salary level, a random sample of 60 professors was selected from a personnel database for all two-year institutions in the United States. a. Describe the sampling distribution of the sample mean x̄. b. Calculate the probability that the sample mean is greater than $66,000? c. If your random sample actually produced a sample mean of $66,000, would you consider this unusual? What conclusion might you draw? Question 41 Random samples of size n were selected from binomial populations with population parameters p given here. Find the mean and the standard deviation of the sampling distribution of the sample proportion p̂ in each case: a. n = 100, p =.3 b. n = 400, p = .1 c. n = 250, p = .6 Question 42 Random samples of size n = 75 were selected from a binomial population with p = .4. Use the normal distribution to approximate the following probabilities: a. P (p̂ ≤ .43) b. P (.35 ≤ p̂ ≤ .43) 13 Question 42 News reports tell us that the average American is overweight. Many of us have tried to trim down to our weight when we finished high school or college. And, in fact, only 20% of adults say they do not suffer from weight-loss woes. Suppose that the 20% figure is correct, and that a random sample of n = 120 adults is selected. a. Does the distribution of p̂, the sample proportion of adults who do not suffer from excess weight, have an approximate normal distribution? If so, what is its mean and standard deviation? b. What is the probability that the sample proportion p̂ exceeds .25? c. What is the probability that p̂ lies within the interval .25 to .30? d. What might you conclude about p if the sample proportion exceeded .30? 14 Question 43 Explain what is meant by “margin of error” in point estimation. Question 44 A random sample of n = 900 observations from a binomial population produced x = 655 successes. Estimate the binomial proportion p and calculate the margin of error. Question 45 A random sample of n = 50 observations from a quantitative population produced x̄ = 56.4 and s2 = 2.6. Give the best point estimate for the population mean m, and calculate the margin of error. Question 46 Find a 90% confidence interval for a population mean µ for these values: a. n = 125, x̄ = .84, s2 = .086 b. n = 50, x̄ = 21.9, s2 = 3.44 c. Interpret the intervals found in parts a and b. Question 47 A random sample of n = 300 observations from a binomial population produced x̄ = 263 successes. Find a 90% confidence interval for p and interpret the interval. Question 48 Independent random samples were selected from populations 1 and 2. The sample sizes, means, and variances are as follows: • Find a 95% confidence interval for estimating the difference in the population means (µ1 − µ2). 15