Math 203: Principles of Statistics I Midterm exam Start time: October 21, 2020, 9:00, Montreal time End time: October 24, 2020, 9:00, Montreal time INSTRUCTIONS 1. Open book 2. Answer all 5 questions 3. Show your reasoning briefly. An answer alone is not sufficient 4. Combinatorial numbers can be left unsimplified 5. You can leave answers as fractions 6. The marks allocated are shown in [brackets] 7. Answers can be hand-written, typed, or a combination of both 8. The solved exam must be uploaded in .pdf format 9. Non-pdf documents will not be accepted Math 203, Midterm Exam Page 2 Fall 2020 Question 1: The distribution of age at graduation from medical school for a group of 100 unrelated medical residents in a hospital is given in the following plots: (a) Briefly describe the distribution of age at graduation for the 100 medical residents (provide the units), in terms of [8 points] i. Centre The median is approximately 31 years ii. Spread or variability The IQR is approximately 33-30=3 years iii. “Shape” The data are skewed to the right iv. Presence of outliers Four on the top part, not extreme (b) Suppose we are told that the mean age and standard deviation of this group of 100 residents are 32.0 and 3.2 years, respectively. Math 203, Midterm Exam Page 3 Fall 2020 i. Give an approximate value for a robust measure of central tendency in the data. Justify your answer [4 points] Because the data are skewed, the median=31 is robust to extreme observations; the mean is not. ii. Give an approximate value for a robust measure of variability in the data. Justify your answer [4 points] Because the data are skewed, the IQR=3 is robust to extreme observations; the standard deviation is not a robust measure. Note: A measure is called robust if it is not affected by extreme observations Math 203, Midterm Exam Page 4 Fall 2020 Question 2: Three Statistics classes all took the same test. Boxplots of the scores for each class are shown below. Histograms are also shown but they are not identified by class. Class 2 Class 3 0.0 0.2 0.2 0.2 0.4 0.4 0.4 0.6 0.6 0.6 0.8 0.8 0.8 1.0 Class 1 B C 150 Frequency 100 80 60 Frequency 40 100 0.0 0.2 0.4 0.6 0.8 0 0 20 50 50 0 Frequency 150 100 200 200 120 A 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 1.0 Interpreting the boxplots a) [8] In no more than a sentence each, describe 3 similarities or differences between the scores for Class 1 and Class 3 in i. Centre Median of Class 3 ≈ 0.3, Median of Class 1 ≈ 0.7 ii. Spread or variability IQR of Class 3 ≈ 0.4 − 0.18 = 0.22, IQR of Class 1 ≈ 0.85 − 0.6 = 0.25. iii. “Shape” Class 3 shows skewness to the right, Class 1 to the left. iv. Presence of outliers Class 1 outliers lower part. Class 3 outliers upper part. Note: There are many possible answers for question 2a). presented here are just some of them. The ones b) [2] What percentage of the scores (approximately) falls within (0.4, 0.6) for Class 2? These are the observations approximately between the first and the third quartiles, so 50% Math 203, Midterm Exam Page 5 Fall 2020 c) [3] Is it appropriate to say that approximately 95% of observations from Class 3 fall in the interval x̄ ± 2s ? Justify your answer in one sentence. No. The distribution of observations shows skewness. The Empirical rule will not apply. Chebyshev’s rule will only guarantee that at least 75% of data will fall inside this interval, which is not a good enough approximation. d) [2] Match each histogram with the corresponding boxplot. Class 1: C; Class 2: B; Class 3: A. Math 203, Midterm Exam Page 6 Fall 2020 Question 3: A distribution of the measurements is approximately mound (bell) shaped and symmetric with mean 50 and standard deviation of 10. Approximately (a) what percentage of the measurements will fall between 40 and 60? [2 Points] The interval from 40 = 50 − 10 to 60 = 50 + 10 represents (µ − σ, µ + σ) . Since the distribution is relatively mound- shaped and symmetric, the proportion of measurements between 40 and 60 is approximately 68% according to the Empirical Rule. (b) what percentage of the measurements will fall between 30 and 70? [2 Points] The interval from 30 = 50 − 20 to 70 = 50 + 20 represents (µ − 2σ, µ + 2σ). Using the Empirical Rule the proportion of measurements between 30 and 60 is 95% approximately. (c) what percentage of the measurements will fall between 30 and 60? [4 Points] Proportion Proportion approx. Proportion Proportion approx of measurements between 40 and 60 = 68% approx of measurements between 50 (= µ) and 60 (= µ + σ)= 68/2= 34% of measurements between 30 and 70 = 95% approx. of measurements between 30(= µ − 2σ) and 50(= µ) = 95/2 = 47.5% Approximately 34% + 47.5% = 81.5% of the measurements will fall between 30 and 60 (d) what percentage of measurements fall below 30? [2 Points] Proportion of the measurements below 50(= µ) = 50% approx. Proportion of measurements between 30 (= µ − 2σ) and 50.(= µ) = 95/2= 47.5% approx. Proportion of the measurements below 30 = 50% − 47.5% = 2.5% approx. (e) If a measurement is chosen at random from this distribution, what is the (approximate) probability that it will be greater than 60? [2 Points] Proportion of the measurements above 50(= µ) = 50% approx. Proportion of measurements between 50 (= µ) and 60(= µ + σ) = 68/2 = 34% approx. Proportion of the measurements above 60 = 50% − 34% = 16% The probability that a randomly chosen measurement lies above 60 is 0.16. Math 203, Midterm Exam Page 7 Fall 2020 Question 4: On Valentine’s day, a restaurant offers a special that could save couples money on their romantic dinners. When the waiter brings the check, he will also bring the four aces from a deck of cards. He will shuffle them and lay them face down on the table. The couple will then get to turn one card over. If it is a black ace, they will get no discount. If it is the ace of hearts, they will get a $20 discount. If they turn over the ace of diamonds, they will get to turn one of the remaining cards, earning a $10 discount if they get the ace of hearts this time. Let X be the discount couples get. (a) What is the probability distribution of X? [10 points] Values X = {0, 10, 20} The probability of getting a $20 discount is 1 P {Ace of hearts} = 4 The probability of getting a $10 discount is P {Ace of hearts on 2nd ∩ Ace of diamonds on 1st} = P {Ace of hearts on 2nd∣Ace of diamonds on 1st}×P {Ace of diamonds on 1st} 1 1 1 × = 3 4 12 The easiest way of getting the probability of no discount is by using complements The probability of getting a $0 discount is 1 1 4 1 2 1−( + )=1− =1− = 4 12 12 3 3 The other way of getting the probability of no discount is: P {(Black ace on 1st) ∪ (Ace of diamonds on 1st ∩ Black ace on 2nd)} =P {Black ace on 1st} + P {(Ace of diamonds on 1st) ∩ (Black ace on 2nd)} Now, P {Black ace on 1st} = 12 , and P {(Ace of diamonds on 1st) ∩ (Black ace on 2nd)} =P {Black ace on 2nd∣Ace of diamonds on 1st} × P {Ace of diamonds on 1st} 2 1 1 = × = 3 4 6 Thus, P {(Black ace on 1st) ∪ (Ace of diamonds on 1st ∩ Black ace on 2nd)} 1 1 2 = + = 2 6 3 (b) What is the expected discount? [4 points] The expected discount for couples is (20 × 1/4) + (10 × 1/12) + (0 × 2/3) = 70/12 ≈ $5.83 (c) What is the standard deviation of the discounts? [6 points] The variance is 2 2 2 ((20 − 5.83) × 1/4) + ((10 − 5.83) × 1/12) + ((0 − 5.83) × 2/3) ≈ 74.306. √ The standard deviations is 74.306 ≈ $8.62. Math 203, Midterm Exam Page 8 Fall 2020 Question 5: Eighty percent of all the apples picked on Andy’s Apple Acres are satisfactory, but the other 20% are not suitable for market. Andy inspects each apple his pickers pick. Even though he has been in the apple business for 40 years, Andy is prone to make mistakes. He judges as suitable for market 5% of all unsatisfactory apples and judges as unsuitable for market 1% of all satisfactory apples. Let A = satisfactory, B = judged satisfactory/marketed. c c c Information given: P (A) = 0.80; P (A ) = 0.20; P (B∣A ) = 0.05; P (B ∣A) = 0.01 (a) What percentage of all apples picked on Andy’s Apple Acres are both satisfactory and judged to be satisfactory by Andy as being satisfactory for market? [5 points] P (A ∩ B) = P (B∣A)P (A) = 0.99 × 0.80 = 0.792 About 79% of all apples picked on Andy’s Apple Acres are both satisfactory and judged to be satisfactory by Andy as being satisfactory for market. (b) What percentage of all apples picked are marketed? [5 points] c c P (B) = P (B∣A)P (A) + P r(B∣A )P r(A ) = 0.99 × 0.80 + 0.05 × 0.20 = 0.802 About 80% of all apples picked are marketed. (c) What percentage of all apples that are marketed are actually satisfactory? [5 points] P (A∣B) = P (A ∩ B) P (B) 0.792 = .988 0.802 About 99% of all apples that are marketed are actually satisfactory.