Test 1 Applied Statistics- Tuesday April 7, 11.30-1.00 in KJ 17 Use of a calculator with statistical functions (mean, standard deviation, linear regression) is advised 1. In a health care survey the researchers tried, among other things, to establish a relation between age and body fat (as a percentage of the body weight). Here are the observations of age and body fat for a random sample of 11 nurses. Age (x) 23 23 Fat% (y) 27.9 9.5 Furthermore a. 24 27 39 41 49 53 53 54 60 7.8 17.8 31.4 25.9 25.2 34.7 42.0 29.1 41.1 = 13151.8 is given. =14.201 =11.268 Compute the correlation coefficient r, the regression coefficient and the regression constant (in three decimals) In b. and c. you can use the rounded values r = 0.81 and regression line b. c. = 40.545 = 26.582 = 0.64x + 0.52 Predict the Fat percentage of a 50 year old nurse. Compute r2 and describe its meaning in terms of the dependability of fat percentage and age. 2. X is N(20, 16)-distributed and Y is B(900, 0.6)-distributed. Compute or approximate the probabilities: a. P(X > 27) b. P(Y ≤ 570) c. P( X > 27 | X > 20) 3. Radon detectors are sold to house owners who want to supervise the quantity of radon in their homes. But how accurate are these simple devices? Researchers placed 12 of these detectors in a room and exposed them to a constant level of radon. These were the 12 observed levels (in picocurie per litre): 91.9 103.8 97.8 99.6 111.4 96.6 122.3 119.3 105.4 104.8 95.0 101.7 a. Give the 5-number-summary of these observations. b. Compute the sample mean and the sample standard deviation. c. State the necessary assumptions for computing a confidence interval for the expected level of radon in the room. d. Compute a confidence interval for the expected level of radon in the room (according to the detectors), using a confidence level of 95%. e. Compute a 95% confidence interval for the standard deviation of the radon measurements. f. The researchers actually used a controlled level of 98 picocurie radon per litre: do these observations show what the researchers in advance suspected, an overestimation of the actual level of radon? Test this conjecture using a level of significance α = 1% and report the steps 3-8 of the testing procedure (below). 1. The research question (in words) 2. The statistical assumptions (model) 3. The hypotheses and level of significance 4. The test statistic and its distribution 5. The observed value (of the statistic) 6. The rejection region (for H0) or the p-value 7. The statistical conclusion 8. The conclusion in words (answer to the question) Formulas: Descriptive statistics: Regression: , Probability: Bayes and X ~ B(n, p) => (appr. for large n) Statistics: N(np, np(1-p) ) and MSE = P(Tn-1 ≥ c) = Confidence Intervals: , Tests: ~ t(n -1) if H0: µ = µ0 is true, then If H0: is true, then If H0: p = p0 is true, then ~ N(0, 1) Marks: 1 2 3 Total a b c a b c a b c d e f 3 2 2 2 3 2 3 2 2 3 3 6 33 Mark = #Points/3.3, on a scale from 0.0 to 10.0. This mark will count 40% in the end result of this course. Solutions: Exercise 1 a. = 0.810, and (0.512 if you used the rounded numbers) ( regression line: b. x = 50 => ) = 0.6450 + 0.52 = 32.52 c. r2 = 0.812 = 65.61%: So, 65.6% of the variation in the weights of women can be explained from the (linear) relation between weight and age. Exercise 2 a. P(X > 27) = 1- 0.9599 = 0.0401 (4.01%) b. Y is appr. N(9000.6, 9000.60.4)-distributed. Computation with cont.corr. (without answer 97.93%) P(Y ≤ 570) = P(Y ≤ 570.5) = (98.1%) c. P( X > 27 | X > 20) = Exercise 3 Rank 1 2 3 4 5 6 7 8 9 10 11 12 Observation 91.9 95.0 96.6 97.8 99.6 101.7 103.8 104.8 105.5 111.4 119.3 122.3 a. The 5-number summary: Minimum = 91.9, Q1 = 97.2, Median = mean of observations with rank 6 and 7 = 102.75, Q3 = 108.45 and maximum = 122.3 b. x =104.133 and s = 9.397 c. The 12 observations can be modelled as independent and normally distributed, radon measurements X1 ,….., X12 , where the normal distribution has unknown parameters en 2 . d. 95%-CI (µ) = = (98.16, 110.1), where n =12, x =104.13, s = 9.397 and c = 2.201 such that P(T12-1 ≥ c) = e. 95%-CI(𝜎) = where n = 12, s =9.397, f. 3. We test H0 : = 98 (no systematic deviation) versus H1 : > 105 (measurements are systematically higher) and = 0.01 4. Test statistic: ~ t(12 -1) if H0: µ = 98 is true 5. Observed value of T: x =104.133 en s = 9.397 , so t 2.261 6. It is right sided test If T ≥ c => reject H0 | or 6`. If P(T ≥ 2.261 | H0 ) ≤ α => reject H0 P(T11 ≥ c | H0 ) ≤ α = 0.01 => c = 2.718 | p-value = P(T11 ≥ 2.261 |H0 ) is between 1% and 2.5% 7. t 2.261 < c => accept H0 . | or 7`. p-value > α => accept H0 . 8. The test does not show convincingly (at 1%-level) that the radon detectors overestimate the radon level.