Chapter 2 Exercise 1 21,36,42,24,25,36,35,49,32 • a=c(21,36,42,24,25,36,35,49,32) mean(a) [1] 33.33333 tmean(a) [1] 32.85714 median(a) [1] 35 Exercise 2 a=c(21,36,42,24,25,36,35,200,32) mean(a) [1] 50.11111 tmean(a) [1] 32.85714 median(a) • [1] 35 Sample mean is not resistant as its value largely inflated with a change of a single value. Exercise 3 The resistance of the 20% trimmed mean is 0.2, meaning that a change in more than 20% of the observations are required to generate a large change in its value. In this case n=9, so 9×0.2=1.8, rounded down to the nearest integer, =1. This means that a large change in the 20% trimmed mean value requires a change of at least 2 observations. Exercise 4 The resistance of the Median is 0.5, meaning that a change in more than 50% of the observations are required to generate a large change in its value. In this case n=9, so 9×0.5=4.5. This means that a large change in the median requires a change of at least 5 observations. Exercise 5 b=c(6,3,2,7,6,5,8,9,8,11) a=c(21,36,42,24,25,36,35,49,32) > tmean(a, tr=.1) [1] 33.33333 > b=c(6,3,2,7,6,5,8,9,8,11) > mean(b) [1] 6.5 > tmean(b) [1] 6.666667 > median(b) [1] 6.5 Exercise 6 c=c(250,220,281,247,230,209,240,160,370,274,210 ,204,243,251,190,200,130,150,177,475,221,350,22 4,163,272,236,200,171,98) mean(c) [1] 229.1724 > tmean(c) [1] 220.7895 > median(c) [1] 221 Exercise 7 f1 = 5, f2 = 8, f3 = 20, f4 = 32, f5 = 23 n=5+8+20+32+23=88 5´1+ 8´ 2 + 20 ´ 3+ 32 ´ 4 + 23´ 5 X= = 3.68 88 Exercise 8 f1 =12, f2 =18, f3 =15, f4 =10, f5 = 8, f6 = 5 12 ´1+18´ 2 +15´ 3+10 ´ 4 + 8´ 5+ 5´ 6 = 2.98 68 Exercise 9 d=(21,36,42,24,25,36,35,49,32) var(d) [1] 81 > winvar(d) [1] 51.36111 Exercise 10 d=(21,36,42,24,25,36,35,102,32) winvar(d) [1] 51.36111 Winsorized variance remained the same. Exercise 11 Yes, because we shift the extreme values closer to the mean. This reduces the dispersion in the data. The mean squared distances from the mean decreases accordingly. Exercise 12 The variance has s sample breakdown point of 1/n, thus a single observation can render it value arbitrarily large or small. Exercise 13 The sample breakdown point of the 20% Winsorized variance is 0.2. In the case of n=25, this would be 25×0.2= 5. Thus, we need at to change at least 6 observation to render the Winsorized variance arbitrarily large. Exercise 14 e=c(6,3,2,7,6,5,8,9,8,11) var(e) [1] 7.388889 > winvar(e) [1] 1.822222 Exercise 15 c=c(250,220,281,247,230,209,240,160,370,274, 210,204,243,251,190,200,130,150,177,475,221, 350,224,163,272,236,200,171,98) var(c) [1] 5584.933 > winvar(c) [1] 1375.606 Exercise 16 e=c(6,3,2,7,6,5,8,9,8,11) • idealf(e) • $ql • [1] 4.833333 • $qu • [1] 8.083333 IQR=8.08-4.83=3.25 Exercise 17 c=c(250,220,281,247,230,209,240,160,370,274, 210,204,243,251,190,200,130,150,177,475,221, 350,224,163,272,236,200,171,98) out(c) $out.val [1] 370 475 350 98 Exercise 18 1 var ( x ) = å( xi - X ) i 2 fxi n Exercise 19 X: 0 1 2 3 4 5 Fx/n: 0.1 0.2 0.25 0.29 0.12 0.04 X = 0 ´ 0.1+1´ 0.2 + 2 ´ 0.25+3´ 0.29 + 4 ´ 0.12 + 5´ 0.04 = 2.25 s 2 = 2.252 ´ 0.1+1.252 ´ 0.2 + 0.252 ´ 0.25+ 0.752 ´ 0.29 +1.752 ´ 0.12 + 2.752 ´ 0.04 = 0.975 s 2 =1.452 ´ 0.2 + 0.452 ´ 0.4 + 0.552 ´ 0.2 +1.552 ´ 0.15+ 2.552 ´ 0.05 =1.2475 Exercise 20 X: 0 1 2 3 4 μ=1.45 Fx/n: 0.2 0.4 0.2 0.15 0.05 s 2 =1.452 ´ 0.2 + 0.452 ´ 0.4 + 0.552 ´ 0.2 +1.552 ´ 0.15+ 2.552 ´ 0.05 =1.2475 Exercise 21 80 70 60 50 out(f) $out.val [1] 51 90 f=c(90,76,90,64,86,51,72,90,95,78) Exercise 22 print(boxplot(g,plot=F)$out) numeric(0) print(boxplot(g,plot=F)$out) [1] 20 20 5 20 5 10 10 15 15 20 3 outliers (none detected) 2 outliers detected (one is masked on graph) Exercise 23 The boxplot has a sample break down point of 0.25%. The number of outliers it detects does not exceed 25% of the sample. For example, when we had 3 outliers with n=10, all outliers disappeared. Exercise 24 • m=c(0,0.12,.16,.19,.33,.36,.38,.46,.47,.60,.61,.61,.66,.67,.68,.69,.75,.77,.81,.81,.82, .87,.87,.87,.91,.96,.97,.98,.98,1.02,1.06,1.08,1.08,1.11,1.12,1.12,1.13,1.2,1.2,1.32, 1.33,1.35,1.38,1.38,1.41,1.44,1.46,1.51,1.58,1.62,1.66,1.68,1.68,1.70,1.78,1.82,1.8 9,1.93,1.94,2.05,2.09,2.16,2.25,2.76,3.05) • • print(boxplot(m,plot=F)$out) [1] 3.05 Exercise 25 The upper and lower quartiles of figure 2.2 are 125 and 50, respectively, so an outlier will be declared when 1. x>125+1.5(125-50) 2. X<50-1.5(125-50) Exercise 26 out(z) $out.val [1] 1 2 2 2 3 3 3 6 8 9 11 11 11 12 18 32 32 41 Histogram of z 15 10 5 Histogram detected far fewer outliers than the other methods 0 Frequency 20 25 30 outbox(z) $out.val [1] 18 32 32 41 0 10 20 30 z 40 Exercise 1 21,36,42,24,25,36,35,49,32 • a=c(21,36,42,24,25,36,35,49,32) mean(a) [1] 33.33333 tmean(a) [1] 32.85714 median(a) [1] 35