Solution to Assignment 1, winter 2015 Introduction to Statistics/MAT2375 1. Textbook problems. 5.4-22 : (a) We have E(exp(t(X1 + X2 ))) = (1 − 2t)−r/2 and E(exp(tX1 )) = (1 − 2t)−r1 /2 . Since X1 and X2 are independent E(exp(t(X1 + X2 ))) = E(exp(tX1 ))E(exp(tX2 )). Therefore (1 − 2t)−r/2 = (1 − 2t)−r1 /2 E(exp(tX2 )). This gives E(exp(tX2 )) = (1 − 2t)(r−r1 )/2 . Therefore (b) X2 ∼ χ2 (r − r1 ). 5.5-16 : (a) Since d.f. = 8, from the t table we have t0.025 = 2.306. (b) √ √ P (X̄ − 2.306S/ 8 < µ < X̄ + 2.306S/ 8) = 0.95. 5.6-14 : We need to have 20 X P( Xi > a) = 0.2 i=1 Since from C.L. 20 X Xi approximate ∼ N (200, 80). i=1 Approximately P20 P a − 200 i=1 Xi − 200 √ > √ 80 80 This gives ! = 0.2. a − 200 √ = 0.842. 80 Solve for a to get a = 207.53 6.2-10 : > x [1] 9.5 10.7 8.3 9.8 [16] 9.9 10.9 12.3 9.2 [31] 2.9 9.8 5.7 8.2 [46] 9.3 8.2 9.9 11.6 [61] 6.6 7.3 16.7 11.0 >hist(x) > summary(x) Min. 1st Qu. Median 2.900 8.675 9.400 >mean(x) [1] 9.421875 9.1 9.3 8.1 8.7 9.4 9.6 11.9 9.3 10.5 9.4 8.8 9.7 8.1 5.0 9.9 6.3 Mean 3rd Qu. 9.422 10.220 2 9.5 12.6 10.5 8.9 11.4 12.0 12.4 9.4 8.2 10.4 9.3 8.7 9.8 9.1 8.8 10.3 8.6 10.2 9.4 14.8 9.9 6.5 10.2 8.8 8.0 8.7 8.9 6.8 Max. 16.700 >var(x) [1] 4.32872 >sd(x) [1] 2.080558 >mean(x)-sd(x) [1] 7.341317 > sum(x<mean(x)+sd(x)&x>mean(x)-sd(x))/length(x) [1] 0.75 > sum(x<mean(x)+2*sd(x)&x>mean(x)-2*sd(x))/length(x) [1] 0.9375 > stem(x) The decimal point is at the | 2 4 6 8 10 12 14 16 | | | | | | | | 9 07 35683 01122236777888991123333444455678889999 223455790469 0346 8 7 See the histogram in page 4. In a normal population these probabailities are 68% and 95% (75% and 93.75% based on the data in this question). 6.3-6. We have the p.d.f. for Wr g(w) = n! n! (F (w))r−1 f (w)(1−F (w))n−r = wr−1 (1−w)n−r (r − 1)!(n − r)! (r − 1)!(n − r)! This shows Wr ∼ β(r, n − r + 1) distribution. Therefore E(Wr ) = r . n+1 (see the cover pages of your textbook for details). For part (a) take r = 1 and r = n as special cases. bf Question 2. 3 20 0 10 Frequency 30 Histogram of x 5 10 15 x >x=rbinom(20000,1,0.46) >X=matrix(x,ncol=200) >m=apply(X, 1,mean) > m [1] 0.545 0.475 0.505 [13] 0.490 0.475 0.455 [25] 0.460 0.480 0.405 [37] 0.500 0.445 0.465 [49] 0.460 0.505 0.475 [61] 0.445 0.480 0.485 [73] 0.460 0.465 0.490 [85] 0.445 0.505 0.495 [97] 0.455 0.470 0.455 0.425 0.415 0.455 0.435 0.430 0.460 0.460 0.490 0.455 0.505 0.415 0.485 0.400 0.490 0.435 0.515 0.465 0.460 0.430 0.480 0.490 0.395 0.400 0.455 0.480 0.415 0.515 0.440 0.490 0.445 0.465 0.435 0.425 0.450 0.460 0.480 0.475 0.495 0.385 0.470 0.480 0.470 0.450 0.465 0.485 0.425 0.525 0.475 0.440 0.515 0.415 0.445 0.420 0.390 0.485 0.465 0.485 >hist(m) >plot(qqnorm(m)) As we can see the histogram is symmetric and the qqplot is alomst linear and it confirms normality. This is in fact confirming central limit theorem. 4 0.485 0.450 0.490 0.505 0.465 0.455 0.470 0.470 0.470 0.445 0.485 0.440 0.465 0.445 0.435 0.410 Normal Q−Q Plot Sample Quantiles Frequency 10 0 0.40 5 0.45 15 0.50 20 25 0.55 Histogram of m 0.40 0.45 0.50 0.55 −2 m −1 0 1 2 Theoretical Quantiles Question 3. As you can notice the qqplot and histogram confirms the observations are not coming from a normal distribution. > r1=rnorm(500,10,5) > r2=rnorm(500,20,5) > r=r1*r2/(r1+r2) > par(mfrow=c(2,1)) > sum(r<8&r>6)/500 [1] 0.348 > hist(r) > plot(qqnorm(r)) 5 100 50 0 Frequency 150 Histogram of r −10 −5 0 5 10 r 10 5 0 −5 −10 Sample Quantiles Normal Q−Q Plot −3 −2 −1 0 1 Theoretical Quantiles 6 2 3