Bootstrap Estimation Suppose a simple random sample (with replacement) X1; X2; : : : ; Xn is available from some population with distribution function (cdf) F (x): A statistic tn is computed from the observed data: Sample mean: Objective: Make inferences about some feature of the population mean or average value P tn = n1 nj=1 Xj Standard deviation: v u u 1 tn = u t n Correlation: " # X 1j Xj = X2j variance ratio of variance components median tn = 98th percentile n X 1 j=1(Xj X )2 j = 1; 2; : : : ; n X 2: ) 1:)(X2j j =1(X1j Xq Pn 2 2 j =1(X1j X 1:) j =1(X2j X 2:) qP n Pn correlation 1111 Some feature of the sample, say tn, is used to estimate a corresponding feature of the population. What can you say about the distribution of tn, with respect to all possible samples of size n from the population? Expectation of tn Standard deviation 1112 Simulation: (The population c.d.f. is known) For a univariate normal distribution with mean and variance 2, the cdf is F (x) = P rfX xg Z = x1 p21 e 12 ( w )2dw Simulate B samples of size n and compute the value of tn for each sample: Distribution function tn;1; tn;2; : : : ; tn;B How can a condence interval be constructed? 1113 1114 Approximate EF (tn) with the simulated mean t= 1 B X Approximate the c.d.f. for tn with t B k=1 n;k Fbn(t) = Approximate V arF (tn) with B 1 B X v u u u t largest 1 k=1 B 1 B Order the B values of tn from smallest to (tn;k t)2 Approximate the standard deviation of tn with number of samples with tn;k < t B X tn(1) tn(2) : : : tn(B ) and approximate percentiles of the distribution of tn. 2 1 k=1(tn;k t) 1116 1115 Basic idea: (1) Approximate the population c.d.f. F (x) with the empirical c.d.f. F^n(x) for the observed sample What if F (X), the population c.d.f., is unknown? X1; X2; : : : ; Xn You cannot use a random number generator to simulate samples of size n, and values of tn, from the population. Use a bootstrap (or resampling) method? 1117 Assign probability n1 to each observation in the sample. Then b Fbn(x) = n where b is the number of observations in the sample with Xi < x for i = 1; 2; : : : ; n 1118 Approximate the act of simulating B samples of size n from a population with c.d.f. F (x) by simulating B samples of size n from a population with c.d.f. Fbn(x) { The \approximating" population is the original sample. { Sample n observations from the original sample using simple random sampling with replacement. This will be called a boostrap sample. { repeat this B times to obtain B bootstrap samples of size n. Sample 1: tn;1 Sample 2: tn;2 .. Sample B: tn;B Evaluate bootstrap estimates of features of the sampling distribution for tn, when the c.d.f. is F^n(x). B 1X E d(tn) = t B b=1 n;b Fbn V ard Fbn (tn) = B 1 B h X 1 b=1 tn;b Ed F n (tn) 1120 1119 This resampling procedure is called a The bootstrap is a large sample method: nonparametric bootstrap. If used properly it provides consistent large sample results: As n ! 1 (and B ! 1) Ed Fbn (tn) = B1 B X b=1 tn;b Vd arFbn (tn) = B X 1 b=1 ! V arF (tn) B Large number of bootstrap samples: As B ! 1 Ed F^n (tn) Vd arF^n (tn) ! EF (tn) 1 2 4tn;b 1 B X i2 32 t 5 B b=1 n;b 1121 B = B1 X tn;b ! EF^n(tn) b=1 = 1 B X 1 b=1(tn;b ! V arF^n(tn) B 2 Ed F^n (tn)) What is a good value for B? standard deviation: B= 200 condence interval: B= 1000 more demanding applications: B 5000 1122 Consistency: Original sample size must become large As n ! 1; F^n(x) ! F (x) for any x: Then, EF^n (tn) ! EF (tn) V arF^n (tn) ! V arF (tn) .. .. For small values of n; F^n(x) could deviate substantially from F (x). The bootstrap can fail! Example 12.1: Average values for GPA and LSAT scores for students admitted to n=15 Law Schools in 1973. School LSAT GPA 1 576 3.39 2 635 3.30 3 558 2.81 4 579 3.03 5 666 3.44 6 580 3.07 7 555 3.00 8 661 3.43 9 651 3.36 10 605 3.13 11 653 3.12 12 575 2.74 13 545 2.76 14 572 2.88 15 594 2.96 1123 1124 These schools were randomly selected from Law School Data 3.4 a larger population of law schools. 3.2 We want to make inferences about the cor- 3.0 GPA relation between GPA and LSAT scores for the population of law schools The sample correlation (n=15) is r = 0:7764 tn 2.8 560 580 600 620 640 660 LSAT score 1125 1126 Bootstrap samples: Take samples of n=15 schools, using simple random sampling with replacement. Sample 2: School 12 11 12 8 9 5 9 1 5 5 1 2 10 14 Sample 1: School 7 10 14 8 5 4 1 2 13 3 15 4 9 3 3 LSAT 555 605 572 661 666 578 576 635 545 558 594 573 651 558 558 GPA 3.00 3.13 2.88 3.43 3.44 3.03 3.39 3.30 2.76 2.81 2.96 3.03 3.36 2.81 2.81 LSAT 575 653 575 661 651 666 651 576 666 666 576 635 605 572 GPA 2.74 3.12 2.74 3.43 3.36 3.44 3.36 3.39 3.44 3.44 3.39 3.30 3.13 2.88 tn;2 = 0:6673 = r15;2 Repeat this B = 5000 times to obtain tn;1; tn;2; : : : ; tn;5000 tn;1 = 0:8586 = r15;1 1127 1128 Estimated correlation (from the original sample of n = 15 law schools) r = 0:7764 100 150 200 250 300 5000 Bootstrap Correlations 0 50 Bootstrap standard error (from B = 5000 bootstrap samples v 2 32 u u B B X X 1 1 u 4t Sr = t t 5 B 1 b=1 n;b B j =1 n;j 0.2 0.4 0.6 0.8 v u u u t 2 3 2 5000 5000 X X 1 1 4 5 r15;b = 4999 5000 j=1 r15;j b=1 1.0 = 0:1341 1129 1130 Number of Bootstrap Bootstrap estimate of samples Standard error 2.6 2.8 3.0 3.2 3.4 0:1108 0:0985 0:1334 0:1306 0:1328 0:1299 0:1366 0:1341 GPA B = 25 B = 50 B = 100 B = 250 B = 500 B = 1000 B = 2500 B = 5000 Law School Data 500 \Very seldom are more than B = 200 replications needed for estimating a standard error." { Efron & Tibshirani (1993) (page 52) 550 600 School 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 LSAT 622 542 579 653 606 576 620 615 553 607 558 596 635 581 661 547 599 646 622 611 546 GPA Type 3.23 2 2.83 2 3.24 2 3.12 1 3.09 2 3.39 1 3.10 2 3.40 2 2.97 2 2.91 2 3.11 2 3.24 2 3.30 1 3.22 2 3.43 1 2.91 2 3.23 2 3.47 2 3.15 2 3.33 2 2.99 2 1132 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 1133 700 LSAT score 1131 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 650 School 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 LSAT 614 628 575 662 627 608 632 587 581 605 704 477 591 578 572 615 606 603 535 595 575 GPA Type 3.19 2 3.03 2 3.01 2 3.39 2 3.41 2 3.04 2 3.29 2 3.16 2 3.17 2 3.13 1 3.36 2 2.57 2 3.02 2 3.03 1 2.88 1 3.37 2 3.20 2 3.23 2 2.98 2 3.11 2 2.92 2 1134 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 School 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 LSAT 573 644 545 645 651 562 609 555 586 580 594 594 560 641 512 631 597 621 617 637 572 GPA Type 2.85 2 3.38 2 2.76 1 3.27 2 3.36 1 3.19 2 3.17 2 3.00 1 3.11 2 3.07 1 2.96 1 3.05 2 2.93 2 3.28 2 3.01 2 3.21 2 3.32 2 3.24 2 3.03 2 3.33 2 3.08 2 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 School 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 LSAT 610 562 635 614 546 598 666 570 570 605 565 686 608 595 590 558 611 564 575 GPA Type 3.13 2 3.01 2 3.30 2 3.15 2 2.82 2 3.20 2 3.44 1 3.01 2 2.92 2 3.45 2 3.15 2 3.50 2 3.16 2 3.19 2 3.15 2 2.81 1 3.16 2 3.02 2 2.74 1 1135 The correlation coeÆcient for the population of 82 Law Schools in 1973 is = 0:7600 The exact distribution of estimated correlation coeÆcients for random samples of size n=15 from this population involves More than 3.610110 possible samples of size n=15 Results from 100,000 samples of size n=15. Percentiles .95 0.9128 .75 0.8404 .50 0.7699 .25 0.6777 .05 0.5010 min = -0.3647 max = 0.9856 mean = 0.7464 std. error = 0.1302 1137 1136 Bias: From a sample of size n, tn is used to estimate a population parameter . BiasF (tn) = EF (tn) " " average across true all possible parameter samples of size n value from the population Law School example: BiasF (r15) = EF (r15) 0:7600 = :7464 :7600 = 0:0136 1138 Bootstrap estimate of bias: \true value" if you take the original sample as the population # BiasFb (tn) = EFb (tn) tn " approximate this with the average of results from B bootstrap samples B1 PBb=1 tn;b Improved bootstrap bias estimation: Efron & Tibshirani (1993), Section 10.4 Bias corrected estimates: d (tn) ten = tn Bias b 0 B = 2tn @ 1 X t 1 A B b=1 n;b Law School example: (B = 5000 bootstrap samples) X 1 5000 d (r15) = Bias b F 5000 b=1 r15;b r15 = :7692 :7764 = :0072 Law School example: d (r15) r~15 = r15 Bias b = :7764 ( :0072) = 0:7836 " here we moved farther away from = :7600: 1139 1140 \Empirical Percentile Bootstrap Condence Intervals Bias correction can be dangerous in practice: ten may have a substantially larger variance than tn Construct an approximate (1 condence interval for . ) 100% tn is an estimator for Compute B bootstrap samples to obtain MSEF (ten) = [BiasF (ten)]2 + V arF (ten) tn;1; : : : ; tn;B is often larger than MSEF (tn) = [BiasF (tn)]2 + V arF (tn) Order the bootstrapped values from small- est to largest tn;(1) tn;(2) : : : tn;(B ) 1141 1142 Compute upper and lower 2 100th percentiles Compute kL = (B + 1) 2 = largest integer (B + 1) 2 kU = B + 1 kL Then, an approximate (1 ) 100% condence interval for is [tn;(kL); tn;(kU )] Law School example: (B = 5000 bootstrap samples) Construct an approximate 90% condence interval for = population correlation = :10 kL = [(5001)(:05)] = [250:05] = 250 kU = 5001 250 = 4751 An approximate large sample 90% condence interval is [r15;(250); r15;(4751)] = [0:5207; 0:9487] 1144 1143 Law School example: Fisher Z -transformation Zn = m(rn) 100 150 200 250 300 5000 Bootstrap Correlations = 12 log 11 + rrnn ! ! _ N 12 log 11 + ; n 1 3 An approximate (1 ) 100% condence 0 50 interval for 21 log 1+ 1 lower limit: Zn Z=2 pn1 3 = ZL upper limit: Zn + Z=2 pn1 3 = ZU 0.2 0.4 0.6 0.8 1.0 Transform back to the original scale " 1145 e2ZL 1 e2Zu 1 ; e2ZL + 1 e2Zu + 1 # 1146 For n = 15 and r15 = :7764 we have 1 1 + :7764 = 1:03624 Z15 = log 2 1 :7764 and 1 ZL = Z15 Z:05 p 15 3 = :561372 Percentile Bootstrap Condence Intervals = Z15 + Z:05 p151 3 = 1:511113 and an approximate 90% condence interval for the correlation is (0:509; 0:907) The bootstrap percentile interval would approximate this interval if the original sample was taken from a bivariate normal approximation. ZU Suppose there is a transformation = m() such that b = m(tn) N (; !2) for some standard deviation !. Then, an approximate condence interval for is [m 1(b Z(=2)!); m 1(b + Z=2!)] The bootstrap percentile interval is a con- sistent approximation. (You do not have to identify the m() transformation.) 1148 1147 Bias-corrected and accelerated (BCa) bootstrap percentile condence intervals The bootstrap approximation becomes more accurate for larger samples simulate B bootstrap samples and order the resulting estimates tn;(1) tn;(2) tn;(B ) For smaller samples, the coverage proba- bility of the bootstrap percentile interval tends to be smaller than the nominal level (1 ) 100. A percentile interval is entirely inside the parameter space. A percentile condence interval for a correlation lies inside the interval [ 1; 1]. The BCa interval of intended coverage 1 is given by [tn;([1(B+1)]); tn;([2(B+1)])] where 1 2 1149 = = 0 @Zb0 0 @Zb0 1 +1 Zb0 Z=2 A ab(Zb0 Z=2) +1 Zb0 + Z=2 A ab(Zb0 + Z=2) 1 1150 where () is the c.d.f. for the standard normal distribution Z=2 is an \upper" percentile of the standard normal distribution,e.g. Z:05 = 1:645 and (Z=2) = 1 2 proportion of t n;b Zb0 = 1 values smaller than tn is roughly a measure of \median bias" of tn in \normal units." when exactly half of the bootstrap samples have tn;b values less than tn, then Zb0 = 0. 1151 ab = j =1(tn;() Pn 6 hP n j =1(tn;() tn; j )3 i3=2 tn; j )2 is the estimated acceleration, where tn; j is the value of tn when the j -th case is removed from the sample and tn;() = n1 Pnj=1(tn; j ) 1152 BCa intervals are second order accurate P rf < lower end of BCa intervalg = 2 + Clower n P rf > upper end of BCa intervalg = 2 + Cupper n Bootstrap percentile intervals are rst order accurate Clower P rf < lower endg = + p 2 n P rf < upper ABC intervals are approximations to BCa intervals { second order accurate { only use about 3% of the computation time (See Efron & Tibshirani (1993) Chapter 14) endg = 2 + Cpupper n 1153 1154 # # # # # # # # # # # # # This is S-plus code for creating bootstrapped confidence intervals for a correlation coefficient. It is stored in the file lawschl.ssc Any line preceded with a pound sign is a comment that is ignored by the program. The law school data are read from the file lawschl.dat # Enter the law school data into a data frame School LSAT 1 1 576 2 2 635 3 3 558 4 4 578 5 5 666 6 6 580 7 7 555 8 8 661 9 9 651 10 10 605 11 11 653 12 12 575 13 13 545 14 14 572 15 15 594 GPA 3.39 3.30 2.81 3.03 3.44 3.07 3.00 3.43 3.36 3.13 3.12 2.74 2.76 2.88 2.96 laws <- read.table("lawschl.dat", col.names=c("School","LSAT","GPA")) laws 1155 # First test for zero correlation # Plot the data par(fin=c(7.0,7.0),pch=16,mkh=.15,mex=1.5) plot(laws$LSAT,laws$GPA, type="p", xlab="GPA",ylab="LSAT score", main="Law School Data") # Compute the sample correlation matrix n <tt<pval pval length(laws$LSAT); sqrt(n-2)*rr/sqrt(1 - rr*rr) <- 1 - pt(tt,n-2) <- round(pval,digits=5) cat("t-test for zero correlation: ", round(tt,4), fill=T) t-test for zero correlation: 4.4413 rr<-cor(laws$LSAT,laws$GPA) cat("Estimated correlation: ", round(rr,5), fill=T) Estimated correlation: 1156 cat("p-values for the t-test for zero correlation: ", pval, fill=T) p-values for the t-test for zero correlation: 0.00033 0.77637 1157 1158 # Use Fisher's z-transformation to construct # approximate confidence intervals. # First set the level of confidence at 1-alpha. alpha <- .10 z <- 0.5*log((1+rr)/(1-rr)) zl <- z - qnorm(1-alpha/2)/sqrt(n-3) zu <- z + qnorm(1-alpha/2)/sqrt(n-3) rl <- round((exp(2*zl)-1)/(exp(2*zl)+1), digits=4) ru <- round((exp(2*zu)-1)/(exp(2*zu)+1), digits=4) per <- (1-alpha)*100; cat( per,"% confidence interval: (", rl,", ",ru,")",fill=T) # Compute bootstrap confidence intervals. # Use B=5000 bootstrap samples. nboot <- 5000 rboot <- bootstrap(data=laws, statistic=cor(GPA,LSAT),B=nboot) Forming Forming Forming . . . Forming Forming replications 1 to 100 replications 101 to 200 replications 201 to 300 replications 4801 to 4900 replications 4901 to 5000 90 % confidence interval: ( 0.509 , 0.9071 ) 1159 # limits.emp(): Calculates empirical percentiles # for the bootstrapped parameter # estimates in a resamp object. # The quantile function is used to # calculate the empirical percentiles. # usage: # limits.emp(x, probs=c(0.025, 0.05, 0.95, 0.975)) limits.emp(rboot, probs=c(0.05,0.95)) 5% 95% Param 0.523511 0.9473424 # Do another set of 5000 bootstrapped values rboot <- bootstrap(data=laws, statistic=cor(GPA,LSAT),B=nboot) limits.emp(rboot, probs=c(0.05,0.95)) 1160 # limits.bca(): Calculates BCa (bootstrap # bias-correct, adjusted) # confidence limits. # usage: # limits.bca(boot.obj, # probs=c(0.025, 0.05, 0.95, 0.975), # details=F, z0=NULL, # acceleration=NULL, # group.size=NULL, # frame.eval.jack=sys.parent(1)) limits.bca(rboot,probs=c(0.05,0.95),detail=T) $limits: 5% 95% Param 0.442216 0.9306627 $emp.probs: 5% 95% Param 0.520661 0.9486835 5% 95% Param 0.01954177 0.9065055 1161 1162 # # # Both sets of confidence intervals could have been obtained from the summary( ) function summary(rboot) Call: bootstrap(data = laws, statistic = cor(GPA, LSAT), B = nboot) $z0: Param -0.07979538 Number of Replications: 5000 $acceleration: Param -0.07567156 Summary Statistics: Observed Bias Mean SE Param 0.7764 -0.007183 0.7692 0.1341 $group.size: [1] 1 Empirical Percentiles: 2.5% 5% 95% 97.5% Param 0.4638 0.5207 0.9487 0.9616 BCa Percentiles: 2.5% 5% 95% 97.5% Param 0.3462 0.4422 0.9307 0.9449 1164 1163 50 hist(rboot$rep,nclass=50, xlab=" ", main="5000 Bootstrap Correlations", density=.0001) 0 # Make a histogram of the bootstrapped # correlations. 100 150 200 250 300 5000 Bootstrap Correlations 0.2 1165 0.4 0.6 0.8 1.0 1166 The bootstrap can fail: Example: X1; X2; : : : ; Xn are sampled from a uniform (0; ) distribution: true density: f (x) = 1 ; 0 < x < 8 > > > < 0 x0 true c.d.f.: F (x) = >> x 0 < x > : 1 x> Bootstrap percentile condence intervals for tend to be too short. Application of the bootstrap must adequately replicate the random process that produced the original sample Simple random samples \Nested" experiments Sample plants from a eld { Sample leaves from plants { Curve tting (existence of covariates) 1168 1167 Parametric Bootstrap Obtain a bootstrap sample of size n by Suppose you \knew" that (X1j ; X2j ) j = 1; : : : ; n were obtained from a simple random sample from a bivariate normal distribution, i.e., " # " # ! Xj = XX12jj NID 12 ; " # 2 Repeat this to obtain rn;1; rn;2; : : : ; rn;B Compute bootstrap Estimate unknown parameters b = bb 1 sampling from a N (^; ^ ) distribution, i.e. X1;b; : : : ; Xn;b and compute tn;b = rn;b standard errors { bias estimators { condence intervals = n1 X Xj = X { n j =1 c = n 1 1 X (Xj X)(Xj X)T n j =1 1169 1170 References Davison, A.C. and Hinkley, D.V. (1997) Boot, Cambridge Series in Statistical and Probabilistic Mathematics 1, Cambridge University Press, New York. Efron, B. (1982) The Jackknife, The Bootstrap and other resampling plans, CBMS, 38, SIAM-NSF, Philadelphia. Efron, B. and Gong, G. (1983) The American Statistician, , 36-48. Efron, B. (1987) Better Bootstrap condence intervals (with discussion) Journal of the American Statistical Association, 82, 171-200. Efron, B. and Tibsharani, R. (1993) An Introduction to the Bootstrap, Chapman and Hall, New York. Shao, J. and Tu, D. (1995) The Jackknife and Bootstrap, New York, Springer. strap Methods and Their Applications Example 12.2: Stormer viscometer data (Venables & Ripley, Chapter 8) measure viscosity of a uid measure time taken for an inner cyclinder in the mechanism to complete a specic number of revolutions in response to an actuating weight calibrate the viscometer using runs with varying weights (W ) (g) { uids with known viscosity (V ) (poise) { record the time (T ) (sec) { theoretical model T = W1V + 1171 # # # # # # # # # # This code is used to explore the Stormer viscometer data. It is stored in the file stormer.ssc Enter the data into a data frame. The data are stored in the file stormer.dat library(MASS) stormer <- read.table("stormer.dat") stormer 1173 2 1172 Viscosity 1 14.7 2 27.5 3 42.0 4 75.7 5 89.7 6 146.6 7 158.3 8 14.7 9 27.5 10 42.0 11 75.7 12 89.7 13 146.6 14 158.3 15 161.1 16 298.3 17 75.7 18 89.7 19 146.6 20 158.3 21 161.1 22 298.3 23 298.3 Wt 20 20 20 20 20 20 20 50 50 50 50 50 50 50 50 50 100 100 100 100 100 100 100 Time 35.6 54.3 75.6 121.2 150.8 229.0 270.0 17.6 24.3 31.4 47.2 58.3 85.6 101.1 92.2 187.2 24.6 30.0 41.7 50.3 45.1 89.0 86.5 1174 Starting values for 1 and 2: Fit an approximate linear model. Note that V Ti = 1 i + i W i 2 ) (Wi 2)Ti = 1Vi + i(Wi 2) ) WiTi = 1Vi + 2Ti + i(W1 2) % - this is the new response variable this is the new error Use OLS estimation to obtain b1(0) = 28:876 b2(0) = 2:8437 # Use a linear approximation to obtain # starting values for least squares # estimation in the non-linear model fm0 <- lm(Wt*Time ~ Viscosity + Time - 1, data=stormer) b0 <- coef(fm0) names(b0) <- c("b1","b2") # Fit the non-linear model storm.fm <- nls( formula = Time ~ b1*Viscosity/(Wt-b2), data = stormer, start = b0, trace = T) 885.365 : 28.8755 2.84373 825.11 : 29.3935 2.23328 825.051 : 29.4013 2.21823 1176 1175 # Create a bivariate confidence region # for the the (b1,b2) parameters. # First set up a grid of (b1,b2) values bc <- coef(storm.fm) se <- sqrt(diag(vcov(storm.fm))) dv <- deviance(storm.fm) summary(storm.fm)$parameters Value b1 29.401328 b2 2.218226 Std. Error 0.9155353 0.6655234 gsize<-51 b1 <- bc[1] + seq(-3*se[1], 3*se[1], length = gsize) b2 <- bc[2] + seq(-3*se[2], 3*se[2], length = gsize) bv <- expand.grid(b1, b2) t value 32.113813 3.333054 # Create a function to evaluate sums of squares ssq <- function(b) sum((stormer$Time - b[1] * stormer$Viscosity/ (stormer$Wt-b[2]))^2) 1177 1178 # Create the plot # Evalute the sum of squared residuals and # approximate F-ratios for all of the # (b1,b2) values on the grid dbeta <- apply(bv, 1, ssq) n<-length(stormer$Viscosity) df1<-length(bc) df2<-n-df1 fstat <- matrix( ((dbeta - dv)/df1) / (dv/df2), gsize, gsize) motif() par(fin=c(7.0,7.0), mex=1.5,lwd=3) plot(b1, b2, type="n", main="95% Confidence Region") contour(b1, b2, fstat, levels=c(1,2,5,7,10,15,20), labex=0.75, lty=2, add=T) contour(b1, b2, fstat, levels=qf(0.95,2,21), labex=0, lty=1, add=T) text(31.6,0.3,"95% CR", adj=0, cex=0.75) points(bc[1], bc[2], pch=3, mkh=0.15) # remove b1,b2, and bv rm(b1,b2,bv,fstat) 1180 1179 A joint condence region for (1; 2): Deviance (residual sum of squares): j =1 4 1Vj W j 2 Tj 95% Confidence Region #2 3 d(1; 2) = n " X 2 d(1;2) d(b1;b2) 2 d(b1;b2) n 20 15 1 20 10 1 F (1; 2) = b2 Approximate F-statistic: 5 2 7 5 2 20 15 An approximate (1 ) 100% condence interval consists of all (1; 2) such that F (1; 2) < F(2;n 2); 1181 27 28 29 30 10 7 31 5 95% CR 32 b1 1182 #========================================= # Bootstrap I: # Treat the regressors as random and # resample the cases (y,x_1,x_2) #========================================= Bootstrap Estimation storm.boot1 <- bootstrap(stormer, coef(nls(Time~b1*Viscosity/(Wt-b2), data=stormer,start=bc)),B=1000) Bootstrap I: Sample n = 23 cases (Ti; Wi; Vi) from the original data set (using simple random sampling with replacement) summary(storm.boot1) Call: bootstrap(data = stormer, statistic = coef(nls(Time ~ (b1 * Viscosity)/(Wt - b2), data = stormer, start = bc)), B = 1000) Number of Replications: 1000 1183 Summary Statistics: Observed Bias Mean SE b1 29.401 -0.08665 29.315 0.7194 b2 2.218 0.09412 2.312 0.8424 1184 # Produce histograms of the bootstrapped # values of the regression coefficients # # # # Empirical Percentiles: 2.5% 5% 95% 97.5% b1 27.8202 28.113 30.402 30.613 b2 0.8453 1.057 3.513 3.779 The following code is used to draw non-parametric estimated densities, normal densities, and a histogram on the same graph. # truehist(): Plot a Histogram (prob=T # by default) For the function # hist(), probability=F by # default. # width.SJ(): Bandwidth Selection by Pilot # Estimation of Derivatives. # Uses the method of Sheather # & Jones (1991) to select the # bandwidth of a Gaussian kernel # density estimator. BCa Percentiles: 2.5% 5% 95% 97.5% b1 27.9368 28.242 30.525 30.747 b2 0.7462 0.899 3.409 3.604 Correlation of Replicates: b1 b2 b1 1.0000 -0.8628 b2 -0.8628 1.0000 1185 1186 b1.boot <- storm.boot1$rep[,1] b2.boot <- storm.boot1$rep[,2] library(MASS) motif() par(fin=c(7.0,7.0),mex=1.5) # density() : Estimate Probability Density # Function. Returns x and y # coordinates of a non-parametric # estimate of the probability # density of the data. Options # include the type of window to # use and the number of points # at which to estimate the density. # n = the number of equally spaced # points at which the density # is evaluated. mm <- range(b1.boot) min.int <- floor(mm[1]) max.int <- ceiling(mm[2]) truehist(b1.boot,xlim=c(min.int,max.int), density=.0001) width.SJ(b1.boot) [1] 0.6918326 b1.boot.dns1 <- density(b1.boot,n=200,width=.6) b1.boot.dns2 <list( x = b1.boot.dns1$x, y = dnorm(b1.boot.dns1$x,mean(b1.boot), sqrt(var(b1.boot)))) 1187 1188 # Draw non-parametric density lines(b1.boot.dns1,lty=3,lwd=2) truehist(b2.boot,xlim=c(min.int,max.int), density=.001) # Draw normal densities width.SJ(b2.boot) [1] 0.7339957 lines(b1.boot.dns2,lty=1,lwd=2) legend(27.,-0.30,c("Nonparametric", "Normal approx."), lty=c(3,1),bty="n",lwd=2) # Display the distribution of the estimate # of the second parameter b2.boot.dns1 <- density(b2.boot,n=200,width=.8) b2.boot.dns2 <list( x = b2.boot.dns1$x, y = dnorm(b2.boot.dns1$x,mean(b2.boot), sqrt(var(b2.boot)))) lines(b2.boot.dns1,lty=3,lwd=2) lines(b2.boot.dns2,lty=1,lwd=2) legend(1.,-0.30,c("Nonparametric", "Normal approx."), lty=c(3,1),bty="n",lwd=2) motif() par(fin=c(7.0,7.0),mex=1.5) mm <- range(b2.boot) min.int <- floor(mm[1]) max.int <- ceiling(mm[2]) 1189 1190 0.5 0.6 0.4 0.5 0.3 0.4 0.2 0.3 0.1 0.2 0.0 0.1 0.0 26 27 28 29 30 31 32 0 2 b1.boot 4 6 b2.boot Nonparametric Normal approx. Nonparametric Normal approx. 1191 Bootstrap II: Fix the values of the explanatory variables f(Wj ; Vj ) : j = 1; : : : ; ng Create new observations: Compute residuals from tting the model to the original sample b1Vj ej = Tj W b j 2 1192 j = 1; : : : ; n where (Tj;b ; Wj ; Vj ) = Tj;b j = 1; : : : ; n b1Vj + e Wj b2 j Fit the model to the j -th bootstrap sample to obtain Approximate sampling from the population of random errors by taking a sample (with replacement) from fe1; : : : ; eng say, b1;b and b2;b Repeat this for b = 1; : : : ; B bootstrap samples e1;b; e2;b; : : : ; en;b 1193 1194 #======================================= # Bootstrap II: # Treat the regressors as fixed and # resample from the residuals #======================================= # # # # # # # Center the residuals at zero and divide residuals by linear approximations to a multiple of the standard errors. These centered and scaled residuals approximately have the same first two moments as the random errors, but they are not quite uncorrelated. g2 <- grad.f(b[1], b[2], stormer$Viscosity, stormer$Wt, stormer$Tim) D <- attr(g2, "gradient") h <- 1-diag(D%*%solve(t(D)%*%D)%*%t(D)) rs <- rs/sqrt(h) dfe <- length(rs)-length(coef(storm.fm)) vr <- var(rs) rs <- rs%*%sqrt(deviance(storm.fm)/dfe/vr) rs <- scale(resid(storm.fm),center=T,scale=F) grad.f <- deriv3( expr = Y ~ (b1*X1)/(X2-b2), namevec = c("b1", "b2"), function.arg = function(b1, b2, X1, X2, Y) NULL) 1196 1195 # Compute 1000 bootstrapped values of the # regression parameters storm.boot2 <- bootstrap(data=rs, statistic=storm.bf2(rs),B=1000) # Create a function to use in fitting # the model to bootstrap samples storm.bf2 <- function(rs) {assign("Tim", fitted(storm.fm) + rs, frame = 1) nls(formula = Tim ~ (b1*Viscosity)/(Wt-b2), data = stormer, start = coef(storm.fm) )$parameters } summary(storm.fm)$parameters Value b1 29.401328 b2 2.218226 Std. Error 0.9155353 0.6655234 Forming Forming Forming Forming Forming Forming Forming Forming Forming Forming replications replications replications replications replications replications replications replications replications replications 1 to 100 101 to 200 201 to 300 301 to 400 401 to 500 501 to 600 601 to 700 701 to 800 801 to 900 901 to 1000 Call: bootstrap(data = rs, statistic = storm.bf2(rs), B = 1000) t value 32.113813 3.333054 b1.boot <- storm.boot2$rep[,1] b2.boot <- storm.boot2$rep[,2] 1197 1198 # The BCA intervals may not be correctly # computed by the following function summary(storm.boot2) Empirical Percentiles: 2.5% 5% 95% 97.5% b1 27.58 27.847 30.690 30.924 b2 1.03 1.241 3.295 3.459 BCa Confidence Limits: 2.5% 5% 95% 97.5% b1 26.236 26.267 29.698 29.985 b2 1.316 1.524 3.563 3.683 Number of Replications: 1000 Summary Statistics: Observed Bias Mean SE b1 28.745 0.5719 29.317 0.8892 b2 2.432 -0.1682 2.264 0.6353 Correlation of Replicates: b1 b2 b1 1.0000 -0.9194 1199 # # # # 1200 The following code is used to draw non-parametric estimated densities, normal densities, and histograms on the same graphic window. motif() par(fin=c(7.0,7.0),mex=1.3) # Draw a non-parametric density lines(b1.boot.dns1,lty=3,lwd=2) mm <- range(b1.boot) min.int <- floor(mm[1]) max.int <- ceiling(mm[2]) # draw normal densities truehist(b1.boot,xlim=c(min.int,max.int), density=.0001) b1.boot.dns1 <- density(b1.boot,n=200, width=width.SJ(b1.boot)) b1.boot.dns2 <list( x = b1.boot.dns1$x, y = dnorm(b1.boot.dns1$x,mean(b1.boot), sqrt(var(b1.boot)))) 1201 lines(b1.boot.dns2,lty=1,lwd=2) legend(27,-0.22,c("Nonparametric", "Normal approx."), lty=c(3,1),bty="n",lwd=2) 1202 # Display the distribution for the other # parameter estimate motif() par(fin=c(7.0,7.0),mex=1.3) 0.5 mm <- range(b2.boot) min.int <- floor(mm[1]) max.int <- ceiling(mm[2]) 0.3 0.4 truehist(b2.boot,xlim=c(min.int,max.int), density=.00001) 0.0 0.1 0.2 b2.boot.dns1 <- density(b2.boot,n=200, width=width.SJ(b2.boot)) b2.boot.dns2 <list( x = b2.boot.dns1$x, y = dnorm(b2.boot.dns1$x,mean(b2.boot), sqrt(var(b2.boot)))) 26 27 28 29 30 31 32 b1.boot lines(b2.boot.dns1,lty=3,lwd=2) lines(b2.boot.dns2,lty=1lwd=2) legend(0.5,-0.25,c("Nonparametric", "Normal approx."),lty=c(3,1), bty="n",lwd=2) Nonparametric Normal approx. 1204 1203 Fixed Bootstrap 0.889 0.635 0.6 Comparison of standard errors: Asymptotic normal Random Parameter approx. Bootstrap 1 0.916 0.719 2 0.666 0.842 0.2 0.4 Comparison of approximate 95% condence intervals: Parameter 0.0 1 2 0 1 2 3 Asymptotic normal approx. (27.50, 31.31) (0.83, 3.60) Random Bootstrap (28.13, 30.40) (1.06, 3.51) Fixed Bootstrap (27.42, 31.03) (1.03, 3.46) 4 " bi t21;:025Sbi b2.boot Nonparametric Normal approx. 1205 1206