Topic 10: Miscellaneous Topics Outline • • • • • Joint estimation of β0 and β1 Multiplicity Regression through the origin Measurement error Inverse predictions Joint Estimation of β0 and β1 • Confidence intervals are used for a single parameter • Confidence regions for two or more parameters • The region for (β0, β1) defines a set of lines…that form a band about the estimated regression line (Topic 5) Joint Estimation of β0 and β1 • Since β0 and β1 are (jointly) Normal, the natural (i.e., smallest) confidence region is an ellipse (STAT 524) • Text consider rectangles (KNNL 4.1) (i.e., region formed from the union of two separate intervals) • Need to adjust confidence level of each CI so region has proper a level Bonferroni Correction • We want the probability that both intervals are correct to be ≥ 0.95 • Basic idea is an error budget • Spend half on β0 and half on β1 • Since a=0.05, we use α* =0.025 for each CI ( consider 97.5% CIs) Bonferroni Correction • For joint region of (β0, β1), use b1 ± tcs(b1) b0 ± tcs(b0) where tc = t(.9875, n-2) Note: .9875 = 1 – (.05)/(2*2) Expanding on the Note • We start with a 5% error budget. • We have two intervals so we give 0.05/2=2.5% to each • Each interval is two-sided so we again divide by 2 • Thus 0.9875 = 1 – (.05)/(2*2) Bonferroni Concept • Theory behind this correction • Let the two intervals be I1 and I2 • We will use c if the interval contains the true parameter value, nc if the interval does not contain the true parameter Bonferroni Inequality • P(both c)=1-P(at least one nc) • P(at least one nc) = P(I1 nc) + P(I2 nc) - P(both nc) ≤ P(I1 nc) + P(I2 nc) • Thus, P(both c) ≥ 1-(P(I1 nc) + P(I2 nc)) Green area on left is greater than green area on the right .025 .025 <.025 .025 Bonferroni Inequality • P(both c) ≥ 1-(P(I1 nc) + P(I2 nc)) • So if we use 0.05/2 for each interval, 1- (P(I1 nc) + P(I2 nc)) = 1 – 0.05 =0.95 • So P(both cor) is at least 0.95 • We will use this same idea when we do multiple comparisons in ANOVA Joint Estimation of β0 and β1 • For Toluca example, rectangular region is 8.20 ≤ b0≤ 116.5 2.85 ≤ b1≤ 4.29 • Region shown on next page…all lines when X positive between 116.5 + 4.29X 8.2 + 2.85X Definitely not as small nor symmetric about mean X as the confidence band Mean Response CIs • Simultaneous estimation for all Xh, uses Working-Hotelling (KNNL 2.6) ˆ h ± Ws( ˆ h) where W2=2F(1-α; 2, n-2) • For simultaneous estimation for a few Xh, use Bonferroni. Let g=# of Xh. Then ˆ h ± Bs(ˆ h ) where B=t(1-α/(2g), n-2) • Use this when B < W narrower CIs Simultaneous PIs • Simultaneous prediction for a few Xh, use • Bonferroni ˆ h± Bs(pred) where B=t(1-α/(2g), n-2) • Scheffé ˆ h± Ss(pred) where S2 = gF(1-α; g, n-2) • Again choose one with narrower intervals Regression through the Origin • • • • Yi = β1Xi + ei NOINT option in PROC REG Generally not a good idea Might be forcing model to behave certain way in area with no data • Problems with R2 and other statistics • See cautions, KNNL p 164 Measurement Error • For Y, this is usually not a problem…just adds to variance s • For X, we can get biased estimators of our regression parameters • See KKNL 4.5, pp 165-168 • Berkson model: special case where measurement error in X is no problem Inverse Predictions • Sometimes called calibration • Given Yh, predict the corresponding ˆ value of X, X h • Solve the fitted equation for Xh ˆ = (Yh – b0)/b1, b1≠ 0 • X h • Approximate CI can be obtained, see KNNL, p 169 Background Reading • Next class we will do simple regression with vectors and matrices so that we can generalize to multiple regression • Look at KNNL 5.1 to 5.7 if this is unfamiliar to you