Linear Regression Formula Sheet

FORMULA SHEET - SIMPLE LINEAR REGRESSION 1. Prediction Equation 𝑦̂𝑖 = 𝛽̂0 + 𝛽̂1 𝑥1 = = 𝑆𝑆𝑥𝑦 𝑆𝑆𝑥𝑥 ∑ 𝑥𝑖 𝑦𝑖 − 𝑛𝑥̅ 𝑦̅ 𝑐𝑜𝑣(𝑥, 𝑦) = 2 ∑ 𝑥𝑖 − 𝑛𝑥̅ 2 𝑣𝑎𝑟(𝑥) 𝑆𝑦 =𝑟∗ 𝑆𝑥 𝑆𝑆𝑥𝑦 = 12. Adjusted 𝑅2 ∑ 𝑥 2 − (∑ 𝑥)2 𝑛 𝑛 𝛽̂0 = 𝑦̅ − 𝛽̂1 𝑥̅ 𝑅𝐴2 = The adjusted coefficient of determination 2 9. Confidence Interval for Mean value of Y given x A (1 − 𝛼) 100% confidence interval for E(Y|X): 𝑆𝑆𝑅 𝑆𝑆𝐸 𝑅2 = =1− 𝑆𝑆𝑇 𝑆𝑆𝑇 5. Standard Error of Estimate ∑(𝑌𝑖 𝑆𝑒 = √ 𝑛−𝑘−1 A (1 – 𝛼) 100% prediction interval for Y is: 𝑌̂𝑖 ± 𝑡(𝛼,𝑛−2) 𝑆𝑒 √1 + 1 𝑥̅ 2 𝑆(𝛽0 ) = 𝑆𝑒 √ + 𝑛 (𝑛 − 1)𝑆𝑥 2 = 𝑆(𝛽1 ) = 𝑆𝑒 × √∑ 𝑥 2 √𝑆𝑆𝑥𝑥 1 𝑆𝑒 = √𝑛 − 1 𝑆𝑥 7. Test statistic for𝛽̂1 𝑡(𝑛−2) 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒 − 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 = 𝐸𝑠𝑡. 𝑠𝑡𝑑. 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑘 = Number of explanatory variables 13. Variance Inflation Factor 1 1 − 𝑅𝑗2 𝑅𝑗2 is the coefficient of determination for the regression of 𝑋𝑗 as the dependent variable and all other 𝑋𝑖 as independent variables If VIF >10, Multicollinearity is suspected 14. Tolerance Factor 𝑋𝑖 is the observed value of independent variable, 𝑌̂𝑖 is the estimate of Y, 1 − 𝑅𝑗2 = 1 𝑉𝐼𝐹 15. Beta weights (Standardized Beta) 𝑛 is the sample size, and 𝑆𝑒 is the standard error 11. Coefficient of Correlation (for simple regression) √𝑛𝑆𝑆𝑥𝑥 𝑆𝑒 1 (𝑋𝑖 − 𝑋̅)2 + 𝑛 𝑆𝑆𝑋 where 6. Standard Error of β0 and β1 𝑛 = Number of observations 𝑆𝑆𝑋 = (𝑛 − 1)𝑆𝑥2 2 − 𝑌̂)2 𝑅 2 = Unadjusted coefficient of determination 𝑉𝐼𝐹(𝑋𝑗 ) = 10. Prediction Interval for a random value of Y given x 4. Coefficient of Determination 𝑛−1 𝑛 − (𝑘 + 1) 𝛽1 ± 𝑡(𝛼,𝑛−2) × 𝑆𝑒 (𝛽1 ) Here 𝑌̂ is the E(Y|X) 3. Sample Y Intercept 𝑆𝑆𝐸⁄ (𝑛 − 𝑘 − 1) =1− 𝑆𝑆𝑇⁄ (𝑛 − 1) 𝑅𝐴2 = 1 − (1 − 𝑅 2 ) × 1 (𝑋𝑖 − 𝑋̅)2 𝑌̂𝑖 ± 𝑡(𝛼,𝑛−2) 𝑆𝑒 √ + 𝑛 𝑆𝑆𝑋 2 ∑ 𝑥𝑦− ∑ 𝑥 ∑ 𝑦 𝑅𝐴2 𝛽0 ± 𝑡(𝛼,𝑛−2) × 𝑆𝑒 (𝛽0 ) 2 ∑(𝑥𝑖 − 𝑥̄ )(𝑦𝑖 − 𝑦̄ ) ∑(𝑥𝑖 − 𝑥̄ )2 𝑆𝑆𝑥𝑥 = 𝛽̂1 − 𝛽1 𝑆𝑒 (𝛽̂1 ) 8. Confidence Interval for β0 and β1 2. Sample Slope 𝛽̂1 = = 𝑟 = √𝑅 2 = 𝑆𝑆𝑋𝑌 √𝑆𝑆𝑋𝑋 𝑆𝑆𝑌𝑌 Forward Regression 𝐹𝑖𝑛 > 3.84 𝑃𝑖𝑛 < 0.05 Backward Regression 𝐹𝑜𝑢𝑡 < 2.71 𝑃𝑜𝑢𝑡 > 0.10 𝐵𝑒𝑡𝑎 = 𝛽𝑖 × 𝑆𝑥 𝑆𝑦 𝑆𝑥 = Standard deviation of X 𝑆𝑦 = Standard deviation of Y ANOVA TABLE Source of Variation Sum of Squares Degrees of Freedom Regression SSR k Error SSE n-(k+1) Total SST n-1 Mean Square F Statistic 𝑆𝑆𝑅⁄ 𝑘 𝑆𝑆𝐸⁄ (𝑛 − (𝑘 + 1)) 𝐹(𝑘,𝑛−𝑘−1) = 𝑀𝑆𝑅 𝑀𝑆𝐸 16. Partial F Test 𝐹𝑟,𝑛−(𝑘+1) (𝑆𝑆𝐸𝑅 − 𝑆𝑆𝐸𝐹 )⁄ 𝑟 = 𝑀𝑆𝐸𝐹 17. F Test (Overall significance of the model) 𝐹𝑘,𝑛−(𝑘+1) = (𝑅𝐹2 − 𝑅𝑅2 )⁄ 𝑟 = 2 (1 − 𝑅𝐹 )⁄ (𝑛 − 𝑘 − 1) = 𝑀𝑆𝑅 𝑀𝑆𝐸 𝑆𝑆𝑅⁄ 𝑘 𝑀𝑆𝐸⁄ (𝑛 − (𝑘 + 1)) 𝑅 2⁄ 𝑘 = (1 − 𝑅 2 ) ⁄(𝑛 − (𝑘 + 1)) 𝑆𝑆𝐸𝑅 = Sum of squared errors for reduced model 𝑆𝑆𝐸𝐹 = Sum of squared errors for full model 𝑟 = Number of variables dropped from the full model / or added to the reduced model MULTIPLE LINEAR REGRESSION 18. Prediction Interval A (1 − 𝛼) 100% PI (Prediction Interval) for value of a randomly chosen 𝑌, given values of 𝑋𝑖 : 𝑦̂ ± 𝑡(𝛼,(𝑛−(𝑘+1)) √𝑠 2 (𝑦̂) + 𝑀𝑆𝐸 2 19. Confidence Interval A (1 − 𝛼) 100% CI (Confidence Interval) for a conditional mean of 𝑌, given values of 𝑋𝑖 : 𝑦̂ ± 𝑡(𝛼,(𝑛−(𝑘+1)) 𝑆[𝐸̂ (𝑌)] 2 20. Partial correlation Correlation between 𝑦 and 𝑥1 , when the influence of 𝑥2 is removed from both 𝑦 and 𝑥1 : 𝑝𝑟𝑦1,2 = 𝑟𝑦1 − (𝑟𝑦2 )(𝑟12 ) 2 √1 − 𝑟𝑦22 √1 − 𝑟12 21. Semi-partial correlation (Part correlation) Correlation between 𝑦 and 𝑥1 , when the influence of 𝑥2 is removed from 𝑥1 (but not out of 𝑦): 𝑠𝑟𝑦1,2 = 𝑟𝑦1 − (𝑟𝑦2 )(𝑟12 ) 2 √1 − 𝑟12 Square of part correlation of an explanatory variable = unique contribution of the explanatory variable to 𝑅 2 When this variable is added 𝑠𝑟 2 = 𝐶ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑅 2 2 2 = 𝑅𝑛𝑒𝑤 − 𝑅𝑜𝑙𝑑 𝑝𝑟 2 = 𝐶ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑅 2 2 1 − 𝑅𝑜𝑙𝑑 22. Omitted variable bias Actual relationship 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 Fitted model 𝑌 = 𝛼0 + 𝛼1 𝑋1 Then ∝1 = 𝛽1 + 𝛽2 × 𝐶𝑜𝑣(𝑥1 , 𝑥2 ) 𝑉𝑎𝑟(𝑥1 ) JD Sir Formulae Slide 1: b1 = coefficient Covariance: Se(b1) = corresponding standard error t test (Significance of regression) Eg: for alpha = 0.05 and n = 10 Confidence Interval for E[Ŷ|X]: Correlation Coefficient: Prediction Interval for a specific Ordinary Least Squares Ŷ: Estimators: P value of coefficients: =T.DIST.2T(|tStat|,n-k-1) (k is no of independent variables) Slide 4: (If <0.05, Reject Ho where Ho will be coefficient = 0) Slide 2: Slide 3: Summary Output: Hypothesis test for beta1 = 0 Multiple R = sqrt(R^2) = Omitted Variable Bias: correlation when it’s SLR Given equations: 𝑆𝑆𝑅 𝑆𝑆𝐸 𝑅 = =1− 𝑆𝑆𝑇 𝑆𝑆𝑇 2 Standard Error, Se = sqrt(MSE) F = MSR/MSE = t^2(for SLR beta1) P value of F test and T test for beta1 will be identical for SLR. Hypotheses test for beta1<=a MSR = SSR/k MSE = SSE/n-k-1 SST = SSR+SSE F test (Significance of overall Actual T Value using VIF Confidence Interval for beta1 Standard Error of Coefficients: (SLR)

Linear Regression Formula Sheet

Related documents

Products

Support

Linear Regression Formula Sheet

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib