FORMULA SHEET - SIMPLE LINEAR REGRESSION 1. Prediction Equation π¦Μπ = π½Μ0 + π½Μ1 π₯1 = = πππ₯π¦ πππ₯π₯ ∑ π₯π π¦π − ππ₯Μ π¦Μ πππ£(π₯, π¦) = 2 ∑ π₯π − ππ₯Μ 2 π£ππ(π₯) ππ¦ =π∗ ππ₯ πππ₯π¦ = 12. Adjusted π 2 ∑ π₯ 2 − (∑ π₯)2 π π π½Μ0 = π¦Μ − π½Μ1 π₯Μ π π΄2 = The adjusted coefficient of determination 2 9. Confidence Interval for Mean value of Y given x A (1 − πΌ) 100% confidence interval for E(Y|X): πππ πππΈ π 2 = =1− πππ πππ 5. Standard Error of Estimate ∑(ππ ππ = √ π−π−1 A (1 – πΌ) 100% prediction interval for Y is: πΜπ ± π‘(πΌ,π−2) ππ √1 + 1 π₯Μ 2 π(π½0 ) = ππ √ + π (π − 1)ππ₯ 2 = π(π½1 ) = ππ × √∑ π₯ 2 √πππ₯π₯ 1 ππ = √π − 1 ππ₯ 7. Test statistic forπ½Μ1 π‘(π−2) πΈπ π‘ππππ‘π − πππππππ‘ππ = πΈπ π‘. π π‘π. πππππ ππ ππ π‘ππππ‘π π = Number of explanatory variables 13. Variance Inflation Factor 1 1 − π π2 π π2 is the coefficient of determination for the regression of ππ as the dependent variable and all other ππ as independent variables If VIF >10, Multicollinearity is suspected 14. Tolerance Factor ππ is the observed value of independent variable, πΜπ is the estimate of Y, 1 − π π2 = 1 ππΌπΉ 15. Beta weights (Standardized Beta) π is the sample size, and ππ is the standard error 11. Coefficient of Correlation (for simple regression) √ππππ₯π₯ ππ 1 (ππ − πΜ )2 + π πππ where 6. Standard Error of β0 and β1 π = Number of observations πππ = (π − 1)ππ₯2 2 − πΜ)2 π 2 = Unadjusted coefficient of determination ππΌπΉ(ππ ) = 10. Prediction Interval for a random value of Y given x 4. Coefficient of Determination π−1 π − (π + 1) π½1 ± π‘(πΌ,π−2) × ππ (π½1 ) Here πΜ is the E(Y|X) 3. Sample Y Intercept πππΈ⁄ (π − π − 1) =1− πππ⁄ (π − 1) π π΄2 = 1 − (1 − π 2 ) × 1 (ππ − πΜ )2 πΜπ ± π‘(πΌ,π−2) ππ √ + π πππ 2 ∑ π₯π¦− ∑ π₯ ∑ π¦ π π΄2 π½0 ± π‘(πΌ,π−2) × ππ (π½0 ) 2 ∑(π₯π − π₯Μ )(π¦π − π¦Μ ) ∑(π₯π − π₯Μ )2 πππ₯π₯ = π½Μ1 − π½1 ππ (π½Μ1 ) 8. Confidence Interval for β0 and β1 2. Sample Slope π½Μ1 = = π = √π 2 = ππππ √ππππ ππππ Forward Regression πΉππ > 3.84 πππ < 0.05 Backward Regression πΉππ’π‘ < 2.71 πππ’π‘ > 0.10 π΅ππ‘π = π½π × ππ₯ ππ¦ ππ₯ = Standard deviation of X ππ¦ = Standard deviation of Y ANOVA TABLE Source of Variation Sum of Squares Degrees of Freedom Regression SSR k Error SSE n-(k+1) Total SST n-1 Mean Square F Statistic πππ ⁄ π πππΈ⁄ (π − (π + 1)) πΉ(π,π−π−1) = πππ πππΈ 16. Partial F Test πΉπ,π−(π+1) (πππΈπ − πππΈπΉ )⁄ π = πππΈπΉ 17. F Test (Overall significance of the model) πΉπ,π−(π+1) = (π πΉ2 − π π 2 )⁄ π = 2 (1 − π πΉ )⁄ (π − π − 1) = πππ πππΈ πππ ⁄ π πππΈ⁄ (π − (π + 1)) π 2⁄ π = (1 − π 2 ) ⁄(π − (π + 1)) πππΈπ = Sum of squared errors for reduced model πππΈπΉ = Sum of squared errors for full model π = Number of variables dropped from the full model / or added to the reduced model MULTIPLE LINEAR REGRESSION 18. Prediction Interval A (1 − πΌ) 100% PI (Prediction Interval) for value of a randomly chosen π, given values of ππ : π¦Μ ± π‘(πΌ,(π−(π+1)) √π 2 (π¦Μ) + πππΈ 2 19. Confidence Interval A (1 − πΌ) 100% CI (Confidence Interval) for a conditional mean of π, given values of ππ : π¦Μ ± π‘(πΌ,(π−(π+1)) π[πΈΜ (π)] 2 20. Partial correlation Correlation between π¦ and π₯1 , when the influence of π₯2 is removed from both π¦ and π₯1 : πππ¦1,2 = ππ¦1 − (ππ¦2 )(π12 ) 2 √1 − ππ¦22 √1 − π12 21. Semi-partial correlation (Part correlation) Correlation between π¦ and π₯1 , when the influence of π₯2 is removed from π₯1 (but not out of π¦): π ππ¦1,2 = ππ¦1 − (ππ¦2 )(π12 ) 2 √1 − π12 Square of part correlation of an explanatory variable = unique contribution of the explanatory variable to π 2 When this variable is added π π 2 = πΆβππππ ππ π 2 2 2 = π πππ€ − π πππ ππ 2 = πΆβππππ ππ π 2 2 1 − π πππ 22. Omitted variable bias Actual relationship π = π½0 + π½1 π1 + π½2 π2 Fitted model π = πΌ0 + πΌ1 π1 Then ∝1 = π½1 + π½2 × πΆππ£(π₯1 , π₯2 ) πππ(π₯1 ) JD Sir Formulae Slide 1: b1 = coefficient Covariance: Se(b1) = corresponding standard error t test (Significance of regression) Eg: for alpha = 0.05 and n = 10 Confidence Interval for E[ΕΆ|X]: Correlation Coefficient: Prediction Interval for a specific Ordinary Least Squares ΕΆ: Estimators: P value of coefficients: =T.DIST.2T(|tStat|,n-k-1) (k is no of independent variables) Slide 4: (If <0.05, Reject Ho where Ho will be coefficient = 0) Slide 2: Slide 3: Summary Output: Hypothesis test for beta1 = 0 Multiple R = sqrt(R^2) = Omitted Variable Bias: correlation when it’s SLR Given equations: πππ πππΈ π = =1− πππ πππ 2 Standard Error, Se = sqrt(MSE) F = MSR/MSE = t^2(for SLR beta1) P value of F test and T test for beta1 will be identical for SLR. Hypotheses test for beta1<=a MSR = SSR/k MSE = SSE/n-k-1 SST = SSR+SSE F test (Significance of overall Actual T Value using VIF Confidence Interval for beta1 Standard Error of Coefficients: (SLR)