Simple Linear Regression: Model & Analysis

Chapter 12 Simple Linear Regression         Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance Using the Estimated Regression Equation for Estimation and Prediction Computer Solution Residual Analysis: Validating Model Assumptions Simple Linear Regression Model   The equation that describes how y is related to x and an error term is called the regression model. The simple linear regression model is: y = b0 + b1x +e • b0 and b1 are called parameters of the model. • e is a random variable called the error term. Simple Linear Regression Equation  The simple linear regression equation is: E(y) = b0 + b1x • Graph of the regression equation is a straight line. • b0 is the y intercept of the regression line. • b1 is the slope of the regression line. • E(y) is the expected value of y for a given x value. Simple Linear Regression Equation  Positive Linear Relationship E(y) Regression line Intercept b0 Slope b1 is positive x Simple Linear Regression Equation  Negative Linear Relationship E(y) Intercept b0 Regression line Slope b1 is negative x Simple Linear Regression Equation  No Relationship E(y) Regression line Intercept b0 Slope b1 is 0 x Estimated Simple Linear Regression Equation  The estimated simple linear regression equation is: ŷ  b0  b1 x • The graph is called the estimated regression line. • b0 is the y intercept of the line. • b1 is the slope of the line. • ŷ is the estimated value of y for a given x value. Estimation Process Regression Model y = b0 + b1x +e Regression Equation E(y) = b0 + b1x Unknown Parameters b0, b1 b0 and b1 provide estimates of b0 and b1 Sample Data: x y x1 y1 . . . . xn yn Estimated Regression Equation ŷ  b0  b1 x Sample Statistics b0, b1 Least Squares Method  Least Squares Criterion min  (y i  y i ) 2 where: yi = observed value of the dependent variable for the ith observation yî = estimated value of the dependent variable for the ith observation The Least Squares Method  Slope for the Estimated Regression Equation  xi y i  (  xi  y i ) / n b1  2 2  xi  (  xi ) / n where: xi = value of independent variable for ith observation yi = value of dependent variable for ith observation n = total number of observations The Least Squares Method  y-Intercept for the Estimated Regression Equation b0  y  b1 x where: _ x = mean value for independent variable _ y = mean value for dependent variable n = total number of observations Example: Reed Auto Sales  Simple Linear Regression Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown on the next slide. Example: Reed Auto Sales  Simple Linear Regression Number of TV Ads 1 3 2 1 3 Number of Cars Sold 14 24 18 17 27 Example: Reed Auto Sales  Slope for the Estimated Regression Equation b1 = 220 - (10)(100)/5 = 5 24 - (10)2/5  y-Intercept for the Estimated Regression Equation b0 = 20 - 5(2) = 10  Estimated Regression Equation y^ = 10 + 5x Example: Reed Auto Sales Scatter Diagram 30 25 Cars Sold  20 ^ y = 10 + 5x 15 10 5 0 0 1 2 TV Ads 3 4 The Coefficient of Determination  Relationship Among SST, SSR, SSE SST = SSR + SSE 2 2 ^ )2  ( y i  y )   ( yî  y )   ( y i  y i where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error The Coefficient of Determination  The coefficient of determination is: r2 = SSR/SST where: SST = total sum of squares SSR = sum of squares due to regression Example: Reed Auto Sales  Coefficient of Determination r2 = SSR/SST = 100/114 = .8772 The regression relationship is very strong because 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold. The Correlation Coefficient  Sample Correlation Coefficient rxy  (sign of b1 ) Coefficien t of Determinat ion rxy  (sign of b1 ) r 2 where: b1 = the slope of the estimated regression equation yˆ  b0  b1 x Example: Reed Auto Sales  Sample Correlation Coefficient rxy  (sign of b1 ) r 2 The sign of b1 in the equation yˆ  10  5 x is “+”. rxy = + .8772 rxy = +.9366 Model Assumptions  Assumptions About the Error Term e 1. The error e is a random variable with mean of zero. 2. The variance of e , denoted by  2, is the same for all values of the independent variable. 3. The values of e are independent. 4. The error e is a normally distributed random variable. Testing for Significance    To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of b1 is zero. Two tests are commonly used • t Test • F Test Both tests require an estimate of  2, the variance of e in the regression model. Testing for Significance  An Estimate of  2 The mean square error (MSE) provides the estimate of  2, and the notation s2 is also used. s2 = MSE = SSE/(n-2) where: SSE   (yi  yî ) 2   ( yi  b0  b1 xi ) 2 Testing for Significance  An Estimate of  • To estimate  we take the square root of  2. • The resulting s is called the standard error of the estimate. SSE s  MSE  n2 Testing for Significance: t Test  Hypotheses H 0 : b1 = 0 H a : b1 = 0  Test Statistic b1 t sb1 Testing for Significance: t Test  Rejection Rule Reject H0 if t < -t or t > t where: t is based on a t distribution with n - 2 degrees of freedom Example: Reed Auto Sales  t Test • Hypotheses H 0 : b1 = 0 H a : b1 = 0 • Rejection Rule For  = .05 and d.f. = 3, t.025 = 3.182 Reject H0 if t > 3.182 Example: Reed Auto Sales  t Test • Test Statistics t = 5/1.08 = 4.63 • Conclusions t = 4.63 > 3.182, so reject H0 Confidence Interval for b1   We can use a 95% confidence interval for b1 to test the hypotheses just used in the t test. H0 is rejected if the hypothesized value of b1 is not included in the confidence interval for b1. Confidence Interval for b1  The form of a confidence interval for b1 is: b1  t / 2 sb1 where b1 is the point estimate t / 2 sb1 is the margin of error t / 2 is the t value providing an area of /2 in the upper tail of a t distribution with n - 2 degrees of freedom Example: Reed Auto Sales   Rejection Rule Reject H0 if 0 is not included in the confidence interval for b1. 95% Confidence Interval for b1 b1  t / 2 sb1 = 5 +/- 3.182(1.08) = 5 +/- 3.44 or 1.56 to 8.44  Conclusion 0 is not included in the confidence interval. Reject H0 Testing for Significance: F Test  Hypotheses H 0 : b1 = 0 H a : b1 = 0  Test Statistic F = MSR/MSE Testing for Significance: F Test  Rejection Rule Reject H0 if F > F where: F is based on an F distribution with 1 d.f. in the numerator and n - 2 d.f. in the denominator Example: Reed Auto Sales  F Test • Hypotheses • Rejection Rule H 0 : b1 = 0 H a : b1 = 0 For  = .05 and d.f. = 1, 3: F.05 = 10.13 Reject H0 if F > 10.13. Example: Reed Auto Sales  F Test • Test Statistic F = MSR/MSE = 100/4.667 = 21.43 • Conclusion F = 21.43 > 10.13, so we reject H0. Some Cautions about the Interpretation of Significance Tests   Rejecting H0: b1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Just because we are able to reject H0: b1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y. Example: Reed Auto Sales  Point Estimation If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be: y^ = 10 + 5(3) = 25 cars

Simple Linear Regression: Model & Analysis

Related documents

Products

Support

Simple Linear Regression: Model & Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib