Fitting a regression model • We wish to fit a simple linear regression model: y = β0 + β1x + . • Fitting a model means obtaining estimators for the unknown population parameters β0 and β1 (and also for the variance of the errors σ 2). • First step: obtain a sample of size n from the relevant population. • For each sample unit, obtain measurements (y1, x1), (y2, x2), ..., (yn, xn). • How do we use the sample values to estimate the model parameters? We wish to find estimators b0, b1 that are best in some sense. Stat 328 - Fall 2004 1 The Method of Least Squares • The method that produces the ’best’ estimators we are seeking is called the method of Least Squares (LS), sometimes also known as Ordinary Least Squares (OLS). • By ’best’ we mean the values of β0, β1 that produce a line closest to all n observations. This means that we find the line that minimizes the distances of each observation to the line. • Formal definition of LS estimators: values of β0, β1 that minimize the sum of squared deviations of observations from the line. Note: the textbook uses β̂0, β̂1 to denote the estimators of β0, β1, whereas I have used b0, b1. We mean the same thing and you can use either notation. Stat 328 - Fall 2004 2 The Method of Least Squares (cont’d) • Steps to obtain LS estimators of (β0, β1): 1. For each observation (yi, xi), consider the error i: i = yi − E(yi) = yi − (β0 + β1xi). 2. Find the values of β0, β1 that minimize the sum of the squared errors (SSE): n n X X SSE = 2i = (yi − β0 − β1xi)2. i=1 Stat 328 - Fall 2004 i=1 3 The Method of Least Squares (cont’d) • It can be shown that the LS estimators of β0, β1 are given by b1 b0 SSxy = SSxx = ȳ − b1x̄, where SSxy is the sum of cross-deviations of y and x: SSxy = n X (xi − x̄)(yi − ȳ), i=1 Stat 328 - Fall 2004 4 and Sxx is the sum of squared deviations of the x: SSxx = n X (xi − x̄)2. i=1 • Formulas for SSxy and Sxx that are easier for computation are SSxy = X yixi − nx̄ȳ i SSxx = X x2i − n(x̄)2. i [Those of you who know some calculus, you might be interested in the companion set of notes: LS-derivation on the course web site. Everyone else: material in LS-derivation is NOT part of the course so don’t faint.] Stat 328 - Fall 2004 5 Method of LS - Example • Suppose that we have the following data on a sample of size n = 5 stores, where y represents number of units sold (in 100s) of a product over a certain period and x represents the amount (in $1,000) spent by the store in advertising the product: Store 1 2 3 4 5 Stat 328 - Fall 2004 x 2 3 4 5 6 y 5 7 6 7 9 6 Method of LS - Example (cont’d) • We wish to answer the following questions: 1. How many units can a store expect to sell if it spends $5,000 in advertising? 2. What might be the expected sales if a store were to increase advertising by $1,000? 3. Would it be possible to sell more than 1000 units if advertising were increased? By how much? • To answer all of those questions, we need to get b0 and b1. • Use the computational formulas for SSxy and SSxx. We need x̄ and ȳ, the products xiyi and the squares x2i . • From the table above: x̄ = 4 and ȳ = 6.8. Stat 328 - Fall 2004 7 Method of LS - Example (cont’d) • To get SSxy and SSxx we expand the data table: Store 1 2 3 4 5 x 2 3 4 5 6 y 5 7 6 7 9 xy 10 21 24 35 54 x2 4 9 16 25 36 • Now: SSxy = X xiyi − nx̄ȳ i = (10 + 21 + 24 + 35 + 54) − 5 × 4 × 6.8 = 144 − 136 = 8. Stat 328 - Fall 2004 8 Method of LS - Example (cont’d) • We get the sum of squared deviations of x in a similar manner: SSxx = X x2i − n(x̄)2 i = (4 + 9 + 16 + 25 + 36) − 5 × 16 = 90 − 80 = 10. • We can now compute the estimators for β0, β1: SSxy 8 = = 0.8 SSxx 10 = ȳ − b1x̄ = 6.8 − 0.8 × 4 = 3.6. b1 = b0 Stat 328 - Fall 2004 9 Example - Interpreting results • β1 represents the change in y when x increases by one unit. Thus in example, every $1,000 increase in advertising expenditures is expected to result in an additional 80 units of the product sold. • A store that spends nothing on advertising can expect to sell about 360 units of the product. • How many units can a store that spends $5,000 expect to sell? We need to compute ŷ, the predicted value of y for x = 5: ŷ = 3.6 + 0.8 × 5 = 7.6. Thus a store that spends $5,000 in advertising can expect to sell about 760 units in the period under consideration. Stat 328 - Fall 2004 10 Example - Interpreting results (cont’d) • What might be the expected change in sales at a store that increases advertising by $1,000? Since we know that every additional $1,000 represents an increase of about 80 units sold, a store than increases ads by $1,000 can expect to sell: current amount + 80 = y + 80. • Would it be possible to sell more than 1000 units if advertising were increased? By how much? By trial and error: – For $6,000 in ads we can expect to sell ŷ = 3.6 + 0.8 × 6 = 8.4 × 100 units. – For $8,000 we can expect to sell ŷ = 3.6 + 0.8 × 8 = 10 × 100 units. Stat 328 - Fall 2004 11 Example - Interpreting results (cont’d) • More formally: for a given ŷ solve for x from ŷ = b0 + b1x. If I know what ŷ I want and I have b0, b1, I can solve for x above as x= ŷ − b0 . b1 • In example, for ŷ = 10, and for b0 = 3.6, b1 = 0.8, I get 10 − 3.6 x= = 8, 0.8 or $8,000, the same we obtained earlier by trial and error. Stat 328 - Fall 2004 12 Residuals or errors • Earlier we computed ŷ, the predicted value of y for a given x as ŷ = b0 + b1x. • Note that ŷ is an estimator of E(y), the expected value of y for a given x. • Since we had defined = y − E(y), we can now estimate the errors or residuals for each observation as ei = yi − ŷi = yi − b0 − b1xi. • Note that the sum of the errors is equal to 0: Stat 328 - Fall 2004 P i ei = 0. 13 Example: Tampa home sales • Data are appraised values (x) and sale prices (y) (both in $1,000) of n = 92 residential properties sold in Tampa, FL in 1999. • Questions of interest might be: 1. Are appraisal value and sale price associated? 2. What is the expected change in sale price if the assessed value of a home increases by $20,000? 3. What sale price can a home owner expect if the house she owns is appraised at $180,000? 4. A home owner is hoping to sell his home for $500,000 or more. How much would his house need to be appraised for for his hopes to be realistic? • See JMP and SAS outputs. SAS code is on web site under Examples. Stat 328 - Fall 2004 14 Example: Tampa home sales (cont’d) 1. Are appraisal value and sale price associated? It appears so. The estimated regression coefficient b1 is 1.07, apparently different from 0. 2. Since b1 = 1.07, the expected change in sale price for every $1,000 increase in assessed value is b1 × 1, 000 = $1, 070. Thus, an increase in assessed value of $20,000 is associated to an increase in sale price of about 20 × b1 = $21, 400. 3. We compute ŷ for x = 180: ŷ = 20.94 + 1.07 × 180 = $213.54. The owner of a home assessed at $180,000 can expect to get about $213,500 for it. Stat 328 - Fall 2004 15 4. Owner wishes to make $500,000: we need to find x for which ŷ = 500: 500 − b0 500 − 20.94 = = 447.72 x= b1 1.07 His hopes would be realistic if his home is appraised at at least $448,000. Stat 328 - Fall 2004 16 Final comments • We can predict y for any x. However, if the x of interest is larger or smaller than all the x’s included in the sample, this is called extrapolation. • It is always dangerous to extrapolate beyond the range of the sample. We do not know whether our model holds outside of the range of the x in the sample. See figure. Stat 328 - Fall 2004 17