Chapter 1 Linear Regression with One Predictor Variable 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 1 1 / 41 Regression analysis is a statistical methodology that utilizes the relation between two or more quantitative variables so that a response or outcome variable can be predicted from the other, or others. 迴歸分析(Regression Analysis)是一種統計學上分析數據的方 法,目的在於了解兩個或多個變數間是否相關、相關方向與強 度,並建立數學模型以便觀察特定變數來預測研究者感興趣的變 數。 (Wiki) hsuhl (NUK) LR Chap 1 2 / 41 起源:「迴歸」一詞最早由法蘭西斯·高爾頓(Francis Galton)所 使用。他曾對親子間的身高做研究,發現父母的身高雖然會遺傳 給子女,但子女的身高卻有逐漸「迴歸到中等(即人的平均 值)」(regression to the mean)的現象。不過當時的迴歸和現在的 迴歸在意義上已不盡相同。 (Wiki) 「向平均迴歸」(regression to the mean)現象: 非常高的父母所生 的子女,往往比父母矮些,而非常矮的雙親所生的孩子,則往往 比父母親高。將人的身高從高、矮兩個極端往所有人類的平均值 拉。(統計 改變了世界) hsuhl (NUK) LR Chap 1 3 / 41 business social science biological sciences engineering chemical science economics management etc. hsuhl (NUK) LR Chap 1 4 / 41 Relations between Variable Functional Relation between Two Variables functional relation vs. statistical relation 工讀時數 vs. 薪水 Y = dollar sales of a product X = a fixed price and number of units sold hsuhl (NUK) LR Chap 1 5 / 41 Relations between Variable Functional Relation between Two Variables (cont.) If the selling price is $2 per unit, Y = 2X (a) figure (b) data Figure : Example of Functional Relation (Y = f (X)) hsuhl (NUK) LR Chap 1 6 / 41 Relations between Variable Functional Relation between Two Variables (cont.) The observations for a statistical relation do not fall directly on the curve of relationship. Ex: Employees performance evaluations Y = Year-end evaluations; X = midyear evaluations hsuhl (NUK) LR Chap 1 7 / 41 Relations between Variable Functional Relation between Two Variables (cont.) Which kind of pattern can be observed in the figure? What kind of the relation between midyear and year-end evaluations? hsuhl (NUK) LR Chap 1 8 / 41 Relations between Variable Functional Relation between Two Variables (cont.) (a): scatter plot (b): Plotting a line of relationship to describe the statistical relation between X and Y hsuhl (NUK) LR Chap 1 9 / 41 Relations between Variable Functional Relation between Two Variables (cont.) Figure : Curvilinear Statistical Relation between Age and Steroid(膽固醇) Level in Healthy Females Aged 8 to 25. hsuhl (NUK) LR Chap 1 10 / 41 Regression Models and Their Uses Basic Concepts A regression model is a formal means of expressing the two essential ingredients of a statistical relation: A tendency of the response variable Y to vary with the predictor variable X in a systematic fashion. A scattering of points around the curve of statistical relationship. hsuhl (NUK) LR Chap 1 11 / 41 Regression Models and Their Uses Basic Concepts (cont.) A regression model: A probability distribution of Y for each level of X The probability distributions vary in some systematic fashion with X. Figure : Pictorial Representation of Regression Model hsuhl (NUK) LR Chap 1 12 / 41 Regression Models and Their Uses Construction of Regression Models Y: the dependent or response variable X: the independent, explanatory or predictor variable Construction: Selection of Predictor Variables: may more than one predictor variable Functional Form of Regression Relation: linear, quadratic regression functions . . . Scope of Model: some interval or region of values of the predictor variables; The shape of the regression function outside the range would be in serious doubt. hsuhl (NUK) LR Chap 1 13 / 41 Regression Models and Their Uses Extrapolation (a) extrapolation hsuhl (NUK) (b) danger LR Chap 1 14 / 41 Regression Models and Their Uses Uses Three major purposes: 1 description (描繪) 2 control (控制) 3 prediction (預測) hsuhl (NUK) LR Chap 1 15 / 41 Regression Models and Their Uses Causality Regression and Causality (因果關係): The existence of a statistical relation between Y and X does not imply that Y depends causally (有因果關係地 ) on X. (迴歸並不意味著存在因果關係,即解釋變數是因,反應變數是果) Correlation(關係)6=Causation(因果): 有關係不代表有因果關係。 已知X、Y之間有相關,可用已知的X來預測Y,但不表示X的改 變,定會造成Y有特定的改變。 吸菸 vs. 癌症 速度 vs. 車禍次數 hsuhl (NUK) LR Chap 1 16 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Statement of Model The linear regression function with one predictor variable: Yi = β0 + β1 Xi + εi , i = 1, . . . , n Yi : the value of response variable in the ith trial β0 , β1 : parameters Xi : a known constant; the value of the predictor variable in the ith trial εi : random error term; E{εi } = 0; σ 2 (εi ) = σ 2 ; uncorrelated σ{εi , εj } = 0 ∀i, j(i 6= j) simple, linear in the parameters; linear in the predictor variable hsuhl (NUK) LR Chap 1 17 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Features 1 Yi : the sum of two components: (1) β0 + β1 Xi (2) εi 2 E{εi } = 0: E{Yi } = E{β0 + β1 Xi + εi } = β0 + β1 Xi 3 The regression function: E{Y} = β0 + β1 X (The regression function relates the means of the probability distribution of Y for given X to the level of X. hsuhl (NUK) LR Chap 1 18 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Features (cont.) 1 2 Yi in the ith trial exceeds or falls short of the value of the regression function by the error term amount εi σ 2 {εi } = σ 2 : σ 2 {Yi } = σ 2 3 2 2 σ {β0 + β1 Xi + εi } = σ {εi } = σ 2 The error terms are assumed to be uncorrelated, so are the responses Yi and Yj . hsuhl (NUK) LR Chap 1 19 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Features (cont.) Figure : Illustration of Simple Linear Regression Model hsuhl (NUK) LR Chap 1 20 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Meaning of Regression Parameters Regression coefficients: β0 (slope), β1 (intercept) Figure : Meaning of Parameters of Simple Linear Regression Model hsuhl (NUK) LR Chap 1 21 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Alternative modification 1 Original regression model: Yi =β0 + β1 Xi + εi , i = 1, . . . , n 2 Yi =β0 X0 + β1 Xi + εi (X0 ≡ 1) 3 Yi =β0 + β1 Xi + εi =β0 + β1 (Xi − X̄) + β1 X̄ + εi =(β0 + β1 X̄) + β1 (Xi − X̄) + εi =β0∗ + β1 (Xi − X̄) + εi hsuhl (NUK) LR Chap 1 22 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Data from Regression Analysis Unknown the regression parameters β0 , β1 Estimate parameters from relevant data Rely on an analysis of the data for developing a suitable regression model hsuhl (NUK) LR Chap 1 23 / 41 Simple Linear Regression Model with Distribution of Error Terms Unspecified Figure : Typical Strategy for Regression Analysis hsuhl (NUK) LR Chap 1 24 / 41 Estimation of Regression Function Estimate: Method of Least Squares Observations: (Xi , Yi ), i = 1, . . . , n Deviation (偏差): Yi − β0 − β1 Xi hsuhl (NUK) LR Chap 1 25 / 41 Estimation of Regression Function Estimate: Method of Least Squares (cont.) The least square criterion: Q= n X (Yi − β0 − β1 Xi )2 i=1 The property of “Good” estimators? The least squares estimators b0 , b1 minimize the criterion Q for the given sample observations. How to obtain the estimators b0 , b1 ? hsuhl (NUK) LR Chap 1 26 / 41 Estimation of Regression Function Estimate: Method of Least Squares ∂Q =0 ∂β0 b0 ,b1 ∂Q =0 ∂β1 b0 ,b1 Pn (X − X̄)(Yi − Ȳ) Pn i ⇒b1 = i=1 2 i=1 (Xi − X̄) b0 = Ȳ − b1 X̄ X̄, Ȳ: the means of the Xi and Yi , respectively hsuhl (NUK) LR Chap 1 27 / 41 Estimation of Regression Function Data on Lot Size and Work Hours refrigeration equipment manufacturing Xi : Lot Size (i = 1, . . . , 25) Yi : Labor work hours b1 = 3.5702; b0 = 62.37 hsuhl (NUK) LR Chap 1 28 / 41 Estimation of Regression Function Normal equations X X Yi = nb0 + b1 Xi X X X Xi Yi = b0 Xi + b1 Xi2 A test of the second partial derivatives will show that a minimum is obtained with the least squares estimators b0 and b1 . hsuhl (NUK) LR Chap 1 29 / 41 Estimation of Regression Function Property of Least Squares Estimators (Chap 2) Gauss-Markov theorem Unbiased: E{b0 } = β0 ; E{b1 } = β1 More precise: The estimators b0 and b1 are more precise than any other estimators belonging to the class of unbiased estimators that are linear functions of the observations Y1 , . . . , Yn . Linear function of Yi : b1 = X ki Yi b0 , b1 are linear estimators. hsuhl (NUK) LR Chap 1 30 / 41 Estimation of Regression Function Point Estimation of Mean Response response: a value of the response variable Mean response: E{Y} Estimated regression function: Ŷ = b0 + b1 X (Ŷ: the value of the estimated regression function at X of the predictor variable) Ŷ : an unbiased estimator of E{Y} Fitted value Ŷi : Ŷi = b0 + b1 Xi , i = 1, . . . , n hsuhl (NUK) LR Chap 1 31 / 41 Estimation of Regression Function Residuals Residual: ei ei = Yi − Ŷi = Yi − (b0 + b1 Xi ) is the vertical deviation of Yi from the fitted value Ŷi on the estimated regression line, and it is known. Model error term: εi εi = Yi − E{Y} the vertical deviation of Yi from the unknown true regression line and is unknown. hsuhl (NUK) LR Chap 1 32 / 41 Estimation of Regression Function Properties of Fitted Regression Line The sum of the residuals is zero: n X ei = 0 i=1 (Rounded errors may be presented.) P The sum of the squared residuals is a minimum: ni=1 e2i P ∵ the criterion Q to be minimized equals ni=1 e2i when b0 , b1 are used for estimating β0 , β1 The sum of the observed values Yi equals the sum of the fitted values Ŷi : n n X X Yi = Ŷi i=1 hsuhl (NUK) LR Chap 1 i=1 33 / 41 Estimation of Regression Function Properties of Fitted Regression Line (cont.) The sum of the weighted residuals is zero when the residual in the ith trial is weighted by the level of the predictor variable in the ithe trial: n X Xi ei = 0 i=1 The sum of the weighted residuals is zero when the residual in the ith trial is weighted by the fitted value of the response variable for the ith trial: n X Ŷi ei = 0 i=1 The regression line always goes through the point (X̄, Ȳ) hsuhl (NUK) LR Chap 1 34 / 41 Estimation of Regression Function Estimation of σ 2 σ 2 {Yi } = σ 2 The error sum of squares or residual sum of squares: SSE n n X X 2 SSE = (Yi − Ŷi ) = e2i i=1 i=1 The residual sum of squares SSE has n − 2 degrees of freedom. (Two degrees of freedom are associated with the estimates b0 and b1 involved in obtaining Ŷi ) E{SSE} = (n − 2)σ 2 (need to be proof) hsuhl (NUK) LR Chap 1 35 / 41 Estimation of Regression Function Estimation of σ 2 (cont.) The error mean square or residual mean square: MSE Pn P 2 2 SSE ei i=1 (Yi − Ŷi ) MSE = = = n−2 n−2 n−2 MSE is an unbiased estimator of σ 2 : E{MSE} = σ 2 An estimate of σ = hsuhl (NUK) √ MSE LR Chap 1 36 / 41 Normal Error Regression Model Normal Error Regression Model The normal error regression model: Yi = β0 + β1 Xi + εi Yi : the observation response Xi : a known constant β0 , β1 : parameters εi , i = 1, . . . , n: independent N(0, σ 2 ) 常態分佈的特性? The estimators of the parameters β0 , β1 and σ 2 van be estimated be the method of maximum likelihood. (MLE) hsuhl (NUK) LR Chap 1 37 / 41 Normal Error Regression Model Normal Error Regression Model (cont.) The method of maximum likelihood chooses as the maximum likelihood estimate that value for which the likelihood value is largest. Two methods for finding MLE: a systematic numerical search use of an analytical solution Estimator of µ is the sample mean Ȳ hsuhl (NUK) LR Chap 1 38 / 41 Normal Error Regression Model Normal Error Regression Model (cont.) σ = 2.5; β0 = 0; β1 = 0.5 hsuhl (NUK) LR Chap 1 39 / 41 Normal Error Regression Model Normal Error Regression Model (cont.) The density of an observation Yi for the normal error regression model: (E{Yi } = β0 + β1 Xi ; σ 2 {Yi } = σ 2 ) " 2 # 1 Yi − β0 − β1 Xi 1 fi = √ exp − 2 σ 2π The likelihood function for n observations Y1 , . . . , Yn : " 2 # n n Y Y 1 1 Y − β − β X i 0 1 i √ exp − L(β0 , β1 , σ 2 ) = fi = 2 σ 2π i=1 i=1 " # n 1 X 1 = exp − 2 (Yi − β0 − β1 Xi )2 2 n/2 (2πσ ) 2σ i=1 hsuhl (NUK) LR Chap 1 40 / 41 Normal Error Regression Model Normal Error Regression Model (cont.) (cont.) The MLE of σ 2 is biased. MSE = n σ̂ 2 n−2 Ex: β̂0 = b0 = 2.81; β̂1 = b1 = 0.177 hsuhl (NUK) LR Chap 1 41 / 41