Stat 401 B – Lecture 6 Simple Linear Regression Question Is annual carbon dioxide concentration related to annual global temperature? 1 Simple Linear Regression Response variable, Y. Annual (o global temperature C). Explanatory (predictor) variable, x. Annual atmospheric CO2 concentration. 2 Regression model Y = μ y| x + ε •Y represents a value of the response variable. •μ y| x represents the population mean response for a given value of the explanatory variable, x. •ε represents the random error 3 1 Stat 401 B – Lecture 6 Linear Model Y = μ y| x + ε = β 0 + β1 x + ε β0 The Y-intercept parameter. β1 The slope parameter. 4 Conditions The relationship is linear. The random error term, ε , is Independent Identically distributed Normally distributed with standard deviation, σ . 5 15.0 Temp 14.5 14.0 13.5 300 350 CO2 400 6 2 Stat 401 B – Lecture 6 Describe the plot. Direction – positive/negative. Form – linear/non-linear. Strength. Unusual points? 7 Method of Least Squares Find estimates of β 0 and β1 such that the sum of squared vertical deviations from the estimated straight line is the smallest possible. 8 Least Squares Estimates βˆ1 = ∑ (x − x )( y − y ) 2 ∑ (x − x ) βˆ0 = y − βˆ1 x yˆ = βˆ0 + βˆ1 x 9 3 Stat 401 B – Lecture 6 Bivariate Fit of Temp By CO2 15.0 Temp 14.5 14.0 13.5 300 350 400 CO2 10 Linear Fit Linear Fit yˆ = βˆ0 + βˆ1 x Predicted Temp = 9.8815 + 0.012584*CO2 11 Interpretation Estimated Y-intercept. This does not have an interpretation within the context of the problem. Having no CO2 in the atmosphere is not reasonable given the data. 12 4 Stat 401 B – Lecture 6 Interpretation Estimated slope. For each additional 1 ppmv of CO2, the annual global temperature goes up 0.012584 o C, on average. 13 Bivariate Fit of Temp By CO2 15.0 Temp 14.5 14.0 13.5 300 350 400 CO2 Linear Fit 14 How Strong? The strength of a linear relationship can be measured by R2, the coefficient of determination. RSquare in JMP output. 15 5 Stat 401 B – Lecture 6 How Strong? R2 = SS Model SSTotal R2 = 0.80145 = 0.806 0.99450 16 Interpretation 80.6% of the variation in the global temperature can be explained by the linear relationship with carbon dioxide concentration. 19.4% is unexplained. 17 Interpretation There is a fairly strong positive linear relationship between carbon dioxide concentration and global temperature. Cause and effect? 18 6 Stat 401 B – Lecture 6 Cause and Effect? There is a strong positive linear relationship between the number of 2nd graders in communities and the number of crimes committed in those communities. 19 Connection to Correlation If you square the correlation coefficient, r, relating carbon dioxide to global temperature you get R2, the coefficient of determination. r = ± R 2 = + 0.806 = +0.898 20 Connection to Correlation ⎛ sy ⎞ ⎟⎟ ⎝ sx ⎠ βˆ1 = r ⎜⎜ s y is the standard deviation of the y values s x is the standard deviation of the x values 21 7