Worksheet 11 – Chapter 10 – Simple Linear Regression Name:____________________ Section: __________________ 1. You want to develop a model to predict the selling price of homes based on assessed value. A random sample of 30 recently sold single-family houses in a small city is selected to study the relationship between selling price (in thousands of dollars) and assessed value (in thousands of dollars). The data are in the HOUSE file. Using the steps shown in the notes, the flow chart from the notes & book, perform a simple linear regression analysis of this data. Use DDXL to perform the initial analysis, then provide interpretations within the context of this problem of the values found from DDXL. If appropriate, predict the selling price for a house whose assessed value is $170,000 and create a 95% confidence interval for that value. 1. Hypothesize the deterministic component of the model πΈπΈ(π¦π¦) = π½π½0 + π½π½1 π₯π₯ 2. Use the sample data to estimate the unknown parameters in the model a. Plot the data to a scatter plot Determine whether fitting a line to the data seems appropriate based on the graph. The data appear to follow a linear pattern. b. Find the slope and interpret οΏ½1 = 1.78171 π π π π π π π π π π = π½π½ For every $1000 dollar increase in assessed value the asking price increases by 1.7817 (thousands of dollars) c. Find the y-intercept and interpret οΏ½0 = −122.344 π¦π¦ − ππππππππππππππππππ = π½π½ There is not a valid interpretation of y-intercept for this problems a house assessment value of $0 is in the scope of the data sampled or possible. d. Prediction equation: π¦π¦οΏ½ = −122.344 + 1.78171(π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄ ππππππππππ) 3. Specify the probability distribution of the random error term and estimate the standard deviation of this distribution a. Check assumptions of probability of random error i. The mean of ε (random error) is 0 ii. The variance of ε is constant iii. The probability distribution of ε is normal iv. ε's are independent of one another b. Find the value of variability of random error and interpret π π = 3.475 π π = (3.475)2 = 12.0756 We expect approximately 95% of the observed values of selling price to lie within 2(3.475) = 6.95 thousand dollars of their respective least squares predicted selling price value. 2 4. Statistically evaluate the usefulness of the model a. Hypothesis Test for π½π½1 Hypotheses: π»π»0 : π½π½1 = 0 π»π»π΄π΄ : π½π½1 ≠ 0 Assumptions: (See step 3a above) Test: Test Statistic: t = 18.7 p-value ≤ 0.0001 Summary: At the 5% significance level, my p-value is less than alpha therefore reject H0. There is sufficient evidence to suggest that the population slope of the regression line predicting selling price ($1000) from assessed value ($1000) is different from 0. Our model is statistically useful b. Calculate coefficient of correlation (r) and interpret ππ = √. 926 = 0.9623 There is a strong positive linear association between selling price ($1000) and assessed value ($1000). c. Calculate coefficient of determination (r2) and interpret ππ 2 = .926 About 92.6% of the sample variation in selling price ($1000) can be explained by using assessed value ($1000) to predict selling price in our linear model. d. Is the model practically useful? Since r2 is large and s is small relative to the possible values for y the model is practically useful. 5. Use the model for prediction, estimation a. Use line for prediction π¦π¦οΏ½ = −122.344 + 1.78171(π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄π΄ ππππππππππ) π¦π¦οΏ½ = −122.344 + 1.78171(170) π¦π¦οΏ½ = 180.5467 π‘π‘βππππππππππππ ππππππππππππππ b. Create Confidence Intervals for estimation and interpret We are 95% confident that the true mean selling price value for a house whose assessed value is $170,000 lies between $178,710 dollars and $182,390.