Multiple Linear Regression in R

advertisement
Assignment
Problem: For the homeprice dataset,
1.
What does a half bathroom do for the sale price?
2.
How do the coefficients change if you force the intercept, b0 to be 0? (Use a ‐1 in the
model formula notation.) Does it make any sense for this model to have no intercept
term?
3. What is the effect of neighbourhood on the difference between sale price and list
price?
4. Do nicer neighbourhoods mean it is more likely to have a house go over the asking
price?
5. Is there a relationship between houses which sell for more than predicted (a positive
residual) and houses which sell for more than asking?
6. (If so, then perhaps the real estate agents aren't pricing the home correctly.)
Ans:
Description
The homeprice data frame has 29 rows and 7 columns, representing Sale price of homes in
New Jersey in the year 2001. This dataset is a random sampling of the homes sold in
Maplewood, NJ during the year 2001. Of course the prices will either seem incredibly high or
fantastically cheap depending on where you live, and if you have recently purchased a home.
This data frame contains the following columns:
list: list price of home (in thousands)
sale: actual sale price
full: number of full bathrooms
half: number of half bathrooms
bedrooms: number of bedrooms
rooms : total number of rooms
neighbourhood : Subjective assessment of neighbourhood on scale of 1-5
1. Half bathroom and the Sale price
> plot(sale~half, data=homeprice)
> hlm <- lm(sale~half , data=homeprice)
> abline(hlm)
> summary(hlm)
Call:
lm(formula = sale ~ half, data = homeprice)
Residuals:
Min
1Q
-180.27 -75.27
Median
-22.34
3Q
72.66
Max
246.58
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
228.27
28.78
7.932 1.59e-08 ***
half
69.08
31.00
2.229
0.0344 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 109.8 on 27 degrees of freedom
Multiple R-squared: 0.1554, Adjusted R-squared: 0.1241
F-statistic: 4.966 on 1 and 27 DF, p-value: 0.03436
From the plot we can observe that increase in number of half bathrooms results in increase of
the sale price.
2. Changing coefficient b0 to be 0
> hlm <- lm(sale~half-1 , data=homeprice)
> abline(hlm)
> summary(hlm)
Call:
lm(formula = sale ~ half - 1, data = homeprice)
Residuals:
Min
1Q
-222.62
6.44
Median
117.70
3Q
215.00
Max
450.00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
half
242.56
39.36
6.163 1.18e-06 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 196.8 on 28 degrees of freedom
Multiple R-squared: 0.5757, Adjusted R-squared: 0.5605
F-statistic: 37.98 on 1 and 28 DF, p-value: 1.181e-06
The coefficients will change if we force the intercept. Initially the coefficients are 228.27 and
69.08 and after changing the intercepts the coefficients are 242.56 and 0.
It is not making any sense for this model to have no intercept term, we can observe the
variation of the regression line (Red line) compared to original regression line (Black line).
The red line is not considering from the Zero number of half bathrooms.
3. Effect of neighbourhood on the difference between sale price and list price
> xyplot(sale ~ list | neighborhood,panel=panel.lm,data=homeprice)
> nbd =as.numeric(cut(neighborhood,c(0,2,3,5),labels=c(1,2,3)))
> table(nbd) # check that we partitioned well
nbd
1 2 3
10 12 7
> xyplot(sale ~ list | nbd, panel=panel.lm,layout=c(3,1))
For different neighbourhood we can observe linear relation between the sale price and the list
price.
>
>
>
>
>
a=(list-sale)
plot(a~neighborhood, data=homeprice)
hlm <- lm(a~neighborhood , data=homeprice)
abline(hlm)
summary(hlm)
Call:
lm(formula = a ~ neighborhood, data = homeprice)
Residuals:
Min
1Q Median
-33.05 -5.80
0.85
3Q
7.50
Max
30.05
Coefficients:
(Intercept)
neighborhood
Estimate Std. Error t value Pr(>|t|)
-7.800
7.435 -1.049
0.303
3.150
2.428
1.298
0.205
Residual standard error: 13 on 27 degrees of freedom
Multiple R-squared: 0.0587, Adjusted R-squared: 0.02383
F-statistic: 1.684 on 1 and 27 DF, p-value: 0.2054
With increasing quality of neighbourhood, we can observe that there is an ascending trend in
the difference between the sales price and the list price of the houses.
4. Yes. It has been noted that in nicer neighbourhoods a house tends to be sold at a price over
the asked price.
Download
Random flashcards

Arab people

– Cards

Nomads

– Cards

Emergency medicine

– Cards

Create flashcards