Word

advertisement
Math 58B - Introduction to Biostatistics
Spring 2015
Jo Hardin
Lab Assignment 12
Lab Goals:
1.
To understand the concept of a null sampling distribution for the slope of a regression
line.
2.
To be able to run & interpret a correlation in R.
3.
To be able to run & interpret a regression line in R.
Note 1: if you can't remember what an ISCAM function does, pass it the argument "?". Also,
remember to look at the back of each chapter for R summaries.
load(url("http://www.rossmanchance.com/iscam2/ISCAM.RData"))
Note 2: Though, for this lab, you'll prinarily use the applet, for running linear models in R,
use the function lm.
help(lm)
# page 326
## starting httpd help server ... done
install.packages("car", repos="http://cran.us.r-project.org")
## Installing package into '\\wells/fac-staff$/jsh04747/My Documents/R/winlibrary/3.1'
## (as 'lib' is unspecified)
## package 'car' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\jsh04747\AppData\Local\Temp\RtmpieQbEL\downloaded_packages
library(car)
## Warning: package 'car' was built under R version 3.1.3
In class
In this lab, we will investigate a new setting: inference for simple linear regression. Go
through Investigation 4.10 & 4.11.
To remove the 47th row from a dataset called "mydata":
datanew = mydata[-47,]
To turn in
Consider the cat data from Investigation 4.6. Use R to create the scatterplot as well as to
find the correlation and least squares regression line.
.1. Create a scatterplot with the explanatory variable (you choose!) on the x-axis, and the
response variable on the y-axis. Be sure to have your axes labeled.
plot(explanatory, response, xlab="my x label", ylab="my y label")
.2. Superimpose the regression line onto your scatterplot.
plot(explanatory, response)
abline(lm(response~explanatory))
.3. Calculate and interpret correlation coefficient (sign, strength, linearity). cor(explanatory,
response)
.4. Square the correlation coefficient to obtain 𝑟 2 . Interpret the coefficient of determination
in context (i.e., using words like "square feet" and "price").
.5. Determine and interpret the slope of the least squares line in context. The interpretation
should be of the form: for every additional ____ we estimate that ____ changes by ____.
lm(response ~ explanatory)
summary(lm(response ~ explanatory))
.6. Give a confidence interval for the slope. Interpret the confidence interval. Based only on
the confidence interval, is your slope statistically significant? Explain how you know. (Note
that I haven't been explicit about the hypothesis test, but you should know the implicit
hypothesis test to which I'm referring.)
.7. Determine and interpret the intercept of the least squares regression line. Explain what
this value might signify in this context. Is the interpretation meaningful in this context?
Explain.
.8. Going back to (u) of 4.10, give the intuition as to why each of the observations in the box
makes sense. That is, why does the sampling variability of b1 change (in the appropriate
direction) for each of n, 𝜎, and 𝑠𝑥 ?
.9. Describe to someone who has taken the first 3/4 of Math 58B the differences between
the sampling distributions in Investigations 4.10 and 4.11.
Download