Worksheet on Using Excel to do Regression

advertisement
AP Stats
Using Excell
Name_______________________
1. A researcher wants to know if there is a relationship between the number of shopping centers in a
state and the retail sales (in billions $) of that state. A random sample of 8 states is listed below. After
determining, via a scatter-plot, that the data followed a linear pattern, the regression line was found.
Using the given data and the given regression output answer the following questions.
# Shoppers
Sales (in Billions)
a.
b.
c.
d.
e.
f.
g.
h.
i.
630
15.5
370
7.5
616
13.9
700
18.7
430
8.2
568
13.2
1200
23
2976
87.3
Find the least squares line.
Is this linear model a good fit? Explain.
Can we assume the data is approximately normal? Explain.
What is the slope? Interpret it.
What is the y-intercept? Interpret it.
What is the strength of the relationship.
Find the residual for 700 shoppers.
What is the coefficient of determination? Interpret it.
Find the SSresid, SSTo, and se.
2. Coffee is a leading export from several developing countries. When coffee prices are
high, farmers often clear forest to plant more coffee trees. Below are five years’ data
on prices paid to coffee growers in Indonesia and the percent of forest area lost in a
national park that lies in a coffee-producing region.
Price (cents per
pound)
Forest lost (percent)
a.
b.
c.
d.
e.
f.
29
0.49
40
1.59
54
1.69
55
1.82
71
3.1
Find the least squares line.
Is the linear model a good model to use? Explain.
Can we assume the data is approximately normal? Explain.
What is the slope? Interpret it.
What is the y-intercept? Interpret it.
What percent of the variation in percent of forest lost can be explained by the
least squares line?
g. What is the strength of the relationship?
h. Find the residual for 54 cents per pound.
i. Find the SSresid, SSTo, and se.
Answer the
a.
b.
c.
d.
e.
f.
following parts for Questions 3-4 below.
Find the least squares line.
What is the slope? Interpret it.
What is the y-intercept? Interpret it.
What is the strength of the relationship.
What is the coefficient of determination? Interpret it.
Find the SSresid, SSTo, and se.
3. A chemical company wants to study the effect of extraction time on the efficiency of an extraction
process. They obtained a random sample of extraction times and the corresponding efficiency scores.
The output from Excel is given below.
Regression Statistics
Multiple R
0.864
R Square
0.746
Std Error
5.139
Obs
15
Coefficients
Intercept
39.022
Time
0.764
Std Error
4.173079
0.123639
t Stat
9.350943
6.178365
P-value
3.9E-07
3.33E-05
Lower 95%
30.00684
0.496782
Upper 95%
48.03761
1.030995
4. The following is output from Excel for regression analysis. The researcher wanted to predict the total
cholesterol (mg/100ml) using weight (kg) as the predictor variable. Using the output, please answer
the following questions?
SUMMARY OUTPUT
Regression
Multiple R
R Square
Standard Error
Observations
Intercept
Weight
Statistics
0.265293
0.070381
76.65431
25
Coeff Std Err t Stat
199.30 85.82 2.322
1.62 1.229 1.320
ANOVA
Source
Regress
Residual
Total
df
1
23
24
P-value
0.0294
0.1999
Lower 95%
21.77
-0.921
SS
MS
F
10231 10231 1.741
135145 5875.8
145377
Upper 95%
376.825
4.1656
Download