Political Science 30: Political Inquiry

advertisement
Political Science 30:
Political Inquiry
Linear Regression II: Making Sense
of Regression Results

Interpreting SPSS regression output
Coefficients for independent variables
 Fit of the regression: R Square


Statistical significance


How to reject the null hypothesis
Multivariate regressions
College graduation rates
 Ethnicity and voting

Linear Regression: Review

Want to draw a line that best represents the
relationship between the IV (X) and DV (Y).
 Y = a + b*X
 Allows us to predict DV given value of IV
 Regression finds the values for a and b that
minimizes the distance between the points
and the line.
 Technically, a and b are population
parameters. We only get to calculate sample
statistics, a-hat and b-hat.
Interpreting SPSS regression output
100
80
Slope or
“coefficient”
60
Graduation Rate
40
How tight is
the fit?
Y-intercept
or “constant”
20
0
Rsq = 0.3454
0
200
400
600
Average SAT Score
800
1000
1200
1400
1600
Interpreting SPSS regression output

An SPSS regression output includes two
key tables for interpreting your results:

A “Coefficients” table that contains the yintercept (or “constant”) of the regression, a
coefficient for every independent variable,
and the standard error of that coefficient.

A “Model Summary” table that gives you
information on the fit of your regression.
Interpreting SPSS regression output:
Coefficients
Coefficients
Unstandardized
Coefficients
Model
1
B
a
Standardized
Coefficients
Std.
Error
(Constant)
4.236
7.048
Average
SAT Score
5.88E-02
.007
Beta
t
.588
Sig.
.601
.549
8.778
.000
a. Dependent Variable: Graduation Rate
In this class, we will ONLY LOOK AT
UNSTANDARDIZED COEFFICIENTS!
• The y-intercept is 4.2% with a standard error of 7.0%
• The coefficient for SAT Scores is 0.059%, with a
standard error of 0.007%.
Interpreting SPSS regression output:
Coefficients
Coefficients
Unstandardized
Coefficients
Model
1
B
a
Standardized
Coefficients
Std.
Error
(Constant)
4.236
7.048
Average
SAT Score
5.88E-02
.007
Beta
t
.588
Sig.
.601
.549
8.778
.000
a. Dependent Variable: Graduation Rate
Est. Graduation Rate = 4.2 + 0.059 * Average SAT Score
Interpreting SPSS regression output:
Coefficients

The y-intercept or constant is the
predicted value of the dependent variable
when the independent variable takes on
the value of zero.
This basic model predicts that when a
college admits a class of students who
averaged zero on their SAT, 4.2% of them
will graduate.
 The constant is not the most helpful statistic.

Interpreting SPSS regression output:
Coefficients

The coefficient of an independent variable
is the predicted change in the dependent
variable that results from a one unit
increase in the independent variable.
A college with students whose SAT scores are
one point higher on average will have a
graduation rate that is 0.059% higher.
 Increasing SAT scores by 200 points leads to a
(200)(0.059%) = 11.8% rise in graduation rates

Interpreting SPSS regression
output: Fit of the Regression
Model Summary
Model
1
R
a
.588
R Square
.345
Adjusted
R Square
.341
Std. Error of
the Estimate
12.45%
a. Predictors: (C onstant), Aver age SAT Score
The R Square measures how closely a regression line
fits the data in a scatterplot.
• It can range from zero (no explanatory power) to one
(perfect prediction).
• An R Square of 0.345 means that differences in SAT
scores can explain 35% of the variation in college
graduation rates. Key sentence for your homework!
R Square Examples
Statistical Significance

What would the null hypothesis look like
in a scatterplot?
If the independent variable has no effect on
the dependent variable, the scatterplot
should look random, the regression line
should be flat, and its slope should be zero.
 Null hypothesis: The regression coefficient
(b) for an independent variable equals zero.
 Can we reject null b=0 based on our
estimate of b-hat?

Statistical Significance

Our formal test of statistical significance
asks whether we can be sure that a
regression coefficient for the population
differs from zero.
Just like in a difference in means/proportions
test, the “standard error” is the standard
deviation of the sample distribution.
 If a coefficient is more than two standard
errors away from zero, we can reject the null
hypothesis (that it equals zero).

Statistical Significance

So, if a coefficient is more than twice the
size of its standard error, we reject the null
hypothesis with 95% confidence.
This works whether the coefficient is negative
or positive.
 The coefficient/standard error ratio is called
the “test statistic” or “t-stat.”
 A t-stat bigger than 2 or less than -2 indicates
at statistically significant correlation.

Interpreting SPSS regression output:
T-Stats
Coefficients
Unstandardized
Coefficients
Model
1
B
Standardized
Coefficients
Std.
Error
(Constant)
4.236
7.048
Average
SAT Score
5.88E-02
.007
a. Dependent Variable: Graduation Rate
a
Beta
t
.588
Sig.
.601
.549
8.778
.000
Multivariate Regressions

A “multivariate regression” uses more than
one independent variable (or confound) to
explain variation in a dependent variable.
The coefficient for each independent variable
reports its effect on the DV, holding constant all
of the other IVs in the regression.
 Thought experiment: Comparing two colleges
founded in the same year with the same
student faculty ratio, what is the effect of SATs?

Multivariate Regressions
Year of
Founding
SAT Scores
Tuition
Student/Faculty
Ratio
Graduation
Rates
Multivariate Regressions

Again, want to estimate coefficients:
Est. Grad. Rate =
a + b1*SAT Score + b2*Year Founded+ b3*Tuition + b4*Faculty Ratio
Multivariate Regressions
Coefficients
a
Unstandardized
Coefficients
Model
1
Std.
Error
B
(Constant)
Standardized
Coefficients
Beta
59.187
47.203
-2.1E-02
.023
Average SAT Score
4.2E-02
In-state Tuition
Year school was
founded
Student/faculty ratio
t
Sig.
1.254
.212
-.072
-.917
.361
.010
.410
4.224
.000
8.4E-04
.000
.208
2.109
.037
-.206
.329
-.054
-.626
.533
a. Dependent Variable: Graduation Rate
Model Summary
Model
1
R
a
.630
R Square
.397
Adjusted
R Square
.377
a. Predictors: (Constant), Student/faculty ratio, Year school
was founded, Average SAT Score, In-state Tuition
Std. Error of
the Estimate
12.11%
Multivariate Regressions

Holding all other factors constant, a 200 point
increase in SAT scores leads to a predicted
(200)(0.042) = 8.4% increase in the graduation
rate, and this effect is statistically significant.

Controlling for other factors, a college that is
100 years younger should have a graduation
rate that is (100)(-0.021) = 2.1% lower, but this
effect is not significantly different from zero.
Download