Multiple Regression

advertisement
Dependent
Variable
(DV)
Correlations
Independent
Variables (IV)
batting average Pearson Correlation
This value shows the relationship between
batting average (DV) and doubles (IV). r=.609.
r2=.371, meaning that 37.1% of the variance in
batting average is accounted for by the
doubles variable. This is significant. In fact, all
IV’s here are significant.
Pearson’s Correlation (r)
evaluates the relationship
between two variables. A
perfect relationship is 1.
batting
average
runs
scored
doubles triples Home strike
runs outs
1
.825**
.609**
.662** .336*
-.621**
.000
.000
.000
.024
.000
45
45
Sig. (2-tailed)
N
45
45
45
45
Pearson Correlation
.825**
1
.517**
.599** .496** -.366*
Sig. (2-tailed)
.000
.000
. 000
.001
.013
N
45
45
45
45
45
45
Pearson Correlation
.609**
.517**
1
.487** .299*
-.157
Sig. (2-tailed)
.000
.000
.001
.046
.304
N
45
45
45
45
45
45
Pearson Correlation
.662**
.599**
.487**
1
-.037
-.487**
Sig. (2-tailed)
.000
.000
.001
.808
.001
N
45
45
45
45
45
45
Pearson Correlation
.336*
.496**
.299*
-.037
1
.197
Sig. (2-tailed)
.024
.001
.046
.808
N
45
45
45
45
Pearson Correlation
-.621**
-.366*
-.157
-.487** .197
Sig. (2-tailed)
.000
.013
.304
.001
.194
N
45
45
45
45
45
runs scored
doubles
triples
homeruns
.194
45
45
1
strike outs
45
**. Correlation is significant at the 0.01 level (2-tailed).
Indicates the
The p value for each variable relationship represents
*. Correlation is significantnumber
at the 0.05
of level (2-tailed).
the level of significance of the relationship and
participants in the
whether the null hypothesis is true. The null will be
study. There were
rejected if p is less than .05. The p values for the
relationship between the DV and each IV are both
45 participants.
equal to .000, meaning they are significant and the
null is rejected.
Colinearity between IVs.
When one or more
predictors are perfectly
correlated with other
predictors, there is no
possible mathematical
solution in MRC. The
greater the colinerarity
the more unstable the
partial regression
coefficients. Usually at
an acceptable level if
less than .8
These p values
represents the level of
significance between
IVs (strikeouts and
doubles; home runs
and triples). Because p
is greater than .05, the
value is not significant
and then null is true.
The model
being
reported
Variables Entered/Removeda
Model
Variables
Variables
Entered
Removed
Method
strike outs,
doubles,
1
homeruns,
. Enter
triples, runs
scoredb
All
Independent
Variables
Variables that were removed
from current regression;
Empty unless stepwise
regression.
a. Dependent Variable: batting average
b. All requested variables entered.
Method used to run
regression; “Enter” is
also known as a
direct regression.
Also referred to as the root mean
squared error. It is the standard
deviation of the error term and the
square root of the Mean Square for
the Residuals in the ANOVA table.
It indicates the error in predicting
or estimating a person’s score on
the criterion using the multiple
regression equation.
The model
being
reported
Model Summary
Model
1
R
.927a
R Square
Adjusted R
Std. Error of the
Square
Estimate
.860
R= the degree of linear
relationship between the criterion
and the weighted combination of
predictors as specified by the
regression equation. The square
root of R-Squared. The multiple
correlation is .927
R-Squared indicates the variance accounted for or
the proportionate reduction in error. This value
indicates that 86% of the variance in a player’s
batting average can be predicted from the
variables run scored, doubles, triples, homeruns,
and strikeouts. R-Squared is also referred to as the
coefficient of determination.
.842
.01748
Adjusted R-Squared (.842),
read 84.2%, is the value
after adjustment of the RSquared that penalizes the
addition of extraneous
predictors to the model.
a. Predictors: (Constant), strike outs, doubles, homeruns, triples, runs
scored
Another term for
variance. These are
estimates of population
variance.
The model being
reported.
ANOVAa
Model
1
Sum of Squares
df
Mean Square
Regression
.073
5
.015
Residual
.012
39
.000
Total
.085
44
F
47.965
Sig.
.000b
Degrees of Freedom
associated with sources
of variance.
F ratio= mean square effect
(regression) divided by the
mean square error term
(residual).
The source of variance: Regression,
Residual, and Total. The Total variance is
split into the variance than can be
explained by the independent variables
(regression) and the variance that is not
explained by the independent variables
(residual).
The P value is the possibility
of the F value, given that the
null is true. The null
hypothesis states that the IV’s
will have no effect on the DV.
If p is less than .05, then the
null is rejected. This p value at
.000 < .05, meaning that it is
significant. Therefore, we can
say that the variables runs
scored, doubles, triples,
homeruns, and strikeouts can
be used to reliably predict
batting average.
a. Dependent Variable: batting average
b. Predictors: (Constant), strike outs, doubles, homeruns, triples, runs scored
The predictor
values. The first
variable represents
the constant (Y
intercept). This is
the predicted
value of batting
average when all
other variables are
0.
These are the coefficients that you
would obtain if you standardized all
of the variables in the regression,
including the DV and IVs, and ran
the regression. By standardizing
before running the regression, we
have put all of the variables on the
same scale, and we can compare the
magnitude of the coefficients to see
which one has more of an effect.
These are the values for
the regression equation
for predicting the DV
from the IV. The
equation is: Ypredicted
= .183(x1) + .447(x2) +
.991(x3) + .622(x4) +
.274(x5) - .285(x6)
Statistically
significant
because p <
.005
Coefficientsa
Model
Unstandardized Coefficients
Standardized
t
Sig.
Coefficients
B
Std. Error
(Constant)
.183
.017
runs scored
.447
.110
doubles
.991
triples
Beta
10.685
.000
.426
4.074
.000
.313
.235
3.165
.003
.622
.581
.098
1.070
.291
homeruns
.274
.169
.138
1.616
.114
strike outs
-.285
.052
-.408
-5.497
.000
1
a. Dependent Variable: batting average
The t and p values give a rough
indication of the impact of each
predictor variable. A big absolute t
value and small p value suggests that a
predictor variable is having a large
impact on the criterion variable.
Statistically
insignificant
because p >
.005
Download