Supplemental Material: Hypothesis Testing for Correlation

advertisement
Ch 11: Correlations (pt. 2)
and Ch 12: Regression (pt.1)
Nov. 13, 2014
Hypothesis Testing for Corr
• Same hypothesis testing process as before:
• 1) State research & null hypotheses –
– Null hypothesis states there is no relationship between
variables (correlation in pop = 0)
– Notation for population corr is rho ()
– Null:  = 0 (no relationship betw gender & ach)
– Research hyp:  doesn’t = 0 (there is a signif
relationship betw gender & ach)
(cont.)
• The appropriate statistic for testing the
signif of a correlation (r) is a t statistic
• Formula changes slightly to calculate t for
a correlation:
• Need to know r and sample size
• Find the critical value to use for your
comparison distribution – it will be a t value
from your t table, with N-2 df
• Use same decision rule as with t-tests:
– If (abs value of) t obtained > (abs value) t critical
 reject Null hypothesis and conclude correlation
is significantly different from 0.
Example
• For sample of 35 employees, correlation
between job dissatisfaction & stress = .48
• Is that significantly greater than 0?
• Research hyp: job dissat & stress are significantly
positively correlated ( > 0)
• Null hyp: job dissat & stress are not correlated ( = 0)
• Note 1-tailed test, use alpha = .05
Regression
• Predictor and Criterion Variables
• Predictor variable (X) – variable used to
predict something (the criterion)
• Criterion variable (Y) – variable being
predicted (from the predictor!)
– Use GRE scores (predictor) to predict your
success in grad school (criterion)
Prediction Model
• Direct raw-score prediction model
– Predicted raw score (on criterion variable) =
regression constant plus the result of multiplying a
raw-score regression coefficient by the raw score on
the predictor variable
– Formula
Yˆ  a  (b)(X )
a = regression
constant
b = regression coefficient
(not standardized)
• The regression constant (a)
– Predicted raw score on criterion variable
when raw score on predictor variable is 0
(where regression line crosses y axis)
• Raw-score regression coefficient (b)
– How much the predicted criterion variable
increases for every increase of 1 on the
predictor variable (slope of the reg line)
Correlation Example: Info needed to
compute Pearson’s r correlation
x
y
(x-Mx)
(x-Mx)2
(y-My)
(y-My)2
(x-Mx)(y-My)
6
6
2.4
5.76
2
4
4.8
1
2
-2.6
6.76
-2
4
5.2
5
6
1.4
1.96
2
4
2.8
3
4
-.6
.36
0
0
0
3
2
-.6
.36
-2
4
1.2
Mx=
3.6
My=
4.0
0
SSx=
15.2
0
SSy= 16 SP = 14.0
Refer to
this total
as SP
(sum of
products)
Formulas for a and b
• First, start by finding
the regression
coefficient (b):
• Next, find the
regression constant
or intercept, (a):
SP
slope (b) 
SSX
int ercept(a)  My  b( Mx)
This is known as the “Least Squares Solution” or
‘least squares regression’
Computing regression line
(with raw scores)
X Y
6 6
1 2
5 6

3 4
3 2
mean 3.6 4.0
SP
slope b 
SSX
14

 0.92
15.2
intercept(a)  My  b(Mx)
 4.0(0.92)(3.6)
 0.688
 15.20

SSX
Ŷ = .688 + .92(x)
16.0
14.0
SP
SSY
Interpreting ‘a’ and ‘b’
• Let’s say that x=# hrs studied and y=test
score (on 0-10 scale)
• Interpreting ‘a’:
– when x=0 (study 0 hrs), expect a test score of
.688
• Interpreting ‘b’
– for each extra hour you study, expect an
increase of .92 pts
Correlation in SPSS
• Analyze  Correlate  Bivariate
– Choose as many variables as you’d like in your
correlation matrix  OK
– Will get matrix with 3 rows of output for each combination
of variables
• Notice that the diagonal contains corr of variable with itself, we’re
not interested in this…
• 1st row reports the actual correlation
• 2nd row reports the significance value (compare to alpha – if <
alpha  reject the null and conclude the correlation differs
significantly from 0)
• 3rd row reports sample size used to calculate the correlation
Simple Regression in SPSS
– Analyze Regression  Linear
– Note that terms used in SPSS are “Independent
Variable” (this is x or predictor) and “Dependent Variable”
(this is y or criterion)
– Class handout of output – what to look for:
• “Model Summary” section - shows R2
• ANOVA section – 1st line gives ‘sig value’, if < .05  signif
– This tests the significance of the R2 for the regression.
If yes  it does predict y)
• Coefficients section – 1st line gives ‘constant’ = a (listed under ‘B’
column)
– Other line gives ‘unstandardized coefficient’ = b
– Can write the regression/prediction equation from this info…
Download