Chapter 3 Investigating the Relationship of Scores

advertisement
Chapter 3
Investigating the Relationship
of Scores
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
Chapter Objectives
After completing this chapter, you should be able to
1 . Define correlation, linear correlation, interpret the
correlation coefficient, and use the rank-difference and
product-moment methods to determine the relationship
between two variables.
2. Construct a scattergram and interpret it.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-2
Linear Correlation
• Correlation - a statistical technique used to express the
relationship between two sets of scores (two variables)
• Linear correlation – the degree to which a straight line
best describes the relationship between two variables
Examples: longevity and exercise, smoking and cancer,
intramural participation and grades, number of miles run
per week and time on 5K.
• Correlation coefficient - number that represents the
correlation
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-3
Correlation Coefficient
1. The values of the coefficient will always range from
+1.00 to -1.00. Rare that coefficients of +1.00, -1.00,
and 0.00 are found.
2. A positive coefficient indicates direct relationship.
3. A negative relationship indicates inverse relationship.
4. A correlation coefficient near .00 indicates no
relationship.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-4
Correlation Coefficient
5. The number indicates the degree of relationship and the
sign indicates the type of relationship. The number
+.88 indicates the same degree of relationship as the
number -.88. The signs indicate that the directions of
the relationships are different.
6. A correlation coefficient indicates relationship. After
determining a correlation coefficient, you cannot infer
that one variable causes something to happen to the
other variable.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-5
Scattergram
Graph use to illustrate the relationship between two
variables.
See Figure 3.1
Scattergram can indicate a positive relationship, a negative
relationship, or a zero relationship.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-6
Scattergram
Positive relationship - points will tend to cluster along a
diagonal line that runs from the lower left-hand corner
of scattergram to the upper right-hand corner.
Negative relationship - points will tend to cluster along a
diagonal line that runs from the upper left-hand corner to the
lower right-hand corner.
The closer the points cluster along the diagonal line, the higher
the relationship.
Zero relationship - points are scatter throughout the
scattergram.
See Figure 3.2.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-7
Spearman Rank-Difference
Correlation Coefficient
•Also called rank-order.
•Used when one or both variables are rank or ordinal
scales.
•Difference (D) between ranks of two sets of scores is used
to determine correlation coefficient.
Examples - golf driving distance and order of finish in golf
tournament; height and IQ score; weight and
order of finish in 400 meter race; number of
calories consumed and weight lost
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-8
Spearman Rank-Difference
Correlation Coefficient
Symbol: Greek rho () or rrho
To determine :
1. List each set of scores in a column.
2. Rank the two sets of scores.
3. Place the appropriate rank beside each score.
4. Head a column D and determine the difference in rank for
each pair of scores. (Sum of the D column should always be 0)
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-9
Spearman Rank-Difference
Correlation Coefficient
5. Square each number in the D column and sum the
values (D2).
6. Calculate the correlation coefficient by subtracting the
values in the formula
 = 1.00 - 6 ( D2)
N(N2 – 1)
Table 3.1
illustrates use of rankdifference correlation
coefficient for sit-up and
push-up scores.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-10
Pearson Product-Moment
Correlation Coefficient
• Also called Pearson r.
• Used when measurement results are reported in interval
or ratio scale scores.
• Has many variations.
• Symbol is r.
Examples - study time and test grade; leg strength and
standing long jump; running long jump and time for 100
meters
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-11
Pearson Product-Moment Correlation
Coefficient
Calculation procedure (see Table 3.2)
1. Label columns for name, X, X2, Y, Y2, XY.
2. Designate one set of scores as X, designate the other set
as Y, and place the appropriate paired scores by the
individual’s name.
3. Find the sums of the X and Y columns (X and Y).
4. Square each X score, place squared scores in the X2
column, and find the sum of the column (X 2).
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-12
Pearson Product-Moment Correlation
Coefficient
5. Square each Y score, place squared scores in the Y2
column, and find the sum of the column (Y2).
6. Multiple each X score by the Y score, place the
product in the XY column, and find the sum of the
column (XY).
7. Substitute the values in the formula
r=
N(XY) - (X)(Y)
N(X 2) - (X)2
N(Y2) - (Y)2
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-13
Interpretation of the Correlation Coefficient
*The purpose for which the correlation is computed must be
considered.
Following ranges can be used as general guidelines for
interpretation of the correlation coefficient.
r = below .20 (extremely low relationship)
r = .20 to .39 (low relationship)
r = .40 to .59 (moderate relationship)
r = .60 to .79 (high relationship)
r = .80 to 1.00 (very high relationship)
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-14
Significance of the Correlation Coefficient
*Statistical significance, or reliability, of the correlation
coefficient should be considered.
*In determining statistical significance, you are answering
the question: If the study were repeated, what is the
probability of obtaining a similar relationship?
*When r is calculated, the number of paired scores is
important. With small number of paired scores, it is possible
that a high r can occur by chance.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-15
Significance of the Correlation Coefficient
*With small number of paired scores, r must be large to be
significant.
*With large number of paired scores, a small r may be
significant.
*A table of values is used to determine the statistical
significance of a correlation coefficient.
*Must determine degrees of freedom and level of
significance.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-16
Significance of the Correlation Coefficient
*Degrees of freedom (df) equal N-2; .05 and .01 levels of
significance.
*If correlation coefficient significant at the .05 level, it will
occur only 5 in 100 times by chance.
*If significant at the .01 level, it will occur only 1 in 100
times by chance.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-17
Significance of the Correlation Coefficient
*Use appendix A to compare obtained r.
*If obtained r is larger than table values found at the .05
and .01 level, r is significant at the .01 level.
*If obtained r falls between these two table values, r is
significant at the .05 level.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-18
Significance of the Correlation Coefficient
*Significant correlation coefficients lower than .50 can
be useful for indicating nonchance relationships among
variables, but they probably are not large enough to be
useful in predicting individual scores.
*Table 3.2 : r = .90; df = 15-2 = 13
*Note difference in table values with the increased
number of paired scores and at the .05 and .01 levels.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-19
Coefficient of Determination
*Square of the correlation coefficient (r2).
*Represents the common variance between two variables
(the proportion of variance in one variable that can be
accounted for by the other variable.
*Example: Correlation of .85 between long jump test and
leg strength test; r2 = .72; 72% of the variability in the
standing long jump scores is associated with leg strength;
72% of both variables come from common factors.
*Use of coefficient of determination shows that a high
correlation coefficient is needed to indicate a substantial
to high correlation between two variables.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-20
Negative Correlation Coefficients
*Occasions when negative correlation coefficient is
expected.
*Negative correlation: Small score that is considered to
be a better score is correlated with a large score that
also is considered to be a better score.
Examples: Relationship between time to run 5K and
maximum O2 consumption; weight and pull-ups
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-21
Correlation, Regression, and Prediction
Linear correlation – how close the relationship between two
variables is to a straight line
If relationship found, a score for one variable can be used to
predict the score for other variable – linear regression
analysis
Standard error of estimate - numerical value that indicates
the amount of error to be expected in predicted score;
confidence limits; use in the same way as standard
deviation is used with a group of scores
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-22
Correlation, Regression and Prediction
Through multiple correlation-regression analysis, we can
predict a score using several other scores.
May predict college freshman year grade point average
with SAT of ACT score, high school grade point average,
and class rank.
Predict health problems through lifestyle.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved
3-23
Download