industrial5

advertisement
Industrial/Organizational Psychology.
Assignment 5, 15 Feb 2000
Jeffrey Absher xxx-xx-xxxx
Define/discuss test-retest reliability, equivalent-form reliability, internal-consistency reliability, inter-rater
reliability, inter-rater reliability, criterion-related validity, content validity, and construct validity.
Test-retest reliability is how well the same test applied to the same group correlates to earlier tests. In a
trivial case if one weighs two cubic centimeters of water today, stores it, weighs it tomorrow, and gets the
same values as today then the test (weighing the water) has test-retest reliability. The coefficient of stability
is the correlation coefficient between the two temporally separated tests.
Equivalent-form reliability is when multiple tests (different forms of the same test) designed to measure the
same attribute correlate well over the same set of test-subjects. For example if one were to use calipers to
measure fat content of a person and use water-displacement to measure fat content of that same person then
one should receive the same values. The coefficient of equivalence is the correlation coefficient between
the results of the tests.
Internal-Consistency reliability – Huh?
Inter-Rater reliability applies when human judgment plays a significant role in the values obtained for
results and multiple “raters” do the judging. The correlation of values between raters should be high. An
example of inter-rater reliability is that all judges should give Mary-Lou Retton a 10/10 score for her
performance if her performance is perfect. There should be little inter-judge variation in her score.
Criterion-Related validity – There are two types of criterion-related validity. Concurrent criterion-related
validity is when a “predictor” is measured at the same time as the construct that we wish to measure is
being exercised by the subjects. For example, I could measure a drafter’s ability to draw a straight line
freehand while he is engaged in drafting a print, and then use the measurement as a predictor of how many
drawings the drafter can produce per week. If the quality of the straight line correlates strongly to the
number of drawings produced per week, then the quality of the straight line is a good concurrent predictor.
Predictive criterion-related validity is when a predictor test is given prior to the subjects exercising the
construct that we are trying to predict. A perfect example of this is S.A.T. scores used as a predictor of
college success. With both types of criterion-related validity, we desire high correlation between predictor
scores and actual performance.
Content Validity is how well a predictor covers the scope of the behavior or the knowledge area that it is
supposed to test. A test on computer internetworking support that only contained questions on TCP/IP
addressing but failed to ask questions about Ethernet troubleshooting or about Netscape support would
have low content validity. The content validity of a test is usually assessed by Subject Matter Experts in the
field. Content Validity can be rephrased in an employment-testing situation as “How well does the test
measure the knowledge and skills that are actually used by those already performing the job?”
Construct Validity is how well a test measures the underlying constructs that it is designed to measure. An
example construct is morality. A department store may be interested in the morality of its future employees
to prevent employee theft. A morality test with construct validity would provide approximately the same
scores as current morality tests (convergent validity coefficients) and should not be biased by aspects of
subjects such as gender, race, or nationality (divergent validity coefficients.)
Download