Chapter 4

advertisement
Chapter 4: Reliability & Validity
Reliability:


“To what extent can we say that the data are consistent?”
About consistency
Different statistical techniques available, each produces a
reliability coefficient ranging from 0.00 to +1.00 (totally
inconsistent or totally consistent)
Different techniques to measure reliability
Approach/Method
Test-Retest
(coefficient of stability)
Description
Measures across time
Remarks
- Article need to start time lapse
(higher reliability if longer time)
Parallel-Forms
(coefficient of equivalence)
Measures across forms
- two forms of the same instrument
supposedly focusing on the same
object to measure
- Examining performance
on odd and even-numbered
items separately and
measure their correlation
- Will be expectedly high (unusual
if low)
Internal Consistency
1. Split-half
2. Kuder-Richardson
#20 (K-R 20)
- Order of items does not
matter (all possible
combinations computed)
3. Cronbach’s alpha
- Same as KR 20 if
dichotomous; more versatile
if items have > 2 possible
values
Interrater Reliability
1. Kendall’s coefficient
of concordance
2. Cohen’s kappa
3. Intraclass
Correlation (ICC)
4. Pearson’s productmoment correlation
- For ranked data (ordinal)
- For nominal data
(categorical)
- Reliability of ratings (for
raw score)
- For raw score
Standard Error Measurement:
Range within which a score would likely to fall if a given measured object were
to be remeasured
Some warnings about reliability
1.
Different methods of assessing reliability consider the issue of
consistency from different perspectives. E.g. a high coefficient of
stability does not necessarily mean high internal consistency
2.
Reliability coefficients really apply to data, not to the measuring
instruments. They are characteristics of data rather than the
instruments that produce the data.
3.
Place more faith in good results for large groups rather than for small
groups.
4.
If a test is administered under time pressure, various estimates of
internal consistency (split-half, KR 20, alpha) will be high. So don’t be
overly impressed.
5.
Reliability not only criterion used to assess quality of data.
Validity
-
concerned with accuracy
i.e. whether the measuring instrument measures what it purports to
measure
reliability is a necessary but not sufficient condition for validity; valid data
are reliable, but not all reliable data are valid
three kinds of validity:
1.
Content Validity
o Content experts evaluates instrument; we should ask:
 Who did the evaluations?
 What did they check/do?
 How was the outcome?
2.
Criterion-Related Validity
o Comparing scores with a relevant, established criterion
variable
o Correlating these two set of scores to produce the validity
coefficient
o Two kinds:
 Concurrent validity (two tests administered at same or
near same time)
 Predictive validity (test before the criterion)
3.
Construct Validity
o 3 ways
 Providing correlation with convergent and discriminative
variables; score should indicate high correlation for
convergent (convergent validity) variables and low


correlation for discriminant (discriminant validity)
variables.
Show certain groups obtain higher mean scores on new
instrument than other groups, with the two groups
determined on logical grounds prior to the test
Conduct a factor analysis
Warnings about Validity
1.
2.
Validity (like reliability) is a characteristic of the data, not the
instrument.
Importance of correlation as correlation plays a central role in
assessing construct validity (the first two ways); hence remember
warnings about correlation
Final Comments
1.
How high should reliability and validity coefficients be?
Answer: It should be judged in relative to other available instruments.
2.
Researchers should use multiple methods to assess reliability and
validity.
3.
Reliability and validity related to data quality, which by itself, does not
determine the degree to which the study’s results can be trusted.
Possible for conclusions to be worthless because of the wrong use of
statistical procedure, or design of study deficient. In other words,
reliability and validity are important, but other important concerns must
be attended to as well.
Download