Uploaded by Miles Justine Rivera

RELIABILITY

advertisement
Reliability
Reliability

Refers to the degree to which test scores are
consistent, dependable, or repeatable. It is a function
of the degree to which test scores are free from
errors of measurement (Drummon, 2000)

Refers to the consistency of test scores obtained by
the same persons when they are re-examined with
the same test on different occasions, or with
different sets of equivalent items, or under other
variable examining conditions (Anastasi & Urbina,
1997)
Reliability

Underlies the error of measurement of a single
score, whereby we can predict the range of
fluctuation likely to occur in a single individual’s
scores as a results of irrelevant chance factors

Refers to internal consistency of a test based on
the number of items in the test and the average
inter-correlations among all items and
computing the average of these inter correlation
among test items
Some Sources of
Inconsistency
(Error Variance)
Characteristics of an individual

General skills and techniques of taking
tests (test-wiseness or test naivete)

Stable response sets (e.g., to mark A option
marked more frequently that other options of
multiple-choice items, to mark true-false items
“true” when undecided)
Level of practice on the specific skills involved
(especially psychomotor skills)


Factors affecting performance on many
or all tests at a particular time
-Health
-Fatigue
-Noises and distraction
-Motivation
-Emotional Strain
-Anxiety
Administration of the test/
Appraisal of test performance
Conditions of testing: adherence to time
limits, freedom from distraction, clarity
of instructions, etc.
 Interaction of personality, sex, or race of
examiner with that of examinee that
facilitates or inhibits performance
 Unreliability or bias in grading or rating
performances

Others
Luck in selection of answers by sheer
guessing
 Momentary distraction

Psychological Assessment for the Guidance Practitioner by Alexa PrielaAbrenica (2009)
Key Concepts
Sources of Error
 Time Sampling: Test-Retest method
 Item Sampling: Parallel Forms Method
 Internal Consistency

◦ Split-half Method
◦ KR20 Formula
◦ Coefficient Alpha

Inter-rater Reliability
Measurement error will have an effect
on reliability.
 True scores vs Observed scores
 Reliability coefficient – ratio of variance
of true scores on a test to the variance
of observed scores

Source of Error: Time
Test-retest reliability
 Same test is given twice
 Find correlation of scores on both
instances
 Disadvantage: Carryover effect

Source of Error: Items
Parallel Forms
 Different items are used to measure the
same attribute
 Pearson Product-Moment Correlation
Coefficient

Source of Error: Internal Consistency
Split-half
 Tests are given and divided into halves
and scored separately
 Compare scores on first half and scores
on the second half
 Cronbach’s coefficient alpha – lowest
estimate of reliability expected
 Kuder Richardson 20(dichotomous)

KR20 (Kuder-Richardson Formula
20)
an index of the internal consistency
reliability of a measurement instrument,
such as a test, questionnaire, or
inventory
 be applied to any test item responses
that are dichotomously scored
 Values of KR-20 generally range from
0.0 to 1.0, with higher values
representing a more internally
consistent instrument

KR20
very rare cases, typically with very
small samples, values less than 0.0 can
occur, which indicates an extremely
unreliable measurement
 0.7 is an acceptable value
 0.8 for longer tests of 50 items or
more

http://knowledge.sagepub.com/view/researchdesign/n205.xml
Methods of assessing reliability
(Drummond, 2006)
Test-Retest
Procedure: same test given twice with
time interval between testings
 Coefficient: stability
 Problems: memory effect, practice
effect, change over time

Alternate Forms-Version 1
Procedure: equivalent test given with
one after the other
 Coefficient: equivalence
 Problems: hard to develop equivalent
tests

Alternate Forms-Version 2
Procedure: equivalent test given with
time between testings
 Coefficient: equivalence and stability
 Problems: hard to develop equivalent
tests, may reflect change in behavior
over time

Internal Consistency
Procedure: one test given at one time
only (test divided into part in split-half)
 Coefficient: equivalence and internal
consistency
 Problems: uses shortened forms (splithalf), only good if traits are unitary or
homogeneous, gives high estimates on a
speeded test, hard to compute by hand

Psychological Assessment for the Guidance Practitioner by Alexa PrielaAbrenica (2009)
Inter-Rater/Inter-Observer Reliability

The degree of agreement among
raters
 Percentage of agreement
 Correlations
 Cohen’s Kappa
Factors Affecting Reliability
Length increases reliability
 Homogeneity increases reliability
 Shorter time, higher reliability
 Types of reliability estimate

How reliable is reliable?

Usual reliability coefficients of at least
0.70 to 0.80
Download