Test Administration

advertisement
Test Administration
The Examiner & the Subject

Relationship between examiner &
subject
• Feldman & Sullivan (1960)
• WISC administered to children in two
conditions:


Enhanced rapport
neutral

Race of tester
• Little evidence that race of tester
significantly affects test performance (at
least with black & white American
children)

Training of testers
• Patterson et al. (1995)

Expectancy or Rosenthal effects
Rosenthal & Jacobsen (1968)




Classic study conducted at a public
elementary school, described as “lower
class”
School used a tracking system where
children were sorted into one of three
tracks (fast, medium & slow) based on
reading performance
About 600 students in the school
Students at the school described as “low
achievers”; pre-test data showed an
average IQ of 98 for boys and 99 for girls



In spring of 1964. students given the
“Harvard Test of Inflection”, which
supposedly predicted academic “blooming”
At beginning of following year, researchers
randomly selected 20% of the student at
the school, and stated that these students
were “academic bloomers”
At the end of the year, IQ scores of
children labeled as “bloomers” were
compared with the IQ scores of children
who were not given this label
Results

Students labeled as “bloomers” gained
significantly more IQ points over the year,
compared to control students

Reinforcing responses
• Sweet (1970)
• Terrel et al. (1978)

Administered WISC-R to lower-income
African-American 2nd graders under one of 4
conditions:
•
•
•
•
No feedback
Verbal praise
Candy
Culturally relevant verbal praise (“nice job,
blood”)
Standardized Administration


Standardized testing procedures are
so important that they are listed as
an essential criterion for valid testing
in the Standards for Educational &
Psychological Testing
Requires that tester must be very
familiar with materials, requiring
several test administrations
Background & Motivation of
Examinee

Test anxiety
• When taking an important examination I
sweat a great deal
• I freeze up when I take intelligence
tests or school exams
• I really don’t understand why some
people get so upset about tests
• I dread courses in which the instructor
likes to give “pop” quizzes
Test Anxiety (cont’d)


Test anxiety is negatively correlated
with school achievement, aptitude
test scores, & measures of
intelligence
Test anxiety is exacerbated by tests
with time limits
Siegman (1956)

Compared performance levels of high- and
low-anxious medical/psychiatric patients
on timed & untimed subtests from the
WAIS
12.5
12
11.5
low-anxious
subjects
11
10.5
high-anxious
subjects
10
9.5
9
untimed tests
timed tests
Test-Smart or “Coached”
Individuals

Powers & Swinton (1984)
• Mailed test preparation materials,
including extra practice tests,
explanations to practice test questions,
hints or strategies for answering
different item types
• Special coaching, preparation yielded
scores 53 points higher than scores of
uncoached respondents
Fair Test Fact Sheet


The GRE can be conquered with tricks
having nothing to do with the knowledge,
persistence, thoughtfulness, and other
qualities that are vital to graduate study
and professional performance.
One coaching book advises: "Taking the
GRE is a game with its own rules, traps,
and measures of success…How you do on
the GRE is an indication of how well you
play the game, but it is not an indication
of how 'intelligent' you are, or what kind
of student you will make."
Fair Test on Coaching (cont’d)

The exam's susceptibility to coaching undermines
educational equity by advantaging students who can afford
test prep materials - many of whom already score in the
upper percentiles - over those who cannot. The most
comprehensive coaching classes (which generally offer the
greatest score gains) cost upwards of $1,000 or more. One
coaching company claims its students gain on average 212
points on the GRE - a substantial advantage in the graduate
school application process. While ETS asserts that the GRE
is not coachable, it promotes its own materials: test takers
can purchase a diagnostic service for $15, Preparing to
Take the General Test for $18, or can use the free
POWERPREP software package. While there are no
independent studies on coaching's impact on the GRE,
independent studies of coaching for the similar SAT exams
demonstrate that coaching can improve scores (see
FairTest's The SAT Coaching Cover-Up).
Computer-Administered Tests







Improves standardization
Can individually tailor administration of test items
Allows for precise timing of questions &
responses
Saves time & expense
Allows as much time as necessary for responses
Reduces bias in testing
Examinees often disclose more in response to
computer test than human-administered test
Computerized Adaptive Testing



An adaptive test is one in which the
questions are tailored specifically to the
individual being examined
An adaptive test of mental ability, for
example, will include items that are
neither too easy nor too difficult for the
respondent
As a result, when two individuals of
different ability take the same test, they
might respond to completely different
questions
Computerized Adaptive Testing
Most difficult items (10)
Difficult items (10)
Less difficult items (10)
Ten items
Routing test
Average items (10)
Easier items (10)
Easy items (10)
Easiest items (10)
Computerized Adaptive Testing
Using Item Response Theory


Theory allows for a calculation of the
difficulty of each item, the discriminating
power of the item, and the probability of
guessing the correct answer
It then provides procedures for
• Estimating the respondent’s ability on the basis
of his or her response to each item
• Choosing the optimal test items on the basis of
that estimate
• Revising that estimate on the basis of
responses to each new item
Advantages of Computerized
Adaptive Testing




Makes it possible to achieve high levels of
accuracy using an extremely small set of
test questions (reducing fatigue, boredom)
Improves item security
Saves time & money
Research shows that CAT versions of test
produce scores that correlate highly
(about .90) with paper-and-pencil versions
of tests
Behavioural Assessment
Methodology


Research has shown that in certain
areas (e.g., estimating job aptitude),
the best kind of test involves giving a
person some of the tasks he/she will
be required to perform on the job,
and then observing & rating that
performance
Reliability of observers’ ratings is
critical in these kinds of measures
Issues in Behavioural Assessment

Reactivity
• Reliability and accuracy are highest when
someone is monitoring the observers;
decreases when work is not monitored

Drift
• After training, observers tend to “drift” away
from the standards or rules they followed in
training

Expectancies
• Behavioural observers will pay more attention
to & notice behaviour they expect
Download