Lecture 3: Psychometric Testing 1

advertisement
Evaluating Psychological Tests
1
Psychological testing
• Suffers a credibility problem within the eyes
of general public
• Two main problems
– Tests used inappropriately
• Goddard (1912) used a translation of Binet’s test to
test ability of American immigrants - conclusion 79%
of Italian immigrants = ‘feeble-minded’ - bias
– Tests themselves can be flawed
• Often measures supposed constructs which are not
supported by proper factor analysis - (Internal locus
of control)
2
External bias in tests
• Do group differences imply test bias (difficulty
unrelated to characteristic being assessed)?
– V1 - innate abilities can be different across
groups (Reynolds, 1995; Kline, 1993)
• Japanese have higher than average spatial abilities
• African Americans have ‘lower IQ’ (Hernstein &
Murray, 1996)
– V2 – Ethnic and gender groups must have the
same underlying abilities – evidence to the
contrary must be a product of measuring
something other than what is relevant
• Kline – ‘egalitarian fallacy’
3
Dealing with differences
• Detected through different regression
equation – not through different means
• What purpose does research in this area
serve?
– Within group differences far outweigh between
group differences
4
Detecting internal bias
• If only gross scores are considered, hard and easy
items for each group might balance themselves out
giving a false impression of the test’s ‘health’
• Alternative – Run a mixed factorial ANOVA
– Each test item (question) is entered as a level of
repeated measures factor
– Group = between subjects variable
• Main effect of item – expected
• Main effect of group shows external bias
• Interaction show internal bias in that the pattern of responding
is different across the groups
• Such a method is susceptible to power manipulation
5
Bias - performance characteristics
• Response bias
– individuals are more likely to agree than
disagree (Cronbach, 1946) – response set of
acquiescence
• Does not cause a problem if everyone behaves in
same manner – standard score will be unaffected
• But there are considerable individuals differences in
acquiescence therefore it can cause a major problem
– Changing polarity removes this difficulty
• Social desirability
– Counter acted by lie scales and consistency
measures
6
Obvious influences
•
•
•
•
Motivation
Expectation
Anxiety
Test specific practise
7
Revisiting Validity
8
Validity – different definitions
• Correctness or truth of an inference
• Validity with respect to IV
– Are we truly manipulating that which we think we
are
• Often relies on the construct of interest being adequately
described
• How do you manipulate something like the
unconscious?
• Validity with respect to the DV
– Extent to which you are measuring what you claim
9
to measure
Different types of validity
• Content validity
– Whether the target construct is adequately
addressed
– When measuring depression should assess
aspects such as fatigue, anxiety, appetite,
motivation, libido
• Is assessed through expert opinion
– Has a certain amount of subjectivity
10
Different types of validity
• Criterion-Related validity
– How measure compares to some already
validated measure
• Two types
– Predictive
– Concurrent
11
Different types of validity
• Construct validity
– Most important – Are the experimental
manipulations that we make really manipulating
the construct of interest
– Evaluation requires
• Clear definition of the construct
– Can be difficult e.g., IQ – has many different facets
• Assess match between construct and operations used to
represent it (exp manipulations)
– Can involve criterion and content validity
– Viewed as an evolving never ending process
12
Different types of validity
• Internal validity – degree to which the
independent and dependent variables are
causally linked
• External validity – degree to which causal
relationship holds across different settings
13
How relevant is validity to you
• Reviewing articles is essentially addressing
validity and reliability issues
– In examination situation would be useful
although not essential to talk about the different
forms of validity
• In discussion sections of reports again you
are essentially evaluating the results with
respect to validity and reliability
– Would not really use the formal language used
14
here – is a style issue
Download