Psychology - University of Toledo

advertisement
Joni L Mihura:
Joni L Mihura:
The Validity of Psychological Tests as Measures of Aggressive Behavior:
A Review of the Literature
E.M. Farrer & J.L. Mihura
University of Toledo
Abstract
This study reviews the empirical literature on the validity of
psychological tests as measures of aggressive behavior. The
psychological tests were categorized into two groups: (a) self-report
questionnaires (e.g., BDHI, JI, PAI) and (b) performance personality
tests (e.g., Rorschach and Hand Test). For criterion variable of
aggressive behavior, only studies using observational measures are
included in the review (e.g., ward reports, patient records, chart
reviews). The effect sizes of the psychological tests compared to
observational measures are presented and then compared using a
monotrait-multimethod approach. Also of interest are similar studies
using a multitrait-multimethod approach, comparing and contrasting
similar constructs (e.g., aggression, anger, antisocial behavior) using
the same or different methods (self-report measures, performance
personality tests, and observational measures). The goal of the
study is therefore twofold: (1) to review the literature on the validity of
psychological tests as measures of aggressive behavior and (2) to
place this aggression literature in a psychometric context regarding
more general issues of monomethod-heteromethod approaches to
validity.
Method cont.
The observational measures needed to be clear in how they
measured the aggressive behavior. Records of past aggressive
behavior (e.g., chart reviews, criminal file reviews) also had to have
a well-defined way of measuring aggressive behavior including
“objective” systems like the number of institutional infractions for
forensic samples.
Only studies written and conducted in English with no clear
criterion contamination, (e.g., behavior ratings blind to psychological
test data) were included.
For comparative purposes, the findings are reported in effect
sizes, converted where necessary to use Pearson r as the common
metric. As a rule of thumb, the magnitude of effect sizes (r) can be
classified as (a) small = .10, (b) moderate = .30, and (c) large = .50
Table 2.
Aggressive Behavior as Measured by
Performance Personality and Self-Report Tests: Summary
Statistics
Measurement Method
k
N
rw
Performance Personality Test
3
246
.31
Self-Report Tests
5
498
.27
Total Tests
8
744
.29
Note: k = number of effect sizes included in the summary statistic
Figure 1.
Multitrait-Multimetehod Table
Construct Overlap
Measurement Method
Introduction
Most often in psychology the general notion of a person’s level of
functioning and personality aspects is obtained by the word of mouth
of the person him- or herself. Ways this can be done is by using selfreport measures, such as the Personality Assessment Inventory (PAI),
or by performance personality tests, such as the Rorschach. These
measures, however, rely heavily on the respondent as the source of
information, whereas behavior measures rely on others as the source
of information.
Results cont.
Different
Same
Moderate
High
Major Study Question:
E.g., Self-report
Self report or
anger measure performance personality
compared to
aggression measures
aggressive behavior compared to aggressive
behavior
E.g., self-report
E.g., Self-report
anger measure
aggression measure
compared to selfcompared to self-report
report aggression
aggression measure
measure
Table 3.
MTMM Results: Weighted Mean Effect Sizes
Construct Overlap
Measurement Method
Moderate
High
Different
.16
(k = 8, N = 924)
.28
(k = 9, N = 814)
Same
.46
(k = 6, N = 1,771)
.77
(k = 4, N = 371)
Many self-report measures used for screening are broadband
inventories such as the PAI or the Jesness Inventory (JI). Several are
also specifically designed to measure the construct of interest. The
construct of particular interest to this review is aggression.
Aggression can be defined as “the act or practice of attacking
without provocation, “ (Coccaro et al., 1997). Aggression can be
verbal or physical and, for this study, directed outwardly. The reliance
on self-report measures and performance personality tests of
aggression is of particular interest due to the implications that could
arise if the aggression is carried out. How well can self-report
measures and performance personality tests designed to measure
aggression actually predict aggressive behavior?
Further, aggression also has similar constructs with similar
implications. Anger and antisocial behavior are among those. How
well do tests specifically measuring those related constructs predict
aggressive behavior? Also, how well do the same constructs
measured by different methods compare? This multitrait-multimethod
approach is of particular interest to the study. According to Campbell
and Fiske (1959) the same construct measured by different methods
should agree and should agree better than different constructs
measured by different methods.
Thus, the study has two goals. The first is to review how selfreport and performance personality measures of aggression compare
to observable behavior. The second is to compare similar but slightly
different constructs to themselves and each other using the same and
different methods.
Method
Studies were located by conducting a PsycINFO search of
articles published within the past 30 years with either Aggressive
Behavior or Antisocial Behavior or Violence as Subject terms. These
were further limited by a classification code of personality scales and
inventories or clinical psychological testing. The articles were limited
due to the high volume retrieved without the classification code—
19,651. The limit reduced the number of articles to 387.
These remaining articles were kept or eliminated based on the
following criteria. The tests in the study had to contain a self-report or
performance personality measures of aggression, anger, or antisocial
behavior. The next criterion for the study was an observational
measure used that could be correlated with the self-report or
performance personality test.
Discussion
Results
For studies that reported more than one effect size, these were
were averaged to report as one effect size per study.
Table 1.
Aggressive Behavior as Measured by
Performance Personality and Self-Report Tests
Study
Measures
Sample
N
r
C
94
.27
14. Hand Test AOS&ACT-MOV
MR
36
.57
13. Hand Test AOS&ACT-MOV
MR
116
.27
F
169
.26
S
91
.33
17. BDHI Assault
F
60
.26
15. BDHI Assault
C
51
.40
7. PAI AGG-P
F
127
.20
Performance Personality Tests
3. Rorschach AgC
Self-Report Tests
18. PAI AGG
4. Aggression Questionnaire PA
Note: C = Clinical; MR = Mentally Retarded; F = Forensic; S = Student
Table 1 shows the effect sizes for performance personality tests
and self-report measures as compared to behavior that range from .20
TO .57. Summary statistics were also computed for self-report and
performance personality test effect sizes. This was done by taking the
mean of each grouping of tests weighted by N. As shown in Table 2,
both performance personality and self-report tests had overall medium
effect size relationships with aggressive behavior—r = .31 and r = .27,
respectively.
The next table shows the results from the question of what
happens to the effect sizes when slightly different constructs are
measured using different methods and when the same constructs are
measured by the same methods. Again, the effect sizes in the table
are weighted to compensate for the varying sample sizes.
According to Jacob Cohen (1988), “…when one looks at nearmaximum correlation coefficients, of personality measures…with reallife criteria, the values one encounters fall at the order of r = .30.” This
corresponds to the findings above. The values were obtained using
personality measures and with the same or highly overlapping
constructs as compared to the real-life criteria in question. Self-report
and performance personality measures do not differ in their effect sizes
either.
For the Campbell and Fiske’s (1959) MTMM approach, the data
correspond quite well. Measuring moderate construct overlap using
different methods will result in low effect sizes. On the other hand,
using high construct overlap and the same methods, the correlation is
quite high and what one would expect for test-retest reliability. This
also corresponds with Meyer et al.’s (2001) findings that a single
measure will only represent a certain portion of one’s personality and
that different sources of information tend to provide their own unique
interpretation of someone’s personality or behavior.
Future information is yet to come. While performance personality
tests and self-report measures were compared to each other, to
behavioral measures, and to themselves; the next step would be to see
how well observational measures compare to themselves.
References
1. Campbell, D.T., & Fiske, D.W. (1959). Convergent and discriminant
validation by the multitrait-multimethod matrix. Psychological Bulletin,
56, 81-105.
2. Cohen, J. (1988). Set correlation and contingency tables. Applied
Psychological Meaurement, 12(4), 425-434.
3. Meyer, G.J., Finn, S.E., Eyde, L.D., Kay, G.G., Moreland, K.L., Dies,
R.R., Eisman, E.J., Kubiszyn, T.W., & Reed, G.M. (2001). Psychological
testing and psychological assessment. American Psychologist, 56(2),
128-165.
See Handout for the List of Reviewed Studies.
Download