Back to Lie Detection - University of Toronto

advertisement
Back to Lie Detection
Home Page
VALIDITY OF THE LIE DETECTOR
A Psychophysiological Perspective
JOHN J. FUREDY
University of Toronto
RONALD J. HESLEGRAVE
Defense and Civil Institute of Environmental Medicine, Toronto
The profession of polygraphy uses physiological measures to improve the detection of
deception. The science most relevant for assessing the utility of these measures is that of
psychophysiology. This article examines the validity of polygraphy from the perspective
of the science of psychophysiology, which employs physiological measures to study and
differentiate psychological processes. The focus is on the version practiced currently by
most members of the American Polygraph Association, the "control question technique"
(CQT). A brief consideration of some critical terms is followed by a description of CQT
polygraphy, and then by a review of the literature. We conclude that as a scientific tool,
CQT polygraphy is of questionable validity, although it is probably a better-than-chance
detector of guilt
E
specially in the United States, but also in Canada, Israel,
Japan, and the United Kingdom, the forensic use of the lie
detector is increasing (Office of Technology Assessment Report,
1983). The evaluation of polygraphic assessment continues to be a
matter of heated and complex controversy among both laymen
and experts in cognate fields. The aim of this review is to shed
some light on the validity of the lie detector, and the practice of
polygraphy as established by the American Polygraph Association
(APA).'
AUTHORS' NOTE: The preparation of this article was supported by grants to
JJF from the National Research Council of Canada and the Social Sciences
Research Council of Canada (Sabbatical leave Fellowship). We are indebted to
Caroline Davis for comments on an earlier draft. Correspondence can be
CRIMINAL JUSTICE AND BEHAVIOR, Vol. 15 No. 2, June 1988 219-246 •
1988 American Association for Correctional Psychology
219
220
CRIMINAL JUSTICE AND BEHAVIOR
In any controversial area, the basic terms are often only
emotively rather than intellectually understood. Usage may
hallow a particular meaning that is not justified by the facts. This
is the case with the meaning of polygraphy. Through association
with the APA, this term has come to mean the specific procedures
of the profession of polygraphers that are used, purportedly, to
detect deception. As a result, it has become common to talk about
"administering a polygraph test," and polygraphic organizations
have indicated that such administration should be restricted to
bona fide experts in "polygraphy." One effect of this usage is that
it implies that a polygraph test is a standardized technique for
differentiating honesty from guilt (or, at the very least, innocence
from guilt) by an expert. In fact, as will become apparent, the
"test" is not standardized, and it is doubtful whether the examiners
can be considered to be (scientific) experts in the legal sense of the
term.
More generally, the term polygraph refers to an instrument
that has far wider uses than those of lie detection, and are also far
less mysterious. The polygraph monitors a number of subtle
physiological changes in functions (such as heart rate, skin
resistance, and blood pressure) that the body constantly undergoes
when responding to internal needs and external demands from
the environment. These small changes are influenced by the
autonomic nervous system; they are not under voluntary control,
and we are not generally aware of them. The changes in
autonomic functions are detected by amplifying and recording
the functions on a multichannel instrument, the polygraph. It is
so called because it has many (poly) pens, with each pen dedicated
to recording (or graphing) a specific physiological function being
measured.
Because the topic under discussion has a number of other
terminological difficulties, we shall begin our treatment with a
section that considers some of the critical terms and the relations
between them.
addressed to John Furedy at the Department of Psychology, University of
Toronto, Toronto, Ontario, M5S 1A1, Canada.
Furedy, Heslegrave / LIE DETECTOR VALIDITY
221
TERMS OF REFERENCE
The expression lie detector will be used here to refer to the
methods used to establish guilt or innocence by the polygraphic
profession as practiced by members of the APA. This restriction
is important because there are, of course, many other logically
possible ways of attempting to differentiate guilt from innocence.
We focus on the APA type of polygraphic methods both because
these methods are primarily in use in Western democratic
societies and because it is these methods that have evolved out of
the science of psychophysiology.
The term validity requires a little elaboration. In the psychological testing literature (the polygraph being, essentially, a psychological test—see Lykken, 1981), reliability is regarded as
necessary, though not sufficient, for validity. A test is reliable
when repeated administrations of the test to the same individual
(assumed to be unchanged) from one occasion to another yield
comparable results. For the test to be valid, it must also measure
what it purports to measure. So, to take a simple example from
physics, a faulty thermometer that has constant measurement
error (say, 10 degrees) due to a manufacturing flaw, may be
perfectly reliable, but quite invalid or inaccurate; it will reliably
show temperatures 10 degrees different from the actual
temperature.
The modern instruments (polygraphs) available to most professional polygraphers are reliable in the sense of providing an
adequate record of the actual physiological functions. From the
point of view of reliability, therefore, there is no great problem
with these instruments. Rather, it is their validity or accuracy that
is at issue, and we shall refer to accuracy rather than validity,
because this more clearly allows us to discuss the possible errors
in classification, and, more important, to make a critical distinction between two sorts of errors.
When polygraphers are asked to give a single percentage figure
on the "accuracy" of polygraphy, the misconception is promoted
that either the errors are of a single type, or that differences
between error types are unimportant. On the contrary, it is critical
222
CRIMINAL JUSTICE AND BEHAVIOR
to distinguish between at least two major sorts of errors: false
negative and false positive. The distinction between these error
classes was strongly emphasized by the modern approach to
psychophysical problems (Tanner & Swets, 1954): the theory of
signal detection (TSD). Although the status of TSD and its
treatment of the concept of psychophysical threshold is still a
matter of controversy, the distinction between the two sorts of
errors is a widely accepted and sound one. In the context of
detecting deception, false negative errors are instances where the
deceptive suspect is declared to be truthful. There is a tendency to
assume that any shortfall from perfection of the polygraph is
represented only by errors in favor of the accused, that is, of
letting the guilty go free, which is consistent with the intention of
the justice system. However, false positive errors also occur where
the truthful are wrongly judged to be deceptive. These errors lead
to the innocent being falsely accused, and, from a judicial point of
view, are more serious than the false negative errors. Moreover,
the severity of this problem is magnified by evidence that in
polygraphy, false positive errors may be more likely than false
negative errors (Horvarth, 1977). Another complicating aspect in
polygraphy is that in addition to the truthful and deceptive
decisions, there is also a third sort of decision that is made when
the examiner is unwilling to assign the examinee to the truthful or
deceptive categories. This third type of decision—the "inconclusive" result—would appear to be an error from a scientific point
of view, because neither guilt nor innocence has been detected.
However, from a legal point of view, the outcome is consistent
with the intention of justice, according to which the accused
cannot be considered guilty regardless of the "true" nature of this
circumstance.
It is also relevant to consider the qualitative implications of the
distinction between false negative and false positive errors. These
implications concern the subjective value that is placed on the two
sorts of errors, which is determined both by the circumstances
and by the ethical views of the person evaluating those circumstances. As an illustration, contrast a habitual thief with a record
of previous thefts with a person without any previous convictions
Furedy, Heslegrave / LIE DETECTOR VALIDITY
223
who is in a position of responsibility and authority (e.g., a bank
manager). Most would argue that false positive errors are less
serious than negative errors in the case of the habitual thief. That
is, even if the thief were to be wrongly accused inasmuch as he or
she had not committed that particular theft, he or she probably
committed other similar crimes without being caught. In contrast,
the false negative error of not catching the habitual criminal for
an actual crime is a more, serious problem, especially from the
point of view of future victims. Just the reverse holds for the
"solid-citizen" case, where an incorrect accusation (even if later
proved wrong) can have a very dramatically deleterious effect on
the life of the unfortunate innocent accused.
Not only the circumstances but also the views of the evaluator
are important. Hence while most evaluators would agree that
there is a difference between the habitual-criminal and respectable-executive circumstances, the size of this difference is very
much in the eye of the beholding evaluator. An evaluator who is
for "law and order" would probably argue that false positives in
the case of the criminal are almost completely unimportant,
rather than being merely less important than in the case of the
respectable executive. An evaluator whose values lay a greater
emphasis on civil liberties would, on the other hand, argue that
false positives should never be viewed as unimportant even in the
case of habitual criminals, because such a position is contrary to
the principle that all persons are entitled to a fair hearing. In
addition, it is also likely that polygraphers tend to lean, in general,
toward undervaluing the importance of false positive errors, since
polygraph results represent only one source of evidence contributing to the eventual outcome of the accusation. As detailed
below, a recent Israeli experiment showed that when in situations
where the risk of falsely classifying innocent subjects was high,
professional polygraphers were very ready to accept such false
positive errors as long as they thought that their rate of detecting
the guilty was also high (Ben-Shakhar, Lieblich, & Bar-Hillel,
1982). This discussion makes clear why statements of the form,
"polygraphy is X% accurate," are too simplistic, masking as they
do a number of technical as well as ethical complexities.
224
CRIMINAL JUSTICE AND BEHAVIOR
It is also important to recognize the important role played by
the attitudes of evaluators toward accepting errors, and weighing
these errors depending upon the circumstances. Because the
circumstances are known by the evaluator prior to the administration of the polygraph, and because the test is not standardized, it
is likely that not only will the outcome be judged on the basis of
examinee circumstance and examiner attitude, but also the
administration of the test will be shaped by these prejudices.
Because the test is psychological in the sense of involving a
complex interview-like interaction between examiner and examinee, any biases in designing and administering the test are
likely to produce outcomes that are consistent with those biases.
So different individuals accused of different crimes may be given
quite different tests, even though all those tests are called by a
single name—a polygraph test. Indeed, the term test itself is
potentially misleading, because it suggests relatively standardized
instruments such as IQ tests that, though controversial, give
essentially the same results across competent operators.
Although there is much dispute about the accuracy of polygraphy, there is agreement that this profession offers a technical
tool rather than a novel scientific area of investigation. All
technical tools are founded on basic scientific fields. Because the
central technological claim of poly graphy is that it uses physiological measures to detect deception, the scientific field that is most
cognate to polygraphy is psychophysiology. Psychophysiology
may be defined as that branch of psychology that uses unobtrusively measured physiological changes (of which we are not
normally aware) to study psychological processes (for elaboration, see, e.g., Furedy, 1983). Because deception is a psychological
process, at least that part of polygraphy that uses the physiological
measures to detect deception is appropriately conceived as a
technique that attempts to apply the principles of psychophysiology.
Of course the polygraphic situation has other components to
be described in greater detail below. These components include
the pretest interview (where the examiner tries to establish the
content for the test), and the confession-inducing function of the
polygraph. Concerning the first component, scientific information
Furedy, Heslegrave / LIE DETECTOR VALIDITY
225
from the fields of psychological testing and clinical psychology
appear germane, while the field of social psychology is relevant to
the second component. Accordingly, the scientific underpinnings
of the profession of polygraphy are varied and complicated, so
that there is unlikely to be simple answer to the question of
polygraphy's scientific status.
THE CONTEMPORARY PRACTICE OF POLYGRAPHY
The prescientific and modern origins of polygraphy have been
summarized elsewhere (Furedy, 1986; for a more detailed account,
see Heslegrave, 1981). In this article we shall move directly to
describe the current version.
DESCRIPTION OF THE CURRENT VERSION
OF THE POLYGRAPHIC EXAMINATION
The polygraphic examination as practiced by the APA is a
process that lasts from one to one-and-a-half hours. It has two
phases: the initial interview and the polygraphic test. The first
phase, lasting between 30 and 60 minutes, is used to establish
rapport with the interviewee, to work out the exact form of the
questions to be asked, to convince the interviewee (sometimes by
way of "demonstration") of the infallibility of the polygraph, and
to ensure that the polygraph is in working order in the sense that
the physiological changes are being clearly recorded. During this
time, the polygrapher seeks to present a very "professional"
image.
Nowadays polygraphic equipment is sufficiently standardized
that four function are recorded. Probably the most sensitive
function is one commonly referred to as the "GSR" or the
"galvano" channel by polygraphers. It is the short-term change in
skin resistance (or conductance) that is elicited by most stimuli
one or two seconds following the stimulus, and lasting two or
three seconds. This is the "galvanic skin response," hence the
abbreviation "GSR." Another channel, known as the "cardio"
226
CRIMINAL JUSTICE AND BEHAVIOR
function by polygraphers, is the approximate mean of the systolic
and diastolic blood pressures. This is recorded by a pressure cuff
on the arm, but by settling for mean pressure, the polygrapher is
able to get a continuous, beat-by-beat estimate of relative (up
versus down) changes in pressure. To obtain this "cardio"
measure, the cuff has to be kept continuously inflated between the
systolic and diastolic levels. Although not painful, this procedure
certainly detracts to a significant extent from the ideal psychophysiological measure, which should be completely unobtrusive
(see Furedy, 1983). A third function is respiration, in which
relatively small changes in amplitude and frequency are under at
least partial autonomic control. Because of the complexity of the
respiratory waveform, and because autonomic control is only
partial, this function has not been extensively studied by psychophysiologists. Polygraphers, however, make significant use of
this channel, and when only three functions are recorded, it is
usual to have two respiratory channels (obtained by having two
recording couplers around the thorasic and abdominal areas on
the torso). When fourth channel records an additional function,
this has been changes in the peripheral vasculature, as represented
by the blood flow in the tip of the index finger. The vasomotor
response (VMR) behaves rather similarly to the GSR, inasmuch
as it is activated as a short-term change by any change in
stimulation, and it increases as a function of stimulus intensity.
Except for respiration, the measured changes are under
autonomic rather than voluntary control, and the subject is not
normally aware of them. Even in the case of respiration, given
that the changes being measured are quite small, and that the
interviewee is told not to make any gross changes in breathing, the
autonomic, nonvoluntarily controlled description is reasonably
accurate. However, another reason for this channel is to discover
noncompliance with instructions, which presumably may be used
as another index of guilt or deception, with deceptive suspects
being less compliant.
The polygraphic test proper is based on the rationale that
questions ("stimuli") that are relevant will elicit greater physiological responses in the guilty than in the innocent. However, because
Furedy, Heslegrave / LIE DETECTOR VALIDITY
227
individual differences in physiological responsivity are great, the
comparison needs to be made within subjects. That is, one must
compare the responses to relevant questions to those of other
questions for the each person, rather than contrasting the
responses of different persons. Early polygraphers simply
compared crime-relevant questions (e.g., Did you kill X?) to
irrelevant ones (e.g., Were you born in year Y?), but this poses the
obvious problem that innocent subjects may respond more to the
relevant question simply because is more emotionally arousing or
anxiety provoking than the irrelevant question. The current
polygraphic solution to this problem is to use "control" questions.
These control questions are made up in consultation with the
interviewee during the first phase. They are designed to elicit at
least as much emotion as the relevant questions do for the
innocent. For example, the polygrapher may establish during the
first phrase that the interviewee has stolen something on a
previous occasion. Then the polygrapher asks the interviewee to
answer "no"(i.e., lie) to the following control question: "Apart
from the present problem, did you ever steal anything in your
life?" The deception on this question by an innocent subject is
presumed to produce a larger response than the (truthful) answer
by him or her to the relevant question.
In this modern "control-question" approach, the interviewee
can, be judged to be deceptive if responses to the relevant
questions exceed those to the control questions by a reasonable
margin. It is important to recognize that, as emphasized by
Lykken (1981), the term control here is not used in the scientific
sense. In the scientific sense of the term, assuming the phenomenon
to be investigated is deception, the control question should be
identical in every respect to the relevant question except for the
presence of deception in the latter question; in terms of the logic
of scientific experimentation, the relevant question is like the
experimental condition.2 The scientific sense of "control" is not
properly applicable, in our opinion, to the polygraphic field
situation, but only to a certain laboratory paradigm designed
specifically to study the psychological process of deception
(Hemsley, 1977; Heslegrave, 1981). Nevertheless, even though the
228
CRIMINAL JUSTICE AND BEHAVIOR
polygraphic control-question technique (CQT) provides no control in the scientific sense of the term, it may still prove an
accurate way of detecting guilt, by attempting to- control the
differences in the emotional content or the psychological significance between the relevant and control questions.
In the modern CQT examination, there are about ten questions.
The form and content of all of these questions are determined in
the interview phase in conjunction with the interviewee. About
three each of the questions are relevant, control, and irrelevant. In
a single "test," the questions are presented about 15-30 seconds
apart, and they are formulated so that the subject can answer yes
or no to each question. After three of these "test" series, there is a
pause. During the pause the polygrapher usually leaves the
interrogation room to interpret the tests. If a decision cannot be
made by the polygrapher, additional tests may be administered.
During these tests, but most commonly following the pause
after the third test, the confession-inducing function of the
polygraphy is brought into play. By this point the examiner has
had a lot of information on which to "judge" the examinee. The
available information included not only the physiological responses, but possibly the past criminal records, the present
behavior of the subject during the interview, and the current facts
about the case. Even if the examiner is not sure that the examinee
is lying, that doubt is resoluble (according to professional
polygraphers) by pressing the examinee into a confession of guilt.
Usually this involves asserting that the "machine" indicates (to
the examiner) that the subject has been lying to the relevant
questions. According to anecdotal evidence of polygraphers,
about 30% of the subjects break down at this point and admit
their guilt. Polygraphers argue that this confession-inducing
function is really a part of polygraphy's detection function.
However, that argument rests on the very debatable assumption
that all such induced confessions are true. We shall consider this
problem in greater detail below.
An important aspect of the polygraphic examination is the
interpretation of the physiological responses. Currently, there are
two basic methods of scoring: subjective and objective. The
Furedy, Heslegrave / LIE DETECTOR VALIDITY
229
subjective method, which used to be the only method used by
most polygraphers (whose training is often limited to 7-8 weeks of
course work to cover physiology, psychology, psychophysiology,
and the polygraphic techniques, plus a 6-month internship),
consists of simply inspecting the shape of the responses and
deciding, in a general sort of way, whether the person has been
deceptive. This qualitative scoring method rests on the assumption
that there is a unique pattern of physiological responding
associated with lying, namely, the "specific lie response." This
notion used to be popular with polygraphers, but has no
evidential basis in the science of psychophysiology. Nevertheless,
perhaps because individual physiological recordings are intuitively
unique and striking, the notion persists among some professional
polygraphers that such qualitative scoring is accurate.
The "objective," quantitative method of scoring has more
recently been adapted as the preferred method, especially since
polygraphers with sound psychophysiological research credentials
have documented its utility. The rationale for this method—
originated by Backster (1962) and improved on by Raskin
(1976)—does not require the assumption of a specific lie response,
but only that, in terms of some specifiable quantity, the physiological responses to the relevant questions exceed those to the control
questions. The method itself classifies the differences between
pairs of relevant and control questions in each response channel
as a function of magnitude ranging from +3 to -3. The algebraic
value of these numbers is positive to the extent that the control
response exceeds the relevant response. For example, the algebraic
sum of these scores over 3 response channels, 3 question pairs,
and 3 tests determines how the subject is classified. In the
example, the range of scores is from +81 to -81, and cutoff points
are as follows: truthful (+6 or more), deceptive (-6 or less), and
"inconclusive" (between +5 and -5). Strictly speaking, the last
classification refers really to the outcome of the polygraphic
examination rather than to the interviewee, because it indicates
merely that the test score does not permit the polygrapher to come
to a decision concerning whether the interviewee is truthful or
deceptive. Polygraphers, therefore, claim that the inconclusive
230
CRIMINAL JUSTICE AND BEHAVIOR
category is not a real "decision," and that only a binary,
truthful/ deceptive decision is involved in undergoing a polygraphic test. However, the inconclusive outcome is clearly a third
category that is applied to the examinee by both the polygrapher
and the polygrapher's clients, and it is also obvious that the
outcome affects the examinee differently from both a truthful and
a deceptive outcome. One such differential consequence for the
examinee of an inconclusive outcome is that the polygrapher may
decide to give one or two additional tests, or even another
complete examination on another occasion.
It is important to recognize that, compared to the scoring
methods used in the science of psychophysiology, the "objectivity"
of the above polygraphic scoring method is severely limited, and
has been labeled by its originators (Barland & Raskin, 1975) as
"semi-objective." One basic problem is that the score (ranging
from +3 to -3) is arrived at by subjective and qualitative means.
Another problem is that the setting of the cutoff points (6) for
inconclusives is arbitrary. On the other hand, polygraphers point
out that their task is more difficult than that of the psychophysiologist, who can average over many subjects and draw statistical
inferences concerning whether or not there is a significant
difference between two conditions. The polygrapher is required
to make a decision concerning a specific individual.
Another arbitrary aspect of the polygraphic cutoff criteria is
that there is no allowance for number of channels and number of
tests. While it is true that the first number is usually three or four,
and that the second number varies, in fact, usually between three
and five,, this still remains a problem at least in principle. This is so
because, mathematically, the chances of scoring an examination
inconclusive decrease as a function of the sum of the number of
channels used and tests administered. These chances, indeed,
asymptote toward zero as that sum approaches infinity. However,
it would be possible for the polygraphers to counter that the same
sort of difficulty holds for traditional group significance testing in
experimental psychophysiology, where the chances of finding a
"significant" difference becomes near-perfect as the sample size
becomes very large. It is because of this that a statistically
Furedy, Heslegrave / LIE DETECTOR VALIDITY
231
significant difference between two groups on an IQ test is often
considered to be psychologically insignificant if the difference is
small and the group samples are large (e.g., in the hundreds).3
Accordingly, we suggest that it is only the basic scoring method
rather than the cutoff criteria that suffer uniquely from subjectivity. That problem, however, is exacerbated by the fact that it is
polygraphic practice to have the records scored by the examiner,
rather than being "blindly" scored by an individual who has
access only to the physiological records themselves. Even when
measurement is fully objective, errors from bias can creep in. That
is a principle that holds not only in the biological sciences but also
in such "hard" sciences as astronomy. However, when judgment
is required of the sort involved in deciding to characterize a
relevant question as "clearly" (2) versus "slightly" (1) greater than
its paired control, it is obvious that the biases of the observer can
significantly affect the numerical values assigned. In this regard,
professional polygraphers are loath to give up their privilege of
scoring their own records, if only because they need the information on the spot. Yet in terms of objectivity or lack of bias, it
would seem that this problem is a considerable one for the
apparently "objective" mode of polygraphic scoring. Despite this
problem, field scoring is almost exclusively done by the examiner,
and it is only in research that blind-scoring studies have been
undertaken (e.g., Horvath, 1977). One potential amelioration of
this problem is represented by the efforts of Raskin and his
colleagues (e.g., Kircher & Raskin, 1983) to provide computer
scoring of tests. This kind of objective scoring may result in
superior decisions by individual polygraphers, although it must
be stressed that no degree of sophistication on the measurement
side can overcome the other problems such as those discussed
above with regard to the scientific shortcomings of the so-called
control-question technique.
Before concluding this description, two variants of traditional
lie detection should be mentioned. These variants are not part of
the current professional polygrapher's package, but they are
intended for the same purpose—detection of the guilty. The first
variant is a psychophysiologically based method known as the
232
CRIMINAL JUSTICE AND BEHAVIOR
Guilty Knowledge Technique (GKT). The GKT was introduced
by Lykken (1959) as an alternative to the standard CQT used by
professional polygraphers. The rationale is to focus on the guilty
knowledge (i.e., significance of the question) rather than the
guilty person (i.e., emotional content of the question), and
therefore to ask questions that only the guilty can know the
truthful answer. Both the rationale and the accuracy of the GKT
are superior to that of the CQT (see Bradley & Janisse, 1981;
Lykken, 1981; Podlesny & Raskin, 1977), although from a purely
psychophysiological perspective it is important to note that it is
probably not the process of deception but rather an orienting
process that is being detected (Ben-Shakhar, 1977; Furedy, 1986;
Heslegrave, 1981). However, the evidence favoring the GKT is
based only on laboratory studies, and this fact is no accident. For
its implementation, the GKT requires that the details of the crime
to be used in the polygraph test be kept secret, in order that the
guilty knowledge be truly unique to the perpetrator of the crime.
Because this requirement runs counter to normal police procedures, as well as being quite difficult to implement, the GKT is
seldom used by current polygraphers. However, it is clear that if
police and polygraphic methods are modified, the GKT is a
promising future method for the accurate detection of the guilty
when a specific crime has been committed (see also Lykken, 1981,
for a hypothetical case).
The second variant is the Psychological Stress Evaluator
(PSE). This seeks to make use of the fact that there are small
changes in voice inflection that a speaker is unaware of, and that
respond to stress by changes in the microtremors found in the
larynx (Lippold, 1971). These changes can be amplified and
displayed either on an oscilloscope or a graphic recorder. As a
response to stress, or to the emotional content of the question,
these changes can be taken as an indication of lying in the same
way as other measures in the CQT polygraph test. The PSE is very
convenient to use. Not only is it unnecessary to affix electrodes to
the body, but the PSE can even be used on tapes or telephone
conversations. Fortunately for those who would regret the
decline personal freedom if all our conversations could be so
monitored for their truthfulness, research has shown the PSE to
Furedy, Heslegrave / LIE DETECTOR VALIDITY
233
be no better than chance (Brenner, Branscombe, & Schwartz,
1979; Horvath, 1978; Lynch & Henry, 1979; Nachshon &
Feldman, 1980). In this regard, the PSE has to be sharply
distinguished from modern CQT polygraphy, for even polygraphy's severest critics (e.g., Lykken, 1981) do not argue that
polygraphy's accuracy is not better than chance, and the APA
itself has officially rejected the use of the PSE (Horvath, 1982).
VALIDITY
In this section we shall briefly review some of the literature
pertinent to the accuracy and validity of current detection of
deception techniques. The literature on accuracy is vast, and there
is no pretension of exhaustiveness in this brief review. This review
will cover several of the more important issues related to
determining the validity of detection of deception techniques.
Our intention is to provide the reader with information that will
facilitate a critical evaluation of accuracy claims.
One necessary prolegomenon, however, is that the accuracy
estimates of polygraphic detection of deception are dependent
upon a great many factors. The skill level of the examiner, the
psychological state of the subject, the scoring procedures, the
questioning techniques, and the particular physiological variables
measured are but a few of the variables that must be taken into
account when one is attempting to determine the accuracy or
validity of the procedure. As an overriding caveat, it should be
clear that most studies have provided insufficient control over the
many factors that can influence the accuracy of detection of
deception procedures. Accordingly, the validity of these procedures remains an unresolved issue and estimates of accuracy
range from chance to perfection.
IS DECEPTION ACTUALLY DETECTED?
Polygraphy has come to be known as the "detection of
deception," but this still leaves open the question of whether, in
fact, it is deception that is being detected by the procedure. We
234
CRIMINAL JUSTICE AND BEHAVIOR
shall focus on variants of the control question technique (CQT) in
this regard, although we shall also consider the guilty knowledge
technique (GKT) at the end of this subsection.
In the CQT, the control questions are paired with the relevant
questions by being temporally adjacent in the question series; this
temporal proximity minimizes differential habituation effects.
The control questions are designed in a pretest interview and deal
with similar circumstances to those covered by the relevant
questions "so that the subject is very likely to be deceptive to them
or very concerned about them" (Podlesny & Raskin, 1977, p.
786). Although there is some dispute over the exact theoretical
formulation underlying the CQT (Lykken, 1978, 1979, 1981;
Raskin, 1978; Raskin & Podlesny, 1979), in general the theory is
that guilty subjects will be deceptive to relevant questions and
show stronger autonomic responding to the relevant than the
control questions. In contrast, the control question is meant to be
"a stronger stimulus for the innocent subject because he knows he
is truthful to the relevant questions; he has been led to believe that
the control questions are also very important in assessing his
veracity . . . and he is either deceptive in his answers, very
concerned about his answers, or unsure of his truthfulness
because of the vagueness of the questions and problems in
recalling the events" (Raskin & Podlesny, 1979, p. 54).
Although a number of practical and theoretical problems with
the CQT have been identified by Lykken (1974,1978,1979), the
main point is that the procedure does not provide an adequate
scientific control for detecting deception, because it is impossible
to estimate what the relevant response would have been if the
answer to the relevant question had been honest. Indeed, Raskin
and Podlesny (1979) have argued that the control question is not
meant as a scientific control for deception. Rather, it is meant as a
stronger emotional stimulus than the relevant question for
innocent subjects. Therefore, it is meant as an "emotional
standard" (Barland & Raskin, 1973, p. 43) designed to enhance
the innocent subject's responses to control questions. In fact, in
Raskin and Podlesny's terms (1979), quoted above, the control
questions are meant to be of great concern to all subjects (since
guilty and innocent subjects cannot be discriminated beforehand).
Furedy, Heslegrave / LIE DETECTOR VALIDITY
235
The users of the CQT, then, do not employ the technique to
detect deception per se, but rather employ it to detect the guilty.
One can, indeed, generate the seemingly paradoxical conclusion
that deception could be detected only in those innocent subjects
who give larger responses to the control than to the relevant
questions. In this case, innocent individuals would be those who,
being asked to be deceptive to the control questions and being
truthful (and hence innocent) to the relevant questions, would be
classified by the CQT user (the polygrapher) as "nondeceptive,"
whereas they were actually deceptive (as instructed) with respect
to the control question.
Lykken's (1959,1960) Guilty Knowledge Technique—GKT—
does not suffer from the scientific problems of control methodology that weaken the CQT. This is because the GKT does provide a
control comparison that is a reasonable estimate of the subjects'
responses to relevant questions if they were being honest to those
questions. However, the GKT is also not designed to detect
deception per se. The rationale, rather, is that because of the
guilty person's special knowledge about some crime-related issue,
the response to a question about that issue will exceed that of a
person who does not possess any such guilty knowledge, because
the relevant knowledge would be more salient to the guilty.4 So
even if the GKT were to be commonly used in the field, which it is
not, it would still not serve to detect deception per se, despite the
fact that the term detection of deception has been accepted into
current usage. However, the fact that deception is not being
detected highlights the problems with this technique. If an
innocent suspect has crime-relevant, but not genuinely guilty,
knowledge that has been acquired in ways not associated with the
commission of the crime (e.g., if the police who were at the scene
of the crime inadvertently release information that is later used
to construct a relevant question in the GKT), then he or she may
be misclassified as guilty.3
LABORATORY VERSUS FIELD VALIDITY
Even if current techniques do not detect deception per se, it is
still possible that they do detect the guilty, and differentiate them
236
CRIMINAL JUSTICE AND BEHAVIOR
from those who did not commit specific criminal acts. However, it
is very difficult to get a precise estimate of the accuracy of
polygraphy. Polygraphers themselves who write in polygraphy
journals are, not surprisingly, very sanguine about the level of
their profession's accuracy. Recently, Ansley (1983) has provided
a review in which he reports an accuracy of 96.3% for field cases.
However, this review has failed to cite a number of studies that
reported quite low accuracy rates, and also includes reference to
many studies that lacked any semblance of scientific control.
However, the overall accuracy issue appears confused even
when only scientifically respectable studies are considered. Whereas Lykken (1978, 1979) estimates accuracy to be only slightly
above chance (i.e., 64%-71%), Raskin and his associates (Podlesny
& Raskin, 1978; Raskin, 1978), reviewing the same body of
literature, put the figure as near perfect, that is, 90%-95%. In what
follows, we consider some of the complex factors that are
responsible for this great variation in estimates between two
respected members of the scientific psychophysiological
community.
One of the most significant factors that affect conclusions
concerning the validity of the CQT is whether the accuracy is
determined in laboratory and mock-event studies, or in actual
field investigations. There appears to be a consensus that the
differences between the laboratory and field situations are
sufficiently great that laboratory results should not be generalized
to the validity of these procedures in field settings. Most would
agree that the subjects undergoing real-life polygraphic interviews
would be considerably more aroused and concerned than those
subjects involved in laboratory experiments. In addition, in field
situations, subjects would vary in many ways: subjects would
view the examination, examiner, and the purpose of the test
differently; they would probably be more heterogeneous and vary
from laboratory subjects on such factors as age, background,
intelligence, personality; the events preceding the examination
would differ as well as the time period between the critical event
and the examination; in the field, the guilty subjects would be
more motivated to deceive; and the anxiety or stress levels of
Furedy, Heslegrave / LIE DETECTOR VALIDITY
237
guilty and innocent subjects may differ more extensively in the
field than in the lab.
It should also be noted that although a number of factors have
been listed that may differ from the laboratory to the field
situation, the direction of these effects have not been specified.
For example, although we can probably assume that subjects
undergoing an actual criminal investigation would be more
aroused and stressed, we cannot assume that guilty subjects
would therefore be either more or less detectable. On the one
hand, the additional arousal could lead the guilty to respond
more strongly, but it is also possible that the innocent would come
to be anxious about the relevant question and hence show greater
responses to it. Indeed, it is not unreasonable to suppose that the
move from lab to field increases and decreases, respectively, the
significance of the relevant and control questions. Because CQT
polygraphy's rationale depends crucially on the relevant-control
comparison, the above supposition would in itself be sufficient to
produce a reversal of direction of effects as one moved from the
lab to the field.
More generally, any views on the effects of these factors on
polygraphy's accuracy are no more than guesses. There is no
adequate research that allows one to even begin to estimate these
differences between laboratory and field situations in order for
laboratory results to be validly generalized to field settings.
Finally, we must be clear that field validation reports cannot
include studies such as card tests on criminal suspects (e.g.,
Kugelmass, Lieblich, Ben'Ishal, Opatowski, & Kaplan, 1968)
or mock crime investigations on convicted psychopaths (Raskin
& Hare, 1978).
Accurate estimates of the validity of polygraphy, then, can be
based only on field examinations of actual criminal cases.
However, this restriction is not sufficient: The outcomes of the
polygraph examinations must also be verified against some
criterion on "ground truth," which has usually taken the form of
judicial outcome or confession. If these necessary restrictions are
accepted, then there are only a handful of reports that are relevant
as evidence on this issue.
238
CRIMINAL JUSTICE AND BEHAVIOR
Lykken (1974) cited Bersh (1969) as the only adequate study of
validity in the area. In that study Bersh obtained 323 criminal
investigations conducted by the military, and had a four-member
panel of lawyers reach a decision (disregarding legal technicalities)
concerning the guilt or innocence of the accused. After discarding
80 cases on the grounds of insufficient evidence, the panel
produced unanimous, majority, and split decisions, respectively,
for 247, 59, and 27 cases. Using the judicial decision as the
criterion, the polygrapher's decision was correct in 92% of the
unanimous cases, but only in 75% of the majority cases. While
these statistics seem to provide impressive levels of validity, there
are several key problems to consider.
The first problem is that the adequacy of the criterion "ground
truth" measure is questionable. It cannot be assumed that all
judicial decisions were correct, because there is no way of
independently estimating the "ground truth." It may even be
possible that the 8% error in unanimous cases occurred through
mistakes made by the panel rather than by the polygrapher! The
second problem is that, as Lykken (1974) has pointed out, the
polygrapher's decision was made on the basis of the facts
concerning the case as well as clinical impressions of the subject
under investigation at the time of examination. In that case, it is
possible to view the study as one of reliability, with the
polygrapher serving as an additional judge. The fact that the
polygrapher obtained more "correct" decisions in the unanimous
than in the majority cases then can be viewed as simply
illustrating that as the agreement among the original four judges
rose,* so too did the agreement of the polygrapher-judge with the
panel of original judges. It is true that both Raskin (1978) and
Barland (1982) have correctly indicated that there is no evidence
from Bersh's study to support Lykken's interpretation that the
polygrapher functioned as yet another judge. However, because
the polygrapher followed the usual professional practice of
basing decisions not only on the records but also on other factors,
Lykken's interpretation cannot be ruled out of consideration.
The third problem is that from all the cases on which the
polygrapher made a decision, one-third were discarded through
Furedy, Heslegrave / LIE DETECTOR VALIDITY
239
receiving split decisions from the panel of judges. If we assume
that the polygrapher's accuracy for this one-third of the cases was
no better than chance, then over all the cases selected (323 cases)
the number of correct detections based on the panel decisions and
indecisions (corrected for total cases in each category) would have
been 75% against a chance rate of 50%, a less impressive statistic.
It might also be observed that the polygrapher seemed to be able
to make guilty versus innocent decisions in those one-third of the
cases where, according to the panel, the evidence was not
sufficient to yield a unanimous legal decision. At least to critics of
polygraphy, this suggests that polygraphers are apt to make
decisions in situations where the ground truth is not determinable.
Only if polygraphy is regarded as a sort of magic path to truth
would this possibility be an untroubling one.
In a follow-up study, Horvath (1977) examined judgments
from 10 polygraph examiners of law enforcement agencies. An
important methodological advance over the earlier Bersh (1969)
study was that in half the cases the "ground truth" was verified by
confession of the guilty person. Another methodologically advantageous feature was that the polygrapher used only the physiological records for their judgments. This feature is important because
it can be argued (see, e.g., Lykken, 1981) that without such
"blind" record reading a study can at best produce information on
the shared prejudice among polygraphers (i.e., reliability), rather
than on accuracy (i.e., validity). In the 560 judgments made by the
10 examiners for verified cases, correct decisions occurred only
64% of the time. Moreover, there were no differences between
high- and low-experience examiners (greater than versus less than
3 years experience). Raskin (1978) has argued that this usually low
accuracy fate may be partly attributable to less experience, poor
training, and the fact that polygraphers usually do not operate
simply on the basis of the records, but have other behavioral
symptoms to consider. However, especially as the physiological
recordings are supposed to be the essence of polygraphy, these
results cannot be viewed as very supportive of the notion that
polygraphy's accuracy is very high.
In another extension of Bersh's work, Barland and Raskin
240
CRIMINAL JUSTICE AND BEHAVIOR
(1976) reported a study in which Barland administered ControlQuestion tests to 92 criminal suspects and then had a panel of
experts review the cases with the polygraph evidence removed.
The panels' decisions were the criterion against which the
polygraph decisions would be judged. In only 64 cases could the
panel achieve a majority decision, and of those a clear polygraphic
decision could be achieved by blind scoring of charts by Raskin in
only S1 of the cases. Of those 51 cases, 40 were criterion guilty and
11 were criterion innocent. Raskin scored 86% of the cases
correctly. However, when the guilty and innocent subjects are
considered separately, he scored 98% of the guilty correctly, but
only 45% of the innocent correctly, which yields an average of
71.5% correct classifications, if these sample accuracy rates are
representative of true (population) accuracy. Consideration of
these various statistics indicates that the high degree of accuracy
at detecting guilty subjects must be balanced against excessively
high false positive outcomes. Barland (1982) has noted, however,
that unlike the Bersh (1969) study, the case histories that were
given to his panel were often incomplete, and there were an
unusually large number of cases that were not classified because
of inconclusive evidence. He has raised the possibility that his
incompleteness coupled with the philosophy of American jurisprudence to protect the innocent may have caused bias in favor of
innocent decisions. While this argument may have some merit,
the data suggest that this caused a bias toward generating
inconclusive decisions rather than those of innocence. Those
classed as innocent were probably innocent "beyond a reasonable
doubt." Again, however, the validity of the panel decisions as to
"ground truth" are questionable.
In contrast to these somewhat low accuracy estimates, proponents of polygraphy have argued that several other studies
reporting high accuracies do meet Lykken's "criteria of blind
interpretation of confirmed polygraphy charts from criminal
suspects" (Raskin & Podlesny, 1979, p. 56). These studies
(Horvath & Reid, 1972; Hunter & Ash, 1973; Raskin, 1976;
Slowik & Buckley, 1975; Wicklander & Hunter, 1975) report, on
the average, 90% and 89% accuracy, respectively, for the detection
Furedy, Heslegrave / LIE DETECTOR VALIDITY
241
in guilty and innocent suspects. However, Lykken has claimed
that the charts were not randomly selected, but rather chosen
from the subset that showed "clear" truthful or deceptive
patterns. On the other hand, this charge of nonrandom chart
selection could also be leveled at the (low accuracy) study of
Horvath (1977).
More recently, a study by Ginton, Daie, Elaad, and BenShakhar (1982) appears to provide relevant data on accuracy. An
important advantage of this work is that there was clear and
independent evidence for what the "ground truth" was, and yet
the situation was a real, field situation instead of a simulated,
laboratory arrangement. In their arrangements, subjects did or
did not commit some act; the act was committed freely rather
than being simulated; the act was verifiable; subjects were
concerned about the outcome of the polygraphic examination,
and believed that the experimenter did not know who had and
who had not committed the act, and that the polygrapher did not
know the proportion of guilty people.
In total, 21 Israeli police officers participated in the study as
part of a police course. During part of the course, subjects were
given tests that required written answers. Unbeknownst to the
subjects, the answers were secretly recorded on a hidden chemical
page underneath their exam page. Later the subjects scored their
own test with answer sheets under conditions that physically
allowed alteration of their original test sheets. A few days later all
subjects were told that cheating had occurred on the tests and
were given an opportunity to clear themselves of suspicion by
taking a polygraphy exam. However, it was also made clear that a
negative polygraphic outcome would adversely affect their future
careers in the force. In all, 7 subjects cheated, but of the 21
subjects only IS took the polygraph exam; 3 confessed, 1 guilty
subject did not show up for the exam, and 2 (1 guilty and 1
innocent) refused to take the exam leaving, only 2 guilty subjects
and 13 innocent subjects. The evaluation of the subjects were
made blindly on the charts alone, on the behavior of the subjects
alone, and on both the charts and behavior (which is the standard
polygraphic practice). In addition, charts were scored both
242
CRIMINAL JUSTICE AND BEHAVIOR
globally and by the field-numerical-scoring technique proposed
by Barland and Raskin (1975).
The 2 guilty subjects form too small a sample to base any
conclusions on, but suffice it to say that the various methods did
result in misclassifications (innocent or inconclusive). For the 13
innocent subjects, one relevant comparison is that among the
chart-alone, behavior-alone, and chart-and-behavior conditions,
with global chart scoring. Here the respective frequencies of
innocent (correct), guilty, and inconclusive (which would probably
be sufficiently serious as to affect an innocent policeman's career
adversely) categories were as follows: for the chart-alone condition, 7, 3, and 3; for the remaining two conditions, 1 1 , 1 , and 0.
Accordingly, with the global method of scoring, addition of the
charts has no effect on correct or incorrect decisions, while
behavioral information (with or without the chart information)
appears to provide more accurate and less ambiguous (fewer
inconclusive) decisions. The chart information appears to reduce
the frequency of inconclusives obtained from using the charts
alone.
The other issue of interest is to compare the accuracy of the
(older) global method chart scoring with the "semi-objective"
numerical system. In the chart-alone (blind) condition, the
respective frequencies of the innocent, guilty, and inconclusive
categories were 7, 3, and 3 (global) and 5, 1, and 7 (numerical).
The corresponding frequencies for the chart-and-behavior condition were 11,2, and 0 (global) and 6, 1, and 6 (numerical). The
most obvious feature of these results is that the numerical method
produces more inconclusive decisions and fewer "hits" (innocent
classifications) than the global method. This sort of "trade-off5
relation is expected on the basis of signal-detection theory. Of
course the sample size involved in this study is far too small for
any definitive fine-grained conclusions. However, what is clear
from what appears to be the only field study that was both
sufficiently real life and adequately controlled, is that CQT
polygraphy, though better than chance, produces a significant
percentage of wrong decisions.
In summary, the primary scientific authorities on the detection
Furedy, Heslegrave / LIE DETECTOR VALIDITY
243
of deception have different estimates of the validity of the
techniques for field use. Raskin (Raskin & Podlesny, 1979)
believes the accuracy to be approximately 90% in field situations,
while Lykken has repeatedly (e.g., Lykken, 1974) stated that
detection of deception techniques have an accuracy rate of 64% to
71 % (against chance rates of 50%). There is a dearth of adequate
research on the subject, and that research, moreover, is complicated and difficult to do. At this stage it appears likely that the
accuracy of polygraphy is somewhere between Raskin's and
Lykken's estimates. That is, the technique is better than chance,
but not so foolproof that one does not need to be critical of the
procedures and the results of those procedures. Moreover, as also
discussed elsewhere in this review, there are special problems
having to do with the gravity of making false positive errors, as
well as other issues that render the valuation of polygraphy a
complicated and controversial matter. 6 In addition, the use of
polygraphy raises other ethico-legal issues in liberal democratic
societies. We have not discussed these here (for a brief account,
see Furedy, 1986), but their existence makes it important to
recognize that conclusions regarding polygraphy are conditioned
not only by the facts but also by the differing values of various
assessors.
NOTES
1. Even this qualified assertion is not meant to imply that the practices and techniques
employed by members of the APA are scientifically based, and therefore reliable and
valid. Rather, this assertion implies only that the measure employed and the interview
techniques used on any occasion are drawn from a finite common pool of measures and
techniques. As will become apparent, different individual situations can "result in different
measures and techniques being employed by different examiners, and the specific
interview will undoubtedly vary with different examiners. In other words, the level of
standardization of the polygraph "test" does not approach the level reached by other
psychological tests designed to measure such concepts as IQ, personality, and
temperament.
2. In addition to the obvious fact that control questions vary in more ways from
relevant questions than the presence or absence of deception, there are other unique
difficulties with the rationale of the CQT that actually generate the paradoxical conclusion
that deception is detected in the innocent rather than the guilty. The paradox arises from
the fact that examinees are supposed to be deceptive to the control question. The CQT
244
CRIMINAL JUSTICE AND BEHAVIOR
involves comparing responses to the relevant and control questions, and "decisions
of truthful or deceptive require substantially larger overall responses to control or
relevant questions, respectively" (Podlesny ft Raskin, 1977, p. 786). Accordingly, if
one assumes that it is deception as a psychological process rather than guilt about
the commission of a specific crime that is to be detected (as the claim that polygraphy
involves the "detection of deception" would suggest), then larger autonomic
responses to the control relative to the relevant question would be interpretable as
detecting deception (in the innocent suspect). The fact that it is possible to generate
this paradoxical conclusion indicates that the rationale of the CAT is beset with
complications that are not apparent to one who simply assumes that the "control"
question is simply a scientific control
3. The argument that increasing the number of measures and subjects in the
sample reduces the chances of finding a no-difference result holds only if the
population difference is not precisely zero. However, that assumption probably
holds for most natural phenomena. Specifically, the difference in IQ between
males and females is probably not precisely zero, but it is also probably not large
enough to be psychologically meaningful, even though it would produce a
statistically significant sample difference with sufficiently large sample sizes.
Similarly, by increasing the number of channels and tests, the polygrapher
increases the chances of a noninclusive outcome (either deceptive or truthful) under
conditions where the "true" ("population") relevant and control responses do not
differ by a "reasonable margin" (analogous to a "psychologically meaningful"
difference in the IQ example).
4. This is a rather simplified description of the findings concerning the GKT.
For a more thorough discussion, see the work of Ben-Shakhar and his colleagues
(e.g., Ben-Shakhar, 1977; Ben-Shakhar, Lieblich, ft Kugelmass, 1975).
5. A recent study by Bradley and Warfield (1984) indicates that the GKT may
work even when the innocent possess crime-relevant knowledge, and this aspect of
the study is certainly stressed in the abstract's "major conclusion that subject may have
crime-relevant information and not be classed, based on detection scores, as guilty"
(Bradley ft Warfield, 1984, p. 683). However, in one group of 10 innocent subjects,
the rate of misclassification was 60% (Bradley ft Warfield, 1984, p. 687),
suggesting that there are conditions where possession of crime-relevant knowledge
by the innocent certainly does invalidate the GKT.
6. For an exhaustive, recent review of the studies reporting accuracy rates, see
Office of Technology Assessment report (1983).
REFERENCES
Ansley, N. (1983). A compendium of polygraphy validity. Polygraph, 12, 53-61.
Backster, C. (1962). Methods of strengthening our polygraph technique. Police, 6(5),
61-68.
Barland, G. H. (1982). On the accuracy of the polygraph: An evaluative review of
Lykken's tremor in the blood. Polygraph, 11, 215-224.
Barland, G. H., & Raskin, D. C. (1973). Detection of deception. In W. F. Prokasy &
D. C. Raskin (Eds.). (1973). Electrodermal activity in psychological
research. New York: Academic Press.
Furedy, Heslegrave / LIE DETECTOR VALIDITY
245
Barland, G. H., ft Raskin, D. C. (1975). An evaluation of field techniques in
detection of deception. Psychophysiology, 12,321-330.
Barland, G. H., ft Raskin, D. C. (1976, March). Validity and reliability of polygraph
examinations of criminal suspects (Report No. 76-1, Contract 75-Ni-99O001, US.Department of Justice). Salt Lake City: University of Utah.
Ben-Shakhar, G. (1977). A further study on the dichotomization theory in detection
of information. Psychophysiology. 14,408-413.
Ben-Shakhar, G., Lieblich, I., ft Bar-Hillel. M. (1982). An evaluation of
polygraphers' judgments: A review from a decision theoretic perspective.
Journal of Applied Psychology. 67,701-703.
Ben-Shakhar, G., Lieblich, I., ft Kugelmass, S. (1975). Detection of information and
GSRhabitation: An attempt to derive detection efficiency from two
habitation curves. Psychophysiology. 12,283-288.
Bersh, P. J. (1969). A validation of polygraph examiner judgments. Journal of
Applied Psychology. 53,399-403.
Bradley, M. T., & Janisse, M. P. (1981). Accuracy demonstrations, threats, and the
detection of deception: Cardiovascular, electrodermal, and pupillary
measures. Psychophysiology, 18,307-315.
Bradley, M. T., ft Warfield, J. F. (1984). Innocence, information, and the guilty
knowledge test in the detection of deception. Psychophysiology. 21,683689.
Brenner, M., Branscomb, H. H., ft Schwartz, G. E. (1979). Psychological evaluation
Two tests of a vocal measure. Psychophysiology, 16,351-357.
Furedy, J. J. (1983). Operational, analogical, and genuine definitions of
psychophysiology. International Journal of Psychophysiology, 7,13-19.
Furedy, J. J. (1986). Lie detection as psychophysiological differentiation: Some fine
lines. In M. Coles, E. Donchin, ft S. Porges (Eds.), Psychophysiology:
Systems, processes, and applications—A handbook. New York:
Guilford. Ginton, A., Daie, N., Elaad, E., ft Ben-Shakhar, G. (1982). A method of
evaluating the use of the polygraph in a real-life situation. Journal of
Applied Psychology. 67,131-137.
Hemsley, G. D. (1977). Experimental studies in the behavioral indicants of
deception. Unpublished doctoral dissertation, University of Toronto.
Heslegrave, R. J. (1981). A psychophysiological analysis of the detection of
deception: The role of information, retrieval, novelty and conflict
mechanism. Unpublished Ph.D. thesis, University of Toronto.
Horvath, F. S. (1977). The effect of selected variables on interpretation of polygraph
records. Journal of Applied Psychology. 62,127-136.
Horvarth, F. S. (1978). An experimental comparison of the psychological stress
evaluator and the galvanic skin response in the detection of deception.
Journal of Applied Psychology. 63, 338-344.
Horvath, F. S. (1982). Detecting deception: The promise and the reality of voice
stress analysis. Polygraph. 11, 304-308.
Horvath, F. S., ft Reid, J. E. (1972). The polygraph silent answer test. Journal of
Criminal Law. Criminology. & Police Science, 63, 285-293.
Hunter, F. L., ft Ash, P. (1973). The accuracy of consistency of polygraph examiners'
diagnosis. Journal of Police Science A Administration, 1, 370-375.
Kircher, J. C, & Raskin, D. C. (1983). Clinical versus statistical lie detection
revisited: Through a lens sharply. Psychophysiology, 20,452 (Abstract).
Kugelmass, S., Lieblich, I., Ben-Ishai, A., Opatowski, A., ft Kaplan, M. (1968).
246
CRIMINAL JUSTICE AND BEHAVIOR
Experimental evaluation of galvanic skin response and blood pressure
change indices during criminal interrogation. Journal of Criminal Law,
Criminology, & Police Science, 59,632-635.
Lippold, O. (1971). Physiological tremor. Scientific American, 224,65-73.
Lykken, D. T. (1969). The GSR in the detection of guilty. Journal of
Applied Psychology,43,385-388.
Lykken, D. T. (1960). The validity of the guilty knowledge technique: The effects of
faking. Journal of Applied Psychology. 44,258-262.
Lykken, O. T. (1974). Psychology and the lie detector industry. American
Psychologist,29, 725-739.
Lykken, D. T. (1978). The psychopath and lie detector. Psychophysiology, 15,137142. Lykken, D. T. (1979). The detection of deception. Psychological
Bulletin, 86,47-53.
Lykken, D. T. (1981). A tremor in the blood: Uses and abuses of the lie detector.
New York: McGraw-Hill.
Lynch, B. E., & Henry, D. R. (1979). A validity study of the Psychological Stress
Evaluator. Canadian Journal of Behavioural Science, 11, 89-94.
Nachshon, I., & Feldman, B. (1980). Vocal indices of psychological stress: A
validation study of the psychological stress evaluator. Journal of Police
Science & Administration. 8,40-53.
Office of Technology Assessment. (1983, November). Scientific Validity of
Polygraph Testing. Washington, DC: Government Printing Office.
Podlesny, J. A., & Raskin, D. C. (1977). Physiological measures and the detection of
deception. Psychological Bulletin. 84,782-799.
Podlesny, J. A., & Raskin, D. C. (1978). Effectiveness of techniques and
physiological measures in the detection of deception. Psychophysiology,
15,344-359.
Raskin, R. C. (1976, June). Reliability of chart interpretations and sources of errors
in polygraphy examinations (Report No. 76-3, contract 75-Nii-99-0001,
U.S. Department of Justice). Salt Lake City: University of Utah.
Raskin, D. C. (1978). Scientific assessment of the accuracy of detection of
deception: A reply to Lykken. Psychophysiology, 15,143-147.
Raskin, D. C, A Hare, R. D. (1978). Psychopathy and detection of deception in a
prison population. Psychophysiology, 15, 126-136.
Raskin, D. C, & Podlesny, J. A. (1979). Truth and deception: A reply to Lykken.
Psychological Bulletin, 86,54-59. .
Slovik, S. M., & Buckley, J. P. (1975). Relative accuracy of polygraph examiner
diagnosis of respiration, blood pressure, and GSR recordings. Journal of
Police Science & Administration, 3, 305-309.
Tanner, W. P., Jr., & Swets, J. A. (1954). A decision-making theory of visual
detection. Psychological Review, 61, 401-409.
Wicklander, D. E., & Hunter, F. L. (1975). The influence of auxiliary sources of
information in polygraph diagnoses. Journal of Police Science &
Administration, 33,405-409.
Download