observational studies and experimental studies in the

advertisement
OBSERVATIONAL STUDIES AND EXPERIMENTAL STUDIES IN THE
INVESTIGATION OF CONFERENCE INTERPRETING
Daniel Gile
Université Lumière Lyon 2 and ISIT, Paris
Published in Target 10:1.69-93. 1998
Abstract: In conference interpreting research (IR), empirical investigation can be classified as
observational or experimental. The former can be used for exploration, analysis and
hypothesis-testing, and is either interactive or non-interactive. Besides its conventional role of
hypothesis-testing, the latter can be exploratory. The main methodological problems in both
are related to validity and representativity and to quantification. At this stage, the most
important contribution to interpreting research can be expected from observational
procedures. Simple descriptive statistics and uncomplicated quantitative processing of the
data still have much to offer.
Résumé: La recherche empirique sur l'interprétation peut être classée comme
observationnelle ou expérimentale. La recherche observationnelle est interactive ou non et
peut servir à l'exploration, à l'analyse et à la vérification d'hypothèses. Utilisée généralement
pour la vérification d'hypothèses, la recherche expérimentale peut également être exploratoire.
Les principaux problèmes méthodologiques dans ces deux démarches se posent en matière de
validité et de représentativité et à propos de la quantification. A ce stade, c'est encore la
recherche observationnelle qui présente le plus important apport potentiel. Les statistiques
descriptives et des méthodes quantitatives simples gardent un grand intérêt.
1. Introduction
As Translation Studies evolves into an academic (inter)discipline (Snell-Hornby,
Pöchhacker and Kaindl 1994, Wilss 1996), it has also been migrating from a predominantly
'philosophical' and prescriptive approach towards forms of investigation more consistent with
the norms of research in more established disciplines (see Holmes, Lambert and Van den
Broeck 1978, Toury 1995). At the same time, there is more empirical research as opposed to
essay-type reflection, as exemplified inter alia by Tirkkonen-Condit's regular endeavors in
this direction (Tirkkonen-Condit and Condit 1989, Tirkkonen-Condit 1991, Tirkkonen-Condit
and Laffling 1993) and by the work on Think Aloud Protocols spearheaded by German
investigators (see Krings 1986, Lörscher 1986, Königs 1987, Kiraly 1995, Wilss 1996) - also
see the collection of papers in Meta 41:1(1996) and Séguinot 1989. While the discipline is
maturing, it is still vulnerable to various methodological weaknesses, as discussed in Toury
1991, 1995. Conference interpreting research (hereafter IR) is less mature and may be facing
more fundamental questions about its identity and the way ahead (Gile 1995a, Gambier et al
1997).
One fundamental decision to be made by investigators selecting empirical research
projects is choosing between the experimental and observational paradigm. In IR, the present
trend seems to favor experimental studies, thus possibly depriving the discipline of much
useful data, and even of a large number of empirical research projects, since beginners and
students who may be able and willing to carry out simple observational projects could hesitate
to take on, or fail to complete experimental ones. This paper discusses methodological issues
linked to this choice.
2. Definitions
In the literature, a fundamental distinction is made between "theoretical research" which,
roughly speaking, focuses on the intellectual processing of ideas, and "empirical research",
which centers around the collection and processing of data. In principle, this distinction refers
to different relative weights in a bipolar activity, not to the existence of two separate
paradigms, one ignoring theory and the other ignoring data.
D.Gile 1996
Obsvsexp
page 1
Empirical investigation can be further classified into two broad categories: observational
research, also called 'naturalistic', which consists in studying situations and phenomena as
they occur 'naturally' in the field, and experimental research, in which situations and
phenomena are generated for the specific purpose of studying them. The study of situations
and phenomena initially generated for other purposes will be defined in this paper as
observational. Classroom experiments will be considered experimental in this sense of the
word if they are created for research purposes, and observational if they are initially designed
for didactic purposes and are processed scientifically as an afterthought. As is the case in any
classification, class boundaries can be fuzzy. For instance, having translators do their
professional work in a room with a camera so that their activity can be observed with some
precision (Séguinot 1989) could be considered an observational operation in spite of the
controlled spatial environment, while having them work with a carefully controlled set of
dictionaries and other documents could be enough to have the study cross the line into
experimentation. However, boundary problems should not cause difficulty in the present
discussion.
As a footnote, one might add that in literary and historical studies, including those with a
translation focus, the divisions explained above are less relevant, since research is based on
pre-existing data and is therefore almost necessarily observational (though one could conceive
of experimental setups in literary research).
3. Basic approaches in observational research
Over the years, research methods have become increasingly sophisticated. While in
sociology, in ethnology, in ethnolinguistics, in political science and in other disciplines,
observational investigation is still the core approach, in other sciences, and in particular in
psychology and in psychology-related disciplines, there has been a shift towards experimental
procedures to the detriment of observational studies, which now tend to be viewed as preexperimental, and often as pre-scientific: "...exploratory activities that are not experimental
are often denied the right to be classified as sciences at all" (Medawar 1981:69). In IR, after a
long period of anecdote-, personal experience- and speculation-based writing, the research
community is aspiring towards solid scientific status (see Arjona-Tseng 1989, Moser-Mercer
1991, Kondo and Mizuno 1995) and seems to be particularly sensitive to the scientific
prestige factor. This may strengthen the attractiveness of the experimental paradigm, which
has become prominent in conjunction with the recognition of the relevance of cognitive
psychology to interpreting - cognitive psychology uses almost exclusively the experimental
paradigm.
As explained in the introduction, the predominance of the experimental approach may
have deleterious effects. Most scientific disciplines need a descriptive foundation (cf.
Beaugrand 1988:33, McBurney 1990:27). In interpreting research, there is not enough
observational data yet, and Stenzl's call for more descriptive studies (1983:47) is still topical.
It is therefore important to give such studies a chance by recalling their legitimacy as
scientific endeavours per se even when they are not followed by hypothesis testing (cf.
Fourastié 1966:134, Mialaret 1984:25, Grawitz 1986:430).
Observational research can be broadly classified into three types of approach:
3.1 The exploratory approach
In this paper, the adjective 'exploratory' refers to endeavors primarily concerned with the
analysis of situations and events in the field without any prior intent to make a specific point,
ask a specific question or test a specific hypothesis. Again, the popularity of experimental
research in its hypothesis-testing variety (see below) has been detrimental to the exploratory
approach, in that one often hears or reads that investigation around well-defined hypotheses is
a sine qua non condition for "scientific work". This opinion is not shared by all (cf. Babbie
1992:91). Besides the authority argument, examples from mathematics, medicine and
psychology show that good results can be obtained through the exploratory approach (also see
Kourganoff 1971:54-55), though it must be stressed that:
D.Gile 1996
Obsvsexp
page 2
a. The absence of a well-defined and explicitly formulated hypothesis does not mean that
exploration is done at random; some expectations always guide the investigator,
b. Exploratory projects can, and in fact most often do lead to precise hypotheses and their
subsequent testing.
One example of such an exploratory approach can be found in a study of interpreting
research productivity in the 80's and 90's undertaken by Franz Pöchhacker (1995a,b).
Pöchhacker collected publication data, and then analysed it in terms of individual authors, of
'schools', of evolution over time, and drew a number of conclusions. He did not start with a
specific hypothesis to test.
3.2 The focused analytical approach
This title refers to research efforts focusing on a particular phenomenon through the analysis
of observational data. This is the type of observational research most frequently encountered
in IR. For example, Thiéry (1975) sent out a questionnaire to the 48 interpreters classified as
'true bilingual' members of the International Association of Conference Interpreters in order to
identify their biographical, professional and 'linguistic' idiosyncrasies. A more recent example
is a study of how interpreting affected the role of participants in a war crime trial (Morris
1989). The idea was to analyse data obtained during the trial to see whether and how the
interpreters' intervention actually changed the courtroom situation, especially as regards the
role and interaction between the various actors. Another relatively recent analytical study is
Gile's case study of his own terminological preparation for a medical conference and its
'efficiency' (1989, ch.14).
3.3 The hypothesis-testing approach
This approach is similar to that usually found in experimental studies, in that the researcher
seeks to collect data that will strengthen or weaken specific hypotheses about a particular
phenomenon. The difference is that here the data are derived from field observation rather
than experimentation, with the obvious limitations which arise from the fact that
environmental conditions are given, not controllable, hence the risks associated with
uncontrolled variability.
Observational hypothesis-testing is rare in interpreting resarch. One example is
Shlesinger's M.A. thesis (1989), in which she used field data to test the hypothesis that
interpreting turns rather literate original speeches into more oral-like target speeches and viceversa.
Observational studies can be methodologically simple. Generally speaking, as research in a
field makes headway, observation and analysis tend to become finer and require more
sophisticated tools and methods. In IR, the paucity of empirical research at this stage implies
that the complexity of methods cannot be taken as an indicator for the relative value of their
contribution: there are still many important aspects of interpretation to discover with simple
methods. And yet, when such simple projects are suggested to beginners, they often balk at
the idea and look for more ambitious projects which mostly turn out to be out of their reach.
3.4 Interactive and non-interactive observational research
Another methodologically important distinction can be made between 'interactive' and
'non-interactive' observational research. The latter consists in observing situations and
phenomena as they happen without the observed subjects playing an active role in the
collection, analysis and assessment of the data (as opposed to interview-based and
questionnaire-based studies), and with minimum data-distorting influence from the observer
during the collection of the data. Actually, one could claim that there is always some
interaction: the terms "interactive" and "non-interactive" are simplifications. It would be more
appropriate, but less convenient, to talk of higher-interaction vs. lower-interaction. An
example of such "non-interactive" or lower-interaction observational research is Strolz's
doctoral dissertation (1992), in which she tested a number of hypotheses on the basis of the
recordings of two interpreters' rendering of the same speech on the media. The two
D.Gile 1996
Obsvsexp
page 3
interpreters were not involved in the collection of the data; they only 'provided' the corpus by
doing their job in a professional situation. Similarly, Williams's 1995 study of anomalous
stress in interpreting consisted of an analysis of recordings of source texts and target texts in
an 'authentic' conference.
"Interactive" observational research in translation and interpreting studies generally
involves participation of the investigator in the process under study and/or questionnaires or
interviews, and thus entails the risk of interference from the researcher and/or a significant
influence of the research procedure on the phenomenon under study, or the risk of
interference from the subjects' personal perception, interpreting and reporting of facts. For
example, Thiéry's questionnaire was sent out to his 'true bilingual' colleagues who were also
his competitors. Since the issue of language classification and in particular of 'true
bilingualism' is a sensitive one in the conference interpreters' community, with concrete
professional implications, it is difficult to rule out an influence of the very question on the
responses on one hand, and on the investigator's interpretation thereof on the other. As to
Gile's case study on conference preparation (1989), the knowledge that his preparatory work
would be examined for research purposes is likely to have had some influence on the way he
worked.
The potential effects of such interaction may be taken on board usefully in the discussion
of results if the direction of the bias is known or strongly suspected, as in the case of selfobservation of preparation for a conference (the assumption being that the preparation is done
at least as seriously, and possibly more seriously, than under 'normal' circumstance). In other
cases, its effects can be difficult to assess.
Issues associated with interaction can be illustrated with a recent, rather large study on the
expectations of users from conference interpreters (Moser 1995), based on user interviews
performed by interpreters. Though the interviews followed a carefully designed questionnaire,
one may question the validity and representativity of some of the findings for the following
reasons (not to mention the fact that the interviewers were volunteers and had received written
instructions, but no training - see Shipman 1988:85 on this subject):
- It is a strong possibility that interpreters selected their respondents among users who
seemed to show a friendly or otherwise positive attitude towards them or towards interpreting,
rather than among indifferent or hostile delegates.
- During the interview, the identity of the interviewers as interpreters was clear to the
respondents, hence a possible effect on the answers given to the questions.
- Because of the interpreters' subjective position on the matter, one cannot rule out bias in
their interpretation of the responses.
The risks associated with interaction in social anthropology, where the researcher's
presence may influence the observed group's behaviour, have been acknowledged a long time
ago. The numerous TAP (think-aloud-protocol studies) performed on translators over the past
few years also entail a strong possibility of interaction between the research process and the
translation processes under study (see for instance Toury 1991, Kiraly 1995 and Kussmaul
1995).
4. Experimental research: statistical hypothesis-testing vs. 'open experimenting'.
Statistical hypothesis-testing is the most frequently used approach in experimental research
in the social sciences. The procedure involves the selection of a sample of subjects who
perform a task or are submitted to certain conditions in a 'controlled environment' in order to
test a hypothesis on the basis of a 'statistical test' ('Chi square', 't test', 'F test', analysis of
variance, etc.) showing 'significance' or 'non-significance' of measured and calculated values.
The hypothesis is then either rejected or 'not rejected'. Most of the experimental research done
on interpreting is of this type (see Gran and Taylor 1990, Lambert and Moser-Mercer 1994
and Tommola 1995).
For many behavioral scientists, 'experimental research' has become a synonym for
statistical hypothesis-testing. Researchers tend to forget that experiments can also be used for
other purposes. In particular, they can have an exploratory function and help find answers to
D.Gile 1996
Obsvsexp
page 4
questions of the type "what will happen if..." (Fraisse, Piaget and Reuchlin 1989:I-92). Such
experiments will be called 'open experiments' here. Again, present social norms make
hypothesis-testing more appealing, and open experimenting appears as 'naive'. However, in
industrial research, which is anything but naive, and in physics, chemistry and biology, which
do not have to worry about their scientific status, researchers have no qualms about its use.
The same applies to medicine, to agronomic research, and even to education research. It may
also be worthwhile to recall that one of the main criticisms levelled against open
experimenting, namely its inefficiency, attributable to the fact that the absence of a precise
hypothesis to test leads to possible waste of time, material means and efforts, loses much of
its relevance when computer simulation can be performed. Last but not least, advocates of the
primacy of the hypothesis testing-paradigm point out repeatedly that "you only find what you
are looking for", meaning that researchers need a precise hypothesis or research question to
work on for fear of missing the significant phenomenon in a sea of data. The principle is
generally true, although major discoveries have also been made at some distance from the
researcher's focal point. On the other hand, focusing on one item may prevent the observer
from noticing the presence of another, which may be important as well (see Reuchlin
1969:29-30).
One example of open experimenting in translation research is a study in which an idea
was presented graphically and subjects were asked to verbalise it in their native language
(Gile 1995b, Ch. 3). The initial objective was to identify phenomena that could help justify
some liberty in the production of the target text. Findings included a confirmation of interindividual and intra-individual variability in production, but also the identification of several
types of information, partly language-dependent, that subjects added to the information they
aimed to get across to their receivers (the 'message'), without necessarily being aware of it or
controlling this added information.
Open experimenting provides an observation platform which may then lead to more
precise hypotheses and to statistical hypothesis-testing. In terms of experimental design, it is
less difficult than the latter, because it requires less stringent controls (Fraisse, Piaget and
Reuchlin 1989), but it does provide the basic advantages of experimental procedure as a
whole, and in particular the possibility of having the same task performed under identical
conditions by more than one person, something which in interpreting only rarely happens in
the field.
5. Fundamental methodological issues in experimental and observational research in IR
In principle, the major advantage of observational research is that it allows the
investigation of phenomena as they occur naturally, with no distortion induced by the study.
However, as already pointed out, this is not always the case, as many observational
procedures do generate interference with the 'natural' phenomena (see also Shipman 1988).
This is particularly true in interpreting: due to the evanescent nature of both input and output,
on-line on site observation is often required, but when interpreters know they are being
'observed' for the purposes of close scrutiny, this may change their behavior, and not telling
them is ethically problematic. The obvious way around the problem is to use data which are
'naturally' recorded and non-confidential, as is the case in TV interpreting, but samples thus
obtained are not representative of all practitioners and all professional environments (see
Kurz forthcoming).
Another problem is associated with the diversity of working conditions (see section 6.2),
which makes it difficult to identify the contribution of individual variables to the phenomena
observed. Ideally, experimental research offers the solution to the problem: by creating an
environment where all relevant factors are controlled, it is possible to 'isolate' and measure the
effect of the variable under study (the 'experimental' or 'independent' variable). Unfortunately,
the implementation of this principle is often far from simple. The following obstacles are
particularly salient and have been generating much criticism among interpreters against
experimental research:
D.Gile 1996
Obsvsexp
page 5
a. The experimental situation itself, being 'unnatural' (if only because the subjects know that
they are being observed by researchers as opposed to being listened to by clients for
remuneration within a professional framework), may generate processes somewhat different
from the ones occurring in the natural environment (see section 6.1). Not enough data are
available about differences in interpreting performance depending on the context being
natural or experimental to decide that experiments do or do not distort processes significantly,
but the possibility must be considered.
b. The very selection of the relevant variables to be controlled is determined by subjective,
non-scientific considerations, whenever not enough solid evidence is available (see for
example Moravcsik 1980:46). As pointed out earlier, this is the case in IR. By controlling
some less relevant variables and ignoring more important ones, investigators may well lose
much of the potential benefit of the experimental paradigm. This is a serious risk in research
performed by non-interpreters, who may be unaware of factors deemed relevant by
interpreters and fail to control them.
c. To address the issue of variability, experiments are done on samples. In interpreting, access
to data and to subjects is problematic (see for example Gile 1995a), which leads to small, nonrandom samples, and to the possibility of interference from uncontrolled variables associated
with the sampling procedure (see Section 6.2).
6. Methodological issues in the literature
Though interpreting and translation investigators are vulnerable to a wide range of possible
methodological weaknesses and errors, in the context of observational studies vs.
experimental studies, three types of problems seem to be particularly salient in their research:
6.1 Weaknesses related to the validity of the data
One of the most frequently encountered problems in IR has to do with the validity of the
data as actual interpreting data. While in non-interactive observational research, data can be
considered 'authentic', validity problems start with interactive observational research,
precisely because responses elicited 'artificially' tend to be influenced by the way questions
are put, by the interaction between the interviewer and the interviewee, and by the image of
the interviewee or respondent that will be reflected back onto him by his own responses (see
for instance Frankfort-Nachmias and Nachmias 1992). At an even more basic level, the
subjective perception of 'reality' by respondents and interviewees can also be misleading.
For instance, in Déjean Le Féal 1978, it turned out that interpreters may perceive delivery
speed as the number one difficulty in the interpretation of a speech read from a text, while the
actual delivery rate is not high - other features of the speech make it appear to be high.
Similarly, experiments by Collados Ais (1996) and Gile (1985 and 1995c) suggest that
informants assessing the quality of an interpreter's linguistic output or its fidelity to the
original can be very unreliable. This misleading nature of personal perception is one of the
main reasons why scientific methods in general can be advocated in the field of interpreting
despite the limitations of the scientific approach and the significant contribution of the
'personal theorizing' approach (Gile 1990).
A validity problem that deserves special attention is associated with self-observation, as
mentioned above. Not only is the analysis of mental processes through introspection
obviously vulnerable to personal bias, but even attempts at more objective observational
research on one's own work are probably less reliable than studies of other peoples' work.
This does not imply that self-observation should be banned, as it offers advantages in terms of
motivation, availability and access to a corpus. Nevertheless, attention to potential
methodological problems arising in such research is called for.
Validity problems become more severe in experimental research, especially in hypothesis
testing and in research done by non-interpreters. Practitioners challenge the validity of data
obtained with student interpreters and with amateurs with no training in interpreting as
D.Gile 1996
Obsvsexp
page 6
opposed to professionals (e.g. in Treisman 1965, Goldman-Eisler 1972, Lawson 1967,
Chernov 1969, Barik 1971, Kopczynski 1980). They also express doubts as to the
comparability of 'real' interpreting and the tasks given to subjects in experiments: interpreting
written texts (Gerver 1974), some artificially composed with the aim of obtaining certain
linguistic characteristics (Dillinger 1989), or even 'interpreting' random strings of words
(Treisman 1965). Finally, as mentioned earlier, they have doubts as to the validity of data
obtained in a laboratory situation where the speaker is not actually seen and where the
interpreter does not 'feel the atmosphere' of a real conference. References to this issue are
numerous: Politi 1989, Shlesinger 1989 (she rejected experimental procedures and later
changed her mind and started performing experimental research herself), Dillinger 1990,
Lambert 1994, Jörg 1995 (who reports that one of his subjects complained about the 'artificial'
nature of the experiment).
6.2 Representativity of the data
Another major methodological problem lies with the representativity of the data even when
they are considered valid as interpreting data:
- As already mentioned, samples in interpreting research are generally small, with a few
exceptions in observational research such as Thiéry's dissertation (1975), a Japanese study on
189 subjects (Yoshitomi and Arai 1991), and Moser's user expectation survey (1995). In small
samples, the statistical procedure only partly smoothes out possible inter-individual variation.
- The fact that samples are not random undermines the validity of statistical tests performed
on them (Hansen et al. 1953, Snedecor and Cochran 1967, Charbonneau 1988, Babbie
1992:448), as non-random sampling does not exclude bias, and bias implies that sample
means do not tend to approximate the population mean. The same problem is found in
research in many other fields, in particular medicine and psychology, and the latter has
traditionally come under criticism for studying not the human population at large but the
population of first-year psychology students (Mc Nemar 1946, Smart 1966 and Sears 1986 are
quoted on this point in Charbonneau 1988:333). An acceptable approximation can sometimes
be found in the form of critically controlled convenience sampling, in which subjects are
selected because they are easy to access but are screened on the basis of the researcher's
knowledge of the field and (ideally) of empirical data derived from observation and
experimentation. In interpreting, it is sometimes risky to make similar claims, as interpreters
do have direct knowledge of the field, but are also highly involved, and may therefore lack
objectivity when investigating issues on which they have strong views.
- Interpreting conditions are diverse with respect to many parameters widely viewed as
significant, such as the subject of the speech, previous knowledge of the subject by the
interpreter, the syntactic and lexical make up of the speech, delivery parameters, interpreter
fatigue, interpreter motivation, working conditions, etc. In particular, the source-language and
target-language variables are poorly taken into account, as most experiments and
observational projects are carried out with two languages only, and hardly ever with 'exotic'
languages such as Chinese, Japanese, Arabic, Polynesian languages, etc. Each experiment
only represents one, two or up to what can only be a minute fraction of the total number of
possible combinations of values of the significant parameters. Generalising is therefore
hazardous.
- Another problem is linked to the possible variation over time in individual interpreting
performance (the interpreter's "ups and downs"), a phenomenon intuitively perceived by
interpreters, which may also limit the representativity of any single experiment. The
amplitude of and precise reasons for such variations are not known, but the phenomenon is
perceived as sufficiently salient by the practitioner's community to require caution.
6.3 Quantification
Interpretation is a complex phenomenon which interpreters consider difficult to measure
(see for instance Pöchhacker 1994). Though they may accept the validity of a word or syllable
count as a measure of 'linguistic throughput' (with reservations, as explained in Pöchhacker
D.Gile 1996
Obsvsexp
page 7
1994), no such yardstick exists for the measurement of information or 'message' throughput.
Neither have satisfactory ways been found to measure fidelity or the message reception rate
among delegates, and fidelity indicators such as errors are challenged because of allegedly
inadequate definitions (see for example Bros-Brann 1976 and Stenzl 1983).
In this context, observational research is less vulnerable to errors than experimental
research. This is obvious as regards the validity of data, but is also easily explained for the
two other types of problems. Observational studies generally do not rely on quantification to
the same extent as experimental studies, and tend to be content with rather gross order-ofmagnitude indications. Experimental studies of the hypothesis-testing type are based on
mathematical comparisons, and uncertainty and variability in the measurement of the relevant
values have more serious consequences. Furthermore, while such studies are designed for the
purpose of drawing inferences on the population from the samples used (hence the importance
of proper statistical procedure, including sampling methods), generalizations are not
intrinsically part of observational research. When authors of such studies do generalize, their
inferencing is intuitive rather than mathematical, and easier to assess for readers than in
experimental research, because the data are known to be authentic rather than derived from an
'artificial' experimental environment.
7. Strategies
7.1 Validity of the data
Validity problems in interactive observational interpreting research as described above are
essentially the same as in questionnaire- and interview-based studies in other disciplines, and
countering them is possible with the strategies and tactics developed for such studies in other
fields such as sociology and psychology.
Validity problems in experimental interpreting research are more difficult to deal with: too
little solid data are available for assertions to be made regarding the comparability of sight
translation data with interpreting data, of student performance with professional performance,
of work into a regular active language with experimental interpreting into a usually passive
language, etc.
Under the circumstances, I believe that in research on interpreting (as opposed to linguistic,
psycholinguistic or neurolinguistic research performed on interpreters for the purpose of
investigating cognitive processes rather than interpreting per se), more weight should be given
to the study of phenomena as they occur in the field, in other words to observational research.
Experimental set-ups create 'artificial' environments by definition. Some would probably be
considered an acceptable approximation for the phenomena under study (even Seleskovitch,
the most outspoken anti-experimentalist, based her own dissertation on an experiment - cf.
Seleskovitch 1975), while others have been rejected as non valid. The uncertainty zone
between the two may be quite wide, and it would appear reasonable to try as much as possible
to design experiments in which factors intuitively deemed relevant by practitioners are not
ignored (see Dillinger 1990 for a challenge regarding the interpreters' doubts and an answer in
Gile 1991).
One constructive way of addressing the problem would be to compare experimental results
and field performance, and study the amplitude and direction of possible differences:
recordings of actual speeches can be used for experiments, and the product of their 'real life'
interpretation and laboratory interpretation can be compared. As such comparisons
accumulate, a solid foundation for assumptions on the comparability of experimental findings
and interpretation in the field will gradually be constructed. Studies in which professional
interpreters and students, amateurs and other subjects are given the same tasks (see for
example Dillinger 1989 and Viezzi 1989) are also useful insofar as they provide data on
comparability (or lack of comparability) in the performance of various categories of subjects
and may thus open new avenues for extending virtual sample size.
7.2 Representativity of the data
D.Gile 1996
Obsvsexp
page 8
There is no easy solution to the problem of representativity, because of the small size of
samples and the large number of different working environments and conditions, as well as
the variety of individual interpreters' personal parameters. I believe the best way to address
this problem is replication, in both experimental and observational procedures. Even if it is
only partial (with somewhat different procedures and/or conditions), replication generates
data comparable to at least some extent with the initial data, and thus may be said to 'increase'
sample size, though admittedly not in the strict sense, which prevents the indiscriminate use
of the data as if they came from a single sample (see Gay 1990:115). Unfortunately, at
present, the number of interpreting investigators who perform experimental or observational
research is very small (see Pöchhacker 1995a,b), and all but a handful engage in a single
effort for the purpose of obtaining a degree and then stop. Such investigators are interested in
original research, much less in replication. The situation may change if interpreting schools
establish research centers or research programs (see Gile 1995a).
Another potentially useful resource for the assessment of data representativity in a study is
the corpus of previously published studies, which may differ in their objectives and in
conditions, but offer relevant data nevertheless. For instance, publications which contain
transcripts of original speeches and their interpretation can be used to study the typology and
distribution of errors, though in some, inferencing possibilities vary, depending on available
information on how the data were obtained.
7.3 Quantification
Admittedly, complex aspects of human behaviour such as interpreting are difficult to
measure directly in a precise, thorough and monolithic way. There are, however, specific
aspects of interpreting which are easier to quantify, such as the number and duration of pauses
in the interpreter's delivery, the proportion of proper names correctly rendered, the number of
lexical units in source and target texts and the relative frequency of specific lexical units in
each, etc. Other aspects are more difficult to quantify. In a study on the quality of interpreting
students' linguistic output (Gile 1987), the sensitiviity and criteria of native informants for
deviations from acceptable linguistic usage was found to be highly variable (Gile 1985).
Similarly, the quality of interpreting is difficult to assess accurately, as stressed inter alia by
Pöchhacker (1994) and demonstrated in Collados Ais (1996). Measurements can be
inaccurate because of a lack of precise definitions (in error counts, quality assessment, etc.),
because of observational inaccuracies or errors (due to personal bias or shifts in attention - see
Gile 1995c), and because of the natural variability of the phenomena under study. In cases
where such inaccuracies are likely, this must be taken into account when drawing conclusions.
For instance, in a study on interpreting, error rates for two groups were respectively 27.22 and
23.45 (see Politi 1989, reporting on a study by Balzani), and the difference, about 15%, was
checked for statistical significance. One may wonder about the usefulness of such statistical
significance, considering the variability and possible non-random inaccuracies that may have
crept into this small-sample experiment, including possible variability in error identification
(in her 1983 M.A. thesis, Stenzl reports that when she applied Barik's error criteria to the
corpus and tried to replicate the procedure, she found different results from his - see Barik
1971). The effect of a single outlier in a small sample can be large, as pointed out by Jörg
(1995) in his data from a sample of 6 interpreters. No matter how powerful and refined new
statistical methods are, they cannot overcome the fundamental limitations associated with
high variability in small non-random samples.
On the other hand, such inaccuracies and the limitations they entail need not be a major
problem in research projects. Less accurate quantification can be performed in many cases
using the tools that have been developed for similar problems in the social sciences, and be
sufficient to provide the required answers. For instance, in a study on the deterioration of
linguistic output quality during interpreting exercises (Gile 1987), in spite of poor informant
reliability, the existence of 3 distinct levels of linguistic deviation frequency depending on the
type of exercise (simultaneous interpreting, consecutive interpreting and ad-libbed
presentations) could be established and proved robust when submitted to sensitivity analysis
D.Gile 1996
Obsvsexp
page 9
(sensitivity analysis checks to what extent results and conclusions change when relevant
parameters vary within specified intervals, as can happen inter alia because of variability in
human evaluation). Similarly, when trying to decide whether proper names pose a 'real'
problem in interpreting, the point will have been made regardless of whether 80, 90 or 100%
of the names in a speech are incorrectly rendered in the experiment (Gile 1984). If the
question is whether or not interpreters lose some information in simultaneous under given
conditions, the result is significant regardless of whether 10, 20 or 30 errors are detected in a
half-hour interpreting segment (Gile 1989 - also see Babbie 1992:127).
Here as elsewhere, common sense is paramount, and more important than statistical
techniques (see Medawar 1981:27,93, McBurney 1990:5). One might also recall in this
respect that 'statistically significant' differences may have no practical significance: the 20%
difference between an average of 10 and an average of 12 errors per half an hour of
interpreting may not be enough to provide 'significantly better' interpreting to a client; the
30% difference between having an average of 150 and 200 new terms to learn while preparing
for a technical conference may not change significantly the difficulty of the endeavor or the
choice of appropriate interpreting strategies.
Most of the debates in interpreting research still focus not on minute differences, but on
basic principles of 'doctrine': does work into one's non-native language produce a target
speech of significantly poorer quality than that produced in one's native language? Should
consecutive interpreting be taught for a whole year before training in simultaneous
interpreting starts? Are shadowing exercises useful ?Small random differences between
conditions do not contribute much to the debate, while wide, regular differences can have
practical significance. Interpretation being largely unexplored, I believe that such wide,
regular differences can still be found with simple quantitative methods, and advocate their use
rather than more complex techniques which may yield less reliable data. Science proceeds by
"successive approximations" (Kourganoff 1971, Camus 1996:8). First approximations,
achieved with simple methods, are necessary to provide the basis for finer approximations,
which will eventually require more sophisticated methods. At this stage, due to the paucity of
data on variability and on co-variability between a larger number of potentially important
variables, descriptive statistics (which describe phenomena) may be a more appropriate tool
than inferential statistics (which attempt to make mathematics-based inferences on the
population from the data collected on samples). Explicit descriptive statistics enable readers
of published studies to decide for themselves whether samples and conditions are
representative and whether results are significant, while statistical tests may give an
undeserved seal of 'scientific objectivity' to a procedure which is not valid by strict standards
(see Gore and Altman 1982).
8. Observational studies vs. experimental studies in translation research
Similar issues exist in translation research, but their relative weights are different. The
following are some differences which can have implications on research strategies, and in
some cases on the choice of observational vs. experimental studies:
8.1 Factors facilitating observational research:
- The product of translation, basically a text, is easy to store and readily available to
investigators on a large scale at little cost, especially as regards literary translation. The
product of interpreting, composed of spoken discourse plus body language (a "hypertext" Pöchhacker 1994), is evanescent. Its observation requires recordings, preferably of both
image and sound. There are relatively few recordings of interpreted speeches, and existing
ones are difficult to access.
- Technically speaking, translated texts are easier to process, with the possibility of rapid
visual access to all parts of the corpus, while recordings of interpreted speeches require timeconsuming linear listening, or equally time-consuming transcription work, which inevitably
leads to a loss of information. From this viewpoint, the experimental approach, which
involves a 'controlled' limited corpus, is likely to be preferred by interpreters.
D.Gile 1996
Obsvsexp
page 10
- In translation, the time elapsed between the moment the translator reads a source-text
segment and the moment s/he produces its final target-language version can last anywhere
from a few seconds to hours, days or weeks. A corollary is that in field conditions, reactions
to specific stimuli (such as manipulated independent variables) can be offset by further
thinking and strategies or 'drown' in the general noise associated with the operation, and be
undetectable to the researcher. This makes experimental procedures more difficult to conduct
when studying the translation process. The TAP paradigm is a clever attempt to address the
problem, but it is not without its weaknesses (cf. Toury 1991). In simultaneous interpreting,
the EVS (ear-voice span, or time elapsed between the moment the interpreter detects a speech
segment and the moment s/he produces its target-language reformulation) rarely exceeds a
few seconds, and reactions to stimuli are likely to be more salient.
8.2 Factors facilitating experimental research:
- Translators are numerous and are used to their product being exposed, as opposed to
interpreters, whose population is much smaller, who feel vulnerable and who dislike
exposure. Access to samples is therefore easier in translation.
- 'Authentic' replication of an interpreting task in the field is essentially limited to speeches
made during international events which are interpreted by several interpreters and broadcast
over several radio and television channels. The validity of any further replication in the
laboratory can be challenged because of the 'artificial' environment. In translation, it is
possible to generate 'authentic' replication of translated tasks, for instance by commissioning
the translation of the same text into the same language from several translators or translation
agencies. The cost of a similar operation replicating field conditions in interpreting would be
much higher.
- In translation, text manipulations are easy to perform without the subject noticing them and
without jeopardizing the validity of the translation task. In interpreting, text manipulations
have been criticized for possibly generating interpreting behavior inconsistent with 'authentic'
behavior.
- Translation performance is less dependent on cognitive load management than interpreting
performance, in which phenomena lasting a fraction of a second can lead to major irreversible
consequences (Gile 1995a,b). It follows that:
(1) Strict control of environmental conditions is less critical in translation than in
interpreting. A gradual increase of sample size over time in several locations and under
slightly different conditions is therefore less problematic.
(2) Intra-individual performance variability is probably lower in translation then in
interpreting, and may well lead to solid conclusions on the basis of smaller samples.
The preceding analysis underscores two major differences between translation and
interpretation as regards research possibilities:
- On the whole, access to data on the product and their analysis are easier in translation.
- The observation of the process is more difficult in translation.
This may offer a partial explanation for the fact that translation is far ahead of
interpretation in observational product-centered studies, while it is struggling with processoriented studies.
9. Conclusion
Observational and experimental studies should be seen as mutually reinforcing, not
mutually exclusive. Somes issues are better addressed by the observational paradigm, and
some by experimental investigations, especially in fields where a considerable corpus of
research already exists, as is the case of neurophysiology. This paper does not challenge the
basic advantages of the experimental paradigm, nor does it advocate the choice of this or that
method in any individual research endeavor. It does claim that in IR, there is a lack of
descriptive data obtainable through simple methods, and that this weakens the power of more
complex methods. In problem-solving oriented research disciplines such as mathematics,
medicine, chemistry, etc., self-regulation occurs naturally as a function of results. In the
D.Gile 1996
Obsvsexp
page 11
behavioral sciences, including IR, it is more difficult to assess the value or 'truthfulness' of
research findings, and paradigmatic directions are to a larger extent a function of social norms
and research policy. Underlining problematic aspects of sophisticated methods and stressing
the value of simple methods may encourage leaders of the IR community to shift somewhat
the balance of efforts in favor of the latter, in particular among students working on a
graduation thesis, which may help fill the existing gap in descriptive data.
References
Arjona-Tseng, Etilvia.1989. "Preparing for the XXIst Century". Proceedings of the Twentieth
Anniversary Symposium of the Monterey Institute of International Studies on The Training of
Teachers of Translation and Interpretation. Monterey. pages not numbered.
Babbie, Earl. 1992. The practice of social research. Belmont, California: Wadsworth. Sixth
Edition.
Barik, Henri. 1971. "A description of various types of omissions, additions and errors
encountered in simultaneous interpretation". Meta 15:1. 199-210.
Beaugrand, Jacques P. 1988. "Démarche scientifique et cycle de la recherche". Robert, ed.
1988: 1-35.
Bros-Brann, Eliane. 1976. "Critical Comments on H.C. Barik's "Interpreters Talk a Lot,
Among Other Things"". Bulletin de l'AIIC 4:1. 16-18.
Camus, Jean-François. 1996. La psychologie cognitive de l'attention. Paris: Armand Colin.
Charbonneau, Claude. 1988. "Problématique et hypothèses d'une recherche". Robert, ed.
1988: 66-77.
Chernov, Ghelly V. 1969. "Linguistic Problems in the Compression of Speech in
Simultaneous Interpretation". Tetradi Perevodchika Vol.6: 52-65.
Collados Aís, Ángela. 1996. La entonación monótona como parametro de calidad en
interpretación simultánea: la evaluación de los receptores. Unpublished doctoral dissertation.
Universidad de Granada.
Déjean Le Féal, Karla. 1978. Lectures et improvisations: incidences de la forme de
l'énonciation sur la traduction simultanée. Unpublished doctoral dissertation. University of
Paris III.
Dillinger, Michael L. 1989. Component Processes of Simultaneous Interpreting. Unpublished
PhD dissertation. McGill University, Montreal.
Dillinger, Michael L. 1990. "Comprehension during interpreting: What do interpreters know
that bilinguals don't ?" The Interpreter's Newsletter 3: 41-55.
Fourastié, Jean. 1966. Les conditions de l'esprit scientifique. Paris: Gallimard.
Fraisse, Paul, Jean Piaget and Maurice Reuchlin. 1989. Traité de psychologie expérimentale.
Vol. I. Histoire et méthode. Paris: PUF. Sixth Edition.
Frankfort-Nachmias, Chava and David Nachmias. 1992. Research Methods in the Social
D.Gile 1996
Obsvsexp
page 12
Sciences. London, Melbourne, Auckland: Edward Arnold. Fourth Edition.
Gambier, Yves, Daniel Gile and Christopher Taylor, eds. 1997. Conference Interpreting:
Current Trends in Research. Amsterdam/¨Philadelphia: John Benjamins.
Gay, L.R. 1990. Educational Research. Competencies for Analysis and Application. New
York: Merril, MacMillan Publishing Company. Third Edition.
Gerver, David. 1974. "The effects of noise on the performance of simultaneous interpreters:
accuracy of performance". Acta Psychologica 38: 159-167.
Gile, Daniel. 1984. "Les noms propres en interprétation simultanée". Multilingua 3:2. 79-85.
Gile, Daniel. 1985. "La sensibilité aux écarts de langue et la sélection d'informateurs dans
l'analyse d'erreurs: une expérience". The Incorporated Linguist 24:1. 29-32.
Gile, Daniel. 1987. "Les exercices d'interprétation et la dégradation du français: une étude de
cas". Meta 31:4. 363-369.
Gile, Daniel. 1989. La communication linguistique en réunion multilingue, Les difficultés de
la transmission informationnelle en interprétation simultanée. Unpublished PhD dissertation.
Université Paris III.
Gile, Daniel. 1990. "Scientific Research vs. Personal Theories in the investigation of
interpretation". Gran and Taylor, eds. 1990: 28-41.
Gile, Daniel. 1991. "Methodological Aspects of Translation (and Interpretation) Research".
Target 3:2. 153-174.
Gile, Daniel. 1995a. Regards sur la recherche en interprétation de conférence. Lille: Presses
Universitaires de Lille.
Gile, Daniel. 1995b. Basic Concepts and Models for Interpreter and Translator Training.
Amsterdam/Philadelphia: John Benjamins Publishing Company.
Gile, Daniel. 1995c. "Fidelity Assessment in Consecutive Interpretation: An Experiment".
Target 7:1. 151-164.
Goldman-Eisler, Frieda. 1972. "Segmentation of input in simultaneous interpretation",
Psycholinguistic Research 1. 127-140.
Gore, Sheila M. and Douglas G. Altman. 1982. Statistics in Practice. London: British
Medical Journal.
Gran, Laura and Christopher Taylor, eds. 1990. Aspects of applied and experimental research
on conference interpretation. Udine: Campanotto Editore.
Grawitz, Madeleine. 1986. Méthodes des sciences sociales, Paris: Dalloz.
Hansen, M., W. Horwitz and W. Madow. 1953. Sample Surveys: Methods and Theory. New
York, London, Sydney: John Wiley and Sons.
Holmes, James S., José Lambert and Raymond van den Broeck, eds. 1978. Literature and
Translation: New Perspectives in Literary Studies. Leuven: acco.
D.Gile 1996
Obsvsexp
page 13
Jörg, Udo. 1995. Verb anticipation in German-English simultaneous interpreting.
Unpublished M.A. dissertation. University of Bradford.
Kiraly, Donald. 1995. Pathways to Translation. Pedagogy and Process. Kent, Ohio and
London, England: The Kent State University Press.
Kondo, Masaomi and Akira Mizuno. 1995. "Interpretation Research in Japan". Target 7:1. 91106.
Königs, Frank G. 1987. "Was beim Übersetzen passiert". Die Neueren Sprachen 86. 162-185.
Kopczynski, Andrzej. 1980. Conference Interpreting (Some Linguistic and Communicative
Problems). Nauk, Poznan: Wydawn.
Kourganoff, Vladimir. 1971. La recherche scientifique, Paris:PUF.
Krings, Hans P. 1986. Was in den Köpfen von Übersetzern vorgeht. Tübingen: Gunter Narr.
Kurz, Ingrid. forthcoming. "Getting the Message Across: Simultaneous Interpreting for the
Media". Proceedings of the EST conference in Prague, September 1995.
Kussmaul, Paul. 1995. Training the Translator. Amsterdam/Philadelphia: John Benjamins
Publishing Company.
Lambert, Sylvie. 1994. "Foreword". Lambert, Sylvie and Barbara Moser-Mercer, eds. 1994:
5-14.
Lambert, Sylvie and Barbara Moser-Mercer, eds. 1994. Bridging the Gap: Empirical research
on simultaneous interpretation. Amsterdam and Philadelphia: John Benjamins Publishing
Company.
Lawson, Everdina. 1967. "Attention and Simultaneous Translation". Language and Speech
10:1. 29-35.
Lörscher, Wolfgang. 1986. "Linguistic aspects of translation processes". House, J. and S.
Blum-Kulka, eds. 1986. Interlingual and intercultural communication. Tübingen: Gunter
Narr: 277-292.
McBurney, Donald H. 1990. Experimental Psychology. Belmont, California: Wadsworth.
Mc Nemar, Q. 1946. "Opinion-attitude methodology". Psychological Bulletin 43. 289-374.
Medawar, Peter B. 1981. Advice to a Young Scientist. London and Sydney: Pan Books.
Meta 41:1. 1996. Special issue on Translation processes, guest-edited by Frank Königs.
Mialaret, Gaston. 1984. La pédagogie expérimentale. Paris: PUF.
Moravcsik, Michael J. 1980. How to Grow Science. New York: Universe Books.
Morris, Ruth. 1989. The impact of interpretation on legal proceedings. Unpublished M.A.
thesis. Communications Institute, Hebrew University of Jerusalem.
D.Gile 1996
Obsvsexp
page 14
Moser, Peter. 1995. Survey on Expectations of Users of Conference Interpretation. Final
Report. produced by SRZ Stadt + Regionalforschung GmbH, Vienna. Mimeographed. English
translation.
Moser-Mercer, Barbara. 1991. "Paradigms gained or the art of productive disagreement".
AIIC Bulletin 19:2. 11-15.
Pöchhacker, Franz. 1994. Simultandolmetschen als komplexes Handeln. Tübingen: Gunter
Narr.
Pöchhacker, Franz. 1995a. ""Those Who Do...": A Profile of Research(ers) in Interpreting".
Target 7:1. 47-64.
Pöchhacker, Franz. 1995b. "Writings and Research on Interpreting: A Bibliographic
Analysis". The Interpreter's Newsletter 6. 17-32.
Politi, Monique. 1989. "Le signal non verbal en interprétation simultanée". The Interpreter's
Newsletter 2. 6-10.
Reuchlin, Maurice. 1969. Les méthodes en psychologie. Paris: PUF.
Robert, Michèle, ed. 1988. Fondements et étapes de la recherche scientifique en psychologie.
St-Hyacinte, Québec: Edisem and Paris: Maloine.
Schjoldager, Anne. 1995. "Interpreting Research and the 'Manipulation School' of Translation
Studies". Target 7:1. 29-45.
Sears, D.O. 1986. "College sophomores in the laboratory: infuences of a narrow data base on
social psychology's view of human nature". Journal of personality and social psychology 51.
515-530.
Séguinot, Candace. 1989. "The Translation Process: An Experimental Study". Séguinot,
Candace, ed. 1989. The Translation Process. Toronto: H.G. Publications, School of
Translation, York University.
Seleskovitch, Danica. 1975. Langages, langues et mémoire. Paris: Minard.
Shipman, Martin. 1988. The Limitations of Social Research. London and New York:
Longman. Third Edition.
Shlesinger, Miriam. 1989. Simultaneous Interpretation as a Factor in Effecting Shifts in the
Position of Texts on the Oral-Literate Continuum. Unpublished M.A. thesis. Tel Aviv
University.
Smart, R. 1966. "Subject selection bias in psychological research". Canadian psychologist 7.
115-121.
Snedecor, George W. and William Cochran. 1967. Statistical Methods. Ames, Iowa: The
Iowa State University. Sixth Edition.
Snell-Hornby, Mary, Franz Pöchhacker and Klaus Kaindl, eds. 1994. Translation Studies An
interdiscipline. Amsterdam/Philadelphia: John Benjamins Publishing Company.
Stenzl, Catherine. 1983. Simultaneous Interpretation - Groundwork towards a Comprehensive
D.Gile 1996
Obsvsexp
page 15
Model. Unpublished M.A. thesis. University of London.
Target 7:1. Special Issue on Interpreting Research. John Benjamins Library.
Strolz, Birgit. 1992. Theorie und Praxis des Simultandolmetschens. Unpublished PhD
Dissertation. Geisteswissenschaftliche Fakultät der Universität Wien.
Thiéry, Christopher. 1975. Le bilinguisme chez les interprètes de conférence professionnels.
Unpublished PhD dissertation. University of Paris III.
Tirkkonen-Condit, Sonja, ed. 1991. Empirical Research in Translation and Intercultural
Studies. Tübingen: Gunter Narr Verlag.
Tirkkonen-Condit, Sonja and John Laffling, eds. 1993. Recent Trends in Empirical
Translation Research. University of Joensuu. Faculty of Arts.
Tirkkonen-Condit, Sonja and Stephen Condit, eds. 1989. Empirical Studies in Translation
and Linguistics. University of Joensuu. Faculty of Arts.
Tommola, Jorma, ed. 1995. Topics in Interpreting Research. University of Turku. Centre for
Translation and Interpreting.
Toury, Gideon. 1991. "Experimentation in Translation Studies: Achievements, Prospects and
some Pitfalls". Tirkkonen-Condit, Sonja, ed. 1991: 45-66.
Toury, Gideon. 1995. Descriptive Translation Studies and Beyond. Amsterdam and
Philadelphia: John Benjamins Publishing Company.
Treisman, Anne. 1965. "The Effects of Redundancy and Familiarity on Translation and
Repeating Back a Native and Foreign Language". British Journal of Psychology 56. 369-379.
Viezzi, Maurizio. 1989. "Information retention as a parameter for the comparison of sight
translation and simultaneous interpretation: an experimental study". The Interpreter's
Newsletter 2. 65-69.
Williams, Sarah. 1995. "Observations on Anomalous Stress in Interpreting". The Translator
1:1. 47-64.
Wilss, Wolfram. 1996. Knowledge and Skills in Translator Behavior.
Amsterdam/Philadelphia: John Benjamins Publishing Company.
Yoshitomi, Asako and Kiwa Arai. 1991. "Tsûyaku katei ni kansuru kisô jikken" ("A basic
experiment on the interpreting process"), in Watanabe, Shoichi, ed. 1991. Mombushô josei
kagaku kenkyû: gaikokugokyôiku no ikkan to shite no tsûyakuyôsei no tame no
kyôikunaiyôhôhô no kaihatsu ni kansuru sôgôteki kenkyû (Report on a Project funded by the
Ministry of Education: A Comprehensive Study of the Development of Interpretation Training
Methodology as a Part of Foreign Language Training). Unpublished manuscript. Sophia
University, Tokyo. 373-439.
D.Gile 1996
Obsvsexp
page 16
Download