Corpora in Translation Studies - The Translation Research Summer

advertisement
Using Corpora in Translation
Studies Research
Dorothy Kenny
SALIS/Centre for Translation and
Textual Studies, Dublin City University
Outline
Part One:
 General introduction to corpora
–
–
What they’re good for
How to get your hands on one
Part Two:
 Corpora in translation studies research
2
To warm you up:



What does ‘cause’ (verb) mean?
What does ‘provide’ mean?
How often is the passive voice used in
English?
Handouts
3
Tell me something I don’t know…




4
Not all facts about language are accessible through
native-speaker intuition
It’s often impossible to describe meanings of words
without reference to the contexts those words occur in
Sometimes you need a lot of data to see patterns
Sometimes you need to count
What a corpus can help us see…


cause has negative ‘semantic prosody’
provide has positive ‘semantic prosody’
(see Louw 1993; Stubbs 1995)

5
in general, passive is used around 10% of the
time in English
(see Halliday 1991)
Corpus
a collection of naturally-occurring texts that are the
object of literary or linguistic study

held in electronic form
–

usually assembled in some principled way
–

eg linguistic analysis, lexicography, Natural Language
Processing, translation research…
or a by-product of another activity
–
6
often highly structured, containing very carefully selected texts
compiled for a purpose
–

thus amenable to (semi-)automatic processing
eg parliamentary debates (and their translations in bilingual or
multilingual parliaments)
Well-known monolingual corpora

British National Corpus (100m words)
–

The Bank of English (524m words)
–

http://www.collins.co.uk/books.aspx?group=153
free access to a sample of 56m words at
http://www.collins.co.uk/Corpus/CorpusSearch.asp
The entire World Wide Web
–
7
http://www.natcorp.ox.ac.uk/
http://www.webcorp.org.uk/
Corpora and corpus processing
software
Provide:
 a (relatively) objective basis for commentary on
linguistic phenomena, or linguistic realizations of, eg,
social phenomena
 a resource for quantitative and qualitative research
 the ability to access and manipulate (sort, display,
annotate) vast quantities of data, thus facilitating
analysis by humans
8
Access to corpora

web access
–
–
–


9
Bank of English, BNC,IDS
free access to ‘sampler’
access to larger corpus by subscription/permission
purchase own copy (eg BNC)
create your own corpus
Web access to corpora: corpus
processing software

Web-accessible corpora have dedicated
interfaces that allow users, eg, to:
–
–
–
10
choose texts/sub-corpora
search the corpus for instances of a particular word
(results in KWIC concordance format)
display and sometimes sort results
Sample search:Cobuild Bank of English
11
Sample results: Cobuild Bank of English
12
Web access to corpora: corpus
processing software




More sophisticated interfaces, eg, Cosmas I* (IDS
in Mannheim), allowed searches for:
all forms of a lemma
– gehen gets gehe, ging, gehst, geht, etc.
all compounds formed from a search word
– Möbel gets Gartenmöbel, Möbellager, etc.
all forms derived from a search word
–
13
Kind gets Kindchen, kindlich, etc.
*replaced in 2003 by Cosmas II
Building your own corpus - basic
design issues








14
Written vs spoken vs written and spoken
Static vs dynamic (monitor)
Synchronic (time of production?) vs diachronic
General reference vs specialised
Monolingual vs bilingual vs multilingual
Domains to be covered
Text types to be covered
Level of annotation: raw vs annotation with extra-textual info
(headers to indicate text title, author, speakers, etc) vs detailed
linguistic annotation
Raw corpus/clean text
Example:
As he weakened, Moran became afraid of his daughters. This once
powerful man was so implanted in their lives that they had never really left
Great Meadow, in spite of jobs and marriages and children and houses of
their own in Dublin and London. Now they could not let him slip away.
from John McGahern’s Amongst Women
15
Corpus annotation:
structural and POS tagging
Extract from the British National Corpus:
<p>
<s n=0001><w CJS>As <w PNP>he <w VVD>weakened<c PUN>, <w
NP0>Moran <w VVD>became <w AJ0>afraid <w PRF>of <w DPS>his
<w NN2>daughters<c PUN>.
<s n=0002><w DT0>This <w AV0>once <w AJ0>powerful <w NN1>man
<w VBD>was <w AV0>so <w VVD-VVN>implanted <w PRP>in <w
DPS>their <w NN2>lives <w CJT>that <w PNP>they <w VHD>had
<w AV0>never <w AV0>really <w VVN>left <w AJ0>Great <w NN1NP0>Meadow<c PUN>, <w PRP>in spite of <w NN2>jobs <w
CJC>and <w NN2>marriages <w CJC>and <w NN2>children <w
CJC>and <w NN2>houses <w PRF>of <w DPS>their <w DT0>own <w
PRP>in <w NP0>Dublin <w CJC>and <w NP0>London<c PUN>.
<s n=0003><w AV0>Now <w PNP>they <w VM0>could <w XX0>not <w
VVI>let <w PNP>him <w VVB>slip <w AV0>away<c PUN>.
</p>
from John McGahern’s Amongst Women
16
Building your own corpus - Getting
the Texts


Full texts vs text samples
Sampling
–


Copyright Permission
Representativeness
–
–
17
random Sampling or handpicking?
difficult concept to apply to textual data
onus is on researcher to document corpus contents
very carefully
Conversion to electronic form?

Text not available in electronic form:
–

Text available in electronic form
–
–
–
–

18
Web pages (see Kilgarriff et al 2006)
Downloads from websites
Full-text databases
Donations from authors/translators
Format texts should be saved in?
–

Scanning + OCR vs Keyboarding
.txt? .html? .xml? (know what your software can handle!)
Is alignment necessary?
Doing Research Using Corpora



Corpora can help us answer questions, but
they also generate questions
In general, passive is used around 10% of the
time in English; but nearly 20% of the uses of
cause as a verb in my scientific corpus are in
the passive voice
Does this tell me something about
–
–
–
19
the verb cause?
scientific English?
my sample of scientific English?
Hypothesis generation and testing
Hypothesis – a tentative claim
eg cause tends to be used more in the passive than other English verbs

hypotheses sometimes emerge from corpus data
–

hypotheses sometimes formed before we look at corpus data
–


20
corpus-driven research
corpus-based research
newly generated hypotheses can be tested against other (often
bigger) corpora
cycles of hypothesis formation, testing, refinement, testing…
common in corpus research
Some early descriptive hypotheses
in translation studies

Translations tend to be simpler/more explicit/more
conventional than:
–
–

other texts in the same language
their source texts
But how do you operationalize notions like
simplification, explicitation, normalization in corpusbased research?
–
Eg, what concrete features of a text show it to be ‘simpler’ than
another text?
(See especially, Baker 1993, 1995, 1996)
21
Early phase: Developing
methodologies and resources
Eg Question: Are translations really ‘simplified’?
Eg Answer: Compared to what?
22
–
source texts?
=> parallel corpus methodology
–
other texts in the target language?
=> comparable corpus methodology
Parallel Corpus
set of source texts in language A alongside
their translations into language B (and perhaps
languages C, D, E …)

can be bilingual or multilingual
–
–

can be unidirectional or bidirectional
–
23
eg English-Norwegian Parallel Corpus (ENPC)
http://www.hf.uio.no/ilos/forskning/forskningsprosjekter/enpc/
eg European Parliament Proceedings Parallel Corpus
http://www.statmt.org/europarl/
–
eg German-English Parallel Corpus of Literary Texts de->en
eg ENPC en->no and no->en
Comparable Corpus
Monolingual Comparable Corpus
set of texts translated into a language A
alongside texts originally written in that same
language
–
Translational English Corpus (TEC)
http://www.monabaker.com/tsresources/TranslationalEnglishCorpus.htm
+ comparable subset of the BNC
–
24
Finnish Comparable Corpus (Mauranen 2004)
Early Corpus-based TS:
Quantitative Bias?




25
Attempt to approach translation objectively
Reliance on properties of text that can be measured
– Average word length
– Average sentence length
– Lexical density
– Type-token ratio, etc
Focus on monolingual comparable corpora
Focus on general tendencies (or ‘universals’) in
translation (Laviosa 2002; Mauranen and Kujamäki 2004)
Lexical Density


26
the ratio of content words to the total number of words
in a text
–
I made the chemicals hotter so they changed.
(4/8 = 50%)
–
Raising the temperature produced a chemical change.
(5/7 = 71%)
(from Gibbons 2003:20)
the lower the lexical density, the ‘simpler’ the text.
Type-Token Ratio


27
the ratio of types to tokens in a corpus
– eg the TTR for The cat sat on the mat near the log
fire is 8/10 (or 80%)
assumed to measure the (lexical) variety in a text:
– the higher the TTR, the more varied the text’s
vocabulary
TTR: An Example
In this piece I rail against the tendency of linguists to
write about the philosophy of science as applied to
their subject field instead of writing about what
languages are like, which is what linguists are
supposed to be good at. Unsympathetic critics will no
doubt charge that by doing so I instantiate the very kind
of behavior that I am railing against.
From Pullum 1991:123
TTR = 49/63 = 78%
28
TTR continued
In this piece I rail against the tendency of linguists to write
about the philosophy of science as applied to their subject field
instead of writing about what languages are like, which is what
linguists are supposed to be good at. Unsympathetic critics will
no doubt charge that by doing so I instantiate the very kind of
behavior that I am railing against. This is not so. I am
complaining about unproductive metalevel discussion, which
consists of linguists talking about doing linguistics instead of
doing it. By offering a critique of such work, I am operating at a
meta-metalevel, talking about linguists talking about doing
linguistics instead of doing it. There is a difference.
TTR = 66/115 = 57%
29
Type/Token Ratio

is extremely sensitive to text length
–

solution: calculate the TTR for successive chunks of
texts (eg every 100 or 1,000 words), and then take an
average count at the end (standardized TTR)

but even standardized TTRs are problematic as a
marker of simplicity/complexity as they can’t capture
the difference between hard words and easy words
–
30
the longer the text is, the lower the TTR is (normally)
eg fire vs conflagration
Operationalizing ‘Simplification’

If translated texts are somehow ‘simpler’ than
original texts in the same language, then they
might have:
–
–
–
–
shorter average word length
shorter average sentence length
lower lexical density
lower standardized type-token ratios
compared to originals…
31
Initial Results: Investigating
Simplification (Laviosa 1998a)



33
Corpus: newspaper articles translated into English,
and newspaper articles originally written in English
Tool: WordList in WordSmith Tools
Results
– lexical density: lower in translated articles
– average sentence length: lower in translated articles
– standardized type/token ratio: no significant
difference
Initial Results: Investigating
Simplification (Laviosa 1998b)



Corpus: fiction translated into English, and fiction
originally written in English
Tool: WordSmith Tools WordList
Results
–
–
–
–
34
lexical density: lower in translated articles
average sentence length: higher in translated articles
standardized type/token ratio: no significant difference
but translations make greater use of high frequency
words than do non-translations = core patterns of lexical
use
Combining Quantitative and
Qualitative Research
Investigating Explicitation (Olohan and Baker
2000)


35
Hypothesis: Translators’ (unconscious?) tendency to
be more explicit than other writers in the same
language will be visible in their greater use of optional
explicitating grammatical elements
–
Eg: that after ‘He says that Mozart was a wonderful composer’
–
Compare: ‘He says Mozart was a wonderful composer’
Corpus: TEC and BNC
Optional that as seen in
concordance lines
Jeux à deux is the circle, which is to
nocuous actions in reality, which is to
sped you hideously. Rock of branches,
ue ink, Yevgenia Wdowin, but it doesn't
sk what smells so awful, and it doesn't
ad in mind for you, then the people who
upstairs to wait for a moment. He is to
n his cigar. It occurred to Vogtmann to
company with her share. Today she could
must change its owner. I believe I can
n to express our concern. I think I may
cking off immediately, I just wanted to
e was not there, but had left a note to
happened to Franz prompts me to lie. I
orm of institution. What I am trying to
tecting suspicion, although he couldn't
36
say that the duration of the game has ne
say that the distance must also be maint
say that it roars. The fields, as the f
say that the eggs in the basement are go
say that my mother hasn't uttered a word
say that Russians stink are right." Dear
say that the stairs are blocked because
say that an apt mode of expression avoid
say that she had made the right decision
say that Herr Urban's thoughts are also
say that he shares our view.' 'I've got
say that you would get all the support y
say that she had driven to the hospital
say that he has gone off to work in his
say, though, is that this institution is
say that he made a deliberate habit of
Investigating Explicitation
(Olohan and Baker 2000; Olohan 2003, 2004)
Optional that in reporting structures built around SAY and TELL

that far more common in translated English than in original English

factors conditioning inclusion of that:
formality;
matrix verb;
potential ambiguity;
long adverbials; cognitive complexity

37
attempt not just to describe, but also to explain the specificities of
translation
But what about the Source Texts?
(Kenny 2005)

Can patterns of inclusion/exclusion of optional that be
influenced by patterns for dass in German source
texts?
–
–


38
Oberflächlich zufriedengestellt sagte er, dass er sowieso
Zeitungen kaufen und telefonieren wolle.
Superficially satisfied, he said that he wanted to buy
newspapers and make a telephone call anyway.
Corpus – German-English Parallel Corpus of Literary
Texts (GEPCOLT)
Looked at SAY + zero connective vs SAY+ that
SAY + that vs zero-connective:
Results (Kenny 2005)
optional that used with SAY in 157 cases
 78 optional uses of that with SAY coincide with a dass
in the ST
 79 coincide with a zero-connective in the ST
 If you see that, there’s a 50/50 chance that dass was
in the ST
zero-connective used with SAY in 219 cases
 46 coincide with a dass in the ST
 173 coincide with a zero-connective in the ST
 If you don’t see that, then eight out of ten times there
was no dass in the ST either
39
SAY + that vs zero-connective:
Results (Kenny 2005)
40

patterns of omission of that follow patterns of omission
of dass; but patterns of inclusion of that cannot be
explained by the ST

dass to 0 shifts do happen (46), but they’re not as
common as 0 to that shifts (79)

overall tendency is towards grammatical explicitation
From the General to the Particular



Much CTS focuses on general patterns in
translated text (see Mauranen and Kujamäki 2004)
Search for generalizations inevitably leads to
the recognition of particularities, eg the
distinctive behaviour of individual translators
Translators’ individual styles come into focus
Baker (2000), Kenny (2001), Saldanha (2004, 2005)
Winters (2004, 2005), Olohan (2004:Chapter 8), Bosseaux (2007)
41
General Features of Translation vs
Features of Translator Style
Eg Kenny (2001)



focuses on lexical creativity/normalisation
uses German-English Parallel Corpus of Literary Texts
uses frequency-ranked word list (WordSmith) to find
–
–


42
creative once-off words in the STs, and
words or clusters of words peculiar to one writer
Uses concordancer (WordSmith) to find creative
collocations (node=Auge ‘eye’)
Uses parallel concordancers (Multiconcord) to find
translations in the TTs
Clusters: Sample Results for John
Brownjohn
(from Natascha Wodin’s (1983) Die Gläserne Stadt)
43
Und plötzlich im Parterre das
Schrillen des Telefons. Helmut. Um
diese Zeit konnte nur er es sein.
Suddenly the phone rang—Helmut-it couldn't be anyone else at this hour.
Das Schrillen des Telefons. Nur
zwanzig Minuten diesmal. Ihre
Anmeldung Moskau. L's russisches
Allo ... die rauhe Oberfläche einer
Birne.
The phone rang. Only a twentyminute delay this time. Your call to
Moscow, L's Russian "Allo?"--husky
as only his voice could be.
Das Schrillen des Telefons unter
der Bettdecke.,
The telephone rang beneath the
bedclothes..
Das Schrillen des Telefons unter
der Bettdecke.
Again the telephone rang beneath
the bedclothes.
Writer-specific forms: Sample Results
for Malcolm Green
(from Gerhard Roth (1986) Am Abgrund)
Da kommt der Staatsbeamte herein
und verlangt den Herrn
Irrenwäscher zu sprechen.
Then the government official enters
and asks to speak to the head
madmen-washer.
Der Herr Irrenwäscher kommt auf
einem Fahrrad angeradelt und hat
eine Fetten.
The head madmen-washer arrives
by bicycle, he's as pissed as a fart.
»Das weiß ich selber«, gibt der Herr
Irrenwäscher frech zurück
"You're telling me!" the madmenwasher replies impudently.
Der Herr Irrenwäscher ist ein fülliger, The head madmen-washer is a
behäbiger Mann.
portly, comfort-loving man.
44
Homing in on the translator’s voice
45

Whose voice do we hear in translation? Whose point of view do
we share? The author’s? The translator’s? Both?

Can we develop methodologies that allow us to show systematic
differences between translators, and thus the specific voice of
each individual translator?
(handout)

Bosseaux (2007),Saldanha (2005), Winters (2004, 2005)
– integrate narratological structure, deixis, modality, typography,
etc into data- and theory-rich studies of the translator’s
presence in translation.
– B. and W. look at different translations of the same source
text(s)
– S. looks at several translations by two translators of different
source texts, but from the same source langauge(s)
Winters 2005


Bilingual parallel corpus
One ST
–

Two TTs
–
–
–
–


46
F Scott Fitzgerald’s (1922)The Beautiful and Damned
both into German
both published in 1998
TT1 by Hans-Christian Oeser
TT2 by Renate Orth-Guttmann
Focuses on aspect of the TL that does not have a
counterpart in the SL – German modal particles
Research initially data-driven
Winters 2005
FSF: He considered, nevertheless, that he had given her
an object-lesson and that the matter was closed…
ROG: Immerhin, sagte er sich, habe er ihr eine Lektion
erteilt und damit sei der Fall wohl erledigt.
HCO: Nichtsdestoweniger war er der Meinung, daß er
ihr einen Denkzettel verpaßt hatte und daß die Sache
damit erledigt war…
47
Interplay between corpora and theoretical
translation studies: Translation Units
(Kenny 2004 & forthcoming)

Bennett (1994) distinguishes:
–
–
–

48
translation atoms
translation focus
text as macro-unit
For technologists working in Natural Language
Processing translation units are segments of
ST aligned with segments of their TT
Translation Units

DTS approach (Toury 1995)
– Coupled pairs of ST and TT segments whose
boundaries are identified by the analyst, according
to ‘no-leftovers principle’
‘Thus, the analyst will go about establishing a segment of
the target text, for which it would be possible to claim that –
beyond its boundaries – there are no leftovers of the
solution to a translation problem which is represented by
one of the source text’s segments, whether similar or
different in rank and scope.’
(Toury 1995:78-79).
49
Translation Units in NLP (Kraif 2003)

Lexical correspondences
–
–
–

Translation Equivalents
–
–
50
Stable semantic equivalents
Pairing in a corpus a regular occurrence
Useful for bilingual lexicon extraction
Context-bound, possibly once-off pairings
Considered as ‘noise’ in bilingual lexicon extraction
Looking for Translation Units in a
Parallel Corpus
Interested in :
–
–
–
–
extended units of meaning and translation units
context ‘independence’ and context dependence
text as macro-translation unit
harnessing potential of corpora to uncover groups of related
instances
Assume:
–
–
–
51
TUs are mutually defining ST-TT segments
search will begin in one language but shuttle back and forth
between the SL and the TL
variable TUs (though not normally exceeding clause)
Concordance of mit aller Kraft in
Gepcolt
ff angefroren sein mußte. Sie versuchte mit aller Kraft, bei klarem Verstand zu
n aus den Gedanken, in denen er steckt, mit aller Kraft herausziehen, doch selbs
e Stopfnadel aus meinem Handarbeitszeug mit aller Kraft, und wenn ich sage Kraft
ahrtendolch in Pension ist, und posiert mit aller Kraft wie auf Sophies Bruderfo
hließend ersticht er die tote Schwester mit aller Kraft. Dann ist er endlich dam
der Kurt Lukas war,
schien sein Messer mit aller Kraft in die Hündin zu treiben
hle Bäume vor dem Feld, der Mann rannte mit aller Kraft, das sah ich an den gewö
tarre der Meereisdecke ziehen die Hunde mit aller Kraft nach der nächstgelegenen
ultern gegen die Matratze, stemmte mich mit aller Kraft gegen seinen Amoklauf, e
hte lebenden Embryo-Gesichter, scheinen mit aller Kraft von der Sonne angezogen
eren Glas-Röhrchen auf ihr Blut warten, mit aller Kraft und Empörung auf den Fuß
52
Mit aller Kraft examples (Tables 1 and 2)
What can we say about the:
–
boundaries of TUs?
No left-overs? Attentional focus on ST? Macro-unit?
–
–
53
stability of translation equivalents/lexical
correspondences?
context (or co-text) dependence
Context Dependence






54
Eri wehrte sich zwar mit aller Macht dagegen, verurteilt zu
werden
hei resisted being convicted with all hisi might
Siej stößt mit dem Stuhl mit aller Gewalt gegen die Türe.
Shej beats the chair against the door with all herj might
man hieb auf ihn von der Seite ein, mit aller Wucht
he was being lashed by someone at his side with the utmost
weight and force
Text as Macro-Unit
(from Wodin 1983/1986)
Sein gebräuntes Gesicht lacht in den Falten. Das Lachen
eines Apfeldiebes. (Par. 125)
His tanned face was creased with laughter, the laughter
of a boy stealing apples.
L mit seinem neuen, schlanken, verjüngten Körper, mit
seinem alten Apfeldieblachen im Gesicht. (Par. 582)
I couldn’t take my eyes off L, with his new, slim,
rejuvenated body and his old, inimitable, mischievious
smile.
55
Translation Units - Conclusions


56
ST segment subject to attentional focus,
translation atoms and translation macro-unit
can all differ, even on a single occasion
Context dependence can’t be assumed to
indicate a once-off idiosyncratic translation;
sometimes it’s the norm
Conclusions
Corpora
–
–
–
–
57
provide a more objective basis for studies of
translation, statements about translator behaviour
allow us to zoom in and out, from general to
particular and back again
allow synthesis of different approaches (cognitive,
empirical, etc)
help us to refine key notions in translation theory
The Future?



Better contextualization of studies/results,
integrating other sources of data
More sophisticated methodologies and
analyses
More varied corpus types
–

58
Eg multimodal corpora – audiovisual corpora, sign
language corpora, etc
Mainstreaming of basic corpus processing
References
.















Baker, Mona (1993) ‘Corpus Linguistics and Translation Studies. Implications and Applications’, in Mona Baker,
Gill Francis and Elena Tognini-Bonelli (eds), Text and Technology: In Honour of John Sinclair Amsterdam and
Philadelphia: John Benjamins 233-250.
Baker, Mona (1995) ‘Corpora in Translation Studies: An Overview and Some Suggestions for Future
Research’, Target 7(2):223-243.
Baker, Mona (1996) ‘Corpus-based Translation Studies: The Challenges that Lie Ahead’, in Harold Somers (ed)
Terminology, LSP and Translation: Studies in Language Engineering, in Honour of Juan C. Sager Amsterdam
and Philadelphia: John Benjamins, 175-186.
Baker, Mona (2000) Towards a Methodology for Investigating the Style of a Literary Translator’ in Target
12(2):241-266
Bennett, Paul (1994) ‘The Translation Unit in Human and Machine’ Babel 40:1, 12-20.
Bosseaux, Charlotte (2007) How Does it Feel? Point of View in Translation Amsterdam/New York: Rodopi.
Gibbons, John (2003) Forensic Linguistics. An Introduction to Language in the Justice System Oxford:
Blackwell Publishing.
Haller, Helmut, Roeland Van Hout and Jeanine Treffers-Daller ‘Lexical Richness in the Spontaneous Speech of
Bilinguals’ in Applied Linguistics 24(2):197-222.
Kenny, Dorothy (2001) Lexis and Creativity in Translation: A Corpus-based Study Manchester: St. Jerome.
Kenny, Dorothy (2004) ‘Die Übersetzung von usuellen und nicht unusuellen Wortverbindungen vom Deutschen
ins Englische’ in Kathrin Steyer (ed) Wortverbindungen – mehr oder weniger fest. Institut für Deutsche Sprache
Jahrbuch 2003 Berlin/New York: Walter de Gruyter, 335-347.
Kenny, Dorothy (2005) ‘Parallel Corpora and Translation Studies: old questions, new perspectives? Reporting
that in Gepcolt. A case study’ in Geoff Barnbrook, Pernilla Danielson and Michaela Mahlberg (eds) Meaningful
texts: the extraction of semantic information from monolingual and multilingual corpora London/New York:
Continuum, 154-165.
Kenny, Dorothy (forthcoming) ‘Translation Units and Corpora’ in Alet Kruger and Kim Wallmach (eds) Corpusbased Translation Studies: More Research and Applications Manchester: St. Jerome
Kilgarriff, Adam, Michael Rundell and Elaine Uí Dhonnchadha (2006) ‘Efficient corpus development for
lexicography: building the New Corpus for Ireland’, Language Resources and Evaluation 40(2): 127-52.
Kilgarriff, Adam and Gregory Grefenstette (2003) ‘Web as Corpus’, Computational Linguistics 29(3): 333-47. .
Kraif, Olivier (2003) ‘From translational data to contrastive knowledge’ International Journal of Corpus
Linguistics 8:1, 1-29.
References continued













Laviosa, Sara (1998a) ‘The English Comparable Corpus: A Resource and a Methodology’ in Lynne Bowker,
Michael Cronin, Dorothy Kenny and Jennifer Pearson (eds), Unity in Diversity? Current Trends in Translation
Studies Manchester: St. Jerome, 101-112.
Laviosa, Sara (1998b) ‘Core patterns of lexical use in a comparable corpus of English narrative prose’ Meta
43(4): 557-570.
Laviosa, Sara (2002) Corpus-based Translation Studies. Theories, Findings, Applications Amsterdam/New
York: Rodopi.
Mauranen, Anna (2004) ‘Corpora, universals and interference’, in Anna Mauranen and Pekka Kujamäki (eds)
Translation Universals. Do they exist?, Amsterdam and Philadelphia: John Benjamins, 65-82.
Mauranen, Anna and Pekka Kujamäki (2004) (eds) Translation Universals. Do they exist? Amsterdam and
Philadelphia: John Benjamins.
Olohan, Maeve (2003) ‘How frequent are the contractions? A study of contracted forms in the Translational
English Corpus’ Target 15:59-89.
Olohan, Maeve (2004) Introducing Corpora in Translation Studies London and New York: Routledge.
Olohan, Maeve and Mona Baker (2000) ‘Reporting that in Translated English: Evidence for Subconscious
Processes of Explicitation?’ Across Languages and Cultures 1:141-72.
Saldanha, Gabriela (2004) 'Accounting for the Exception to the Norm: a Study of Split Infinitives in Translated
English', Language Matters 35(1):39-53.
Saldanha, Gabriela (2005) The Translator's Style: A Corpus-based Exploration Unpublished PhD thesis.
Dublin: Dublin City University.
Toury, Gideon (1995) Descriptive Translation Studies and Beyond Amsterdam and Philadelphia: John
Benjamins.
Winters, Marion (2004) ‘German Translations of F. Scott Fitzgerald’s The Beautiful and Damned – A Corpusbased Study of Modal Particles as Features of Translators’ Style’. In: Ian Kemble (ed.) Using Corpora and
Databases in Translation Portsmouth: University of Portsmouth, 71-88.
Winters, Marion (2005) A corpus-based study of translator style: Oeser’s and Orth-Guttmann’s German
translations of F. Scott Fitzgerald’s The Beautiful and Damned Unpublished PhD thesis. Dublin: Dublin City
University.
Download