Corpora, registers, and metaphor: What every translator should know but was afraid to ask Estudos em Tradução – Teorias, Práticas e Tecnologias PUCRS, August 31, 2012 Tony Berber Sardinha Multidimensional analysis and translation Main points Translation is normally thought of as depending mostly on lexis but in fact it also involves structure (grammar), among other levels. Grammar is affected by the choices made by translators. These choices can have a cumulative effect on the constitution of a text and this in turn can make a text ‘feel different’ than what the translator had in mind. It is hard to have control over the cumulative effect of translation choices, and so in this part of the mini-course the goal is to raise awareness as to the importance of grammar choices in translation rather than to present ‘the solution’ to this issue or ‘the technique’ to handle such cases. Register shift: If the translation is such that the typical characteristics of a register, eg. Conversation, are translated into a different set of characteristics (typical of another register), then the resulting text will not be a natural conversation, but something else. Translators translate texts and texts are shaped by register. Hence it's important that translators become aware of register characteristics. One of these is that they vary systematically -- particular linguistic features co-occur sytematically in particular registers, and not in others. 1 Functional interpretation of structural characteristics Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge ; New York: Cambridge University Press. What impact might a translation have on a text based on the information on this table? 2 Register ‘The register perspective combines an analysis of linguistic characteristics that are common in a text variety with analysis of the situation of use of the variety. The underlying assumption of the register perspective is that core linguistic features like pronouns and verbs are functional, and, as a result, particular features are commonly used in association with the communicative purposes and situational context of texts.’ Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge ; New York: Cambridge University Press, p.2. Dimensions and Multidimensional Analysis ‘Multidimensional (MD) analysis is a quantitative approach that allows the researcher to compare many different registers, with respect to several differ- ent linguistic parameters – the “dimensions.” Two registers can be more or less different with respect to each dimension. By considering all linguistic dimensions, it is possible to describe both the ways and the extent to which registers differ from one another, and ultimately, the overall patterns of register variation in a language. As shown in the last section, the relative distribution of common linguistic features, considered individually, cannot reliably distinguish among registers. There are simply too many different linguistic characteristics to consider. However, these features work together as distinct underlying dimensions. Each of these dimensions represents a group of features that co-occur: the features – as a group – are frequent in some registers and rare in other registers.’ Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge ; New York: Cambridge University Press, p.223. Dimensions of register variation for English (Biber 1988, 2009) Dimensão 1 Biber 1988 Involved versus Informational Production 2 Narrative versus Non-narrative Concerns Explicit versus SituationDependent Reference Overt Expression of Persuasion Abstract versus Non-abstract Information On-line Informational elaboration 3 4 5 6 Tradução Produção marcada por envolvimento versus informacional Propósitos narrativos versus não narrativos Referência explícita versus dependente de referência Persuasão explícita Informação abstrata versus não-abstrata Elaboração em tempo real Biber 2009 Involved versus Informational Production Narrative versus Non-narrative discourse Situationdependent versus elaborated reference Overt expression of argumentation Abstract versus non-abstract style Tradução Produção marcada por envolvimento versus informacional Discurso narrativo versus não narrativo Referência dependente de situação versus elaborada Argumentação explícita Estilo abstrato versus não-abstrato (Não existente) 3 Linguistic characteristics associated with each dimension Dimensão Dimensão 1 Polo positivo Private verbs THAT deletion Contractions Second person pronouns Present tense verb DO as pro-verb Analytic negation Demonstrative pronouns General emphatics First person pronoun BE as main verb Pronoun IT discourse particles Causative subordination Indefinite pronouns General hedges Amplifiers Sentence relatives WH questions Possibility modals Non-phrasal coordination WH clauses Final prepositions (Adverbs ( Conditional subordination Polo negativo Attributive adjectives Prepositions Type-token ratio Word length Nouns (Present participal WHIZ deletion (Past participal WHIZ deletion (Agentless passives (Place adverbials Dimensão 2 Polo positivo Past tense verbs Perfect aspect verbs Third person pronoun Public verbs Synthetic negation Present participial clauses Polo negativo (Word length (Past participal WHIZ deletion (Attributive adjectives Peso Verbo privado Apagamento de THAT Contração Pronome de segunda pessoa Verbo no tempo presente Verbo DO Negação analítica Pronome demonstrativo Enfatizador Pronome de primeira pessoa Verbo to be Pronome IT Partícula discursiva Subordinação causativa Pronome indefinido Atenuador Advérbio / qualificador amplificador Pronome relativo Pergunta WH Verbo modal de possibilidade Coordenação não-frasal Oração WH Preposição final Advérbios Subordinação condicional Adjetivo em posição atributiva Preposição Razão Forma-Ocorrência Tamanho de palavra Substantivo Oração adjetiva reduzida de gerúndio Oração adjetiva reduzida de particípio Voz passiva sem agente Advérbio de lugar Verbo no tempo passado Verbo no aspecto perfeito Pronome de terceira pessoal Verbo público Negação sintética Oração reduzida de gerúndio Tamanho de palavra Oração adjetiva reduzida de particípio Adjetivo em posição Etiqueta prv_vb that_del contrac pro2 pres pro_do CountTags 0.96 0.91 0.9 0.86 0.86 0.82 0.78 0.76 0.74 0.74 0.71 0.71 0.66 0.66 0.62 0.58 0.56 pdem gen_emph pro1 be_state it prtcle sub_cos pany gen_hdg amplifr 0.55 0.52 0.5 0.48 0.47 0.43 0.42) 0.32) wh_ques pos_mod o_and wh_cl fnlprep advs sub_cnd -0.47 adj_attr -0.54 -0.54 -0.58 -0.8 -0.32) prep ttr wrlength n -0.38) -0.39) -0.42) agls_psv pl_adv 0.9 0.48 0.43 0.43 0.4 0.39 pasttnse perfects pro3 pub_vb -0.31) -0.34) wrlength -0.41) adj_attr 4 (Present tense verbs Dimensão 3 Polo positivo WH relative clauses on object position Pied piping constructions WH relative clauses on subject position Nominalizations Phrasal coordination Polo negativo Time adverbials Place adverbials Adverbs Dimensão 4 Polo positivo (único) Infinitives Prediction modals Suasive verbs Conditional subordination Necessity modals Split auxiliaries (Possibility modals Dimensão 5 Polo positivo Conjuncts Agentless passives Past participal clauses BY-passives Past participial WHIZ deletions Other adverbial subordinators (Predicative Adjective Polo negativo (Type-token ratio atributiva Verbos no tempo presente Oração WH em posição de objeto Oração WH com preposição inicial Oração WH em posição de sujeito Nominalização Coordenação frasal Advérbio de tempo Advérbio de lugar Advérbios Verbo no infinitivo Verbo modal de antecipação Verbo suasivo Conjunção subordinativa -condicional Verbo modal de necessidade Advérbio encaixado no auxiliar Verbo modal de possibilidade Conjuntivos Voz passiva sem agente Orações adjetivas reduzidas de particípio Voz passiva com preposição BY Modificador pós-nominal passivo Outros subordinativos Adjetivo em posição predicativa Razão Forma-Ocorrência -0.47) pres 0.63 rel_obj 0.61 rel_pipe 0.45 rel_sub 0.36 0.36 n_nom p_and -0.46 -0.49 -0.6 tm_adv pl_adv advs 0.76 0.54 0.49 0.47 inf prd_mod sua_vb sub_cnd 0.46 0.44 nec_mod spl_aux 0.37) pos_mod 0.48 0.43 0.42 conjncts agls_psv 0.41 by_pasv 0.4 whiz_vbn 0.39 sub_othr 0.31) pred_adj -0.31) ttr 5 6 7 8 9 Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge ; New York: Cambridge University Press. 10 Working with tagged texts: The Biber tagger 5 ^zz++++=5 dead ^jj++++=dead, , ^zz++++=EXTRAWORD 21 ^cd++++=21 wounded ^jj+++xvbn+=wounded in ^in++++=in holiday ^nn++++=holiday shootings ^nns+++??+=shootings across ^in+pl+++=across city ^nn++++=city Girl ^nn++++=Girl, , ^zz++++=EXTRAWORD 10 ^cd++++=10, , ^zz++++=EXTRAWORD now ^rn+dspt+++=now ' ^zz++++='smiling smiling ^jj+++xvbg+=EXTRAWORD and ^cc+phrs+++=and laughing ^jj+++xvbg+=laughing' ' ^zz++++=EXTRAWORD after ^cs+sub+++=after being ^xvbg+++xvbg+=being shot ^nn+++xvbn+=shot playing ^vwbg+++xvbg+=playing in ^in++++=in hydrant ^nn+++??+=hydrant 11 Text Scores How to calculate: Para cada texto, calcular o escore de fator, somando as frequências normalizadas das características que compõe cada polo, separadamente, e em seguida subtraindo a soma referente ao polo negativo do total do polo positivo. Por exemplo, se no polo positivo de uma dimensão houver as características substantivo e preposições, e no negativo verbos e pronomes pessoais, então o escore de fator de cada texto será calculado pela fórmula (frequência padronizada de substantivos + frequência padronizada de preposições) – (frequência padronizada de verbos + frequência padronizada de pronomes pessoais). Assim, um texto que tenha as frequências padronizadas iguais a substantivo = 3,5, preposições = 2,0, verbos = 2,5, pronomes pessoais = 1,0, terá um escore de fator na dimensão igual a 2,0, pois (3,5 + 2,0) – (2,5 + 1,0) = 5,5 – 3,5 = 2,0. Look in the file dim_scores_pucspmac.xlsx for some actual scores of the PUCSP Metaphor Annotated Corpus (PUCSPMAC). What patterns can you identify with respect to the distribution of scores on the dimensions and how they vary across registers? PUCSPMAC composition 12 Working on the interaction vs informational dimension (#1). Why is this dimension important? It is stable across languages; it is the most significant of all dimensions. A translation can have a major impact on a text score on this dimension, e.g. by reducing the interactive features and increasing the informational features (or maybe the other way around, although that’s less likely?). Features marked with < > below are the most characteristic on a dimension pole. How would you translate the samples below? Do the relevant dimension features change significantly in the translation or do they hold? Dimension 1, positive pole (1) Register = Conversation Score = 52.01 Text = -------- sbc050.trn.clean ---------I <pro1> thought they were gonna <contrac> be <pres> back by now. You <pro2> did n't <contrac> hear <pres> <prv_vb> <that_del> them playing last <advs> night . I <pro1> know <pres> <prv_vb> . Where <wh_ques> did they go <pres> ? They went out to dinner with Arianna 's parents . Arianna 's parents . Yeah . That <pdem> was her grandma on the phone . They left at <advs> like <fnlprep> , quarter of eight . Mm . Maybe <gen_hdg> they went shopping . First <advs> , and <o_and> then went to dinner . I <pro1> think <pres> <prv_vb> <that_del> they 're <contrac> hanging out . 13 (2) Register = Fiction Score = 16.97 Text = -------- fifty_shades_of_grey.txt ---------I <pro1> scowl with frustration at myself in the mirror . Damn my hair ' it just <gen_emph> will n't <contrac> behave <pres> , and <o_and> damn Katherine Kavanagh for being ill and subjecting me <pro1> to this ordeal . I <pro1> should be <pres> studying for my final exams , which are <pres> next week , yet <advs> here I <pro1> am <pres> trying to brush my hair into submission . I <pro1> must not sleep <pres> with it wet . I <pro1> must not sleep <pres> with it wet . Reciting this mantra several times , I <pro1> attempt <pres> , once more , to bring it under control with the brush . Negative pole (3) Register = news Score = -25 Text = -------- news_000011.txt ---------Five men <n> were killed <agls_psv> and at <prep> least 21 were wounded <agls_psv> in <prep> separate shootings <n> from <prep> Wednesday <n> morning <n> to <prep> early <adj_attr> Thursday <n> . The first fatality <n> occurred about <prep> 10:30 a.m . Wednesday <n> , when Robert <n> Snipes <n> , 31 , was shot <agls_psv> in <prep> the arm <n> and back during <prep> an argument <n> with <prep> another man <n> in <prep> the 1700 block <n> of <prep> North <n> <pl_adv> Pulaski <n> Road <n> , authorities <n> said . Snipes <n> , of <prep> the 200 block <n> of <prep> North <n> <pl_adv> Kostner <n> Avenue <n> , was taken <agls_psv> to Mount Sinai <n> Hospital 14 <n> , were he later died . At 11:20 p.m . , a 35 - year <n> - old <adj_attr> man <n> was found <agls_psv> dead with <prep> a gunshot <n> wound to <prep> the head <n> in <prep> the 100 block <n> of <prep> East <n> <pl_adv> 68th Street <n> , according <prep> to <prep> police. (4) Register = academic Score = -24.43 Text = -------- chromosome.txt ---------Genome <n> scan <n> of <prep> human <adj_attr> systemic <n> lupus <n> erythematosus <n> : Evidence <n> for <prep> linkage <n> on <prep> chromosome <n> 1q in <prep> African - american <adj_attr> pedigrees <n> Systemic <n> lupus <n> erythematosus <n> ( sle <n> ) is an autoimmune <n> disorder <n> characterized <n> by <prep> production <n> of <prep> autoantibodies <n> against <prep> intracellular <n> antigens <n> including <prep> Dna <n> , ribosomal <n> P , Ro <n> ( ss <n> - a ) , La ( ss <n> - b ) , and the spliceosome <n> . Etiology <n> is suspected <agls_psv> to involve genetic <adj_attr> and environmental <adj_attr> factors <n> . Evidence <n> of <prep> genetic <adj_attr> involvement <n> includes : associations <n> with <prep> Hla <n> dr3 <n> , Hla <n> - dr2 <n> , Fc <n> receptors <n> ( fcr <n> ) Iia <n> and Iiia <n> , and hereditary <adj_attr> complement <n> component <n> deficiencies <n> , as well as familial aggregation <n> , monozygotic <n> twin <n> concordance <n>. 15