1. Phraseology wide and narrow - Université catholique de Louvain

advertisement
1
Pushing back the limits of phraseology:
How far can we go?
Sylviane Granger
Centre for English Corpus Linguistics, Université catholique de Louvain – Belgium
1. Phraseology wide and narrow
Phraseology is pervasive in all language fields and yet despite this fact – or perhaps precisely
because of it – it has only relatively recently become established as a discipline in its own right.
The phraseology literature represents it as a subfield of lexicology dealing with the study of word
combinations rather than single words. These multi-word units (MWUs) are classified into a
range of subtypes in accordance with their degree of semantic non-compositionality, syntactic
fixedness, lexical restrictions and institutionalization. As phraseology has strong links but fuzzy
borders with several other fields of linguistics however, notably morphology, syntax, semantics
and discourse, linguists vary in their opinions as to which subsets of these MWUs should be
included in the field of phraseology. Compounds and grammatical collocations are cases in point.
This difficulty in establishing what exactly falls under the umbrella of phraseology is
compounded by the fact that phraseology is a dynamic phenomenon, and displays both
synchronic and diachronic variations (Moon 1998; Giegerich 2004).
Although there is still some considerable discrepancy between linguists as regards the
terminology and typology of word combinations and the limits of phraseology itself, there is
general agreement that phraseology constitutes a continuum along which word combinations are
situated, with the most opaque and fixed ones at one end and the most transparent and variable
ones at the other (Cowie 1998: 4-7; Howarth 1998: 168-171; Gross 1996: 78). One of the main
preoccupations of linguists working within this tradition has been to find linguistic criteria to
distinguish one type of phraseological unit from another (e.g. collocations vs. idioms or full
idioms vs. semi-idioms) and especially to distinguish the most variable and transparent multiword units from free combinations, which only have syntactic and semantic restrictions and are
therefore considered as falling outside the realm of phraseology (Cowie 1998: 6).
As Cowie (1998) points out, it is this approach, itself greatly indebted to the Russian tradition,
which deserves much of the credit for having established phraseology as a discipline in its own
right. It has provided linguists with a set of discrete criteria which can be used to categorize and
analyze word combinations as well as provide thorough descriptions of phraseological units. At
the same time though, establishing non-compositionality and fixedness as key indices of
phraseology has placed focus firmly on units such as proverbs, idioms or phrasal verbs to the
detriment of more variable combinations, which, because they are considered less ‘core’, tend to
be dealt with less in reference and teaching tools, a state of affairs which is reflected in the large
number of books devoted to idioms or phrasal verbs currently on the market.
2
A more recent approach to phraseology, which originated with Sinclair’s pioneering
lexicographic work (Sinclair 1987) and is usually referred to as the statistical or frequency-based
approach (Nesselhauf 2005), has turned phraseology on its head. Instead of resorting to a topdown approach which identifies phraseological units on the basis of linguistic criteria, it uses a
bottom-up corpus-driven approach to identify lexical co-occurrences. This inductive approach
generates a wide range of word combinations, which do not all fit predefined linguistic categories
(Moon 1998: 39). It has opened up a “huge area of syntagmatic prospection” (Sinclair 2004: 19)
encompassing sequences like frames and colligations as well as institutionalized phrases, which
are “syntactically and semantically compositional, but occur with markedly high frequency (in a
given context)” (Sag et al 2002). Such units, traditionally considered as peripheral or falling
outside the limits of phraseology, have recently revealed themselves to be pervasive in language,
while many of the most restricted units have proved to be highly infrequent.
Unlike proponents of the classical approach to phraseology, Sinclair and his followers are much
less preoccupied with distinguishing between different categories and subcategories of word
combinations or more generally, with setting clear boundaries to phraseology. In Sinclair’s
framework, phraseology is central: phraseological items take precedence over lexical items. This
radical view has been criticized. Gaatone (1997: 168), for instance, welcomes the growing
importance attached to multi-word units but warns against considering everything as
phraseological. However, there is now some strong support for the ubiquity and centrality of
phraseology, both from corpus-based linguistic studies and also from recent psycholinguistic
studies, such as Wray’s (this volume), which present holistic storage as the default type of
processing.
2. Reconciling the two approaches
If phraseology is to be successfully integrated into both theoretical second language acquisition
(SLA) studies and pedagogical applications, the most promising avenue would seem to be one
that combines the benefits of the two approaches: the fine-grained linguistic analysis of the
traditional approach and the heuristic value of the statistical approach.
The traditional approach provides SLA researchers with a keener awareness of the different
categories of MWUs. Current studies either make do with one overarching notion of ‘formulaic
sequence’ or completely disregard the impact of phraseology on speakers’ word knowledge
scores. Fresh light would be shed on the results of Wolter’s (2002) study of the syntagmatic vs.
paradigmatic organization of the L1 and L2 mental lexicon if the phraseological profile of the
prompt words was taken into account.
The traditional approach also has much to contribute to pedagogical research as “different kinds
of MWUs suggest different kinds of learning” (Grant & Bauer 2004: 51). While it is neither
realistic nor desirable to expect materials designers, teachers and learners to master the full range
of fine-grained categories and subcategories of MWUs, all these groups would benefit from a
good understanding of the major categories and accessible terminology (Lewis 2000: 129-130).
3
This said, gaining a good grasp of the contextual use of words involves much more than the
traditional bona fide categories of multi-word units. In an applied perspective, the frequencybased approach has an undeniable advantage as it covers the whole range of co-occurrence
patterns with no a priori exclusions. Even so-called free combinations have a place. While they
are often presented as predictable and hence not worthy of attention, they have been reinstated by
recent studies of learner language which have shown that what is felt to be predictable by native
speakers of the language may in fact present problems for foreign learners (Lea & Runcie 2002:
823-824). Nesselhauf’s (2005) study of V + N combinations has demonstrated that free
combinations are not always used correctly by learners: she identified an error rate that was lower
than for collocations but by no means negligible (17% vs. 25%). The frequency-based approach
has also highlighted the importance of another category of MWU, what Biber (2004) calls
‘lexical bundles’, compositional recurrent sequences which he describes as “the most important
textual building blocks used in spoken and written discourse.” Similar studies based on learner
corpora of academic writing (De Cock 2003 and this volume; Paquot 2005 and this volume;
Flowerdew 1998 and 2003; Granger & Paquot 2005) have shown that it is precisely these
building blocks which cause learners difficulty. It follows that if learners are to become more
fluent speakers and writers, these types of unit have to be included in any course or textbook
alongside fully-fledged idioms and other traditionally recognized units.
What we need then, is a combination of the two approaches. While it is advisable to start from a
very wide notion of phraseology, the frequency-based information should be complemented with
insights drawn from other disciplines as not all units identified by quantitative methods are
pedagogically valuable. Traditional phraseological theory is essential here as it provides the
necessary apparatus to break down the statistical units into linguistically-defined categories, an
essential step towards optimal pedagogical integration. In fact, statistical multi-word units should
be viewed as raw material which needs to be refined using a series of filters: linguistic (types of
MWUs), cognitive (notions of salience, animacy, etc.), cross-linguistic (degree of congruence
with learner’s L1) and didactic (teaching objective).
3. Conclusion
The existence of two widely different approaches to phraseology is an undeniable asset for a field
whose importance is now universally acknowledged. SLA theoreticians and practitioners have all
the necessary ingredients – a solid theoretical apparatus, large native and learner corpora and
powerful extraction tools - to integrate phraseology more solidly into SLA theory and teaching
practice. It is to be hoped that they will avail themselves of these resources so that phraseology
can at long last have the place it deserves in language education.
References
Biber D. (2004) Lexical bundles in academic speech and writing. In Lewandowska-Tomaszcyk B. (ed.)
Practical Applications in Languages and Computers. Frankfurt: Peter Lang, 165-178.
Cowie A.P. (1998) Introduction. In Cowie A.P. (ed.) Phraseology: Theory, Analysis and Applications.
Oxford: OUP, 1-20.
4
De Cock S. (2003) Recurrent Sequences of Words in Native Speaker and Advanced Learner Spoken and
Written English: a Corpus-driven Approach. Unpublished PhD dissertation. Louvain-la-Neuve:
Université catholique de Louvain.
Flowerdew L. (1998) Integrating ‘Expert’ and ‘Interlanguage’ Computer Corpora Findings on Causality:
Discoveries for Teachers and Students. English for Specific Purposes 17(4): 329-345.
Flowerdew L. (2003) A Combined Corpus and Systemic-Functional Analysis of the Problem-Solution
Pattern in a Student and Professional Corpus of Technical Writing. TESOL Quarterly 37(3): 489-511.
Gaatone D. (1997) La locution : analyse interne et analyse globale. In Martins-Baltar M. (ed.) La
locution entre langue et usages. Langages. Fontenay-Saint Cloud: ENS éditions, 165-177.
Giegerich H. J. (2004) Compound or phrase? English noun-plus-noun constructions and the stress
criterion. English Language and Linguistics 8: 1-24.
Granger S. & M. Paquot (2005) The phraseology of EFL academic writing: Methodological issues and
research findings. Paper presented at ICAME 26 – AAAACL6 (International Computer Archive of
Modern and Medieval English - American Association of Applied Corpus Linguistics), 12-15 May
2005, University of Michigan, USA.
Grant L. & L. Bauer (2004) Criteria for Re-defining Idioms: Are we Barking up the Wrong Tree ? Applied
Linguistics 25(1): 38-61.
Gross G. (1996) Les expressions figées en français. Noms composés et autres locutions. Paris: Ophrys.
Howarth P. (1998) The Phraseology of Learners’ Academic Writing. In Cowie A.P. (ed.) Phraseology:
Theory, Analysis, and Applications. Oxford: OUP, 161-186.
Lea D. & M. Runcie (2002) Blunt instruments and Fine Distinctions: a Collocations Dictionary for
Students of English. In Braasch A. & C. Povlsen (eds) Proceedings of the Tenth EURALEX
International Congress. Copenhagen: Center for Sprogteknologi, 819-829.
Lewis M. (2000) Teaching collocation. Further Developments in the Lexical Approach. Boston: Heinle.
Moon R. (1998) Fixed expressions and idioms in English. Oxford: Clarendon Press.
Nesselhauf N. (2005) Collocations in a Learner Corpus. Amsterdam & Philadelphia: Benjamins.
Paquot M. (2005) Towards a productively-oriented academic word list. Paper presented at Practical
Applications in Language and Computers, 7-9 April 2005, Łódź, Poland.
Sag I. A., T. Baldwin, F. Bond, A. Copestake & D. Flickinger (2002) Multiword Expressions: A Pain in
the Neck for NLP. In Proceedings of the Third International Conference on Intelligent Text Processing
and Computational Linguistics (CICLING 2002), Mexico City, 1-15.
Sinclair J. (1987) Looking Up. An account of the COBUILD Project in lexical computing. London: Collins
ELT.
Sinclair J. (2004) Trust the Text – Language, corpus and discourse. London: Routledge.
Wolter B. (2001) Comparing the L1 and L2 Mental Lexicon. A Depth of Individual Word Knowledge
Model. Studies in Second Language Acquisition 23: 41-69.
Download