Learner Corpora and English Language Teaching

advertisement
Learner Corpora and English Language Teaching: Checkup Time
FANNY MEUNIER, Louvain-la-Neuve, Belgium
Prepublication draft
<to appear in Anglistik: International Journal of English Studies 21.1 (March 2010): 209-220)>
1. Introduction
A checkup is usually a routine visit to a specialist in order to assess one's state of
health at a particular point in time with a view to becoming an informed patient and,
if necessary, take adapted measures to improve one's health. The checkup metaphor is
used in the title to refer to the assessment of the links between learner corpus research
(LCR) and English language teaching (ELT). As is the case for medical checkups,
however, no claim of exhaustive analysis will be made here. The focus will rather be
on a limited number of issues which are considered vital for the future of LCR and
suggestions for improvement will be provided in areas where the situation could, at
least in my view, be improved.
2. Learner corpora, SLA and ELT: general overview
Ten years ago, Tono (1999) listed the areas of second language acquisition (SLA) and
ELT that learner corpora could contribute to. Learner corpus studies, Tono (1999)
argued, can foster SLA research by: 1. describing the developmental stages of interlanguage
(IL), 2. studying the effect of L1 transfer, 3. identifying the overuse and
underuse of linguistic features, 4. discriminating between universal and L1 specific
errors, and 5. distinguishing native from non-native-like performance.
2.1. Learner corpora and SLA
Ten years later, it would probably be overoptimistic to claim that LCR has provided a
clear description of the developmental stages of interlanguage, one of the reasons
being the lack of availability of large longitudinal learner corpora. SLA researchers
have long been working on developmental stages of acquisition but many of the SLA
studies are not corpus-based and are limited to the study of a small number of informants.
The advantages of corpus use (size, storage, computer-aided annotation and
analysis, etc.) could nicely complement the many existing non-corpus based longitudinal
SLA studies. Some large-scale projects have been carried out, but they have
typically focussed on early stages of acquisition and/or on research topics traditionally
anchored in the generative tradition such as the acquisition order of morphemes,
negation or phrase structure (see for instance Vainikka and Young-Scholten 1998 for
an illustration of the phrase structure approach). Famous longitudinal studies include
Klein and Perdue's analysis of the second language acquisition of adult immigrant
workers between 1981 and 1988 (see Perdue, 1993a; Perdue, 1993b; Klein and
Perdue 1997). In this extensive crosslinguistic study, learners from different mother
tongue backgrounds learning different second languages were all found to go through
three relatively stable stages (pre-basic, basic and post-basic) generally characterized
by a limited lexicon (with learners using mainly open part-of-speech classes), simple
sentence structures and the absence of inflectional morphology. Other researchers,
such as Myles et al. (1998, 1999), for instance, have used the CHILDES database
format, initially developed for the study of L1 acquisition (MacWhinney 1999), to
analyse classroom foreign language learning. Here again, the main focus was on early
stages of learning rather than on more advanced proficiency levels. Some recent research
projects have been launched to compensate for the lack of longitudinal learner
corpora targeting more advanced stages of acquisition. Reder et al. (2003) present a
multimedia adult ESL learner corpus (MAELC) in which initially low-level adult ESL
classrooms learners have been recorded with multiple video cameras and microphones
during a 5-year period (2001-2006). A transcription framework has been developed
and transcripts include information not only on what students said but also on
1
what their target utterance was in order to facilitate the identification of certain types
of errors. As indicated on the MAELC website, the multimedia aspect of the corpus
also allows for close examinations of dyadic and small-group interactions between
students from different language backgrounds1. A more recent project is
LONGDALE, initiated in January 2008 at the Centre for English Corpus Linguistics
in Louvain, Belgium2. The aim of the project is to build a large longitudinal database
of learner English containing data from learners from a wide range of mother tongue
backgrounds. The same students will be followed over a period of at least three years
(from year 1 to year 3 of their studies at the University of Louvain) and data collections
will be organised at least once a year. It is to be hoped that the collection and
analysis of such corpora will help researchers shed more light on the developmental
stages that learners with different mother tongues go through.
In contrast to the study of the developmental stages of interlanguage, the study of
the effect of L1 transfer, the identification of the overuse and underuse of linguistic
features, of universal and L1 specific errors, and of native vs non-native performance
(see above) have largely benefited from the insights gained from LCR 3.
2.2. Learner corpora and ELT
As for the focus of the present paper, i.e. the links between LCR and ELT, Tono (1999)
states that learner corpora could be used as input to inform L2 lexicography, syllabus
design, material design, data-driven learning and self-learning. He also argues that
explaining the potential and use of learner corpora should be part of teacher education.
The evaluation of those scholarship4 issues is probably less satisfactory than the research
issues linked to SLA. Apart from lexicography, which has benefited from learner
corpus analysis, the other ELT-related domains have not really been fed by LCR:
The field in which advances have been quickest is pedagogical lexicography. The latest
editions of the Longman Dictionary of Contemporary English (LDOCE) (2003) and the
Cambridge Advanced Learner's Dictionary (CALD) (2003) both contain error notes
based on their respective learner corpora, which are intended to help learners to avoid
making common mistakes (Granger 2008, 345).
More recently still, Granger (2009, 24) provides a critical evaluation of the contribution
of learner corpora to second language acquisition and foreign language teaching,
and, in the section devoted to learner corpus research and foreign language teaching,
she stresses the fact that "there is undeniably very little evidence of fully-fledged
up-and-running applications". In that state-of-the-art article, Granger also distinguishes
between the use of learner corpora for delayed or immediate pedagogical use
(DPU or IPU corpus use). In a DPU situation, learner corpora
are not used directly as teaching/learning materials by the learners who have produced
the data [but] are compiled by academics or publishers with a view to providing a better
description of one specific interlanguage and/or designing tailor-made pedagogical
tools which will benefit similar-type learners. (Granger 2009, 24-25).
Learner corpora for immediate pedagogical use (IPU) are "collected by teachers as
part of their normal classroom activities […] and the learners are at the same time
producers and users of the corpus data" (Granger 2009, 25).
This distinction of DPU and IPU will be used to organise the following sections of
the present paper: section 3 will examine two DPU issues, namely syllabus design and
material design; section 4 will address the IPU side of learner corpora and will present
a few studies which report on learners who have analysed their own productions.
Here, I will also touch upon the need for (learner) corpus literacy in teacher education.
The final section will provide both a summary and a list of priority issues for the
future.
3. LCR for syllabus and materials design
Graves (1996, 3) defines the term syllabus as "the specification and ordering of content
of a course or courses". As for the term materials, it refers to the various and
2
specific written, audio and/or video input which are part of the syllabus and which are
provided to learners, together with exercises and tasks presented to exploit the input.
In the ELT circles promoting the use of authentic materials, it has been suggested that
spoken and/or written excerpts from native corpora can be used as raw teaching input
(e.g. the transcript of a postcard from the BNC)5. Lee (2001) argues that "genre analyses
of relatively small, focussed and manageable sets of texts are now possible with
the help of the BNC Index, opening up a rich resource for all kinds of learning and
research activities". Teachers and researchers may have access to some rare subgenres,
such as postcards or shopping lists, which were not included in traditional
general corpora. The outcomes of native corpus research can also feed materials and
syllabus design: e.g. frequency-based vocabulary selection and grading, results of
corpus-based genre analysis for text-type awareness exercises, use of concordances
for data-driven learning activities, etc.
Recourse to learner corpora for syllabus and materials design is still relatively rare
in ELT. Several reasons may account for this. First, the belief that the provision of
incomprehensible input is detrimental to learning is still present in ELT circles despite
the many studies highlighting the benefits of incomprehensible input and erroneous
output in interactional feedback (cf. Gass 1997; Nassaji 2007; Carroll 2000; White
1987; VanPatten 2002; Braidi 2002; Mackey et al. 2003)6. As for the use of the outcomes
of LCR to feed materials or syllabus design, the main problem lies in the lack
of relevance and generalizability of the results provided in LCR studies pertaining
both to the types of learner corpora used and to the research focus. As already stated
in section 1, some learner corpora focus on the early stages of acquisition and/or on
research topics traditionally anchored in the generative tradition; the results of such
studies cannot easily be included in current syllabuses or materials which largely
advocate a communicative approach to language teaching. Other types of learner
corpora target upper-intermediate to advanced levels of proficiency: for example, the
International Corpus of Learner English (ICLE) 7, which contains over 3 million
words of writing by learners of English from 14 different mother tongue backgrounds
at university level (Granger et al. 2009); sections of the Michigan Corpus of Academic
Spoken English (MICASE)8 or the Michigan Corpus of Upper-level Student
Papers (MICUSP)9, which contain spoken and written academic productions by both
native and non-native speakers. Some other learner corpora contain data from one
mother tongue population only (e.g. Kojiro's English learner corpus produced by
Japanese college students)10, which also reduces the generalizability of the results.
Another reason which accounts for the lack of direct influence of LCR studies on
ELT syllabuses and materials is that, apart from the EAP/ESP teachers who do use
EAP or advanced learner corpora (see for instance Flowerdew 2003; Gilquin et al.
2007; Paquot 2008), the topics covered in most learner corpora are often miles away
from the everyday needs of a vast majority of ESL or EFL school teachers who target
English for general purposes for a teenage audience. Finding a learner corpus that
meets their needs comes close to looking for a needle in a haystack. The Common
European Framework of Reference for Languages (CEFR, Council of Europe 2001:
52) suggests that the following thematic categories should be addressed in English for
General Purposes (EGP): personal identification; house and home, environment; daily
life; free time, entertainment; travel; relations with other people; health and body
care; education; shopping; food and drink; services; places; language; weather. Subcategories
can also be established (e.g. leisure, hobbies and interests, radio and TV,
cinema, theatre, concert, etc. for 'free time and entertainment'). To my knowledge, no
native or learner corpus study provides easily transferrable research results which
could be integrated in a syllabus addressing the above-mentioned themes. It must also
be kept in mind that besides containing up-to-date materials which meet the pupils'
interests and which can be linked to their everyday life preoccupations, a syllabus
should also include text types representative of the meaningful communicative tasks
that learners are called upon to perform (e.g. writing an email, a letter of excuse to a
friend, the summary of a book, his/her opinion on a film; chatting with a friend, describing where he/she lives, etc.). All this makes the task of finding large, freely
available and easily searchable corpora which would be appropriate for the teaching
of EGP even more daunting, if not impossible.
Some projects and/or computer tools have been developed with the aim of facilitating
3
the collection of topic specific corpora. One such project is Kilgarriff's Web
BootCaT (see Baroni and Bernardini 2004, Kilgarriff and Greffenstette 2003), which
makes it possible to quickly collect a large corpus of web data on the basis of representative
keywords. RSS technology is also one way of promoting teachers' and
learners' corpus literacy and of automating tailor-made corpus collection (Meunier
and Fairon 2006). This said, those tools can only be used with web data, with all the
advantages and disadvantages this entails, such as for instance a lack of information
on the author of the text, a 'you-can-only-get-what-is-on-the-web' bias, and an additional
bias on the written mode (despite the many examples of speech-like materials
that can be found in chat rooms).
Such limitations notwithstanding, it could be argued that learner corpora could be
used not so much for the interest of the topics, text types or tasks covered but rather as
a source of information on less topic- or genre-dependent issues such as sentence
structure and grammatical problems. McCarthy (2008) has recently acknowledged the
fact that what he calls 'non-native user corpora' are under-developed and underexploited.
He states, however, that
learner corpora (usually in the form of examination scripts or essays or classroom transcripts)
have been used a great deal and are frequently a resource for evidence in error
warnings in teaching materials (e.g. Carter & McCarthy 2006) or as sources for the targeting
of particular language features in materials (McCarthy & O'Dell 2005).
(McCarthy 2008, 570)
The question of the status of errors, McCarthy (2008, 570) stresses, is "one area
where corpus linguistics overlaps with extant and long-standing preoccupations in
teacher education". It should, however, be added that very few large-scale error
tagged learner corpora are freely available11 and that too little research has been carried
out on part-of-speech tagged and syntactically parsed learner corpora. In this
context, Meunier (1995 and 2002) addresses general issues in tagging and parsing
interlanguage, and the role of learner and native corpora in grammar teaching.
Granger (2003), Granger and Thewissen (2005) and Thewissen (2008) deal with the
value of error tagged corpora in CALL and in the assessment of language proficiency,
and Meunier (in preparation) focuses on interlanguage syntactic complexity.
The lack of spoken learner corpora also reinforces the underuse of learner copora
in syllabus and materials design. Referring to native corpora, Shirato and Stapleton
(2007) write:
[A] major concern is that most emphasis to date has been put on written texts with very
few attempts made to analyze spoken data (the British National Corpus (BNC) is 90%
written vs. 10% spoken data) for the purposes of developing pedagogical materials.
(Shirato and Stapleton 2007, 394)
This statement of fact is all the more true for learner corpora. Whilst spoken interaction
is a core issue in ELT, very few spoken learner corpora have been compiled
and analyzed to date. A few exceptions include the LINDSEI corpus (cf. De Cock
2007) and the SST corpus (cf. Tono et al. 2001). To the limitations already presented
for written learner corpora (see above) spoken LCR has to deal with additional demands:
the recording, transcription and annotation of spoken corpora is even more
difficult and time-consuming than is the case for written corpora. Once spoken learner
corpora are collected, transcribed and analyzed (which often includes a comparison
with native spoken corpora), many spoken-specific features are brought to light. Shirato
and Stapleton (2007), in their comparison of English vocabulary in a spoken
learner and native speaker corpus, point out the following:
[T]he results obtained lead [them] to the conclusion that the vocabulary currently acquired
by Japanese NNS differs markedly from the NS norm. [Their] qualitative analysis
has illustrated significant aspects of NS's conversational vocabulary in which softness,
indirectness, hedges, and vagueness are abundant. These characteristics may be
considered a basic defining feature of spoken lexis, although they have long been given
only scant attention in Japanese formal education, in which a mastery of written language
has been a major priority. (Shirato and Stapleton 2007, 410)
4
Those speech-specific language features will only receive appropriate treatment if
more spoken corpora are collected and analyzed in the future. The results of such
studies should also find their way to the ELT syllabus and materials which still tend
to give a rather monolithic view of the language. Very few textbooks feature and/or
explain speech-specific grammatical features such as ellipsis, left dislocation, tail slot,
hesitation features, vagueness hedges or overtures (see Hewings and Hewings 2005
for more examples).
A last reason for the scarce use of corpus–based material (native and learner corpora
alike) by teachers has been put forward by McCarthy (2008) who mentions a
lack of awareness of what exactly corpora can bring to ELT:
[T]eachers have heard of corpora, but they are not quite sure what they are. They are
sometimes frightened of what their use might imply: Does a teacher need to have a high
level of expertise in computational linguistics or information technology in order to be
part of this pedagogical revolution? Does one have to be a native speaker of a particular
language in order to understand and use corpus information in that language?
(McCarthy 2008, 563-564)
This relationship between teachers and corpora will be touched upon in the following
section, together with the assessment of the IPU of learner corpora.
4. Learner corpora for immediate pedagogical use (IPU)
Whilst the previous section addressed DPU issues, the present section focuses on
studies or experiments where learners are asked to analyse their own productions as
part of their learning activities. Mukherjee and Rohrbach (2006, 205) mention "a
widening gap and a growing lag between on-going and intensive corpus linguistic
research on the one hand and classroom teaching on the other" and state that the exploitation
of learner corpora in the EFL classroom is still marginal. Granger (2008)
observes that when teachers do use learner corpora to develop their own in-house
teaching materials, the latter share a number of characteristics:
(1) they tend to be based on learner corpora for immediate pedagogical use; (2) they are
often L1-specific rather than generic; (3) they are designed with a clear teaching objective
in a well-defined teaching context; and (4) they tend to be electronic rather than
paper tools. (Granger 2008, 348)
She quotes the web-based writing environment of Wible, Kuo, Chien, Liu and
Tsao (2001) as "the perfect example of a tool, which allows for the generation, annotation
and pedagogical exploitation of learner corpora" (Granger 2008, 348). Whilst
Wible et al.'s (2001) project is impressive in size, it is probably also rather unique in
its kind as the human and computing resources needed are high. Many other projects
are more limited in scope. Braun (2005), for instance, pleads for the use of pedagogically
relevant corpora which require what Widdowson (2003) calls 'pedagogic mediation'.
She uses a small English Interview Corpus (ELISA) to outline possible solutions
for a pedagogic mediation and shows that learner corpora can be used to address discourse
issues. This use of small learner corpora has already been recommended by
Flowerdew (2001) for EAP materials design and by Tribble (2001) for the teaching of
writing. Belz and Vyatkina's (2008) work is another excellent illustration of the value
of pedagogical mediation. The authors explore the pedagogically mediated application
of a learner corpus in language teaching and in the developmental analysis of
SLA. Belz et al.'s (2008) English-German bilingual corpus contains the complete
record of the native and non-native speaker interactions over a two-month telecollaborative
partnership (see Belz 2002; Belz and Thorne 2006). Monolingual and bilingual
learner corpora of computer-mediated-communication (CMC) are ideal pedagogical
sources. The learners can revisit their own productions in the form of pedagogically
mediated teaching materials. Belz and Vyatkina (2008) provide many examples
of other CMC studies and argue that such an approach facilitates the close,
corpus-driven tracking of micro-changes in learner language use over time and that
such CMC data is a source of robust material for focused instruction. The learningdriven
data methodology advocated by Seidlhofer (2002) finds its realization in such
a learner-centred, context-dependent and culture-bound approach.
5
Following the medical metaphor used in the title of the present paper, it seems fitting
to use a business-related metaphor with regard to the successful IPU of learner
corpora, namely "small is the new big" (Godin 2006). Seth Godin, described as a
marketing guru by Business Week in September 2008 (Scanlon 2008), argues that
whilst "big used to matter" and that "get big fast was the motto", get small seems to
be the new motto
because small gives you the flexibility to change the business model when your competition
changes theirs. […] A small restaurant has an owner who greets you by name
[and] [a] small church has a minister with the time to visit you in the hospital when
you're sick. (Godin 2006)
The 'get small' motto seems to be a key to successful classroom use of learner corpora,
and is probably also one way of drawing teachers' attention to learner corpora.
As Römer (2009) puts it:
Corpus researchers often claim that corpus linguistics can make a difference for language
teaching and that it has an immense potential to improve pedagogy, but perhaps
do not focus enough on the interface of research and practice. They do not make
sufficient efforts to reach practitioners, especially teachers, with the 'corpus mission',
do not know enough about the needs of teachers. (Römer 2009, 83)
Whilst her article focuses mainly on how the use of native corpora can help teachers
meet some of their needs, the crucial importance of a needs' analysis is highlighted,
and, to my knowledge, no such survey exists for learner corpora. Another key reference
book which aims to help corpora reach classrooms is O'Keeffe et al. (2007, xi).
In the introduction, the authors mention the "frequent mismatch between CL research
and what goes on into materials and resources, and what goes on in the language
classroom". The book draws primarily on spoken language corpora, which constitutes
a welcome slant given the numerous calls for more focus on speech in educational
environments, especially in instructed settings where lack of exposure to speech often
turns out to be detrimental to the learners' communicative competence (cf. Meunier
2007). Despite the fact that the book mainly deals with native corpora, it can also help
promote the use of learner corpora as the authors argue that one of their aims was to
encourage teachers to use language corpora when pursuing their own enquiries and
enhance their professional development.
Providing an all-encompassing and ready-made answer on how to best use learner
corpora for immediate pedagogic use is no easy task as the options available will
depend on the learners' needs, teachers' needs, and on the human and computational
resources at their disposal. It seems, however, that the experiments carried out with
smaller corpora satisfy some of the teachers' and learners' immediate needs, probably
because such experiments are considered as feasible (in terms of computer and human
resources needed), manageable, and because they are inherently learner-centred, contextdependent and culture-bound. The fact that learners analyse their own productions
also favours the individualization of learning and teaching and helps learners
monitor their own production and the effects of their own production on others. Promoting
the use of small learner corpora does not imply that the idea of large annotated
learner corpora should be abandoned but rather that a small-step approach should
probably be recommended as an appetizer to classroom corpus use. Citing Godin
(2006) once more, the key to success could be to "get small and think big".
5. Concluding remarks
After a general overview of the place of learner corpus research in SLA and ELT, the
article has presented concrete examples of delayed and immediate pedagogical use of
learner corpora. Coming back to the checkup metaphor it can be stated that LCR is a
healthy and dynamic field of activity. The work carried out so far in LCR is impressive.
Ongoing and future projects will undoubtedly shed more light on second/foreign
language acquisition processes and will benefit ELT. To go on developing healthily,
priority should, at least in my view, be given to the following issues: the collection,
transcription and annotation of longitudinal and spoken learner corpora; the collection,
6
transcription and annotation of new types of learner data (including new topics and
new text types); the promotion of studies on POS-tagged and syntactically parsed corpora;
the promotion of small-scale learner corpus projects, anchored in local contexts
and in line with teachers' and learners' immediate needs.
End notes
1 See http://www.labschool.pdx.edu/maelc_access.html.
2 For more information on the LONGDALE project, see http://cecl.fltr.ucl.ac.be/LONGDALE.html.
3 A closer look at the learner corpus bibliography reveals the impressive range of issues
Addressed (see http://cecl.fltr.ucl.ac.be/learner%20corpus%20bibliography.html).
4 Research and scholarship are usually distinguished in second and foreign language teaching and
learning. Whilst research looks for the explanation or understanding of learning and teaching principles,
scholarship advocates what developments should be pursued in the future and why. The assessment
of developments is also part of scholarly issues (see Byram and Feng 2004 for more details).
5 For an excellent review of the place of authentic discourse and materials in language learning, see
Gilmore (2007)
6 Incomprehensible input may be correct input which is beyond the learner's level of competence or
incorrect input leading to comprehension problems or gaps. Learner corpora, as they are error-prone,
belong to the second category.
7 See http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm for more information on ICLE.
8 See http://lw.lsa.umich.edu/eli/micase/index.htm for more information on MICASE.
9 See http://lw.lsa.umich.edu/eli/eli1/micusp/index.htm for more information on MICUSP.
10 See http://www.eng.ritsumei.ac.jp/asao/lcorpus/ for more information on the corpus.
11 Some publishers use their in-house error tagged learner corpus as a source of inspiration to include
error warnings or notes in teaching materials but very little is known about the error tagging procedures
used and the selection principles adopted for the inclusion of those warnings or error notes.
References
Baroni, Marco and Silvia Bernardini. "BootCaT: Bootstrapping corpora and terms
from the web". Proceedings of LREC 2004, Lisbon: ELDA, 2004. 1313-1316.
Belz, Julie A. and Nina Vyatkina. "The pedagogical mediation of a developmental
learner corpus for classroom-based language instruction". Language Learning &
Technology 12.3 (2008): 33-52.
—. "Social dimensions of telecollaborative foreign language study". Language
Learning & Technology 6.1 (2002): 60-81.
— and Steven L. Thorne, eds. Computer-mediated intercultural foreign
language education. Boston, MA: Heinle & Heinle, 2006.
Braidi, Susan. "Reexamining the role of recasts in native-speaker/non-native-speaker
interactions". Language Learning 52.1 (2002): 1-42.
Braun, Sabine. "From pedagogically relevant corpora to authentic language learning
contents". ReCALL 17.1 (2005): 47-64.
Byram, Mike and Anwei Feng. "Culture and language learning: teaching, research
and scholarship". Language Teaching 37.3 (2004): 149-168.
Carroll, Susanne. Input and evidence: The raw material of second language acquisition.
Philadelphia: John Benjamins, 2000.
Carter, Ronald and McCarthy, Michael. Cambridge Grammar of English: A Comprehensive
Guide to Spoken and Written Grammar and Usage. Cambridge: Cambridge
University Press, 2006.
Council of Europe (2001). Common European Framework of Reference for Languages:
Learning, Teaching, Assessment. Cambridge: CUP. Available online at
http://culture.coe.int/portfolio
De Cock, Sylvie. "Routinized Building Blocks in Native Speaker and Learner Speech:
Clausal Sequences in the Spotlight". Spoken Corpora in Applied Linguistics. Eds.
Mari C. Campoy and María J. Luzón. Bern: Peter Lang, 2007. 217-233.
Flowerdew, Lynne. "The exploitation of small learner corpora in EAP materials design".
Small Corpus Studies and ELT. Theory and practice. Eds. Mohsen
Ghadessy, Alex Henry and Robert Roseberry. Studies in Corpus Linguistics 5.
Amsterdam: John Benjamins, 2001. 363-380.
—. "A Combined Corpus and Systemic-Functional Analysis of the Problem-Solution
Pattern in a Student and Professional Corpus of Technical Writing". TESOL Quarterly
37.3 (2003): 489-511.
Gass, Susan. Input, Interaction, and the Second Language Learner. Mahwah, NJ:
7
Lawrence Erlbaum Associates, 1997.
Gilmore, Alex. "Authentic materials and authenticity in foreign language learning".
Language Teaching 40.2 (2007): 97-118.
Gilquin, Gaëtanelle, Sylviane Granger and Magali Paquot. "Learner corpora: the
missing link in EAP pedagogy". Corpus-based EAP Pedagogy. Ed. Paul Thompson.
Special issue of Journal of English for Academic Purposes 6.4 (2007): 319335.
Godin, Seth. Small Is the New Big: and 183 Other Riffs, Rants, and Remarkable Business
Ideas. Penguin, 2006.
Granger, Sylviane. "Error-tagged learner corpora and CALL: a promising synergy".
CALICO (special issue on Error Analysis and Error Correction in ComputerAssisted Language Learning) 20.3 (2003): 465-480.
—. "Learner Corpora in Foreign Language Education". Encyclopedia of Language
and Education. Volume 4. Second and Foreign Language Education. Eds. Nelleke
Van Deusen-Scholl and Nancy H. Hornberger. Berlin: Springer, 2008. 337-351.
—. "The contribution of learner corpora to second language acquisition and foreign
language teaching: A critical evaluation". Corpora and Language Teaching, Corpora
and Language Teaching. Ed. Karin Aijmer. Studies in Corpus Linguistics 33.
Amsterdam: John Benjamins, 2009. 13-33.
— and Jennifer Thewissen. "The contribution of error-tagged learner corpora to the
assessment of language proficiency. Evidence from the International Corpus of
Learner English". Paper presented at the 27 th Language Testing Research Colloquium,
Ottawa (Canada), 18-22 July 2005.
—, Estelle Dagneaux, Fanny Meunier and Magali Paquot. The International Corpus
of Learner English – Version 2. Handbook and CD-ROM. Louvain-la-Neuve:
Presses Universitaires de Louvain, 2009.
Graves, Kathleen. Teachers as Course Developers. Cambridge: Cambridge University
Press, 1996.
Hewings, Ann and Martin Hewings. Grammar and context. An advanced resource
book. London and New York: Routledge, 2007.
Kilgarriff, Adam and Gregory Greffenstette. "Introduction" to the Special Issue on
Web as Corpus. Computational Linguistics 29.3 (2003): 1-15.
Klein, Wolfgang and Clive Perdue. "The Basic Variety (or: Couldn't natural languages
be much simpler?)". Second Language Research 13.4 (1997): 301-347.
Lee, David. "Genres, registers, text types, domains, and styles: clarifying the concepts
and navigating a path through the BNC jungle". Language, Learning & Technology
5.3 (2001): 37-72, available at <http://llt.msu.edu/vol5num3/pdf/lee.pdf>
Mackey, Alison, Rhonda Oliver and Jennifer Leeman. "Interactional input and the
incorporation of feedback: An exploration of NS-NNS and NNS-NNS adult and
child dyads". Language Learning 53.1 (2003): 35-66.
MacWhinney, Brian. "The CHILDES system". Handbook of child language acquisition.
Ed. Tej Bhatia. San Diego, Academic Press, 1999. 457-494.
McCarthy, Michael. "Accessing and interpreting corpus information in the teacher
education context". Language Teaching 41.4 (2008): 563-574.
McCarthy, Michael and Felicity O'Dell. English Collocations in Use. Cambridge:
Cambridge University Press, 2005.
Meunier Fanny. "Tagging and Parsing Interlanguage". La Linguistique Appliquée
dans les Années 90. Ed. L. Beheydt. ABLA Review 16 (1995): 21-29.
—. "The pedagogical value of native and learner corpora in EFL grammar teaching". In Granger S., Hung J. &
Tyson S. (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching.
Amsterdam & Philadelphia: Benjamins, (2002): 119-142.
— and Cédrick Fairon. "Empowering teachers' and learners' corpus literacy: using the
RSS technology to automate tailor-made corpus collection". Proceedings of the
Seventh Teaching and Language Corpora Conference, TALC 2006, Paris.
—. Review of From Corpus to Classroom. Language use and language teaching, by
Anne O'Keeffe, Michael McCarthy and Ronald Carter. ReCALL Journal 19.3
(September 2007).
—. Corpora, SLA and EFL. Assessing interlanguage syntactic complexity. (in preparation).
Mukherjee, Joybrato. and Jan-Marc Rohrbach. "Rethinking applied corpus linguistics
from a language-pedagogical perspective: new departures in learner corpus research".
8
Planing, Gluing and Painting Corpora: Inside the Applied Corpus Linguist's
Workshop. Eds. Bernhard Ketteman and Georg Marko. Frankfurt/Main: Peter
Lang, 2006. 205-232.
Myles, Florence, Janet Hooper and Rosamond Mitchell. "Rote or rule? Exploring the
role of formulaic language in classroom foreign language learning". Language
Learning 48.3 (1998): 323-363.
Myles, Florence, Rosamond Mitchell and Janet Hooper. "Interrogative chunks in
French L2: A basis for creative construction?" Studies in Second Language Acquisition
21.1 (1999): 49-80.
Nassaji, Hossein. "Elicitation and reformulation and their relationship with learner
repair in dyadic interaction". Language Learning 57.4 (2007): 511-548.
O'Keeffe, Anne, Michael McCarthy and Roger Carter. From Corpus to Classroom.
Language use and language teaching. Cambridge: Cambridge University Press,
2007.
Paquot, Magali. "Exemplification in learner writing: a cross-linguistic perspective".
Phraseology in Foreign Language Learning and Teaching. Eds. Sylviane Granger
and Fanny Meunier. Amsterdam: Benjamins, 2008. 101-119.
Perdue, Clive, ed. Adult language acquisition: cross-linguistic perspectives. Volume
1. Cambridge: Cambridge University Press, 1993a.
—. Adult language acquisition: cross-linguistic perspectives. Volume 2. Cambridge:
Cambridge University Press. 1993b.
Reder, Stephen, Kathryn Harris and Kristen Setzler. "The Multimedia Adult Learner
Corpus". TESOL Quarterly 37.3 (2003): 546-557.
Römer, Ute. "Corpus research and practice: What help do teachers need and what can we offer?" In Aijmer, K.
(ed.) Corpora and Language Teaching. Amsterdam & Philadelphia: Benjamins, (2009): 83–98.
Scanlon, Jessie. "Seth Godin: Profile of a Marketing Guru". BusinessWeek, September 24,
2008. Available at http://www.businessweek.com/innovate/content/sep2008/
id20080924_140114.htm.
Seidlhofer, Barbara. "Pedagogy and local learner corpora: Working with learning-driven data". In Granger S.,
Hung J. & Tyson S. (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language
Teaching. Amsterdam & Philadelphia: Benjamins, (2002): 213–234.
Shirato, Junko and Paul Stapleton. "Comparing English vocabulary in a spoken learner
corpus with a native speaker corpus: Pedagogical implications arising from an empirical
study in Japan". Language Teaching Research 11 (2007): 393-413.
Thewissen Jennifer. "The phraseological errors of French-, German-, and Spanishspeaking
EFL learners: Evidence from an error-tagged learner corpus". Proceedings
from the 8th Teaching and Language Corpora Conference (TaLC8), Lisbon
(Portugal), 3-6 July 2008. Ed. Associação de Estudos e de Investigação Científica
do ISLA-Lisboa. 300-306.
Tono, Yukio. "Using Learner Corpora in ELT and SLA Research". Paper presented at
the Symposium on the Roles of Corpora in Language Teaching and Language Engineering
of the 12th World Congress of Applied Linguistics (AILA), 1-6 August
1999, Tokyo, Japan.
Tono, Yukio, Tomoko Kaneko, Hitoshi Isahara, Toyomi Saiga, Emi Izumi, Masumi
Narita and Emiko Kaneko. "The Standard Speaking Test (SST) Corpus: A 1 millionword spoken corpus of Japanese learners of English and its implications for
L2 lexicography". Proceedings of the 2001 ASIALEX Biennial Conference. Ed. S.
Lee. Seoul: ASIALEX, 2001. 257-262.
Tribble, Chris. "Small corpora and teaching writing". Small Corpus Studies and ELT.
Theory and practice. Eds. Mohsen Ghadessy, Alex Henry and Robert Roseberry.
Studies in Corpus Linguistics 5. Amsterdam, John Benjamins, 2001. 381-407.
Vainikka, Anne and Martha Young-Scholten. "The initial state in the L2 acquisition
of phrase structure". The generative study of second language acquisition. New
Jersey: Erlbaum, 1998. 17-34.
VanPatten, Bill. "Processing instruction: An update". Language Learning 52 (2002):
755-803.
White, Lydia. "Against comprehensible input: The input hypothesis and the developmental
of second language competence". Applied Linguistics 8 (1987): 95-110.
Wible, David, Chin-Hwa Kuo, Feng-yi Chien, Anne Liu and Nai-Lung Tsao. "A webbased
EFL writing environment: integrating information for learners, teachers, and
9
researchers". Computers and Education 37 (2001): 297-315.
Widdowson, Henry. Defining Issues in English Language Teaching. Oxford: Oxford
University Press, 2003.
10
Download