Learner Corpora and English Language Teaching: Checkup Time FANNY MEUNIER, Louvain-la-Neuve, Belgium Prepublication draft <to appear in Anglistik: International Journal of English Studies 21.1 (March 2010): 209-220)> 1. Introduction A checkup is usually a routine visit to a specialist in order to assess one's state of health at a particular point in time with a view to becoming an informed patient and, if necessary, take adapted measures to improve one's health. The checkup metaphor is used in the title to refer to the assessment of the links between learner corpus research (LCR) and English language teaching (ELT). As is the case for medical checkups, however, no claim of exhaustive analysis will be made here. The focus will rather be on a limited number of issues which are considered vital for the future of LCR and suggestions for improvement will be provided in areas where the situation could, at least in my view, be improved. 2. Learner corpora, SLA and ELT: general overview Ten years ago, Tono (1999) listed the areas of second language acquisition (SLA) and ELT that learner corpora could contribute to. Learner corpus studies, Tono (1999) argued, can foster SLA research by: 1. describing the developmental stages of interlanguage (IL), 2. studying the effect of L1 transfer, 3. identifying the overuse and underuse of linguistic features, 4. discriminating between universal and L1 specific errors, and 5. distinguishing native from non-native-like performance. 2.1. Learner corpora and SLA Ten years later, it would probably be overoptimistic to claim that LCR has provided a clear description of the developmental stages of interlanguage, one of the reasons being the lack of availability of large longitudinal learner corpora. SLA researchers have long been working on developmental stages of acquisition but many of the SLA studies are not corpus-based and are limited to the study of a small number of informants. The advantages of corpus use (size, storage, computer-aided annotation and analysis, etc.) could nicely complement the many existing non-corpus based longitudinal SLA studies. Some large-scale projects have been carried out, but they have typically focussed on early stages of acquisition and/or on research topics traditionally anchored in the generative tradition such as the acquisition order of morphemes, negation or phrase structure (see for instance Vainikka and Young-Scholten 1998 for an illustration of the phrase structure approach). Famous longitudinal studies include Klein and Perdue's analysis of the second language acquisition of adult immigrant workers between 1981 and 1988 (see Perdue, 1993a; Perdue, 1993b; Klein and Perdue 1997). In this extensive crosslinguistic study, learners from different mother tongue backgrounds learning different second languages were all found to go through three relatively stable stages (pre-basic, basic and post-basic) generally characterized by a limited lexicon (with learners using mainly open part-of-speech classes), simple sentence structures and the absence of inflectional morphology. Other researchers, such as Myles et al. (1998, 1999), for instance, have used the CHILDES database format, initially developed for the study of L1 acquisition (MacWhinney 1999), to analyse classroom foreign language learning. Here again, the main focus was on early stages of learning rather than on more advanced proficiency levels. Some recent research projects have been launched to compensate for the lack of longitudinal learner corpora targeting more advanced stages of acquisition. Reder et al. (2003) present a multimedia adult ESL learner corpus (MAELC) in which initially low-level adult ESL classrooms learners have been recorded with multiple video cameras and microphones during a 5-year period (2001-2006). A transcription framework has been developed and transcripts include information not only on what students said but also on 1 what their target utterance was in order to facilitate the identification of certain types of errors. As indicated on the MAELC website, the multimedia aspect of the corpus also allows for close examinations of dyadic and small-group interactions between students from different language backgrounds1. A more recent project is LONGDALE, initiated in January 2008 at the Centre for English Corpus Linguistics in Louvain, Belgium2. The aim of the project is to build a large longitudinal database of learner English containing data from learners from a wide range of mother tongue backgrounds. The same students will be followed over a period of at least three years (from year 1 to year 3 of their studies at the University of Louvain) and data collections will be organised at least once a year. It is to be hoped that the collection and analysis of such corpora will help researchers shed more light on the developmental stages that learners with different mother tongues go through. In contrast to the study of the developmental stages of interlanguage, the study of the effect of L1 transfer, the identification of the overuse and underuse of linguistic features, of universal and L1 specific errors, and of native vs non-native performance (see above) have largely benefited from the insights gained from LCR 3. 2.2. Learner corpora and ELT As for the focus of the present paper, i.e. the links between LCR and ELT, Tono (1999) states that learner corpora could be used as input to inform L2 lexicography, syllabus design, material design, data-driven learning and self-learning. He also argues that explaining the potential and use of learner corpora should be part of teacher education. The evaluation of those scholarship4 issues is probably less satisfactory than the research issues linked to SLA. Apart from lexicography, which has benefited from learner corpus analysis, the other ELT-related domains have not really been fed by LCR: The field in which advances have been quickest is pedagogical lexicography. The latest editions of the Longman Dictionary of Contemporary English (LDOCE) (2003) and the Cambridge Advanced Learner's Dictionary (CALD) (2003) both contain error notes based on their respective learner corpora, which are intended to help learners to avoid making common mistakes (Granger 2008, 345). More recently still, Granger (2009, 24) provides a critical evaluation of the contribution of learner corpora to second language acquisition and foreign language teaching, and, in the section devoted to learner corpus research and foreign language teaching, she stresses the fact that "there is undeniably very little evidence of fully-fledged up-and-running applications". In that state-of-the-art article, Granger also distinguishes between the use of learner corpora for delayed or immediate pedagogical use (DPU or IPU corpus use). In a DPU situation, learner corpora are not used directly as teaching/learning materials by the learners who have produced the data [but] are compiled by academics or publishers with a view to providing a better description of one specific interlanguage and/or designing tailor-made pedagogical tools which will benefit similar-type learners. (Granger 2009, 24-25). Learner corpora for immediate pedagogical use (IPU) are "collected by teachers as part of their normal classroom activities […] and the learners are at the same time producers and users of the corpus data" (Granger 2009, 25). This distinction of DPU and IPU will be used to organise the following sections of the present paper: section 3 will examine two DPU issues, namely syllabus design and material design; section 4 will address the IPU side of learner corpora and will present a few studies which report on learners who have analysed their own productions. Here, I will also touch upon the need for (learner) corpus literacy in teacher education. The final section will provide both a summary and a list of priority issues for the future. 3. LCR for syllabus and materials design Graves (1996, 3) defines the term syllabus as "the specification and ordering of content of a course or courses". As for the term materials, it refers to the various and 2 specific written, audio and/or video input which are part of the syllabus and which are provided to learners, together with exercises and tasks presented to exploit the input. In the ELT circles promoting the use of authentic materials, it has been suggested that spoken and/or written excerpts from native corpora can be used as raw teaching input (e.g. the transcript of a postcard from the BNC)5. Lee (2001) argues that "genre analyses of relatively small, focussed and manageable sets of texts are now possible with the help of the BNC Index, opening up a rich resource for all kinds of learning and research activities". Teachers and researchers may have access to some rare subgenres, such as postcards or shopping lists, which were not included in traditional general corpora. The outcomes of native corpus research can also feed materials and syllabus design: e.g. frequency-based vocabulary selection and grading, results of corpus-based genre analysis for text-type awareness exercises, use of concordances for data-driven learning activities, etc. Recourse to learner corpora for syllabus and materials design is still relatively rare in ELT. Several reasons may account for this. First, the belief that the provision of incomprehensible input is detrimental to learning is still present in ELT circles despite the many studies highlighting the benefits of incomprehensible input and erroneous output in interactional feedback (cf. Gass 1997; Nassaji 2007; Carroll 2000; White 1987; VanPatten 2002; Braidi 2002; Mackey et al. 2003)6. As for the use of the outcomes of LCR to feed materials or syllabus design, the main problem lies in the lack of relevance and generalizability of the results provided in LCR studies pertaining both to the types of learner corpora used and to the research focus. As already stated in section 1, some learner corpora focus on the early stages of acquisition and/or on research topics traditionally anchored in the generative tradition; the results of such studies cannot easily be included in current syllabuses or materials which largely advocate a communicative approach to language teaching. Other types of learner corpora target upper-intermediate to advanced levels of proficiency: for example, the International Corpus of Learner English (ICLE) 7, which contains over 3 million words of writing by learners of English from 14 different mother tongue backgrounds at university level (Granger et al. 2009); sections of the Michigan Corpus of Academic Spoken English (MICASE)8 or the Michigan Corpus of Upper-level Student Papers (MICUSP)9, which contain spoken and written academic productions by both native and non-native speakers. Some other learner corpora contain data from one mother tongue population only (e.g. Kojiro's English learner corpus produced by Japanese college students)10, which also reduces the generalizability of the results. Another reason which accounts for the lack of direct influence of LCR studies on ELT syllabuses and materials is that, apart from the EAP/ESP teachers who do use EAP or advanced learner corpora (see for instance Flowerdew 2003; Gilquin et al. 2007; Paquot 2008), the topics covered in most learner corpora are often miles away from the everyday needs of a vast majority of ESL or EFL school teachers who target English for general purposes for a teenage audience. Finding a learner corpus that meets their needs comes close to looking for a needle in a haystack. The Common European Framework of Reference for Languages (CEFR, Council of Europe 2001: 52) suggests that the following thematic categories should be addressed in English for General Purposes (EGP): personal identification; house and home, environment; daily life; free time, entertainment; travel; relations with other people; health and body care; education; shopping; food and drink; services; places; language; weather. Subcategories can also be established (e.g. leisure, hobbies and interests, radio and TV, cinema, theatre, concert, etc. for 'free time and entertainment'). To my knowledge, no native or learner corpus study provides easily transferrable research results which could be integrated in a syllabus addressing the above-mentioned themes. It must also be kept in mind that besides containing up-to-date materials which meet the pupils' interests and which can be linked to their everyday life preoccupations, a syllabus should also include text types representative of the meaningful communicative tasks that learners are called upon to perform (e.g. writing an email, a letter of excuse to a friend, the summary of a book, his/her opinion on a film; chatting with a friend, describing where he/she lives, etc.). All this makes the task of finding large, freely available and easily searchable corpora which would be appropriate for the teaching of EGP even more daunting, if not impossible. Some projects and/or computer tools have been developed with the aim of facilitating 3 the collection of topic specific corpora. One such project is Kilgarriff's Web BootCaT (see Baroni and Bernardini 2004, Kilgarriff and Greffenstette 2003), which makes it possible to quickly collect a large corpus of web data on the basis of representative keywords. RSS technology is also one way of promoting teachers' and learners' corpus literacy and of automating tailor-made corpus collection (Meunier and Fairon 2006). This said, those tools can only be used with web data, with all the advantages and disadvantages this entails, such as for instance a lack of information on the author of the text, a 'you-can-only-get-what-is-on-the-web' bias, and an additional bias on the written mode (despite the many examples of speech-like materials that can be found in chat rooms). Such limitations notwithstanding, it could be argued that learner corpora could be used not so much for the interest of the topics, text types or tasks covered but rather as a source of information on less topic- or genre-dependent issues such as sentence structure and grammatical problems. McCarthy (2008) has recently acknowledged the fact that what he calls 'non-native user corpora' are under-developed and underexploited. He states, however, that learner corpora (usually in the form of examination scripts or essays or classroom transcripts) have been used a great deal and are frequently a resource for evidence in error warnings in teaching materials (e.g. Carter & McCarthy 2006) or as sources for the targeting of particular language features in materials (McCarthy & O'Dell 2005). (McCarthy 2008, 570) The question of the status of errors, McCarthy (2008, 570) stresses, is "one area where corpus linguistics overlaps with extant and long-standing preoccupations in teacher education". It should, however, be added that very few large-scale error tagged learner corpora are freely available11 and that too little research has been carried out on part-of-speech tagged and syntactically parsed learner corpora. In this context, Meunier (1995 and 2002) addresses general issues in tagging and parsing interlanguage, and the role of learner and native corpora in grammar teaching. Granger (2003), Granger and Thewissen (2005) and Thewissen (2008) deal with the value of error tagged corpora in CALL and in the assessment of language proficiency, and Meunier (in preparation) focuses on interlanguage syntactic complexity. The lack of spoken learner corpora also reinforces the underuse of learner copora in syllabus and materials design. Referring to native corpora, Shirato and Stapleton (2007) write: [A] major concern is that most emphasis to date has been put on written texts with very few attempts made to analyze spoken data (the British National Corpus (BNC) is 90% written vs. 10% spoken data) for the purposes of developing pedagogical materials. (Shirato and Stapleton 2007, 394) This statement of fact is all the more true for learner corpora. Whilst spoken interaction is a core issue in ELT, very few spoken learner corpora have been compiled and analyzed to date. A few exceptions include the LINDSEI corpus (cf. De Cock 2007) and the SST corpus (cf. Tono et al. 2001). To the limitations already presented for written learner corpora (see above) spoken LCR has to deal with additional demands: the recording, transcription and annotation of spoken corpora is even more difficult and time-consuming than is the case for written corpora. Once spoken learner corpora are collected, transcribed and analyzed (which often includes a comparison with native spoken corpora), many spoken-specific features are brought to light. Shirato and Stapleton (2007), in their comparison of English vocabulary in a spoken learner and native speaker corpus, point out the following: [T]he results obtained lead [them] to the conclusion that the vocabulary currently acquired by Japanese NNS differs markedly from the NS norm. [Their] qualitative analysis has illustrated significant aspects of NS's conversational vocabulary in which softness, indirectness, hedges, and vagueness are abundant. These characteristics may be considered a basic defining feature of spoken lexis, although they have long been given only scant attention in Japanese formal education, in which a mastery of written language has been a major priority. (Shirato and Stapleton 2007, 410) 4 Those speech-specific language features will only receive appropriate treatment if more spoken corpora are collected and analyzed in the future. The results of such studies should also find their way to the ELT syllabus and materials which still tend to give a rather monolithic view of the language. Very few textbooks feature and/or explain speech-specific grammatical features such as ellipsis, left dislocation, tail slot, hesitation features, vagueness hedges or overtures (see Hewings and Hewings 2005 for more examples). A last reason for the scarce use of corpus–based material (native and learner corpora alike) by teachers has been put forward by McCarthy (2008) who mentions a lack of awareness of what exactly corpora can bring to ELT: [T]eachers have heard of corpora, but they are not quite sure what they are. They are sometimes frightened of what their use might imply: Does a teacher need to have a high level of expertise in computational linguistics or information technology in order to be part of this pedagogical revolution? Does one have to be a native speaker of a particular language in order to understand and use corpus information in that language? (McCarthy 2008, 563-564) This relationship between teachers and corpora will be touched upon in the following section, together with the assessment of the IPU of learner corpora. 4. Learner corpora for immediate pedagogical use (IPU) Whilst the previous section addressed DPU issues, the present section focuses on studies or experiments where learners are asked to analyse their own productions as part of their learning activities. Mukherjee and Rohrbach (2006, 205) mention "a widening gap and a growing lag between on-going and intensive corpus linguistic research on the one hand and classroom teaching on the other" and state that the exploitation of learner corpora in the EFL classroom is still marginal. Granger (2008) observes that when teachers do use learner corpora to develop their own in-house teaching materials, the latter share a number of characteristics: (1) they tend to be based on learner corpora for immediate pedagogical use; (2) they are often L1-specific rather than generic; (3) they are designed with a clear teaching objective in a well-defined teaching context; and (4) they tend to be electronic rather than paper tools. (Granger 2008, 348) She quotes the web-based writing environment of Wible, Kuo, Chien, Liu and Tsao (2001) as "the perfect example of a tool, which allows for the generation, annotation and pedagogical exploitation of learner corpora" (Granger 2008, 348). Whilst Wible et al.'s (2001) project is impressive in size, it is probably also rather unique in its kind as the human and computing resources needed are high. Many other projects are more limited in scope. Braun (2005), for instance, pleads for the use of pedagogically relevant corpora which require what Widdowson (2003) calls 'pedagogic mediation'. She uses a small English Interview Corpus (ELISA) to outline possible solutions for a pedagogic mediation and shows that learner corpora can be used to address discourse issues. This use of small learner corpora has already been recommended by Flowerdew (2001) for EAP materials design and by Tribble (2001) for the teaching of writing. Belz and Vyatkina's (2008) work is another excellent illustration of the value of pedagogical mediation. The authors explore the pedagogically mediated application of a learner corpus in language teaching and in the developmental analysis of SLA. Belz et al.'s (2008) English-German bilingual corpus contains the complete record of the native and non-native speaker interactions over a two-month telecollaborative partnership (see Belz 2002; Belz and Thorne 2006). Monolingual and bilingual learner corpora of computer-mediated-communication (CMC) are ideal pedagogical sources. The learners can revisit their own productions in the form of pedagogically mediated teaching materials. Belz and Vyatkina (2008) provide many examples of other CMC studies and argue that such an approach facilitates the close, corpus-driven tracking of micro-changes in learner language use over time and that such CMC data is a source of robust material for focused instruction. The learningdriven data methodology advocated by Seidlhofer (2002) finds its realization in such a learner-centred, context-dependent and culture-bound approach. 5 Following the medical metaphor used in the title of the present paper, it seems fitting to use a business-related metaphor with regard to the successful IPU of learner corpora, namely "small is the new big" (Godin 2006). Seth Godin, described as a marketing guru by Business Week in September 2008 (Scanlon 2008), argues that whilst "big used to matter" and that "get big fast was the motto", get small seems to be the new motto because small gives you the flexibility to change the business model when your competition changes theirs. […] A small restaurant has an owner who greets you by name [and] [a] small church has a minister with the time to visit you in the hospital when you're sick. (Godin 2006) The 'get small' motto seems to be a key to successful classroom use of learner corpora, and is probably also one way of drawing teachers' attention to learner corpora. As Römer (2009) puts it: Corpus researchers often claim that corpus linguistics can make a difference for language teaching and that it has an immense potential to improve pedagogy, but perhaps do not focus enough on the interface of research and practice. They do not make sufficient efforts to reach practitioners, especially teachers, with the 'corpus mission', do not know enough about the needs of teachers. (Römer 2009, 83) Whilst her article focuses mainly on how the use of native corpora can help teachers meet some of their needs, the crucial importance of a needs' analysis is highlighted, and, to my knowledge, no such survey exists for learner corpora. Another key reference book which aims to help corpora reach classrooms is O'Keeffe et al. (2007, xi). In the introduction, the authors mention the "frequent mismatch between CL research and what goes on into materials and resources, and what goes on in the language classroom". The book draws primarily on spoken language corpora, which constitutes a welcome slant given the numerous calls for more focus on speech in educational environments, especially in instructed settings where lack of exposure to speech often turns out to be detrimental to the learners' communicative competence (cf. Meunier 2007). Despite the fact that the book mainly deals with native corpora, it can also help promote the use of learner corpora as the authors argue that one of their aims was to encourage teachers to use language corpora when pursuing their own enquiries and enhance their professional development. Providing an all-encompassing and ready-made answer on how to best use learner corpora for immediate pedagogic use is no easy task as the options available will depend on the learners' needs, teachers' needs, and on the human and computational resources at their disposal. It seems, however, that the experiments carried out with smaller corpora satisfy some of the teachers' and learners' immediate needs, probably because such experiments are considered as feasible (in terms of computer and human resources needed), manageable, and because they are inherently learner-centred, contextdependent and culture-bound. The fact that learners analyse their own productions also favours the individualization of learning and teaching and helps learners monitor their own production and the effects of their own production on others. Promoting the use of small learner corpora does not imply that the idea of large annotated learner corpora should be abandoned but rather that a small-step approach should probably be recommended as an appetizer to classroom corpus use. Citing Godin (2006) once more, the key to success could be to "get small and think big". 5. Concluding remarks After a general overview of the place of learner corpus research in SLA and ELT, the article has presented concrete examples of delayed and immediate pedagogical use of learner corpora. Coming back to the checkup metaphor it can be stated that LCR is a healthy and dynamic field of activity. The work carried out so far in LCR is impressive. Ongoing and future projects will undoubtedly shed more light on second/foreign language acquisition processes and will benefit ELT. To go on developing healthily, priority should, at least in my view, be given to the following issues: the collection, transcription and annotation of longitudinal and spoken learner corpora; the collection, 6 transcription and annotation of new types of learner data (including new topics and new text types); the promotion of studies on POS-tagged and syntactically parsed corpora; the promotion of small-scale learner corpus projects, anchored in local contexts and in line with teachers' and learners' immediate needs. End notes 1 See http://www.labschool.pdx.edu/maelc_access.html. 2 For more information on the LONGDALE project, see http://cecl.fltr.ucl.ac.be/LONGDALE.html. 3 A closer look at the learner corpus bibliography reveals the impressive range of issues Addressed (see http://cecl.fltr.ucl.ac.be/learner%20corpus%20bibliography.html). 4 Research and scholarship are usually distinguished in second and foreign language teaching and learning. Whilst research looks for the explanation or understanding of learning and teaching principles, scholarship advocates what developments should be pursued in the future and why. The assessment of developments is also part of scholarly issues (see Byram and Feng 2004 for more details). 5 For an excellent review of the place of authentic discourse and materials in language learning, see Gilmore (2007) 6 Incomprehensible input may be correct input which is beyond the learner's level of competence or incorrect input leading to comprehension problems or gaps. Learner corpora, as they are error-prone, belong to the second category. 7 See http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm for more information on ICLE. 8 See http://lw.lsa.umich.edu/eli/micase/index.htm for more information on MICASE. 9 See http://lw.lsa.umich.edu/eli/eli1/micusp/index.htm for more information on MICUSP. 10 See http://www.eng.ritsumei.ac.jp/asao/lcorpus/ for more information on the corpus. 11 Some publishers use their in-house error tagged learner corpus as a source of inspiration to include error warnings or notes in teaching materials but very little is known about the error tagging procedures used and the selection principles adopted for the inclusion of those warnings or error notes. References Baroni, Marco and Silvia Bernardini. "BootCaT: Bootstrapping corpora and terms from the web". Proceedings of LREC 2004, Lisbon: ELDA, 2004. 1313-1316. Belz, Julie A. and Nina Vyatkina. "The pedagogical mediation of a developmental learner corpus for classroom-based language instruction". Language Learning & Technology 12.3 (2008): 33-52. —. "Social dimensions of telecollaborative foreign language study". Language Learning & Technology 6.1 (2002): 60-81. — and Steven L. Thorne, eds. Computer-mediated intercultural foreign language education. Boston, MA: Heinle & Heinle, 2006. Braidi, Susan. "Reexamining the role of recasts in native-speaker/non-native-speaker interactions". Language Learning 52.1 (2002): 1-42. Braun, Sabine. "From pedagogically relevant corpora to authentic language learning contents". ReCALL 17.1 (2005): 47-64. Byram, Mike and Anwei Feng. "Culture and language learning: teaching, research and scholarship". Language Teaching 37.3 (2004): 149-168. Carroll, Susanne. Input and evidence: The raw material of second language acquisition. Philadelphia: John Benjamins, 2000. Carter, Ronald and McCarthy, Michael. Cambridge Grammar of English: A Comprehensive Guide to Spoken and Written Grammar and Usage. Cambridge: Cambridge University Press, 2006. Council of Europe (2001). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: CUP. Available online at http://culture.coe.int/portfolio De Cock, Sylvie. "Routinized Building Blocks in Native Speaker and Learner Speech: Clausal Sequences in the Spotlight". Spoken Corpora in Applied Linguistics. Eds. Mari C. Campoy and María J. Luzón. Bern: Peter Lang, 2007. 217-233. Flowerdew, Lynne. "The exploitation of small learner corpora in EAP materials design". Small Corpus Studies and ELT. Theory and practice. Eds. Mohsen Ghadessy, Alex Henry and Robert Roseberry. Studies in Corpus Linguistics 5. Amsterdam: John Benjamins, 2001. 363-380. —. "A Combined Corpus and Systemic-Functional Analysis of the Problem-Solution Pattern in a Student and Professional Corpus of Technical Writing". TESOL Quarterly 37.3 (2003): 489-511. Gass, Susan. Input, Interaction, and the Second Language Learner. Mahwah, NJ: 7 Lawrence Erlbaum Associates, 1997. Gilmore, Alex. "Authentic materials and authenticity in foreign language learning". Language Teaching 40.2 (2007): 97-118. Gilquin, Gaëtanelle, Sylviane Granger and Magali Paquot. "Learner corpora: the missing link in EAP pedagogy". Corpus-based EAP Pedagogy. Ed. Paul Thompson. Special issue of Journal of English for Academic Purposes 6.4 (2007): 319335. Godin, Seth. Small Is the New Big: and 183 Other Riffs, Rants, and Remarkable Business Ideas. Penguin, 2006. Granger, Sylviane. "Error-tagged learner corpora and CALL: a promising synergy". CALICO (special issue on Error Analysis and Error Correction in ComputerAssisted Language Learning) 20.3 (2003): 465-480. —. "Learner Corpora in Foreign Language Education". Encyclopedia of Language and Education. Volume 4. Second and Foreign Language Education. Eds. Nelleke Van Deusen-Scholl and Nancy H. Hornberger. Berlin: Springer, 2008. 337-351. —. "The contribution of learner corpora to second language acquisition and foreign language teaching: A critical evaluation". Corpora and Language Teaching, Corpora and Language Teaching. Ed. Karin Aijmer. Studies in Corpus Linguistics 33. Amsterdam: John Benjamins, 2009. 13-33. — and Jennifer Thewissen. "The contribution of error-tagged learner corpora to the assessment of language proficiency. Evidence from the International Corpus of Learner English". Paper presented at the 27 th Language Testing Research Colloquium, Ottawa (Canada), 18-22 July 2005. —, Estelle Dagneaux, Fanny Meunier and Magali Paquot. The International Corpus of Learner English – Version 2. Handbook and CD-ROM. Louvain-la-Neuve: Presses Universitaires de Louvain, 2009. Graves, Kathleen. Teachers as Course Developers. Cambridge: Cambridge University Press, 1996. Hewings, Ann and Martin Hewings. Grammar and context. An advanced resource book. London and New York: Routledge, 2007. Kilgarriff, Adam and Gregory Greffenstette. "Introduction" to the Special Issue on Web as Corpus. Computational Linguistics 29.3 (2003): 1-15. Klein, Wolfgang and Clive Perdue. "The Basic Variety (or: Couldn't natural languages be much simpler?)". Second Language Research 13.4 (1997): 301-347. Lee, David. "Genres, registers, text types, domains, and styles: clarifying the concepts and navigating a path through the BNC jungle". Language, Learning & Technology 5.3 (2001): 37-72, available at <http://llt.msu.edu/vol5num3/pdf/lee.pdf> Mackey, Alison, Rhonda Oliver and Jennifer Leeman. "Interactional input and the incorporation of feedback: An exploration of NS-NNS and NNS-NNS adult and child dyads". Language Learning 53.1 (2003): 35-66. MacWhinney, Brian. "The CHILDES system". Handbook of child language acquisition. Ed. Tej Bhatia. San Diego, Academic Press, 1999. 457-494. McCarthy, Michael. "Accessing and interpreting corpus information in the teacher education context". Language Teaching 41.4 (2008): 563-574. McCarthy, Michael and Felicity O'Dell. English Collocations in Use. Cambridge: Cambridge University Press, 2005. Meunier Fanny. "Tagging and Parsing Interlanguage". La Linguistique Appliquée dans les Années 90. Ed. L. Beheydt. ABLA Review 16 (1995): 21-29. —. "The pedagogical value of native and learner corpora in EFL grammar teaching". In Granger S., Hung J. & Tyson S. (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam & Philadelphia: Benjamins, (2002): 119-142. — and Cédrick Fairon. "Empowering teachers' and learners' corpus literacy: using the RSS technology to automate tailor-made corpus collection". Proceedings of the Seventh Teaching and Language Corpora Conference, TALC 2006, Paris. —. Review of From Corpus to Classroom. Language use and language teaching, by Anne O'Keeffe, Michael McCarthy and Ronald Carter. ReCALL Journal 19.3 (September 2007). —. Corpora, SLA and EFL. Assessing interlanguage syntactic complexity. (in preparation). Mukherjee, Joybrato. and Jan-Marc Rohrbach. "Rethinking applied corpus linguistics from a language-pedagogical perspective: new departures in learner corpus research". 8 Planing, Gluing and Painting Corpora: Inside the Applied Corpus Linguist's Workshop. Eds. Bernhard Ketteman and Georg Marko. Frankfurt/Main: Peter Lang, 2006. 205-232. Myles, Florence, Janet Hooper and Rosamond Mitchell. "Rote or rule? Exploring the role of formulaic language in classroom foreign language learning". Language Learning 48.3 (1998): 323-363. Myles, Florence, Rosamond Mitchell and Janet Hooper. "Interrogative chunks in French L2: A basis for creative construction?" Studies in Second Language Acquisition 21.1 (1999): 49-80. Nassaji, Hossein. "Elicitation and reformulation and their relationship with learner repair in dyadic interaction". Language Learning 57.4 (2007): 511-548. O'Keeffe, Anne, Michael McCarthy and Roger Carter. From Corpus to Classroom. Language use and language teaching. Cambridge: Cambridge University Press, 2007. Paquot, Magali. "Exemplification in learner writing: a cross-linguistic perspective". Phraseology in Foreign Language Learning and Teaching. Eds. Sylviane Granger and Fanny Meunier. Amsterdam: Benjamins, 2008. 101-119. Perdue, Clive, ed. Adult language acquisition: cross-linguistic perspectives. Volume 1. Cambridge: Cambridge University Press, 1993a. —. Adult language acquisition: cross-linguistic perspectives. Volume 2. Cambridge: Cambridge University Press. 1993b. Reder, Stephen, Kathryn Harris and Kristen Setzler. "The Multimedia Adult Learner Corpus". TESOL Quarterly 37.3 (2003): 546-557. Römer, Ute. "Corpus research and practice: What help do teachers need and what can we offer?" In Aijmer, K. (ed.) Corpora and Language Teaching. Amsterdam & Philadelphia: Benjamins, (2009): 83–98. Scanlon, Jessie. "Seth Godin: Profile of a Marketing Guru". BusinessWeek, September 24, 2008. Available at http://www.businessweek.com/innovate/content/sep2008/ id20080924_140114.htm. Seidlhofer, Barbara. "Pedagogy and local learner corpora: Working with learning-driven data". In Granger S., Hung J. & Tyson S. (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam & Philadelphia: Benjamins, (2002): 213–234. Shirato, Junko and Paul Stapleton. "Comparing English vocabulary in a spoken learner corpus with a native speaker corpus: Pedagogical implications arising from an empirical study in Japan". Language Teaching Research 11 (2007): 393-413. Thewissen Jennifer. "The phraseological errors of French-, German-, and Spanishspeaking EFL learners: Evidence from an error-tagged learner corpus". Proceedings from the 8th Teaching and Language Corpora Conference (TaLC8), Lisbon (Portugal), 3-6 July 2008. Ed. Associação de Estudos e de Investigação Científica do ISLA-Lisboa. 300-306. Tono, Yukio. "Using Learner Corpora in ELT and SLA Research". Paper presented at the Symposium on the Roles of Corpora in Language Teaching and Language Engineering of the 12th World Congress of Applied Linguistics (AILA), 1-6 August 1999, Tokyo, Japan. Tono, Yukio, Tomoko Kaneko, Hitoshi Isahara, Toyomi Saiga, Emi Izumi, Masumi Narita and Emiko Kaneko. "The Standard Speaking Test (SST) Corpus: A 1 millionword spoken corpus of Japanese learners of English and its implications for L2 lexicography". Proceedings of the 2001 ASIALEX Biennial Conference. Ed. S. Lee. Seoul: ASIALEX, 2001. 257-262. Tribble, Chris. "Small corpora and teaching writing". Small Corpus Studies and ELT. Theory and practice. Eds. Mohsen Ghadessy, Alex Henry and Robert Roseberry. Studies in Corpus Linguistics 5. Amsterdam, John Benjamins, 2001. 381-407. Vainikka, Anne and Martha Young-Scholten. "The initial state in the L2 acquisition of phrase structure". The generative study of second language acquisition. New Jersey: Erlbaum, 1998. 17-34. VanPatten, Bill. "Processing instruction: An update". Language Learning 52 (2002): 755-803. White, Lydia. "Against comprehensible input: The input hypothesis and the developmental of second language competence". Applied Linguistics 8 (1987): 95-110. Wible, David, Chin-Hwa Kuo, Feng-yi Chien, Anne Liu and Nai-Lung Tsao. "A webbased EFL writing environment: integrating information for learners, teachers, and 9 researchers". Computers and Education 37 (2001): 297-315. Widdowson, Henry. Defining Issues in English Language Teaching. Oxford: Oxford University Press, 2003. 10