Przemysław Kaszubski IFA UAM Poznań email@example.com The English Day (10 Dec., 2002) Instytut Neofilologii Państwowa Wyższa Szkoła Zawodowa w Koninie The use of electronic text, or corpora, in the teaching of English What is a corpus? "a collection of pieces of language, selected and ordered according to explicit linguistic criteria in order to be used as a sample of the language" (Sinclair 1996 - EAGLES96). naturally-occurring / authentic text/discourse (NOT citations) usu. machine-readable / electronic / computer-stored and -processable compiled according to design criteria (ideally) representative of the sampled language variety Why bother with a corpus? Expert speakers have only partial knowledge Expert speakers think of what is possible Expert speakers cannot quantify their knowledge Expert speakers cannot make up natural examples Corpus is more comprehensive and balanced Corpus shows us what is common and typical Corpus can give us fairly accurate statistics Corpus can give us many natural examples Some basic types of (monolingual) corpora written / spoken general/reference special(ised) sample monitor Other corpora bilingual and multilingual (comparable & parallel) special / non-standard (e.g. child language) (non-native) learner (or interlanguage) Representativeness. Why are the design criteria important? (Meyer 2002) whose language (range of text sources; time-frame; sociolinguistic variables: gender, age, education, dialect, social relationships) production or reception spoken / written medium genre / text-type general or specialised size vs balance sample size vs whole texts Useful & reliable corpus annotation POS-tagging lemmatisation ELT: some questions asked of corpora does an item exist (in general; in a genre; typicality/variation) are the synonyms interchangeable typical lexical/grammatical context(s) for an item teaching grammar through lexis false-friends or true friends (bilingual corpora) study of literary texts through concordancing Pedagogical approaches to using corpora teacher-controlled use of corpus-based resources (dictionaries, coursebooks, exercises) data-driven-learning / classroom-concordancing Advantages of small (<0,5M) over large (>100M) corpora (Aston 1997) easier to manage more fully analysable easier to become familiar with easier to interpret easier to construct easier to reconstruct more clearly patterned limits are clearer Where are the corpora? Where are the tools? [Some demos.] Free online corpora access English BNC Online Sampler: http://sara.natcorp.ox.ac.uk/lookup.html COBUILD Concordance & Collocation Sampler: http://titania.cobuild.collins.co.uk/form.html WebCorp: http://www.webcorp.org.uk/ Polish Korpus PWN: http://korpus.pwn.pl/ Pseudo-korpus IPI-PAN: http://www.ipipan.waw.pl/~corpus/ Free online text resources for offline research English: Project Gutenberg: http://www.promo.net/pg/ Oxford Text Archive: http://ota.ahds.ac.uk/ Miscellaneous online sources (press, encyclopedias) Polish PWN links: http://slowniki.pwn.pl/korpus/linki.php Polish-English Bilingual documents: http://www.zbiordokumentow.pl/ Other affordable text resources CD-ROMs (encyclopedias, newspapers) Free tools for offline use concordancer: Concordancer for Windows: http://www.linglit.tu-darmstadt.de/wconcord.htm other: XCLOZE & CONTEXTS: http://web.bham.ac.uk/johnstf/timcall.htm TestBuilder: <soon available, e-mail me for info> Conclusions: the contributions of corpora to remember Lexical and lexicogrammatical access to text Context Supercede human intuition of commonness/variation Anyone can research WWW info on corpora (selection): David Lee's Bookmarks for Corpus-Based Linguists: http://devoted.to/corpora M. Barlow's Corpus Linguistics page: http://www.ruf.rice.edu/~barlow/corpus.html Tim Johns CALL Page: http://web.bham.ac.uk/johnstf/timcall.htm P. Kaszubski's (Learner) corpora page: http://main.amu.edu.pl/~przemka PELCRA Home Page (Polish-English Language Corpora for Research and Applications): http://www1.uni.lodz.pl/pelcra/index.htm Recommended books & articles: Kennedy, G. 1992. "Preferred ways of putting things with implications for language teaching". In: Svartvik, J. (ed.), 1992, Directions in corpus linguistics. Proceedings of the Nobel Symposium 82, The Hague: Mouton de Gruyter. 335-373. Partington, A. 1998. Patterns and meaning: using corpora for English language research and teaching. Amsterdam: John Benjamins Publishing Company. Tribble, C. & Jones, G., 1997. Concordances in the classroom: using corpora. A resource guide for teachers [new edition]. Houston, TX: Athelstan. Wichmann, A., Fligelstone, S., McEnery, T. & Knowles, G (eds). 1997. Teaching and Language Corpora. London & New York: Longman.