„Valodas apguvēju korpuss –
tā veidošana un izmantošana valodu
apguvē, mācību materiālu izveidē”
Prof. Karīna Aijmere (Karin Aijmer)
Gēteborgas Universitāte, Zviedrija
• While the use of computer text corpora in
research is now well established, they are now
being used increasingly for teaching purposes.
This includes the use of corpus data to inform
and create teaching materials; it also includes the
direct exploration of corpora by students, both
in the study of linguistics and of foreign
languages. (Stewart, Bernardini and Aston 2004)
• Corpora are collections of texts stored
electronically which provide access to authentic
language use.
• The corpus revolution has now also reached
the pedagogical sphere.
• Corpora provide authentic data
• Corpora are important for language learning
and teaching and in the classroom
• Corpora are important for the writing of
dictionaries, grammars and textbooks
– The Cobuild Dictionary
– The Cambridge Grammar of English
(Carter and McCarthy 2006)
•
•
•
•
•
•
Publications
Corpora and Language Learners (2004)
How to use Corpora in Language Teaching (2004)
Corpora and Language Teaching (2009)
Corpus Linguistics and Language Teachers (2010)
Conferences (TaLC)
• In addition to native corpora we now have
learner corpora
• According to Leech 1998 ’we may claim that
the concept of a learner corpus is an idea
’whose hour has come’
• Computer learner corpora are electronic
collections of authentic FL/SL textual data
assembled according to explicit design criteria
for a particular SLA/FLT purpose. They are
encoded in a standardised and homogeneous
way and documented as to their origin and
provenance.
• A so-called learner profile gives information
about name, age, sex, native language, language
spoken at home, education (how many years of
English), stay in an English-speaking country
• The learner corpus can be used with special tools
(WordSmith, Error editor, PoS-tagger)
• Methodology: comparison with a native
speaker corpus
• Makes it possible to identify features
characteristic of the learners’ interlanguage
• What do learners need help with?
• the International Corpus of learner English
(ICLE)
• Initiated by Professor Sylviane Granger at
the University of Louvain-la-neuve.
• It totals 3.5 million words and includes
essays written by learners from 16 different
backgrounds.
• Advantages of a learner corpus
• Corpus-linguistic methods and tools can be
used (WordSmith, PoS taggers)
• Error tagging
– How should error be coded and claaffied?
• The ICLE-corpora can be used together with a
native speaker corpus
• Makes it possible to identify features which are
typical of non-native speakers
• Overuse – a feature is more frqquent in the
non-native speaker corpus thain in the
native-speaker corpus
• Underuse – a feature is less frqquent in the
non-native speaker corpus thain in the
native-speaker corpus
• Example of overuse
• Learners generally overuse I think in
argumentative writing
(Swe learners 41 times; NSs 3 times)
• Can be explained as overuse of patterns from
spoken language in written language
• Compare very and really
• Large learner corpora have been
assembled by publishers
(Longman, Cambridge University Press)
• The Cambridge Grammar of English (2006)
• Advanced Learners’ dictionaries (Longman)
• There are several new developments in
learner corpus studies
• Longitudinal corpora
• Corpora of novice learners in addition to
advanced learners
• The advent of spoken learner corpora
• Spoken learner corpora
• The pedagogical applications are linked to the
teaching of communicative practices
• How does one create a spoken learner corpus?
• The LINDSEI Corpus (Louvain International
Database of Spoken English Interlanguage)
•
•
•
•
Spoken learner corpora
Based on interviews
Advanced learners
Comparisons can be made with a comparable
corpus with native speaker students
• Makes it possible to consider the ’errors’ made
by non-native speakers in their conversational
behaviour
• Includes overuse and underuse
• Phenomena which are difficult for learners
include discourse features such as little words
like well, you know and I mean
• Little words such as well, you know and I mean
seem to mean very little
• They are above all fluency devices contributing
to coherence in the discourse
• they are generally underused
• A: (I mean) I mean she’s so little I mean you
you know sort of one can imagine a sort of
middle-aged woman with a coat that seemed
you know sort of just slightly exaggerated her
form you know (I mean) she could sort of slip
things in inside pockets but
• C: m m
• B: no she just carried it all home in a carrier
bag didn’t she
• Non-native speakers generally underuse
’pragmatic markers’
• Well is an exception
he came to visit sometimes . [uhm] he was fifteen
years old I think . but [uhm]. well family life
[uhm]. that’s difficult to … hm . well I don’t know
really well it was it was exciting to . to see em
different habits and so on but nothing in particular
• Well is overused by Swedish, German and
French learners
• Well is underused by Chinese learners
• Should pragmatic markers be taught or are
they acquired implicitly by exposure
• They are overused and misused and should
therefore be taught
• The teaching of spoken phenomena is now
generally recognized
• Supported by results from learner corpora
• Learner corpora can target what is to be taught
by pointing to ’errors’
• explicit as warning notes
• Types of applications of learner corpora
• How can they inform teaching in the classroom
• How can they be a resource for textbook writers
• Learner corpora have both direct and indirect
uses
• Direct uses: hands-on uses of learner corpora in
the class-room
What happens for example if a pragmatic marker
is omitted?
What is the function of the pragmtic marker?
What will happen in the future?
• More and different learner corpora
• Learner corpora of different languages
• More use of learner corpora to target teaching
objectives and inform teaching aids
• More attention to learners’ errors and difficulties
in dictionaries, grammars and textbooks
(warning notes)