here - IVACS

advertisement
Annual IVACS One-Day Symposium.
School of Education, Queen’s University Belfast.
Thursday, January 23rd, 2014
The Applications of Corpus Linguistics for research in Education, Teaching and
Learning
Provisional Programme
From 9.00
Arrival (Tea and Coffee)
9.15-9.30
Welcome Address
9.30-10.00
Responding to task: chaotic uniqueness in philosophy
undergraduate essays
James Binchy
10.00-10.30
Dealing with Dirty Data
Ivor Timmis
10.30-11.00
The Project of the Corpus of Learner English in Malta (CLEM):
compiling the corpus and decisions on the error-tagging system.
Odette Vassallo
COFFEE BREAK: 11.00-11.30
11.30-12.00
Collocation and the learner: wading into the depths
Michael McCarthy
12.00-12.30
A comparison of Native and Non-native speaker undergraduate
presentations in the CLAS Corpus
Margaret Healy, Kristin Horan, Anne O’Keeffe
12.30-1.00
‘When some pieces are missing’: The construction of non-native
spoken discourse with a limited set of pragmatic markers
Òscar Bladas and Aisling O’Boyle
LUNCH BREAK: 1.00-2.30 (Sandwich lunch provided)
2.30-3.00
Confidence through Corpora: Supporting native speaker EAP with a
corpus of student-generated texts
Megan Bruce
3.00-3.30
Focus group data investigating ESOL speakers’ talk
Róisín Ní Mhocháin
3.30
Round up
Abstracts
Responding to task: chaotic uniqueness in philosophy undergraduate essays
James Binchy, Mary Immaculate College
Essay writing is used as a common form of assessment for students engaged in third-level
education. It is often the only form of assessment in a module. In the writing that students
produce for their assessments, they are often expected by those correcting the essays to
appropriate the norms of both academic writing and the norms of the particular discipline.
However, at the same time, each essay must be unique and were it not to be, it would be
considered plagiarism. This must be done while responding to a task set by the lecturer.
This paper uses a corpus of 94 undergraduate student essays collected throughout the
philosophy degree programme of one cohort of students, including both the first and the
final essay assignment in the degree programme. The corpus shows that each student
responds to the task, yet does so in a unique manner. The corpus also shows that for each
pattern there is successful student who did not follow that pattern. This paper will examine
the corpus both quantitatively and qualitatively to show that individual writer choices with
regard to lexis can seem chaotic despite similarity of task.
************
Dealing with Dirty Data
Ivor Timmis, Leeds Metropolitan University
In this paper I will discuss some of the challenges involved in working with historical data,
with a particular focus on historical sociolinguistics. The data I will discuss is taken from the
Bolton/Worktown corpus, a corpus of conversations which took place between working
class people in Bolton, an industrial town in the North of England, between 1937 and 1940.
I will consider two specific aspects of the challenge this kind of data poses:
1) How far can we judge whether the data, for the most transcribed ‘live’ by nonlinguists, is ‘authentic’?
2) Is it possible to apply social categories such as ‘working class’ retrospectively?
In relation to the question of authenticity, I will ask whether it is legitimate to speak of
degrees of authenticity, and whether professionally and locally informed intuition has a
legitimate role to play in making judgements on authenticity. In relation to social
categorisation, I will argue that social history can come to the aid of corpus linguistics to
help us meet this aspect of the challenge of historical sociolinguistics.
************
The Project of the Corpus of Learner English in Malta (CLEM): compiling the corpus and
decisions on the error-tagging system.
Odette Vassallo, University of Malta
Malta, where Maltese and English have official status, has long been considered a bilingual
country. There is a growing national concern based on anecdotal evidence that the standard
of English is in decline, especially among learners. To date, no attempt has been made to
substantiate or refute this perception through more concrete evidence. This paper presents
a work-in-progress of a project set up to address this information gap. It outlines the
compilation of the Corpus of Learner English in Malta (CLEM) which is a project funded by
the University of Malta (ENGRP02-02/03) launched to create a baseline study for present
and future investigation of learner English in Malta. To date, CLEM contains 447,925 words
and consists of 985 language essays in English written by Maltese learners collected from
examination scripts representing two national benchmarks. The web-based corpus analysis
system CQPweb was selected for CLEM. Examination scripts were keyed in and the corpus
was tagged for parts-of-speech (POS) adopting the Penn Treebank tagset. CLEM has now
secured more texts and is set to expand and include the national benchmark of younger
learners and of university students. The paper also presents the dilemma the project team
currently faces in the selection of the right error-tagging system.
************
Collocation and the learner: wading into the depths
Michael McCarthy
In most vocabulary teaching, there is understandably an emphasis on increasing the size of
the learner’s lexicon, to help learners over the 2-3,000 word threshold that facilitates
successful everyday communication. While the size (breadth) of a learner’s vocabulary is
crucial, what learners know about the behaviour of the words in their lexicon (depth of
vocabulary knowledge) becomes more and more important as they strive to achieve more
natural-sounding communication. However, collocation can remain problematic even for
higher-level learners. I use evidence from the error-coded segment of the Cambridge
Learner Corpus to examine three persistent problem areas under the general heading of
collocation: (1) binomial ordering, where problems with the ordering of fixed binomial
expressions (e.g. safe and sound, peace and quiet) persist as learners move up the CEFR
levels, (2) tautological collocations, where (near-)synonymous words are collocated in
unexpected ways (e.g. a stench smell, urban cities), which also persist up the levels and (3)
delexical verb collocations (verbs such as make, take, get and their complements), where
progress in depth of knowledge can be observed as learners move up the levels but where
interesting problems remain, even at the highest levels.
************
A comparison of Native and Non-native speaker undergraduate presentations in the CLAS
Corpus
Margaret Healy, Mary Immaculate College
Kristin Horan, Shannon College of Hotel Management
Anne O’Keeffe, Mary Immaculate College
This paper will use the Cambridge Limerick and Shannon Corpus (CLAS), a one-million word
database of spoken academic English collected at the Shannon College of Hotel
Management. These data are set in an interesting language context where roughly half of
the cohort is native speakers of English and half is non-native speakers of English. In this
paper, we will isolate the context of student presentations, in one subject area, Hotel
Management Information Systems, and we will examine in detail the presentations of native
and non-native speakers of English. Our analysis will begin by comparing word frequency
lists and patterns of collocation but we will also look at the differing identities which belie
what is said in different ways.
************
When some pieces are missing: The construction of non-native
spoken discourse with a limited set of pragmatic markers
Òscar Bladas, University of Barcelona
Aisling O’Boyle, Queen’s University Belfast
Pragmatic markers are key pieces of spoken discourse. Literature in second language
acquisition shows that non-native speakers tend to underuse these particles or to use them
differently from native speakers. However, most studies in the field pay little or no attention
to the consequences of underusing pragmatic markers in terms of discourse organisation.
Hence it remains unclear how non-native speakers manage to build coherent second
language discourse with a lower number of these particles. This presentation aims to shed
light on this issue by comparing both quantitatively and qualitatively L3 English discourse
produced by L1 Catalan and L1 Spanish speakers with discourse produced by native English
speakers. The analysis suggests that non-native speakers may, in fact, use a significant
amount of PMs, though in a different way to that of native speakers.
************
Confidence through Corpora: Supporting native speaker EAP with a corpus of studentgenerated texts
Megan Bruce, Durham University Foundation Centre
Durham University Foundation Centre has a student cohort comprising both home and
international students. Over the years, we have started to deliver EAP modules to the home
students alongside their international counterparts and have found that we need to adopt
some different strategies in order to support our home students in acquiring the academic
writing skills they need.
This talk explores one initiative that we have established to induct our students into the
Community of Practice of their progressing department: a Foundation Corpus known as the
FOCUS project. This project began with us building a corpus of student-generated texts
from different academic disciplines which students can search to help them learn how
language is really used in their subject area.
Alongside the development of the corpus, we are designing a suite of self-access activities to
allow students to make the best use of the corpus data to improve their own writing and
use of language. This talk focuses in particular on how we have decided to teach some EAP
skills through the searching of corpus data rather than explicit use of grammatical metalanguage, which is a less threatening approach for our home students who have little or no
background grammatical knowledge, and illustrates it with examples of the approach.
************
Focus group data investigating ESOL speakers’ talk
Róisín Ní Mhocháin, Bath Spa University
This paper involves an initial analysis of focus group data of ESOL speakers in Ireland. The
participants were resident in Ireland and were not at the time enrolled in language classes.
The focus group discussion revolved around their experiences of conversation in Ireland and
how previous language education classes benefited, or not, in conversational situations. This
initial analysis, carried out using Wmatrix as a starting point, aims to explore the resultant
talk by quantifying recurring linguistic features with the added expectation of identifying
areas for further in-depth qualitative analysis. The learning from the initial Wmatrix analysis
and its relevance for trainee and in-service language teachers will be discussed.
************
Download