Abstract - ACORN Aston Corpus Network

Postgraduate Conference in Corpus Linguistics, Aston University
Multi-modal spoken corpus analysis and its relevance for key issues in language
description: the case of multi-word expressions
Methodologies in corpus linguistics have revolutionised the way in which we
study and describe language, allowing us to make objective observations and
analyses using a range of written and spoken data from naturally occurring contexts.
Yet, most current corpora are only concerned with textual representations and do not
take account of other aspects that generate meaning in conjunction with text, such as
gestures, prosody and kinesics which all add meaning to utterances and discourse
as a whole. Recent research in the area of spoken corpus analysis has started to
explore the potential impact of drawing on multi-modal corpus resources on our
descriptions of spoken language (see for example Knight et al. 2006).
In this paper I contrast a purely text-based analysis of spoken corpora with an
analysis which uses the additional parameter of pauses measured and integrated
into a multi-modal corpus resource. The unit of analysis I will focus on is that of multiword expressions (MWEs), the very frequent string 'I think' in particular. The
description and extraction of MWEs has been a key topic in a variety of areas within
applied linguistics and natural language processing for some time, however, there
seem to be a number of problems associated with a purely textual and frequency
based approach. One of the main problems with computational extraction methods is
that we cannot be sure whether corpus-derived MWEs are psycholinguistically valid.
In this study I argue that an analysis of the placement of pauses represented within a
multi-modal corpus resource can contribute to our understanding of MWE, their
boundaries and their psycholinguistic reality.
Knight, D. et al. (2006). Beyond the Text: Construction and Analysis of Multi-Modal
Linguistic Corpora. 2nd annual international e-Social Science Conference