4 - University of Reading

Corpora: Resources for the study of language Paul Thompson Applied Linguistics (p.a.thompson@reading.ac.uk) British Academic Spoken English corpus (BASE)    160 lectures, 39 seminars Transcripts, video and audio 199 XML files:      Transcripts with detailed annotation Metadata included in header 160 lecture transcripts are tagged for Part-ofSpeech www.reading.ac.uk/AcaDepts/ll/base_corpus/ Funded by AHRB, Euralex, BALEAP and university sources British Academic Written English corpus (BAWE)    A corpus of assessed student writing at university level Texts collected at Warwick, Reading and Oxford Brookes University Funded by Economic and Social Research Council of England (ESRC) RES-000-23-0800 BAWE figures 6.5 million words  2,896 texts  2,761 assignments   XML files, POS-tagged 30+ disciplines  4 levels of study  Query interface: Sketch Engine Commercial service: Applied Linguistics pays annual subscription BAWE: it BE ADJ that (eg, ‘it is important that’) Level Raw Rel % 3 225 121.7 2 275 107.7 1 255 96.0 PG 66 62.1 Further possibilities  BASE: Linking audio and video to the transcripts, either online or on hard drives  Insertion of timestamp data into transcripts   Example Why?   Access to temporal, spatial, paralinguistic, phonological information Studies of speech rate, for example Uses of corpora       Comparison between languages Historical linguistics Stylistics Studies of language in use Specialised language use [eg, doctorpatient interactions] Investigations of multimodality Projects in mind  PhD thesis corpus   Academic speech events    Electronic submission Seminars, tutorials, etc Student use of computers in preparing assignments [video and text] Reading and writing of undergraduates Desiderata  Hosting corpus resources at Reading or other university – preferably on Linux servers – with customisable interfaces     BASE, BAWE, and other corpora that Reading possesses For use by all departments at Reading and also elsewhere Varied levels of user access Centralised support needed – lack of continuity with project staff

4 - University of Reading

Related documents

Products

Support

4 - University of Reading

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib