SAMUELS Closing Symposium Huddersfield Project Lesley Jeffries, Brian Walker and Jane Demmen

advertisement
SAMUELS Closing Symposium
Huddersfield Project
Lesley Jeffries, Brian Walker and
Jane Demmen
Huddersfield Project Aims
1. Investigate language used to talk about labour
relations, particularly trades unions, across time in
parliamentary language
–
–
project title: Is there a Baron in the Commons?
builds on previous work into the way unions & their leaders
are discussed in the British press (Language Unlocked 2012)
2. Include the analysis of semantic collocates
–
builds on previous work into lexical collocates of keywords in
the British press when Tony Blair was UK prime minister
(Jeffries & Walker 2012)
3. In the course of carrying out 1 & 2, assist with testing
the Hansard Corpus data and the HTST (tagger)
Progress and interim findings
• Data needed to meet aims 1 & 2:
– frequency counts of lexical items concerning
labour relations from Hansard, HT-tagged
– broken down by diachronic periods
– semantic collocates enabled
• Background research/lit review till Jan 2015
• Corpus querying began early Feb with
CQPWeb Hansard V3.0
• Currently working with V3.1 to overcome
some technical problems and access full data
Initial data and methods
• Preliminary analysis of Callaghan/Thatcher
data extract began September 2014
• Testing of lexical searches and USAS (Rayson
et al 2004) semantic domains to find language
used around unions/labour relations
• Prototypical items (strike, union) mainly in
– I3.1 Work and employment: Generally
– G1.2 Politics
• Broad categories which would require a lot of
manual screening
• Diverted to analysis of formulaic language
(in progress)
Advantages of using HT tags
for this study
• HT offers more specific categories, so should
enable a more nuanced analysis with less need
for manual filtering of irrelevant items
• HT overarching structure:
03 (Society)
-> 03.11 (Occupation and work)
-> 03.11.04 (working)
• 82 HT sub-categories relating to labour relations
HT categories relating to
labour relations
• 03.11.04.03 (Labour relations)
-> 24 subcategories
• 03.11.04.04 (Association of
employers/employees)
-> 32 subcategories
• 03.11.09.05 (Those involved in labour
relations/associations)
-> 23 subcategories
Methods using CQPWeb
Hansard (3.1, HT)
Building up the diachronic view
of labour relations talk in Hansard
Identifying semantic collocates
1
2
Semantic collocates of
03.11.04.03 and sub-categories
01.12.05.25.04.01-18.04
The world > Space > Relative position > Arrangement/fact of
being arranged > State of being gathered together > An
assemblage/collection: group: a set of things forming a
complex unity
01.13.10-05
The world > Time > Frequency > rhythm/measure
02.01.13.08.09
The mind > Mental capacity > Belief > Uncertainty, doubt,
hesitation > Possibility
04.01.02
Geographical names (extra HT code for SAMUELS project)
04.03
Grammatical items (extra HT code for SAMUELS project)
NULL
(Not recognised by the tagger)
01.11.01.07
The world > Existence and causation > Existence >
State/condition
04.06
Pronouns (extra HT code for SAMUELS project)
Collocation of labour relations
with 01.12.05.25.04.01-18.04
An assemblage/collection: group: a set of things
forming a complex unity
1 word at this level: system
Collocation of labour relations
with 01.11.01.07 State/condition
Interim findings and progress
to date
• Some cases of measure (as a collocate) are
tagged incorrectly.
• According to the online HT, conciliation does not
have a meaning association with labour relations
until 1876.
• We may find evidence of earlier cases, but, as
with striking, we know some cases of conciliation
are not tagged correctly.
• These would need manual filtering (conciliation
was the most frequently-occurring item when the
Labour Relations HT code was queried).
Interim findings and progress
to date
• We hope to provide some feedback on accuracy
of the tagging, once we can get data for all
decades
• The larger decades have proved problematic in
processing, so we are trying to create subcorpora
(rather than use the Restricted Query form) to
see if this works better
• We hope to complete the analysis in due course,
for conferences and publications
Outputs: Conference papers
• Abstracts accepted for:
– The 13th International Cognitive Linguistics Conference (ICLC13), 20-25 July 2015, Northumbria University, Newcastle
– Poetics and Linguistic Association Conference (PALA), 15-18
July 2015, University of Kent, Canterbury
• Abstract submitted for:
– Political Discourse: Multidisciplinary Approaches, 26-27 June
2015, University College London
Outputs: Publications
• Language styles at the dispatch box: comparing the
language used by two former UK Prime Ministers (in
preparation; proposed submission for the Journal of
Language & Politics)
• “Is there a Baron in the Commons?” An investigation of
the way industrial unions and their leaders are
discussed in the UK House of Commons (details to be
decided once data retrieved and analysis under way)
• A diachronic study of language used to talk about
labour relations in UK House of Commons debates
1803-2005 (details to be decided once data retrieved
and analysis under way)
References
Jeffries, L. and Walker, B. (2012) “Key words in the press”. English Text
Construction 5(2): 208-29.
Language Unlocked (2012) 20 years of Unions21: Union identity in print
media. Report to Unions21, Stylistics Research Centre, University of
Huddersfield. See
https://www.hud.ac.uk/media/universityofhuddersfield/content/image/re
search/mhm/stylisticsresearchcentre/Unions21report12062013.pdf
Rayson, P. (2008) From key words to key semantic domains. International
Journal of Corpus Linguistics, 13(4), 519-549.
Rayson, P., Archer, D., Piao, S. & McEnery, T. (2004). The UCREL semantic
analysis system. In Proceedings of the workshop on Beyond Named
Entity Recognition Semantic labelling for NLP tasks in association with
the 4th International Conference on Language Resources and
Evaluation (LREC 2004), Lisbon, Portugal, 25 May 2004 (pp. 7-12).
See http://eprints.lancs.ac.uk/1783/1/usas_lrec04ws.pdf (last accessed
March 2015).
Download