Corpus Linguistics and ESP - is there a link?

advertisement
Exploring the Key
Words of
Shakespeare
Mike Scott
School of English
University of Liverpool
Charles University, Prague 25.5.06
This presentation is at
www.lexically.net/downloads/corpus_linguistics
Starting Point 1
Scott and Tribble (2006) studying Romeo and
Juliet:
1.
“All Shakespeare plays” is a suitable
reference corpus
2.
A large number of KWs are proper nouns:
characters in the play
3.
Others:
1.
2.
3.
4.
theme KWs (love, death etc.)
exclamations
pronouns
copula verbs
Starting Point 2
Culpeper (2002) studying Romeo and
Juliet:
 if is a KW for Juliet and reflects her
anxiety
 the is a KW for Mercutio, who “has a
“noun-y” style […] Mercutio is in the
play to give dazzling rhetorical
displays” (2002:22)
Aims of the Paper
To investigate KWs in all of
Shakespeare's plays
 To identify proportions of

character/place KWs
 expected KWs i.e. matching themes
 interesting KWs i.e. unexpected

Methods






Obtain all plays from Project Gutenberg
Clean them up
Use WordSmith’s WordList tool to compute
word-lists
Use KeyWords tool to compute KWs for
each using all the plays as a reference
corpus
Export the KWs for each into an Excel
spreadsheet
Identify KW types
Clean up: original
[Enter Hamlet.]
Ham.
To be, or not to be,--that is the question:-Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune
Or to take arms against a sea of troubles,
And by opposing end them?--To die,--to sleep,-No more; and by a sleep to say we end
The heartache, and the thousand natural shocks
That flesh is heir to,--'tis a consummation
Devoutly to be wish'd. To die,--to sleep;-To sleep! perchance to dream:--ay, there's the rub;
For in that sleep of death what dreams may come,
…
Oph.
Good my lord,
How does your honour for this many a day?
Ham.
I humbly thank you; well, well, well.
Cleaned up (1)
<Header>
HAMLET, PRINCE OF DENMARK
by William Shakespeare
PERSONS REPRESENTED.
Claudius, King of Denmark.
Hamlet, Son to the former, and Nephew to the present King.
Polonius, Lord Chamberlain.
Horatio, Friend to Hamlet.
Laertes, Son to Polonius.
Voltimand, Courtier.
Cornelius, Courtier.
Rosencrantz, Courtier.
Guildenstern, Courtier.
Osric, Courtier.
A Gentleman, Courtier.
A Priest.
Marcellus, Officer.
Bernardo, Officer.
Francisco, a Soldier
Reynaldo, Servant to Polonius.
Players.
Two Clowns, Grave-diggers.
Fortinbras, Prince of Norway.
A Captain.
English Ambassadors.
Ghost of Hamlet's Father.
Gertrude, Queen of Denmark, and Mother of Hamlet.
Ophelia, Daughter to Polonius.
Lords, Ladies, Officers, Soldiers, Sailors, Messengers, and other
Attendants.
</Header>
Cleaned up (2)
[Enter Hamlet.]
<Hamlet>
To be, or not to be,--that is the question:-Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune
Or to take arms against a sea of troubles,
And by opposing end them?--To die,--to sleep,-No more; and by a sleep to say we end
The heartache, and the thousand natural shocks
That flesh is heir to,--'tis a consummation
Devoutly to be wish'd. To die,--to sleep;-To sleep! perchance to dream:--ay, there's the rub;
For in that sleep of death what dreams may come,
…
<Ophelia>
Good my lord,
How does your honour for this many a day?
<Hamlet>
I humbly thank you; well, well, well.
Further cleaning up needed

Check Folio and other edition aspects
completeness
 punctuation
 spellings (US honor / UK honour?)


Correct < > markup to exclude stage
directions and Act and Scene
numbers.
Results
KW types:
 Characters/Places = 50% (30% in BNC,

Scott & Tribble 2006:71)
a large number of expected KWs
matching each play’s theme and plot
 some unexpected KWs

very, e’en, most in Hamlet
(truth/epistemology?)
 the, it in Hamlet (why twice as many?)

Conclusions






Further cleaning up of the source texts is needed.
Using the set of KWs for all the plays presents
patterns that are similar to those of one play alone
(Scott & Tribble 2006, Culpeper 2002).
Character and place names take up about 50% of the
KWs overall, raging from about 40% to 70%.
Other expected KWs account for most of the
remainder.
There is a rich harvest of unexpected KWs
…which merit extensive further investigation
References



Crystal, David & Ben Crystal, 2002. Shakespeare’s
Words. London: Penguin.
Culpeper, Jonathan, 2002. 'Computers, language and
characterisation: An Analysis of six characters in
Romeo and Juliet'. In: U. Melander-Marttala, C.
Östman and M. Kytö (eds.), Conversation in Life and
in Literature: Papers from the ASLA Symposium,
Association Suedoise de Linguistique Appliquée
(ASLA), 15. Universitetstryckeriet: Uppsala, pp.11-30.
Available here.
Scott, Mike & Chris Tribble (2006) Textual Patterns:
key words and corpus analysis in language education.
Amsterdam: Benjamins.
Download