Exploring the Key Words of Shakespeare Mike Scott School of English University of Liverpool Charles University, Prague 25.5.06 This presentation is at www.lexically.net/downloads/corpus_linguistics Starting Point 1 Scott and Tribble (2006) studying Romeo and Juliet: 1. “All Shakespeare plays” is a suitable reference corpus 2. A large number of KWs are proper nouns: characters in the play 3. Others: 1. 2. 3. 4. theme KWs (love, death etc.) exclamations pronouns copula verbs Starting Point 2 Culpeper (2002) studying Romeo and Juliet: if is a KW for Juliet and reflects her anxiety the is a KW for Mercutio, who “has a “noun-y” style […] Mercutio is in the play to give dazzling rhetorical displays” (2002:22) Aims of the Paper To investigate KWs in all of Shakespeare's plays To identify proportions of character/place KWs expected KWs i.e. matching themes interesting KWs i.e. unexpected Methods Obtain all plays from Project Gutenberg Clean them up Use WordSmith’s WordList tool to compute word-lists Use KeyWords tool to compute KWs for each using all the plays as a reference corpus Export the KWs for each into an Excel spreadsheet Identify KW types Clean up: original [Enter Hamlet.] Ham. To be, or not to be,--that is the question:-Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune Or to take arms against a sea of troubles, And by opposing end them?--To die,--to sleep,-No more; and by a sleep to say we end The heartache, and the thousand natural shocks That flesh is heir to,--'tis a consummation Devoutly to be wish'd. To die,--to sleep;-To sleep! perchance to dream:--ay, there's the rub; For in that sleep of death what dreams may come, … Oph. Good my lord, How does your honour for this many a day? Ham. I humbly thank you; well, well, well. Cleaned up (1) <Header> HAMLET, PRINCE OF DENMARK by William Shakespeare PERSONS REPRESENTED. Claudius, King of Denmark. Hamlet, Son to the former, and Nephew to the present King. Polonius, Lord Chamberlain. Horatio, Friend to Hamlet. Laertes, Son to Polonius. Voltimand, Courtier. Cornelius, Courtier. Rosencrantz, Courtier. Guildenstern, Courtier. Osric, Courtier. A Gentleman, Courtier. A Priest. Marcellus, Officer. Bernardo, Officer. Francisco, a Soldier Reynaldo, Servant to Polonius. Players. Two Clowns, Grave-diggers. Fortinbras, Prince of Norway. A Captain. English Ambassadors. Ghost of Hamlet's Father. Gertrude, Queen of Denmark, and Mother of Hamlet. Ophelia, Daughter to Polonius. Lords, Ladies, Officers, Soldiers, Sailors, Messengers, and other Attendants. </Header> Cleaned up (2) [Enter Hamlet.] <Hamlet> To be, or not to be,--that is the question:-Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune Or to take arms against a sea of troubles, And by opposing end them?--To die,--to sleep,-No more; and by a sleep to say we end The heartache, and the thousand natural shocks That flesh is heir to,--'tis a consummation Devoutly to be wish'd. To die,--to sleep;-To sleep! perchance to dream:--ay, there's the rub; For in that sleep of death what dreams may come, … <Ophelia> Good my lord, How does your honour for this many a day? <Hamlet> I humbly thank you; well, well, well. Further cleaning up needed Check Folio and other edition aspects completeness punctuation spellings (US honor / UK honour?) Correct < > markup to exclude stage directions and Act and Scene numbers. Results KW types: Characters/Places = 50% (30% in BNC, Scott & Tribble 2006:71) a large number of expected KWs matching each play’s theme and plot some unexpected KWs very, e’en, most in Hamlet (truth/epistemology?) the, it in Hamlet (why twice as many?) Conclusions Further cleaning up of the source texts is needed. Using the set of KWs for all the plays presents patterns that are similar to those of one play alone (Scott & Tribble 2006, Culpeper 2002). Character and place names take up about 50% of the KWs overall, raging from about 40% to 70%. Other expected KWs account for most of the remainder. There is a rich harvest of unexpected KWs …which merit extensive further investigation References Crystal, David & Ben Crystal, 2002. Shakespeare’s Words. London: Penguin. Culpeper, Jonathan, 2002. 'Computers, language and characterisation: An Analysis of six characters in Romeo and Juliet'. In: U. Melander-Marttala, C. Östman and M. Kytö (eds.), Conversation in Life and in Literature: Papers from the ASLA Symposium, Association Suedoise de Linguistique Appliquée (ASLA), 15. Universitetstryckeriet: Uppsala, pp.11-30. Available here. Scott, Mike & Chris Tribble (2006) Textual Patterns: key words and corpus analysis in language education. Amsterdam: Benjamins.