Corpus Linguistics (2): The Tools of the Tradesession 2 Corpus Linguistics: http://tinyurl.com/669o4zt martin.wynne@it.ox.ac.uk ylva.berglund@it.ox.ac.uk Today’s session • An introduction to some features of tools • Demo of different (kinds of) tools • Hands-on practice with one tool AIM: Help you know what to look for in a tool for your work (and what options there are) There are different TYPES OF TOOLS Different kinds of tools • Online / offline • For one particular corpus / for any corpus or text • Use straight away / need to prepare corpus • 'Free' / licence conditions and costs Different kinds of tools • Online / offline • For one particular corpus / for any corpus or text • Use straight away / need to prepare corpus • 'Free' / licence conditions and costs Tools may • have different functions: concordance, wordlist, statistics, collocation, keywords… • handle annotation: interpret tags, ignore tags, treat tags as text • take different text formats: .txt, .xml, .html Different tools have different functions. TYPICAL FUNCTIONS Concordance • • • • Search word + context Can be displayed as KWIC Can usually be sorted Used to see patterns of use KWIC Concordance Wordlist List all words in the corpus • alphabetically • by frequency Used as starting point for further functions • keywords • lexical density/readability calculations Sampler AntConc wordlist Collocations Co-occurrence patterns borrow money borrow books borrow a car May I borrow (more in Session 3) Collocates: adjectives immediately preceding BUSINESS Corpus of Contemporary American English http://www.americancorpus.org/ Visualization Graphs Word clouds Distribution displays Etc. Example: BNCweb borrow Example: Voyant Tools http://voyant-tools.org ‘borrow’ Compare your intuition to what you find in the corpus What is borrowed and by whom? What words do you expect to find together with borrow? Can these words be grouped in some way, for example based on their word class, function, or meaning? Where would you expect these words (e.g. before or after borrow? Immediately adjacent or not?) Who do you think uses the work borrow? In what context or type of language would you find borrow? Are there any words that are NOT used with borrow? AntConc Download AntConc for free from: http://www.antlab.sci.waseda.ac.jp/antconc_index.html (or just search for Antconc) Use your own texts and corpora. Find some examples at: http://www.ota.ox.ac.uk/ Tip of the week Register to use the BYU corpora for free. http://corpus.byu.edu Next week (Session 3) Collocation Corpus linguists claim to have identified an important principle is responsible for the creation of much of the meaning of texts – collocation (co-occurrences). What is it, and are the claims true? Optional reading: * Xiao, Richard, and Tony McEnery (2006). "Collocation, Semantic Prosody, and near Synonymy: A CrossLinguistic Perspective " Applied Linguistics 27(1): 103129. http://applij.oxfordjournals.org/cgi/content/full/27/1/103 Corpus Linguistics (2): The Tools of the Tradesession 2 Corpus Linguistics: http://tinyurl.com/669o4zt martin.wynne@it.ox.ac.uk ylva.berglund@it.ox.ac.uk