Constructions in Wonderland: Exploring the functionality of constructions through N-grams Functionality of Language | Sprogets Funktionalitet Aalborg Unviersity 10 December, 2014 Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Introduction • Is there a way to (semi-)automatically identify constructions in texts, discourses, and corpora? • Could N-gram analysis and N-gram based network analysis be ways to do that? • How can (semi-)automatic identification of constructions help us learn about the functional contributions of constructions in discourse? • To this end, we will analyze two literary classics and four political speeches. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Introduction Outline • Theoretical framework • • • • Method • • • • • Positioning of paper (Very) basic principles of construction grammar Functionality of constructions Data collection and methods Wordclouds N-grams Network analysis Analyses • • • Word clouds N-grams Networks Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Theoretical grounding Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Theoretical and methodological positioning of this paper • Theoretical positioning • Construction grammar (e.g. Goldberg 1995; Croft 2001) • Cognitive poetics/stylistics (e.g. Stockwell 2002) • Cognitive discourse analysis (e.g. Hart 2013) • Methodological positioning • Corpus stylistics (e.g. Semino & Short 2004) • Corpus-aided discourse studies (e.g. Baker 2012) • Text-mining (e.g. Miner et al. 2012) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Constructions In construction grammar, a construction is a functional unit that pairs form and semantic and/or discourse-pragmatic function (Goldberg 1995, 2006; Croft 2001, 2005; Hilpert 2014). Examples from English - [form]/[function] (cf. Langacker 1987): • [S V IO DO]/[TRANSFER OF POSSESSION] (Goldberg 1995) • [X BE so Y that Z]/[SCALAR CAUSATION] (Bergen & Binsted 2004) • [you don't want me to V]/[THREATENING SPEECH ACT] (Martínez 2013) • [to begin with]/[INTRODUCTION OF LIST OF ITEMS] (Lipka & Schmid 1994) • [PROacc CLinf (or NP)]/[DISBELIEF TOWARDS PROPOSITION] (Lambrecht 1990) • 'What, me worry?!', 'Him a doctor?!' etc. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Constructions • Constructions may be schematic, substantive (fixed), or something inbetween (Fillmore et al. 1988). • Constructions may be atomic/simple, complex, or something inbetween and form a lexicon-syntax continuum (e.g. Goldberg 1995, Croft 2001). • Language competence is an inventory of constructions (aka. the construct-i-con) of varying degrees of abstraction which are instantiated in language use (e.g. Goldberg 1995). • In most contemporary incarnations of construction grammar, the construct-i-con is usage-based (e.g. Croft 2001). • Constructions are subject to general human cognitive processes and principles, such that language is not a separate, autonomous cognitive faculty; thus, construction grammar is part of the overall endeavor of cognitive linguistics. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University The discursive and stylistic functionality of constructions • If constructions are functional units (pairings of form and meaning/function), then they logically must contribute to discourse as part of a speaker's linguistic repertoire. Here are two examples: • Writers of fiction may use constructions in... • • descriptions of actions and happenings; • characterizations (Culpeper 2009) and mind styles (Fowler 1977) by having characters use certain constructions in their dialog and narrative, or by using certain constructions in the descriptions of characters or of their actions; • in setting up the text-world and specifying temporal relations in the narrative; • foregrounding, deviation, parallelism etc. (e.g. Short & Leech 2007) In political speeches, speakers may use constructions in... • framing of issues and other ideologically based representations; • organization of topics/issues; • rhetorical strategies (including parallelisms etc.). Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Method Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Data collection and methods • Text data • Alice's Adventures in Wonderland by Lewis Carroll (1865), obtained via Gutenberg Project = AW • Adventures of Huckleberry Finn by Mark Twain (1884), obtained via Gutenberg Project = HF • Inaugural speeches by US Presidents, obtained via Bartleby • Text mining and corpus-linguistic methods • Wordclouds (R package ‘wordcloud’) • N-grams (R package ‘tau’, AntConc) • Network analysis (R package ‘igraph’) • Concordances (AntConc) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Wordclouds • Wordclouds are graphical representations of the lexical texture of a text, based on frequencies. • They are visual versions of frequency lists. • ... except they do not provide any information on frequencies. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams • An N-gram is a string of words that co-occur frequently in a data set (such as a corpus or a text). • N-grams are specified in accordance with the number of words in the string in question (N = number) • Monogram (1-gram) = one word, bigram (2-gram) = two words, trigram (3-gram) = three words, four-gram (4-gram) = four words, five-gram (5gram) = five words etc. • Examples = 'damn you' (2-gram), 'what the hell' (3-gram), 'pick up the phone' (4-gram) (e.g. Stubbs 2009) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams • N-gram retrieval is a text mining technique used in the identification of N-grams in a data set, text, or corpus. • How it works (in a nutshell): ask a computer to find strings of N words, and it returns a list of N-grams ranked in terms of frequency. • An example: 4-grams in all of Shakespeare's plays • Find all instances of word + word + word + word combinations in the collective body of Shakespeare's plays. • Calculate frequency of word + word + word + word combinations in the collective body of shakespeare's plays. • List the word + word + word + word combinations in terms of frequency in the collective body of shakespeare's plays. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams • N-gram retrieval is a text mining technique used in the identification of N-grams in a data set, text, or corpus. • How it works (in a nutshell): ask a computer to find strings of N words, and it returns a list of N-grams ranked in terms of frequency. • An example: 4-grams in all of Shakespeare's plays • Find all instances of word + word + word + word combinations in the collective body of Shakespeare's plays. • Calculate frequency of word + word + word + word combinations in the collective body of shakespeare's plays. • List the word + word + word + word combinations in terms of frequency in the collective body of shakespeare's plays. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams • N-gram analysis can be used for a wide range of things: • Identification of "aboutness" of a text or discourse. • Phraseological analysis. • Probablistic language modeling. • Analysis of aspects of style, genre and register. • Identification of various types of anonymous writers. • Identification of constructions and their functionality in a text or discourse. • Etc. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis • Network analysis sets up data points as nodes in a network and calculates the strengths of association between them. • As a text mining method, it represents word types (as opposed to tokens) in a text as nodes and, based on N-gram relations, sets up relations between the nodes, or words. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Analysis Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Wordclouds: a first look Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Wordclouds Wordcloud HF things think left help water people want comes bed lot yet now way say tell couldnt nothing little going dat put find thing money duke time good done back made night look place kind run mind enough hadnt dark talk three away didnt found raft take long ever two hear old get well king half got ill looked pap wont set like one man see jim house laid said just warnt says big reckoned can wouldnt thats head reckon first along around home hands took right know dont never last aint come went door mile Yoshikata Shibuya Kyoto University of Foreign Studies prettytold make day tom still low cook let river three knew please indeed question tried word long hand heard mad made right youre nigger give town curious certainly remark anxiously cat come upon head king soon used nothing mean high seen table words mouse air find enough dear caterpillar white added good arm ive way alic time duchess beautiful house left begin hare back even gryphon will theres going wouldnt grow cats suddenly thought take replied much far jury put day must little dont thing see took ever old found garden get like went know one think now alice last didnt eat wonder perhaps room another felt first seemed still course make turtl ran queen said yet looked dodo half two say came got well tell began just mock end voice dormouse better gave try round quite chapter voic yes size hastily however turned ask littl behind cant away sure change set done rather wish hear sea sat use things getting talking doesnt moment idea next might poor spoke wont marchnever thats large till bit sort may rabbit near without turtle side ill hatter door looking eyes great kept can look let shall anything talk tone soon much saying feet close among tea beginning something speak cried court saw minute soup till lay Wordcloud AW Kim Ebensgaard Jensen Aalborg University Wordclouds Wordcloud HF hands things think left help water people want comes lot yet mind thing money find enough hadnt now way time good kind look back done No details (e.g. place frequency) run dukenight put made mile tell couldnt nothing little going dat say two hear dark three Informative to someaway extent didnt found talk bed ever get raft take long king old Visually attractive well half ill looked pap wont set like one man see jim got house laid said just warnt says big reckoned can wouldnt thats head reckon first curious along around home tom still low Yoshikata Shibuya Kyoto University of Foreign Studies mean cook let prettytold make day took right know dont never last aint come went door river three knew please indeed question tried word long hand heard mad made right youre nigger give town cat come upon head king soon used nothing certainly remark anxiously mouse air find high seen table words enough dear caterpillar white added good arm ive way alic time duchess beautiful house left begin hare back even gryphon will theres going wouldnt grow cats suddenly thought take replied much far jury put day must little dont thing see took ever old found garden get like went know one think now alice last didnt eat wonder perhaps room another felt first seemed still course make turtl ran queen said yet looked dodo half two say came got well tell began just mock end voice dormouse better gave try round quite chapter voic yes size hastily however turned ask littl behind cant away sure change set done rather wish hear sea sat use things getting talking doesnt moment idea next might poor spoke wont marchnever thats large till bit sort may rabbit near without turtle side ill hatter door looking eyes great kept can look let shall anything talk tone soon much saying feet close among tea beginning something speak cried court saw minute soup till lay Wordcloud AW Kim Ebensgaard Jensen Aalborg University N-grams in AW and HF Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies 3-grams 4-grams Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies 3-grams 4-grams Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies 3-grams 4-grams Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies 3-grams 4-grams Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies 3-grams 4-grams Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams 3-grams 4-grams speech / dialog Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams 3-grams definite NP 4-grams speech / dialog Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in AW 2-grams 3-grams definite NP 4-grams speech / dialog 'said' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in AW 3-grams 2-grams speech / dialog definite NP 4-grams [DIALOG said NPdef/unique]/[TOPICALIZATION OF DIALOG] 'said' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams 4-grams 5-grams productive Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams productive Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams less productive Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams productive 4-grams 5-grams less productive Two different constructions [it BEn't no X] [there BEn't no X] Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams Yoshikata Shibuya Kyoto University of Foreign Studies 4-grams 5-grams Kim Ebensgaard Jensen Aalborg University N-grams in HF 2-grams 3-grams 4-grams 5-grams Event relating constructions used to temporally structure events in the narrative: [X and then Y]/[EVENT1 FOLLOWED BY EVENT2] [by and by X]/[EVENT HAPPENING AFTER SOME TIME] Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-gram analysis: pros and cons 1 Pro: • Simple N-gram analysis can help us identify and address constructions and their functionality in one text or discourse, as it identifies frequent combinations of words in the text or discourse in question. Con: Problem #1: • What simple N-gram analysis does not tell us is whether or not those frequent combination of words can also be found in other text. • Our N-gram analyses of AW and HF do not tell us if the findings are actually just general patterns in English or if they are actually good delineations of the texts. Solution to problem #1: • To obtain a list of N-grams that really delineate a given text (so that we can identify what constructions are characteristically associated with the text), a comparative analysis can be useful. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams: AW vs. HF • Note that here we focus on 2-grams • Procedures 1. 2-gram analysis of AW and HF 2. Normalization of frequencies for AW and HF: per 10,000 words 3. Comparison between 2-grams of AW and those of HF with Fisher’s exact test Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams: AW vs. HF Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-grams: AW vs. HF Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Other texts: a quick look at inaugural speeches Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University What about other texts? Political texts: inaugural speeches by US Presidents • Procedures 1. 2-gram analysis of 4 inaugural speeches by George Bush (GB), Bill Clinton (BC), George W. Bush (GWB), and Barack Obama (BO). 2. Normalization: per 1,000 words 3. Fisher’s exact test Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Results Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Results Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-gram analysis: pros and cons 2 • Pros • N-grams can help us identify and address constructions and their functionality in one text or discourse • Comparative N-gram analysis can help us find N-grams that delineate texts or discourses • So far, we have found 2-grams that delineate AW and HF, and also some political speeches. • Cons: Problem #2: • • • • Shorter N-grams are embedded in longer N-grams (2-grams can be found inside some of the 3-grams and 4-grams). Consequently, our N-gram lists contain some redundancy. Our comparative N-gram analyses focused on 2-grams, but texts contain longer strings of words (3-grams, 4-grams, etc) and also shorter N-grams (1-grams). From the perspective of construction grammar, we should also talk about those longer/shorter N-grams, like we did in our isolated N-gram analyses. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University N-gram analysis: pros and cons 2 • Solution to problem #2 • We can use the methods we have used 2-grams with other Ngrams, but is there any simpler way to find both short and long N-grams? • That is, can we find 1-grams, 2-grams, 3-grams, 4-grams, etc. all at one time? • Is there any way to provide descriptively a more efficient analysis on frequently co-occurring combinations of words (constructions)? • Our suggestion here is to use network analysis based on 2grams. • Other N-grams emerge in the network, as a network includes all N-grams. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW and HF Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW Top 96 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) • 'said the cat' (3-gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) • 'said the cat' (3-gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) • 'said the cat' (3-gram) • 'said alice' (2-gram) Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) • 'said the cat' (3-gram) • 'said alice' (2-gram) [the N]/[DEFINITE NOMINAL REFERENCE] Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of AW A closer look at some N-grams: • 'said the march hare' (4gram) • 'said the mock turtle' (4gram) • 'said the king' (3-gram) • 'said the caterpillar' (3gram) • 'said the cat' (3-gram) [DIALOG said NPdef/unique ]/[TOPICALIZATION • 'said alice' (2-gram) OF DIALOG] Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of HF Top 99 2-grams Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of HF Negation constructions Including double negation Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis of HF Negation constructions Including double negation Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Two interesting 2-grams of 'i' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Two interesting 2-grams of 'i' • 'i reckon' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Two interesting 2-grams of 'i' • 'i reckon' • 'says i' • 'i says' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Two interesting 2-grams of 'i' • 'i reckon' • 'says i' • 'i says' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Two interesting 2-grams of 'i' • 'i reckon' • 'says i' • 'i says' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': • 'and then' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': • 'and then' • 'by and by' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': • 'and then' • 'by and by' • 'and so' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': • 'and then' • 'by and by' • 'and so' Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis (HF) Three N-grams of 'and': • 'and then' • 'by and by' • 'and so' [X and so Y]/[EVENT1 AMOUNTS TO EVENT2] Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Network analysis • At the end of the day, short and long N-grams can be found automatically. • A list of N-grams can be used to compute p.values with Fisher’s exact test, for instance. • Then, what’s good about using network analysis? • Firstly, it allows us to capture several N-gram types simulteneously without too much redundancy. • Secondly, the real advantage that network analysis offers is not just its visual effects, but in fact it tells you a lot about the internal functionality of the network. • For instance, it is possible to compute the centrality of nodes in a network. This method provides estimates regarding the relationship between a network and the functionality of nodes in it. • A simple N-gram analysis does not provide answers to these issues. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Centrality • Betweenness centrality: Dehmer & Subhash C. Basak (2012: 70-1) ... the betweenness centrality is based on the shortest path between nodes. ... The betweenness centrality focuses on the number of visits through the shortest paths. If a walker moves from one node to another node via the shortest path, then the nodes with a large number of visits by the walker may have high centrality. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Centrality • Computing betweenness centrality: Top 15 Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Concluding remarks • Can N-grams be used as means of identifying constructions? • • Is network analysis useful in the identification of constructions? • • Yes, partially. Yes, partially. What about functionality? • Neither N-grams nor N-gram based networks tell us much about functionality, as they show us purely formal relations. • However, they guide us in terms of connections between words that are salient in a given text and may be indicative of constructional as functional units. • We can then look at the discursive behavior of such N-grams and extrapolate constructions and their functionality in the text or discourse (and, depending on the corpus, in general). Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University Bibliography Baker, P. (2012). Acceptable bias? Using corpus linguistic methods withcritical discourse analysis. Discourse Studies, 9(3): 247-256. Bergen, B. & K. Binsted (2004). The cognitive linguistics of scalar humor. In M. Achard & S. Kemmer (eds.). Language, Culture, and Mind. Stanford, CA: CSLI. 79-91. Croft, W. A. (2001). Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford: Oxford University Press. Culpeper, J. (2009). In G. Brône & J. Vandaele (eds). Cognitive Poetics: Goals, Gains and Gaps. Berling: Mouton de Gruyter. Dehmer, M. & S. C. Basak (2012). Statistical and Machine Learning Approaches for Network Analysis. Chichester: Wiley-Blackwell. Fillmore, C, P. Kay and M. C. O'Connor (1988). Regularity and idiomaticity in grammatical constructions: The case of let alone. Language, 64: 501–38. Fowler, R. (1977). Linguistics and the Novel. London: Methuen. Hart, C. (2013). Constructing contexts through grammar: Cognitive models and conceptualisation in British newspaper reports on political protests. In J. Flowerdew (ed.), Discourse in Context. London: Bloomsbury. 159-183. Goldberg, A. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press. Lambrect, K. (1990). "What, me worry?": Mad Magazine sentences revisited. BLS, 16: 215-228. Langacker, R. (1987). Foundations of Cognitive Grammar. Vol. 1: Theoretical Prerequisites. Stanford, CA: Stanford University Press. Lipka, L. & H.-J. Schmid (1994). To begin with: Degrees of idiomaticity, textual functions and pragmatic exploitations of a fixed expression. ZAA, 42: 6-15. Martínez, N. D. C. (2013). Illocutionary Construtions in English: Cognitive Motivation and Linguistic Realization. Bern: Peter Lang. Miner, G., J. Elder, T. Hill, R. Nisbet, D. Delen & A. Fast,. (2012). Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Elsevier Academic Press. Semino, E. & M. Short (2004). Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing. Londong: Routledge. Short M. & G. Leech (2007). A Linguistic Introduction to English Fictional Prose (2nd ed.). Harlow: Pearson Longman. Stockwell, P. (2002). Cognitive Poetics. London: Routledge. Stubbs, M. (2009). Technology and phraseology. In U. Römer & R. Schulze (eds.), Exploring the Lexis-Grammar Interface. Amsterdam: John Benjamins. 1531. Yoshikata Shibuya Kyoto University of Foreign Studies Kim Ebensgaard Jensen Aalborg University