Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language and Identity Kim Ebensgaard Jensen CGS, Aalborg University Introduction 'Identity' is polysemous and has a number of functions: 'The informant, whose identity was protected, said that he or she was involved with Lowery and another man …' (COCA 2011 NEWS AssocPress) - [NAME, INFORMATION, BACKGROUND] 'Egan has been highly praised for her searching and unconventional narratives about modern angst and identity.' (COCA 2001 NEWS AssocPress) - [PERSONALITY] '...it is not always biodiversity per se that explains these relationships, because the identity and biology of the species involved can influence the outcome of the interactions' (COCA 2011 ACAD Bioscience) [CHARACTERISTICS] '...but that she – felt so invested in her identity as a mother' (COCA 2011 SPOK CNN_Behar) - [SOCIAL ROLE] Kim Ebensgaard Jensen CGS, Aalborg University Introduction This study provides an analysis of the distributional behavior of each sense of 'identity' (i.e. a behavioral profile [Gries 2010, fc: §2]). Towards an understanding of the lexical concept(s) of IDENTITY in (American) English. Towards an understanding of the concept of IDENTITY as such in (American) Anglophone culture(s). Kim Ebensgaard Jensen CGS, Aalborg University Introduction DISCLAIMER! This work is far from complete and has not yet been fully error checked – the results should be taken with a grain of salt and, if anything, merely as documentation of the method and research work in progress! Kim Ebensgaard Jensen CGS, Aalborg University Outline The complexity of identity Polysemy and distribution Data and method Data Behavioral profiling Findings Kim Ebensgaard Jensen CGS, Aalborg University The complexity of identity The lexeme 'identity' and the concept it covers are tricky. As Fearon (1999) points out, 'identity' has a number of specialized uses in the humanistic and social sciences. On top of that, the lexeme figures in everyday discourse and, of course, other more specialized discourses. Kim Ebensgaard Jensen CGS, Aalborg University The complexity of identity The problem is that Our present idea of "identity" is a fairly recent social construct, and a rather complicated one at that. Even though everyone knows how to use the word properly in everyday discourse, it proves quite difficult to give a short and adequate summary statement that captures the range of its present meanings. (Fearon 1999: 4) Kim Ebensgaard Jensen CGS, Aalborg University The complexity of identity And.. Given the centrality of the concept to so much recent research – and especially in social science where scholars take identities both as things to be explained and things that have explanatory force – this amounts almost to a scandal. At a minimum, it would be useful to have a concise statement of the meaning of the word in simple language that does justice to its present intension. (Fearon 1999: 4) Kim Ebensgaard Jensen CGS, Aalborg University The complexity of identiy 'Identity' is simply caught in the reality of language. The lexical concept IDENTITY is, in reality, a set of lexical concepts which are associated with a number of different functions and different contexts. Consequently, the 'short and adequate' statement that Fearon (1999: 4) calls for is ultimately impossible – if it is to do any justice to 'identity' in all its aspects of use. However, it is possible to analyze the use of 'identity' and to get an overview of the lexical concepts it covers and how they behave in actual language use, deploying techniques of analysis and description from corpus linguistics. Kim Ebensgaard Jensen CGS, Aalborg University Polysemy and distribution Polysemy: when a lexical or constructional item is associated with two or more interrelated senses. The senses are, in the perspective of cognitive linguistics, organized in prototype categorial networks (Geeraerts 1997) Kim Ebensgaard Jensen CGS, Aalborg University Polysemy and distribution Meaning cannot be directly observed, but “the distributional characteristics of a linguistic expression reveal many if not most of its semantic and functional properties” (Gries 2012: 57) Kim Ebensgaard Jensen CGS, Aalborg University Polysemy and distribution The senses of a polysemic item may be reflected in different distributional patterns. For instance the INGESTION sense of 'feed' is reflected in the preference for intransitive contexts and the PROVIDE WITH FOOD / MAKE EAT sense is reflected in the preference for transitive contexts. Kim Ebensgaard Jensen CGS, Aalborg University Data and method Data: Source: COCA (2013) 2011-section Academic texts, newspapers, magazines, fiction, speech Query: identity, identities, identity's, identities' Hits: 1330 (no genitives found) Kim Ebensgaard Jensen CGS, Aalborg University Data and method A behavioral profile is a fine-grained analysis of the distributional patterns (aka. behavior) of a lexical item based on the assumption that distributional similarity reflects, or is indicative of, functional similarity; our understanding of functional similarity is rather broad, i.e., encompassing any function of a particular expression, ranging from syntactic over semantic to discourse-pragmatic. (Gries & Berez 2009: 159) Kim Ebensgaard Jensen CGS, Aalborg University Data and method Three steps of BP 1)Qualitative analysis Identification of senses ID tagging 2)Quantification of ID tags – using Gries (2010) 3)Evaluation of quantitative data – using Gries (2010) Kim Ebensgaard Jensen CGS, Aalborg University Towards a behavioral profile: senses and ID tags Senses (will definitely have to be reworked!) • DISTINCT PERSONALITY (DistPersonality) • INDIVIDUAL CHARACTERISTICS (IndCharacteristics) • IDENTITY OPERATOR (IdenOperator) • GROUP MEMBERSHIP/ALIGNMENT (GroupMemb) • HANDLE OR USERNAME (Handle) • NAME AND BACKGROUND INFORMATION (NameBack) • SELF-PERCEPTION (Self) • GEOGRAPHICAL BELONGING (GeoBel) • UNIQUENESS (Unique) • STATE OF EXISTENCE (SoExist) • PERCEIVED/ASSIGNED IDENTITY (Perce) • DEFINING FEATURE (DefFeat) • SOCIAL ROLE (SocRole) • GENDER AND SEXUAL ORIENTATION (GenderSex) • CULTURAL DEFINITION OF SOMEONE/SOMETHING (Culture) Interestingly no instances of the SAMENESS sense were found. Kim Ebensgaard Jensen CGS, Aalborg University Towards a behavioral profile: senses and ID tags ID tagging: Each instance of 'identity' was assigned ID tags after identification of its sense. Each ID tag is divided into what is called levels (which is a specification of a category within the ID tag). Some examples of ID tags and levels: Number (plural, singular) Determiner (definite article, indefinite article, indefinite pronoun, possessive pronoun, zero etc.) Semantics of premodifier (ethnicity, ideology, gender, temporality etc.) Kim Ebensgaard Jensen CGS, Aalborg University Towards a behavioral profile: senses and ID tags ID tagging: Syntax: Syntactic function Determiner Word class of premodifier Postmodifier Diathesis Morphology: Number Function in nominal compound Lexeme in modifier in nominal compound Semantics: Semantics of prepositional complement in postmodifying prepositional phrase Semantics of premodifier Collocation: Lexeme in head postmodified by 'identity' Lexeme in main verb when 'identity' is subject Lexeme in main verb when 'identity, is passive subject, object or similar syntactic function Discourse-pragmatics: Textual function Speech act Domain Kim Ebensgaard Jensen CGS, Aalborg University Towards a behavioral profile: senses and ID tags ID tagging: Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags After being assigned to identified senses, the ID tags are quantified. Quantification of ID tags is essentially the identification of association patterns - “the systematic ways in which linguistic features are used in association with other linguistic and nonlinguistic features” (Biber et al. 1998: 5). The result is the behavioral profile of. I used Gries (2010). Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags Output: Kim Ebensgaard Jensen CGS, Aalborg University Quantification of ID tags Output: Work in progress – not fully error-checked as you can see :-S Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile ID tag: discourse-pragmatics – domain Culture DistPersonality GenderSex GeoBel GroupMemb NameBack academic 0,8983050847 0,6373626374 0,7272727273 0,6517857143 0,8246268657 0,1639344262 fiction 0,0338983051 0,0659340659 0,0519480519 0,0446428571 0,0111940299 0,2090163934 magazine 0,0169491525 0,1318681319 0,0649350649 0,1607142857 0,0858208955 0,1557377049 news 0,0169491525 0,0695970696 0,025974026 0,1071428571 0,0634328358 0,2336065574 spoken 0,0338983051 0,0952380952 0,1298701299 0,0357142857 0,0149253731 0,237704918 Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile ID tag: syntax – function Culture DistPersonality GenderSex GeoBel GroupMemb Handle NameBack A-PC 0,0677966102 0,1135531136 0,1168831169 0,0714285714 0,1231343284 0 0,0983606557 APP 0 0,0036630037 0 0,0267857143 0 0 0,0040983607 Co 0 0,0073260073 0 0 0 0 0 Cs 0,0338983051 0,021978022 0,025974026 0,0267857143 0,0037313433 0 0,0081967213 INP 0,1186440678 0,043956044 0,1168831169 0,1964285714 0,0820895522 0,0833333333 0,0327868852 Od 0,3050847458 0,2783882784 0,2597402597 0,25 0,2276119403 0,8333333333 0,5204918033 Oi 0 0 0 0 0 0 0 PoM-PC 0,3728813559 0,3626373626 0,2857142857 0,3571428571 0,4141791045 0 0,1352459016 PreM 0,0508474576 0,0476190476 0,012987013 0,0089285714 0,0559701493 0 0,0327868852 S 0,0508474576 0,1208791209 0,1818181818 0,0625 0,0932835821 0,0833333333 0,1680327869 Culture DistPersonality GenderSex GeoBel GroupMemb Handle NameBack active 0,7796610169 0,8498168498 0,7402597403 0,7410714286 0,7649253731 0,9166666667 0,8442622951 passive 0,0677966102 0,0805860806 0,0909090909 0,0357142857 0,1044776119 0 0,1229508197 NA 0,1525423729 0,0695970696 0,1688311688 0,2232142857 0,1305970149 0,0833333333 0,0327868852 ID tag: syntax – diathesis Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile Kim Ebensgaard Jensen CGS, Aalborg University Behavioral profile ID tag: collocate in head – 'sense' Culture 0 DefFeat 0 DistPersonality 0,0036630037 GenderSex 0 GeoBel 0,0535714286 GroupMemb 0,026119403 Handle 0 IdenOperator 0 IndChar 0 NameBack 0 Perce 0 Self 0,2222222222 SocRole 0 SoExist 0 Unique 0 Kim Ebensgaard Jensen CGS, Aalborg University Clustering the senses of 'identity' Behavioral profiles offer a fine-grained view of the distributional behavior of the senses of polysemic lexemes. This offers several possibilities of gaining insights into the actual use of the lexeme in question and, of course, its senses. Behavioral profiling may also give us an idea of the structure of the category network that relates the senses of the lexeme. Kim Ebensgaard Jensen CGS, Aalborg University Clustering the senses of 'identity' The behavioral profile itself technically also constitutes a usagebased network of senses, as it associates senses with distributional aspects (represented by ID tag levels) – thus offering a complex overview of entrenchment patterns. However, because of the complexity and detail, it is difficult to get an overview of how the senses relate to each other in such a network on the basis of a fully fledged behavioral profile. Fortunately, there are statistical methods of identifying category structures on the basis of calculations of similarity – such as cluster analysis. Kim Ebensgaard Jensen CGS, Aalborg University Clustering the senses of 'identity' Kim Ebensgaard Jensen CGS, Aalborg University Clustering the senses of 'identity' Kim Ebensgaard Jensen CGS, Aalborg University Concluding remarks 'Identity' is polysemic Our (provisional) behavioral profile shows that...: IDENTITY is a very complex lexical concept – or set of lexical concepts strongly associated with factors of distribution. There is a sharp distinction between the NameBack sense and the other senses, with the latter being associated in particular with academic discourse. The NameBack and Handle senses are more frequently direct objects than the other senses, suggesting perhaps a more concrete or entity-like conceptualization. The Self sense collocates with 'sense' (in 'sense of') much more strongly than any other sense. Although far from complete and still not fully error checked, this analysis also shows that behavioral profiling is an empirically powerful way to do lexicological analysis due to its multifactorial nature. Kim Ebensgaard Jensen CGS, Aalborg University Bibliography Berez, Andrea L. & Stefan Th. Gries (2009). In defense of corpus-based methods: a behavioral profile analysis of polysemous get in English. In Steven Moran, Darren S. Tanner, & Michael Scanlon (eds.), Proceedings of the 24th Northwest Linguistics Conference. University of Washington Working Papers in Linguistics Vol. 27. Seattle, WA: Department of Linguistics. 157-166. Biber Douglas, Susan Conrad, Randi Reppen (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press. Davies, Mark (2013). Corpus of Contemporary American English (COCA). corpus.byu.edu/coca/ Fearon, James D. (1999). What is identity (as we now use the word)? Unpublished manuscript, Department of Political Studies, Stanford University. Geeraerts, Dirk (1997). Diachronic Prototype Semantics: A Constribution to Historical Lexicology. Oxford: Oxford University Press. Gries, Stefan Th. (2010). BehavioralProfiles 1.01: A program for R 2.7.1 and higher. Gries, Stefan Th. (2012). Behavioral profiles: A fine-grained and quantitative approach in corpusbased lexical semantics. In Gonia Jarema, Gary Libben, & Chris Westbury (eds.), Methodological and Analytic Frontiers in Lexical Research. Amsterdam & Philadelphia: John Benjamins. 57-80. Gries, Stefan Th. (fc). Corpus and quantitative linguistics. In John Taylor & Jeanette Littlemore (eds.), Companion to Cognitive Linguistics. London & New York: Bloomsbury. Kim Ebensgaard Jensen CGS, Aalborg University