Identity LangLing

advertisement
Identity's identities:
An empirical study of the distributional effects of polysemy
Aalborg Languages and Linguistics Research Group Seminar
on
Language and Identity
Kim Ebensgaard Jensen
CGS, Aalborg University
Introduction
'Identity' is polysemous and has a number of functions:




'The informant, whose identity was protected, said that he or she was
involved with Lowery and another man …' (COCA 2011 NEWS
AssocPress) - [NAME, INFORMATION, BACKGROUND]
'Egan has been highly praised for her searching and unconventional
narratives about modern angst and identity.' (COCA 2001 NEWS
AssocPress) - [PERSONALITY]
'...it is not always biodiversity per se that explains these relationships,
because the identity and biology of the species involved can influence
the outcome of the interactions' (COCA 2011 ACAD Bioscience) [CHARACTERISTICS]
'...but that she – felt so invested in her identity as a mother' (COCA
2011 SPOK CNN_Behar) - [SOCIAL ROLE]
Kim Ebensgaard Jensen
CGS, Aalborg University
Introduction
This study provides an analysis of the
distributional behavior of each sense of
'identity' (i.e. a behavioral profile [Gries
2010, fc: §2]).
 Towards an understanding of the lexical
concept(s) of IDENTITY in (American) English.
 Towards an understanding of the concept of
IDENTITY as such in (American) Anglophone
culture(s).

Kim Ebensgaard Jensen
CGS, Aalborg University
Introduction
DISCLAIMER!
This work is far from complete and has not
yet been fully error checked – the results
should be taken with a grain of salt and, if
anything, merely as documentation of the
method and research work in progress!
Kim Ebensgaard Jensen
CGS, Aalborg University
Outline
The complexity of identity
 Polysemy and distribution
 Data and method
 Data
 Behavioral profiling
 Findings

Kim Ebensgaard Jensen
CGS, Aalborg University
The complexity of identity
The lexeme 'identity' and the concept it
covers are tricky. As Fearon (1999) points
out, 'identity' has a number of specialized
uses in the humanistic and social sciences.
On top of that, the lexeme figures in everyday
discourse and, of course, other more
specialized discourses.
Kim Ebensgaard Jensen
CGS, Aalborg University
The complexity of identity
The problem is that
Our present idea of "identity" is a fairly recent
social construct, and a rather complicated one
at that. Even though everyone knows how to
use the word properly in everyday discourse,
it proves quite difficult to give a short and adequate summary statement that captures the
range of its present meanings. (Fearon 1999:
4)
Kim Ebensgaard Jensen
CGS, Aalborg University
The complexity of identity
And..
Given the centrality of the concept to so much
recent research – and especially in social
science where scholars take identities both as
things to be explained and things that have
explanatory force – this amounts almost to a
scandal. At a minimum, it would be useful to
have a concise statement of the meaning of
the word in simple language that does justice
to its present intension. (Fearon 1999: 4)
Kim Ebensgaard Jensen
CGS, Aalborg University
The complexity of identiy
'Identity' is simply caught in the reality of language. The
lexical concept IDENTITY is, in reality, a set of lexical concepts
which are associated with a number of different functions
and different contexts. Consequently, the 'short and
adequate' statement that Fearon (1999: 4) calls for is
ultimately impossible – if it is to do any justice to 'identity' in
all its aspects of use.
However, it is possible to analyze the use of 'identity' and to
get an overview of the lexical concepts it covers and how
they behave in actual language use, deploying techniques of
analysis and description from corpus linguistics.
Kim Ebensgaard Jensen
CGS, Aalborg University
Polysemy and distribution
Polysemy: when a lexical or constructional
item is associated with two or more
interrelated senses.
The senses are, in the perspective of
cognitive linguistics, organized in prototype
categorial networks (Geeraerts 1997)
Kim Ebensgaard Jensen
CGS, Aalborg University
Polysemy and distribution
Meaning cannot be directly observed, but “the
distributional characteristics of a linguistic
expression reveal many if not most of its
semantic and functional properties” (Gries
2012: 57)
Kim Ebensgaard Jensen
CGS, Aalborg University
Polysemy and distribution
The senses of a polysemic item may be
reflected in different distributional patterns.
For instance the INGESTION sense of 'feed' is
reflected in the preference for intransitive
contexts and the PROVIDE WITH FOOD / MAKE EAT
sense is reflected in the preference for
transitive contexts.
Kim Ebensgaard Jensen
CGS, Aalborg University
Data and method
Data:
 Source: COCA (2013)
 2011-section
 Academic texts, newspapers, magazines,
fiction, speech
 Query: identity, identities, identity's,
identities'
 Hits: 1330 (no genitives found)
Kim Ebensgaard Jensen
CGS, Aalborg University
Data and method
A behavioral profile is a fine-grained analysis of the
distributional patterns (aka. behavior) of a lexical item
based on the assumption
that distributional similarity reflects, or is
indicative of,
functional similarity; our understanding of functional
similarity is rather broad, i.e., encompassing any function
of a particular expression, ranging from syntactic over
semantic to discourse-pragmatic. (Gries & Berez 2009:
159)
Kim Ebensgaard Jensen
CGS, Aalborg University
Data and method
Three steps of BP
1)Qualitative analysis
 Identification of senses
 ID tagging
2)Quantification of ID tags – using Gries
(2010)
3)Evaluation of quantitative data – using
Gries (2010)
Kim Ebensgaard Jensen
CGS, Aalborg University
Towards a behavioral profile: senses and ID tags
Senses (will definitely have to be reworked!)
•
DISTINCT PERSONALITY (DistPersonality)
•
INDIVIDUAL CHARACTERISTICS (IndCharacteristics)
•
IDENTITY OPERATOR (IdenOperator)
•
GROUP MEMBERSHIP/ALIGNMENT (GroupMemb)
•
HANDLE OR USERNAME (Handle)
•
NAME AND BACKGROUND INFORMATION (NameBack)
•
SELF-PERCEPTION (Self)
•
GEOGRAPHICAL BELONGING (GeoBel)
•
UNIQUENESS (Unique)
•
STATE OF EXISTENCE (SoExist)
•
PERCEIVED/ASSIGNED IDENTITY (Perce)
•
DEFINING FEATURE (DefFeat)
•
SOCIAL ROLE (SocRole)
•
GENDER AND SEXUAL ORIENTATION (GenderSex)
•
CULTURAL DEFINITION OF SOMEONE/SOMETHING (Culture)
Interestingly no instances of the SAMENESS sense were found.
Kim Ebensgaard Jensen
CGS, Aalborg University
Towards a behavioral profile: senses and ID tags
ID tagging:



Each instance of 'identity' was assigned ID tags after
identification of its sense.
Each ID tag is divided into what is called levels (which is a
specification of a category within the ID tag).
Some examples of ID tags and levels:



Number (plural, singular)
Determiner (definite article, indefinite article, indefinite
pronoun, possessive pronoun, zero etc.)
Semantics of premodifier (ethnicity, ideology, gender,
temporality etc.)
Kim Ebensgaard Jensen
CGS, Aalborg University
Towards a behavioral profile: senses and ID tags
ID tagging:





Syntax:

Syntactic function

Determiner

Word class of premodifier

Postmodifier

Diathesis
Morphology:

Number

Function in nominal compound

Lexeme in modifier in nominal compound
Semantics:

Semantics of prepositional complement in postmodifying prepositional phrase

Semantics of premodifier
Collocation:

Lexeme in head postmodified by 'identity'

Lexeme in main verb when 'identity' is subject

Lexeme in main verb when 'identity, is passive subject, object or similar syntactic function
Discourse-pragmatics:

Textual function

Speech act

Domain
Kim Ebensgaard Jensen
CGS, Aalborg University
Towards a behavioral profile: senses and ID tags
ID tagging:
Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
After being assigned to identified senses, the ID
tags are quantified.
 Quantification of ID tags is essentially the
identification of association patterns - “the
systematic ways in which linguistic features are
used in association with other linguistic and nonlinguistic features” (Biber et al. 1998: 5).
 The result is the behavioral profile of.
 I used Gries (2010).

Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
Output:
Kim Ebensgaard Jensen
CGS, Aalborg University
Quantification of ID tags
Output:
Work in progress – not fully error-checked as you can see :-S
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
ID tag: discourse-pragmatics – domain
Culture
DistPersonality
GenderSex
GeoBel
GroupMemb
NameBack
academic
0,8983050847
0,6373626374
0,7272727273
0,6517857143
0,8246268657
0,1639344262
fiction
0,0338983051
0,0659340659
0,0519480519
0,0446428571
0,0111940299
0,2090163934
magazine
0,0169491525
0,1318681319
0,0649350649
0,1607142857
0,0858208955
0,1557377049
news
0,0169491525
0,0695970696
0,025974026
0,1071428571
0,0634328358
0,2336065574
spoken
0,0338983051
0,0952380952
0,1298701299
0,0357142857
0,0149253731
0,237704918
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
ID tag: syntax – function
Culture
DistPersonality
GenderSex
GeoBel
GroupMemb
Handle
NameBack
A-PC
0,0677966102
0,1135531136
0,1168831169
0,0714285714
0,1231343284
0
0,0983606557
APP
0
0,0036630037
0
0,0267857143
0
0
0,0040983607
Co
0
0,0073260073
0
0
0
0
0
Cs
0,0338983051
0,021978022
0,025974026
0,0267857143
0,0037313433
0
0,0081967213
INP
0,1186440678
0,043956044
0,1168831169
0,1964285714
0,0820895522
0,0833333333
0,0327868852
Od
0,3050847458
0,2783882784
0,2597402597
0,25
0,2276119403
0,8333333333
0,5204918033
Oi
0
0
0
0
0
0
0
PoM-PC
0,3728813559
0,3626373626
0,2857142857
0,3571428571
0,4141791045
0
0,1352459016
PreM
0,0508474576
0,0476190476
0,012987013
0,0089285714
0,0559701493
0
0,0327868852
S
0,0508474576
0,1208791209
0,1818181818
0,0625
0,0932835821
0,0833333333
0,1680327869
Culture
DistPersonality
GenderSex
GeoBel
GroupMemb
Handle
NameBack
active
0,7796610169
0,8498168498
0,7402597403
0,7410714286
0,7649253731
0,9166666667
0,8442622951
passive
0,0677966102
0,0805860806
0,0909090909
0,0357142857
0,1044776119
0
0,1229508197
NA
0,1525423729
0,0695970696
0,1688311688
0,2232142857
0,1305970149
0,0833333333
0,0327868852
ID tag: syntax – diathesis
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
Kim Ebensgaard Jensen
CGS, Aalborg University
Behavioral profile
ID tag: collocate in head – 'sense'
Culture
0
DefFeat
0
DistPersonality
0,0036630037
GenderSex
0
GeoBel
0,0535714286
GroupMemb
0,026119403
Handle
0
IdenOperator
0
IndChar
0
NameBack
0
Perce
0
Self
0,2222222222
SocRole
0
SoExist
0
Unique
0
Kim Ebensgaard Jensen
CGS, Aalborg University
Clustering the senses of 'identity'
Behavioral profiles offer a fine-grained view
of the distributional behavior of the senses
of polysemic lexemes.
 This offers several possibilities of gaining
insights into the actual use of the lexeme in
question and, of course, its senses.
 Behavioral profiling may also give us an
idea of the structure of the category network
that relates the senses of the lexeme.

Kim Ebensgaard Jensen
CGS, Aalborg University
Clustering the senses of 'identity'



The behavioral profile itself technically also constitutes a usagebased network of senses, as it associates senses with
distributional aspects (represented by ID tag levels) – thus
offering a complex overview of entrenchment patterns.
However, because of the complexity and detail, it is difficult to
get an overview of how the senses relate to each other in such a
network on the basis of a fully fledged behavioral profile.
Fortunately, there are statistical methods of identifying category
structures on the basis of calculations of similarity – such as
cluster analysis.
Kim Ebensgaard Jensen
CGS, Aalborg University
Clustering the senses of 'identity'
Kim Ebensgaard Jensen
CGS, Aalborg University
Clustering the senses of 'identity'
Kim Ebensgaard Jensen
CGS, Aalborg University
Concluding remarks

'Identity' is polysemic

Our (provisional) behavioral profile shows that...:





IDENTITY is a very complex lexical concept – or set of lexical concepts
strongly associated with factors of distribution.
There is a sharp distinction between the NameBack sense and the other
senses, with the latter being associated in particular with academic
discourse.
The NameBack and Handle senses are more frequently direct objects than
the other senses, suggesting perhaps a more concrete or entity-like
conceptualization.
The Self sense collocates with 'sense' (in 'sense of') much more strongly
than any other sense.
Although far from complete and still not fully error checked, this analysis also
shows that behavioral profiling is an empirically powerful way to do
lexicological analysis due to its multifactorial nature.
Kim Ebensgaard Jensen
CGS, Aalborg University
Bibliography








Berez, Andrea L. & Stefan Th. Gries (2009). In defense of corpus-based methods: a behavioral profile
analysis of polysemous get in English. In Steven Moran, Darren S. Tanner, & Michael Scanlon (eds.),
Proceedings of the 24th Northwest Linguistics Conference. University of Washington Working Papers
in Linguistics Vol. 27. Seattle, WA: Department of Linguistics. 157-166.
Biber Douglas, Susan Conrad, Randi Reppen (1998). Corpus Linguistics: Investigating Language
Structure and Use. Cambridge: Cambridge University Press.
Davies, Mark (2013). Corpus of Contemporary American English (COCA). corpus.byu.edu/coca/
Fearon, James D. (1999). What is identity (as we now use the word)? Unpublished manuscript,
Department of Political Studies, Stanford University.
Geeraerts, Dirk (1997). Diachronic Prototype Semantics: A Constribution to Historical Lexicology.
Oxford: Oxford University Press.
Gries, Stefan Th. (2010). BehavioralProfiles 1.01: A program for R 2.7.1 and higher.
Gries, Stefan Th. (2012). Behavioral profiles: A fine-grained and quantitative approach in corpusbased lexical semantics. In Gonia Jarema, Gary Libben, & Chris Westbury (eds.), Methodological and
Analytic Frontiers in Lexical Research. Amsterdam & Philadelphia: John Benjamins. 57-80.
Gries, Stefan Th. (fc). Corpus and quantitative linguistics. In John Taylor & Jeanette Littlemore (eds.),
Companion to Cognitive Linguistics. London & New York: Bloomsbury.
Kim Ebensgaard Jensen
CGS, Aalborg University
Download