Lexical and Grammatical Convergence within Communities Chris Schmader EECS 470 Northwestern University June 10, 2011 Linguistic Convergence • Occurs when a community arrives at a common set of linguistic conventions – e.g., using "dog" to refer to a certain type of animal – using word order to encode who/what performed an action & who/what was acted upon • Question: how does convergence occur, given that… – These conventions are arbitrary (i.e., no inherent connection between words & their meanings) – Communication within large communities typically occurs at the level of dyads (i.e., two people) Agent-Based Modeling • Some types of linguistic convergence are difficult to study using experimental methods – – • Occur over many interactions between large numbers of people Costs of assembling large groups of participants are prohibitive ABM allows us to: 1) Create a large community of artificial agents 2) Observe thousands of attempts to communicate among agents within community Lexical Convergence • • Barr (2004) used ABM to show that large communities of agents were able to converge on a common lexicon in a bottomup fashion Barr’s (2004) findings: agents converged more quickly on a lexicon when…. – – Number of previous communicative outcomes agents could store (i.e., memory size) was small Number of distinct agents within the community each agent communicated with (i.e., neighborhood size) was relatively small Grammatical Convergence • Current model: investigated whether communities of agents would converge on a common lexicon and grammar • If so, will memory size and neighborhood size also affect grammatical convergence? The Current Model • Contains “people” and “objects” • People communicate about objects that appear in the center of model environment • Objects vary along two dimensions: shape (circle or square) and color (blue or yellow) The Current Model • People have lists representing: – lexicons of four words (“A,” “B,” “C,” and “D”) mapped to four meanings (circle, square, blue and yellow) – two-position grammars, in which 1st sentence position encodes shape and 2nd position encodes color, or vice versa – Memories for how well lexical mappings and grammars have performed Lexicon: [ "A" yellow false] [ "B" "square" false] …. Grammar: [ "shape" "color" ] Lex-memory: ["A" false true] [ "B" true true ]…. Gram-memory: [ false true false ] At Setup • 100 people created and randomly scattered throughout environment • Each person’s lexicon and grammar are randomly initialized “A” = circle “A” = square “A” = yellow “A” = blue (Different colors for agents indicate different lexicons) At Each Tick • Each person ("speaker") communicates with closest other person (“interlocutor”) who hasn’t already spoken on that tick • Training phase: people generate one-word sentences – Sentences refer to object’s shape or its color – If speaker & interlocutor's sentences mismatch, they switch word's meaning based on number of failures stored in memory – If switch occurs, word’s meaning is exchanged with the meaning of least successful other word Person 0's sentence: "A" Person 63's sentence: "D" At Each Tick • Training phase of model run ends when people have converged on common lexicon • Next, people produce twoword sentences based on their grammars – If grammar is “shape-first,” first word refers to shape; “color-first,” first word refers to color – If at least one sentence position contains same word across speaker and interlocutor, it's a success – If not, they record a failure and follow switching algorithm similar to that used for lexicon Person 0's sentence: "C" "D" Person 63's sentence: "C" "D" Sliders • Memory-size: varies number of previous outcomes stored in memories for lexical mappings & grammar – Ranges from 2 to 10 • Neighborhood-size: varies radius, in patches, within which a person can move − Ranges from 1 to 16 − Smaller radius means people will interact with smaller number of distinct others during model run Experiment • Attempted to replicate Barr’s (2004) findings on lexical convergence & extend them to grammatical convergence • Conducted BehaviorSpace experiment measuring number of ticks to lexical convergence & grammatical convergence – Varied memory-size from 2 to 10 – Varied neighborhood-size, with settings of 2, 4, 6, 8, 10, 12, 14, and 16 – Conducted 10 model runs at each combination of memorysize and neighborhood-size settings • Predictions: lexical & grammatical convergence will occur more quickly at low settings of memory-size and middle settings of neighborhood-size Results (Lexical Convergence) Mean Ticks to Converge on a Lexicon 6000 Neighborhood Size = 16 Neighborhood Size = 10 5000 Neighborhood Size = 2 Ticks 4000 3000 2000 1000 0 -1000 2 3 4 5 6 7 8 9 10 Memory Size • Trend toward slower convergence at smallest neighborhood size • Trend toward quicker convergence at smallest memory sizes • Reliably slower convergence at neighborhood size 2 when memory size was 2 or 3 Results (Grammatical Convergence) Mean Ticks to Converge on a Grammar 1000 Neighborhood Size = 16 Neighborhood Size = 10 800 Neighborhood Size = 2 Ticks 600 400 200 0 2 3 4 5 6 7 8 9 10 -200 Memory Size • Analyzed ticks needed after lexical convergence for grammatical convergence to occur • Small trend toward slower grammatical convergence at neighborhood size 2 • No indication of differences in convergence across different levels of memory size Discussion • Results replicated Barr's (2004) findings on effect of memory size on lexical convergence – Smaller memory sizes led to quicker convergence – Likely due to fact that storing too many failures leads people to switch mappings too often • Results did not replicate Barr's findings on neighborhood size – May be due to differences across studies in how models operationalized neighborhood size Discussion • No indication that memory size or neighborhood size affect grammatical convergence – Likely due to fact that grammars in current model were too simple for these effects to emerge – Future work will attempt to extend model to more complex grammars • Overall, results demonstrate that communities can converge on lexicon & grammar in bottom-up fashion, provided they establish lexical mappings first References • Barr, D. J. (2004). Establishing conventional communication systems: Is common knowledge necessary? Cognitive Science, 28, 937-962.