Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute Warm-Up Read the posts and be ready to discuss what you see as the take aways for computationization of discourse analysis from today’s readings… What are the computational implications of the debate between DA and CA? Note: These preparatory activities were rated as least beneficial to students, so… We will start the lecture/discussion at exactly 12:05pm. Please be on time and ready to discuss! Early Course Evaluation Good news: everyone rated the lectures/discussions as valuable and engaging Things to improve: Decrease preparation time Change: only one discussion thread per week, but continue to use it throughout the week, include different options for response that require different amounts of time Change: frontload readings for Monday, further divide readings into required, extra, and supplemental More focus on fewer concepts for the remainder of the semester This week: Sections 7.2-7.7 are most important * Not required!!! Chicken and Egg… Operationalization Main issue for this week: Exploring sequencing and linking between speech acts in conversation Computationalization * Where do the ordering constraints come from? Is it the language? Or is it what is behind the language (e.g., intentions, task structure)? If the latter, how do we computationalize that? Reminder from last time RE Constraint from Ordering Inform is the most common class (37.4%) With bigrams, if we look for conditional probabilities above 25% Next most frequent is Assess (18.5%) The only case where the most likely next class is not Inform is ElicitAssessment, which is followed by Assessment 36% of the time It is followed by Inform 33% of the time It only occurs about 1% of the time Trigrams might be better, but this makes ordering information look pretty useless More on what was least valuable (student quotes) •Nice job on the homeworks!!! •I saw SO much improvement over the several posts and finally the assignment. The forum prompt mini-assignments seem unbalanced in proportion to the homework - by the time the "real" homework came along, I felt I had done ten times more work on my posts already. The Homework Assignment 2 (not due til Feb23) Look at the Maptask dataset and Negotiation coding that is provided Think about what distinguishes the codes at a linguistic level Do an error analysis on the dataset using a simple unigram baseline, and from that propose one or a few new types of features motivated by your linguistic understanding of the Negotiation framework Due on Week 7 lecture 2 Turn in data your feature extractors (documented code) and a formal write up of your experimentation Have a 5 minute powerpoint presentation ready for class on Week 7 lecture 2 Interesting Observation! Responses can address either illocutions or perlocutions Perlocutions are much less constrained Accounts for some of the difficulty in imposing ordering constraints Argues in favor for thinking about conversation as organized around intentions and tasks rather than linguistic categories Wednesday’s readings will argue just the opposite!! Are illocutions just the wrong categories?? Discourse Analysis vs Conversation Analysis (according to Levinson) Rules, formulas, more typical of linguistics and philosophers Categories, contingencies, grammars Use of a small but strategic amount of data Accused of “premature” theory construction Martin & Rose, Levinson More rigorously empirical and inductive Focus on what is found in data, not on what is expected to be found or would sound odd Hesitant to make generalizations/ Accused of being atheoretical Questions about whether the rules “work” on real data * Is it a question about the nature of language (is there a fundamental segmentation difference between utterances and acts?), or is it a question about research methodology? Are these linked? The nature of what we are modeling What we can know about it and how certain we can be How we learn what we know Rules, like speech acts Qualitative observations, anthrooplogy style … An now for Elijah’s SIDE presentation Questions?