Computational Linguistics Analysis of Charismatic Speech: Cross-Cultural and Political Perspectives Andrew Rosenberg NLP & Psychology 11/12/2015 Collaborators • original slides by Julia Hirschberg • Erica Cooper, Svetlana Stenchicova Stoyanchev, Wisam Dakka, Aron Wahl, Judd Sheinholtz, Fadi Biadsy, Rolf Carlson, Eva Strangert, Ian Kaplan, William (Yang) Wang Listening Experiment 1 2 3 4 5 Which of these speakers sound charismatic? 1. Vladimir Lenin 2. Franklin D. Roosevelt 4. Warren G. Harding 3. Mao Zedong 5. Adolf Hitler What is Charisma? What is Charisma? • The ability to attract, and retain followers by virtue of personal characteristics -- not traditional or political office (Weber ‘47) – E.g. Gandhi, Hitler, Castro, Martin Luther King Jr.,.. – Personalismo • What makes an individual charismatic? (Bird ’93, Boss ’76, Dowis ’00, Marcus ’67, Touati ’93, Tuppen ’74, Weber ‘47) – Their message? – Their personality? – Their speaking style? What is Charismatic Speech? • Circularly… – Speech that leads listeners to perceive the speaker as charismatic • What aspects of speech might contribute to the perception of a speaker as charismatic? – Content of the message? – Lexico-syntactic features? – Acoustic-prosodic features? Our Goals • Understand which aspects of speech are correlated with observer perceptions of charisma • To identify charismatic in political leaders – current and future • To create more charismatic text-to-speech synthesizers Our Approach • Collect tokens of charismatic and non-charismatic speech from a small set of speakers on a small set of topics • Ask listeners to rate the ‘The speaker is charismatic’ plus statements about a number of other attributes (e.g. The speaker is …boring, charming, persuasive,…) • Correlate listener ratings with lexico-syntactic and acoustic-prosodic features of the tokens to identify potential cues to perceptions of charisma Outline • Charisma in American English political speech • Comparison with text: what is said vs. how it is said • Charisma in Arabic speech • Cross-cultural perceptions of charisma • Charisma in recent American politics: – Democrats in 2008 – Republican today Outline • Charisma in American English political speech • Comparison with text: what is said vs. how it is said • Charisma in Arabic speech • Cross-cultural perceptions of charisma • Charisma in recent American politics: – Democrats in 2008 – Republican today American English Perception Study • Data: 45 2-30s speech segments, 5 each from 9 candidates for Democratic nomination for U.S. president in 2004 – Liberman, Kucinich, Clark, Gephardt, Dean, Moseley Braun, Sharpton, Kerry, Edwards – 2 ‘charismatic’, 2 ‘not charismatic’ – Topics: greeting, reasons for running, tax cuts, postwar Iraq, healthcare – 4 genres: stump speeches, debates, interviews, ads • 8 subjects rated each segment on a Likert scale (15) 26 questions in a web survey • Duration: avg. 1.5 hrs, min 45m, max ~3hrs Results: How Much Do Subjects Agree with Each Other? • Over all statements? – Using Cohen’s) kappa statistic with quadratic weighting, mean = 0.207 • On the charismatic statement? • = 0.232 (8th most agreed upon statement) • By token? – No significant differences across all tokens although Edwards and Liberman tokens show significantly more agreement across all statements • By statement? – Individual statements demonstrate significantly different agreements (most agreement: The speaker is accusatory, angry, passionate, intense; least agreement: The speaker is trustworthy, believable, reasonable, trustworthy) Results: What Do Subjects Mean by Charismatic? • Which other statements are most closely correlated with the charismatic statement? (determined by kappa): a functional definition The speaker is enthusiastic 0.620 The speaker is persuasive 0.577 The speaker is charming 0.575 The speaker is passionate 0.543 The speaker is boring -0.513 The speaker is convincing 0.499 Results: Does Whether a Subject Agrees with the Speaker or Finds the Speaker ‘Clear’ Affect Charisma Judgments • Whether a subject agrees with a token does not correlate highly with charisma judgments ( = 0.30) • Whether a subject finds the token clear does not correlate highly with charisma judgments ( = 0.26) Results: Does the Identity of the Speaker Affect Judgments of Charisma? • There is a significant difference between speakers (p=2.20e-2) • Most charismatic – Rep. John Edwards (mean 3.86) – Rev. Al Sharpton (3.56) – Gov. Howard Dean (3.40) • Least charismatic – Sen. Joseph Lieberman (2.42) – Rep. Dennis Kucinich (2.65) – Rep. Richard Gephardt (2.93) Results: Does Recognizing a Speaker Affect Judgments of Charisma? • Subjects asked to identify which, if any, speakers they recognized at the end of the study. • Mean number of speakers believed to have been recognized, 5.8 • Subjects rated ‘recognized’ speakers as significantly more charismatic than those they did not believe they had recognized (mean 3.39 vs. mean 3.30). Results: Does Genre or Topic Affect Judgments of Charisma? • Recall that tokens were taken from debates, interviews, stump speeches, and campaign ads – Genre does influence charisma ratings (p=.0004) • Stump speeches were the most charismatic (3.38) • Interviews were the least (2.96) • Topic does affect ratings of charisma significantly (p=.0517) – Healthcare > post-war Iraq > reasons for running neutral > taxes What makes Speech Charismatic? Features Examined • Duration (secs, words, syls) • Charismatic speech is personal: Pronoun density • Charismatic speech is contentful: Function/content word ratio • Charismatic speech is simple: Complexity: mean syllables/word (Dowis) • Disfluencies • Repeated words • Min, max, mean, stdev F0 (Boss, Tuppen) – Raw and normalized by speaker • Min, max, mean, stdev intensity • Speaking rate (syls/sec) • Intonational features: – Pitch accents – Phrasal tones – Contours Results: Lexico-Syntactic Correlates of Charisma • Length: Greater number of words positively correlates with charisma (r=.13; p=.002) • Personal pronouns: – Density of first person plural and third person singular pronouns positively correlates with charisma (r=.16, p=0; r=.16, p=0) – Third person plural pronoun density correlates negatively with charisma (r=-.19,p=0) • Content: Ratio of adjectives/all words negatively correlates with charisma (r=-.12,p=.008) • Complexity: Higher mean syllables/word positively correlates with charisma (p=.034) • Disfluency: greater % negatively correlates with charisma (r=-.18, p=0) • Repetition: Proportion of repeated words positively correlates with charisma (r=.12, p=.004) Results: Acoustic-Prosodic Correlates of Charisma • Pitch: – Higher F0 (mean, min, mean HiF0, over male speakers) positively correlates with charisma (r=.24,p=0;r=.14 p=0;r=.20,p=0) • Loudness: Mean rms and sdev of mean rms positively correlates with charisma (r=.21,p=0;r=.21,p=0) • Speaking Rate: – Faster overall rate (voice/unvoiced frames) positively correlates with charisma (r=.16,p=0) • Duration: Longer duration correlates positively with charisma (r=.09,p=.037) • Length of pause: sdev negatively correlates with charisma (r=-.09,p=.004) Summary for American English • In Standard American English, charismatic speakers tend to be those also highly rated for enthusiasm, charm, persuasiveness, passionateness and convincingness – they are not thought to be boring • Charismatic utterances tend to be longer than others, to contain a lower ratio of adjectives to all words, a higher density of first person plural and third person singular pronouns and fewer third person plurals, fewer disfluencies, a larger percentage of repeated words, and more complex words than non-charismatic utterances • Charismatic utterances are higher in pitch (mean, min) with more regularity in pause length, faster, louder with more variation in intensity Outline • Charisma in American English political speech • Comparison with text: what is said vs. how it is said • Charisma in Arabic speech • Cross-cultural perceptions of charisma • Charisma in recent American politics: – Democrats in 2008 – Republican today Replication of Perception Study from Text Alone • Lower statement agreement, much less on charismatic statement, different speakers most/least charismatic (Clark most, Sharpton least) • `Agreement with speaker’, genre and topic had stronger correlations with charisma • Lexico-syntactic features show weaker correlations – 1st person pronoun density negatively correlated and complexity not at all – Similar to speech experiment for duration, function/content, disfluencies, repeated words Outline • Charisma in American English political speech • Comparison with text: what is said vs. how it is said • Charisma in Arabic speech • Cross-cultural perceptions of charisma • Charisma in recent American politics: – Democrats in 2008 – Republican today Hypothesis: Perception of Charisma Differs by Culture • People of different languages and cultures perceive charisma differently • In particular, they perceive charismatic speech differently – Do Arabic listeners respond to American politicians the same way Americans do? – Do Americans respond to Arabic politicians the way Arabs do? Palestinian Arabic Perception Study • Same paradigm as for SAE • Materials: – 44 speech tokens from 22 male native-Palestinian Arabic speakers taken from Al-Jazeera TV talk shows – Two speech segments extracted for each speaker from the same topic (one we thought charismatic and one not) • Web form with statements to be rated translated into Arabic • Subjects: 12 native speakers of Palestinian Arabic How Does Charisma Differ in Arabic? • Subjects agree on judgments a bit more (κ=.225) than for English (κ=.207) but still low – Agree most on clarity of msg, enthusiasm, charisma, intensity – all differing from Americans – Agree least on desperation (as SAE), friendliness, ordinariness, spontaneity of speaker – Charisma statement correlates (positively) most strongly with speaker toughness, powerfulness, persuasiveness, charm, and enthusiasm and negatively with boringness • Role of speaker identity important in judgments of charisma in Arabic as in English – Most charismatic speakers: Ibrahim Hamami (4.75), Azmi Bishara (4.42), Mustafa Barghouti (4.33) – Least: Shafiq Al-Hoot (3.10), Mohammed Al-Tamini (3.42), Azzam Al-Ahmad (3.33) – Raters claimed to recognize only .55 (of 22) speakers on average, perhaps because the speakers were less well known than the Americans • Topic important in charisma ratings (r=0,p=.043) Israeli separation wall > assassination of Hamas leader > debates among Palestinian groups > the Palestinian Authority and calls for reform > the Intifada and resistance Lexical Cues to Charisma • Length in words positively correlates with charisma, as for Americans • Disfluency rate negatively correlates, as for Americans • Repeated words positively correlates with charisma, as for Americans • Presence of Arabic ‘dialect markers’ (words, pronunciations) negatively correlates with charisma • Density of third person plural pronouns positively correlates w/ charisma – differing from Americans Acoustic/Prosodic Cues to Charisma • Duration positively correlated with charisma, as for Americans • Speaking rate approaches negative correlation – opposite from American – But rate of the fastest intonational phrase in the token positively correlated for both languages – Sdev of rate across intonational phrases positively correlated for charisma in Arabic • Pauses – #pauses/words ratio positively correlated with charisma but not for Americans – Sdev of length of pause positively correlated in Arabic but negatively for Americans • Pitch: – Mean pitch positively correlates (as for Americans) but also F0 max and sdev – Min pitch negatively correlates (opposite from Americans) • Intensity: Sdev positively correlates w/ charisma Outline • Charisma in American English political speech • Comparison with text: what is said vs. how it is said • Charisma in Arabic speech • Cross-cultural perceptions of charisma • Charisma in recent American politics: – Democrats in 2008 – Republican today How Are Perceptions of Charisma Similar Across Palestinian and Arabic Cultures? • Level of subject agreement on statements • Role of speaker ID, topic in charisma judgments • Positive correlations with charisma – Mean pitch and range – Duration, repeated words – Speaking rate of fastest IP • Negative correlations with charisma – Disfluencies How Do Charisma Judgments Differ Across Cultures? • Statements most and least agreed upon • For Arabic vs. English: – Positive correlations with charisma • Sdev of speaking rate, pause/word ratio, sdev of pause length, F0 max and sdev, sdev intensity – Negative correlations with charisma • Dialect, density of third person plural pronouns • Speaking rate, min F0 Ratings Within and Across Cultures • Cross-cultural comparison of subjects ratings: – American raters/Arabic speech – Palestinian raters/English speech, – Swedish raters/English speech • Do native and non-native raters differ on mean scores per token? • Yes, Eng/Pal rating English, Swe/Pal rating English • Do mean scores correlate per token? • Yes, for all • Amer and Swe rating English: – paired t-test betw means per token: p-value = 0.03064 – cor between means of rater-normalized ratings: r = 0.60, p-value = 1.170e-05 • Amer and Pal rating English: – paired t-test betw means: p-value = 0.1048 – cor between means of rater-normalized ratings: r = 0.47, p-value = 0.0009849 • Amer and Pal rating Arabic: – paired t-test betw means: p-value = 0.00164 – cor between means of rater-normalized ratings: r = 0.72, p-value = 3.049e-08 • Swe and Pal rating English: – paired t-test betw means: p-value = 0.8479 – cor between means of rater-normalized ratings: r = 0.55, p-value = 9.467e-05 Charisma and Politics Can we predict the Winners? or The Losers? Goals and Research Questions • Investigate emotion and charisma in political discourse – Collect a corpus of political speech: debates – Determine lexical and acoustic correlates of political success: correlate with subsequent polls – Characterize similarities and differences between Democrats and Republicans Corpus Collection • Democratic primaries from 2008: – Biden, Clinton, Dodd, Edwards, Gravel, Kucinich, Obama, Richardson – New Hampshire MSNBC Debate, September 26, 2007 • Fall 2011 Republican primaries: – Bachmann, Cain, Gingrich, Huntsman, Johnson, Paul, Perry, Romney, Santorum – Fox News / Google Debate, September 22, 2011 • Interviews for all candidates Gallup Poll Results Democrats Poll Results Republicals Poll Results Clinton 47% Romney 20% Obama 26% Cain 18% Edwards 11% Perry 15% Richardson 4% Paul 8% Biden 2% Gingrich 7% Dodd 1% Bachmann 5% Kucinich 1% Santorum 3% Gravel 0.5% Huntsman 2% Johnson 0% Lexical Features • For Debate speech only – – – – – – – Word Count Laughter, applause, and interruptions Number of speaker turns Average number of words per turn Syllables per word Disfluencies per word Linguistic Inquiry and Word Count (LIWC) features Acoustic Features • Debates and Interviews: mean, min, max, and standard deviation of f0 • Difference for these values between debates and interviews Results: Corrlates with Post-Debate Poll Standing Features Democracts Republicans Word Count Laughs Applause Turns Interruptions Avg Words Per Turn Avg Syllables Per Word Disfluencies Per Word Mean-f0 Difference Min-f0 Difference 0.804 0.084 0.194 0.846 0.052 0.199 0.015 -0.805 0.059 0.002 0.582 0.780 0.350 0.381 0.247 0.406 -0.768 -0.130 0.479 0.105 LIWC Positive Correlates for Democrats Feature Examples Correlates Insight Think, know consider 0.74 Inclusive And, with, include 0.680 Cognitive Cause, know, ought 0.679 Words per sentece 0.661 Common verbs Walk, went, see 0.657 Positive emotion Love, nice, sweet 0.632 Pronouns I, them, itself 0.626 Nonfluencies Er, hmm, umm 0.611 LIWC Features: Negative Correlates for Democrats Features Examples Correlation Articles A, an, the -0.684 All Punctuation -0.653 Periods . -0.639 Dashes - -0.595 Numerals 12, 38, 156 -0.581 Questions marks ? -0.578 Ingestion Dish, eat, pizza-0.575 -0.575 Money Audit, cash, owe -0.548 2nd Person Pronouns You, your -0.506 LIWC Features: Correlates for Republicans Features Examples Correlation Leisure Cook, chat, movie 0.869 Past tense Went, ran, had 0.732 “Other” punctuation !”:%$ 0.665 Dictionary words 0.541 Personal pronouns I, them, her 0.525 6+ letters Abandon, abrupt, absolute -0.741 Home Apartment, kitchen, family -0.513 General Party Differences • • • • • Democrat audiences laughed more Republican audience applauded more Democrats got interrupted more Republicans had more words per turn Both had about the same number of syllables per word • Democrats' debate speech was more different from their interviews, but Republicans responded slightly more positively to these differences. Conclusions • Democrats: Talk more, use fewer disfluencies, and use more Insight words • Republicans: Make the audience laugh, use Leisure-related words, and avoid longer words Historical Political Debates • Televised presidential debates first held in 1960. • Then 1976-present. • Polling information is available from CBS polling from 1976-present Predicting debate winners • Try to predict winners of the current debate given all prior debates. – “fighting the last war” – Only those debates with one democrat and one republican – Both VP and Presidential Debates Debate features • Word Usage: 1) personal pronouns, 2) number of questions, 3) pleasantries, 4) contractions, 5) numbers, 6) references to the opponent, 7) references to running mate. • Topic Words: seed WordNet search with “god”, “health”, “religion, “tax”, and “war” – WordNet-defined attribute, synonym, antonym, derivationally-related form, synset entry, troponym, meronym, member meronym, domain category, of the seed term and first-degree words added from the seed term Debate features • Affective Content: average and sum of Dictionary of Affect in Language activation, imagery & pleasantness • Named Entities: number of times and rate of mentions of people, locations and organizations • Term Vector: tf*idf of the top 10k words (stemmed, omitting stop words) • Length-features: syllable, word, sentence length of turns • Complexity: Flesch-Kincaid grade readability. Debate winner prediction results • Train on all previous Unsplit (full debate) Debate (full debate) 0.64 (0.68) p=0.75 Split (predict each 20 turns) 0.61 (0.65) p=1.0 Debater (only one turn) 0.57 (0.5) p=0.23 0.51 (0.5) p=0.47 Unsplit (full debate) • Just 2008 Debate (full debate) 1.0 (0.68) p=0.21 Split (predict each 20 turns) 0.72 (0.65) p=0.35 Debater (only one turn) 0.75 (0.5) p=0.14 0.61 (0.5) p=0.12 Debate Results: Important Features • First person plural possessives (“our”) used by Democrats • “sat”, “estimates”, “plain”, “advocate”, “roughly” • “atomic” • Grade level readability of Democrats • Second person singular object pronouns used by Republicans • “disagreements”, “unilateral”, “soviets” Debate Results: Feature analysis • First person plural possessives (“our”) used by Democrats – Negatively correlates with Democratic wins • Grade level readability of Democrats – Lower complexity in Democratic wins • Second person singular object pronouns (“you”) used by Republicans – Negatively correlates with Republican wins • “atomic”, “unilateral”, “soviets” – important political topics • “achieve”, “disagreements” – political action Future Work • Investigate additional acoustic features (intensity, speaking rate, voice quality) • Higher level prosodic features (intonational contours, phrasing) • Lexical features from interviews and debates • More complicated / interesting lexical features – Cross-cultural similarities and differences – Entrainment • Interpersonal relationships Thank you! The Prediction of Charisma • Arabic English Arabic Prosodic Phenomena MSA vs. Dialect • A word is considered dialectal if: – It does not exist in the standard Arabic lexicon – It does not satisfy the MSA morphotactic constraints – Phonetically different (e.g., ya?kul vs. ywkil) • In corpus of tokens – 8% of the words are dialect. – 80% of the dialect words are accented. Arabic Prosody: Accentuation • 70% of words are accented • 60% of the de-accented words are function words or disfluent items – Based on automatic POS analysis (MADA) – 12% of content words are deaccented • Distribution of accent types: – – – – H* or !H* pitch accent, 73% L+H* or L+!H*, 20% L*, 5% H+!H*, 2% Arabic Prosody: Phrasing • Mean of 1.6 intermediate phrases per intonational phrase • Intermediate phrases contain 2.4 words on average • Distribution of phrase accent/boundary tone combinations – – – – – L-L% H-L% L-H% H-L% H-H% 59% 26% 8% 6% 1% Arabic Prosody – most common contours H* LH* HL+H* LH* H* LH* !H* LL* LL+H* !H* LH* H* HH* !H* !H* LL+H* H- 21.9 13.4 9.7 7.6 4.1 4.1 3 3 2.3 2.1 Arabic Prosody – Disfluency • In addition to standard disfluency: – Hesitations – filled pauses – self-repairs • In Arabic, speakers could produce a sequence of all of the above. (see praat: file: 1036 and 2016) • Disfluency may disconnect prepositions and conjunctions from the content word: – تأتي... يعني... لـ... ولتأتي => و – w- l- uh- yEny uh- t?ty instead of wlt?ty Functional Definition of Charisma • According to our 12 native American English (AE) speakers who rated AE speech, “charisma is” Analyzing Charismatic Speech Functional Definition of Charisma • According to our 12 native Palestinian Arabic (PA) speakers who rated PA speech, “charisma is” Examples of features that correlate with Charisma • Variation in length of pauses: negative correlate in English but positive in Arabic • Speaking Rate: positive correlate in English and approaching negative correlate in Arabic • Mean Pitch: positive correlate in both studies Examples of features that correlate with Charisma • High variation in loudness: positive correlate in Arabic • Disfluency: rate of disfluencies (repetitions, repairs, and filled pauses) negatively correlated with charisma in both • First person plural pronoun: positive correlate in English only • Ratio of adverbs and adjectives to number of words in utterance: negative correlate in English