Affective Computing: Machines with Emotional Intelligence Hyung-il Ahn MIT Media Laboratory Skills of Emotional Intelligence: • • • • • Expressing emotions Recognizing emotions Handling another’s emotions Regulating emotions \ if “have emotion” Utilizing emotions / (Salovey and Mayer 90, Goleman 95) We have pioneered new technologies to recognize human affective information: Sensors, pattern recognition and common sense reasoning to infer emotion from physiology, voice, face, posture & movement, mouse pressure Mind-Read: Recognizing complex cognitive-affective states from joint face and head movements Future “teacher for every learner” Can we teach a chair to recognize behaviors indicative of interest and boredom? (Mota and Picard) Sit upright Lean Forward Slump Back Side Lean What can the sensor chair contribute toward inferring the student’s state: Bored vs. interested? Results (on children not in training data, Mota and Picard, 2003): 9-state Posture Recognition: 89-97% accurate High Interest, Low interest, Taking a Break: 69-83% accurate Detecting, tracking, and recognizing facial expressions from video (IBM BlueEyes camera with MIT algorithms) Affective-Cognitive Mental States Baron-Cohen et al. AUTISM RESEARCH CENTRE, CAMBRIDGE Agreeing Complex Mental States (subset) Concentrating Absorbed Concentrating Vigilant Disagreeing Interested Asking Curious Impressed Interested Thinking Unsure Baffled Confused Undecided Unsure Assertive Committed Persuaded Sure Disapproving Discouraging Disinclined Brooding Choosing Thinking Thoughtful Technology that understands and responds to human experience like a caring, respectful person would, for example: Knows when a person/customer is: • Concentrating, and does not interrupt unless very important • Thinking, and can pause to let you think • Unsure, and can offer to explain differently • (Not) interested in what it says • (Dis)agreeing, and can adjust response respectfully Technology with people sense will perceive cognitive-affective states, e.g., before interrupting hmm … Roz looks busy. Its probably not a good time to bring this up Analysis of nonverbal cues Inference and reasoning about mental states Modify one’s actions Persuade others Inferring Cognitive-Affective State from Facial+Head movements (el Kaliouby, 2005) Experimental Evaluation Conclusions Facial feature extraction Head & facial action unit recognition Head & facial display recognition Mental state inference Head pose estimation Feature point tracking Hmm … Let me think about this Other examples: Agree Disagree Robotic Computer (RoCo) : World’s first physically animated computer 75% sit in front of computers (static) Back pain/injury = #2 cause of missed work Physical movement helps prevent/reduce back pain Goals : - Fostering healthy posture - Building social rapport Improved task performance (Affect-Congruent behavior) Animated Desktop Monitor: RoCo = Robotic Computer QuickTime™ and a Cinepak decompressor are needed to see this picture. RoCo Behavior QuickTime™ and a MPEG-4 V ideo decompressor are needed to see this picture. When should RoCo move? (Future work & not topic of this paper, but important to mention) • NOT when: you’re concentrating, interested, in the middle of an engaging task, or otherwise attentive/focused on the monitor’s content. • Might make a micro-movement when you’re looking away or blinking in the middle of a task. • Might make a larger movement to attract a new user, bow to welcome, or when user shifts tasks and hasn’t shifted posture (etc.) RoCo’s postures congruous to the user affect “Stoop to Conquer” : Posture and affect interact to influence computer users’ comfort and persistence in problem solving tasks People tend to be more persistent and feel more comfortable when RoCo’s posture is congruous to their affective state N=(17) “Stoop to Conquer”: Posture congruent with emotion improves persistence (# tracing attempts, two different experiments) RoCo’s Posture: Human State: Success Slumped Neutral Upright 8.2 8.3 12.0 9.6 7.4 6.9 (“you scored 8/10”) N=30 Failure (“you scored 3/10”) N=19 We are creating new computational models to easure human affective experience and to predict uman decision-making & preference • A multi-modal affective-cognitive measures for product evaluation with computational models of predicting customer decisions Predicting customer product preferences by combining information about emotion and cognition Background findings to inform new research: The brain uses both emotion (affect) and cognition in decision making -> model should combine both affect and cognition A person in an experiment is likely to cognitively bias their self-report of what they like. -> method should not rely on only self-report When a person is cognitively loaded they are more likely to use emotion in decision-making. -> method should slightly load person cognitively Background findings to inform new method: Multiple measures of affect provide most robust assessment: -> method can measure affective physiology (face, skin conductance) as well as behavior and self-report Sweeter beverages are preferred on the first sip; longterm accumulation of something mildly bad is required before it is “bad enough to notice” -> method should require lots of sips of every beverage More complete understanding of consumer desire Facial Expression AFFECTIVE LIKING Emotions Physical NUMBER OF SIPS Amount Consumed MultiDimensional Response Skin Conductance ANTICIPITORY FEELING Arousal Self Report COGNITIVE LIKING Purchase intent Liking Expectation Videos of Testing • Here is a sneak preview of my project. Make sure to look for consumers emotions that may not be captured in self reported questions. QuickTime™ and a YUV420 codec decompressor are needed to see this picture. Test Products Products chosen with clear performance differences • Stronger Performer – – Pepsi Vanilla • Performed in top 25%, green region, in Directions HUT • Weaker Performer – – Pepsi Summer mix • Performed in lower 40%, lower yellow region, in Directions HUT Affective Computing • Two techniques performed simultaneously – Facial Imaging and Head Positioning Tracking face muscle movements to interpret emotions – Galvanic Skin Response (GSR) Measures Arousal, used as an intensity measure for emotions Affective-Cognitive Mental States Facial Head Expression + Position = Interpretation • • • • • • Concentrating Thinking Confused Interested Agreeing Disagreeing GSR Shows Intensity Method: Choice Technique • Choice technique - respondent selected one of two vending machines – Process is repeated 30 times – Eventually respondents realized each machine favors a different product and will select the vending machine hoping to receive their favored product – 70/30 probability of either product coming out of either machine Method - General Set-Up Machine 1 135 246 Machine 2 135 246 Two cups on each side of the computer: Pepsi Vanilla and Pepsi Summer Mix Use of straws avoided blocking facial reaction Experimental Set Up Machine Selection Sip on Resulted Beverage Answer Questions Method - Step 1 RANDOMLY CHOOSE A VENDING MACHINE • Each vending machine directed you to sip a beverage Method- Step 2 RESPONDENTS SIP RESULTED BEVERAGE Method – Step 3 • Answer Questionnaire used in standard CLT – Overall Liking (beverage and machine) – Purchase Intent, Comparison to Expectation Method – Step 4 • Reselect a machine • 30 machine selections were made Data collection timeline Data collected throughout experiment Choice 1 70% Vanilla 30% Mix Quick Time™ a nd a TIFF ( Un co mpr es sed ) d eco mp res so r ar e n eed ed to s ee thi s pi ctu re. Choice 2 70% Mix 30% Vanilla Quick Time™ a nd a TIFF ( Un co mpr es sed ) d eco mp res so r ar e n eed ed to s ee thi s pi ctu re. Start Select vanilla or mix How much do you like the sip? Outcome Sip Question Evaluate Measuring ANTICIPITORY FEELING (hope/dread) Skin conductance Measuring AFFECTIVE LIKING (initial reaction) Facial expression Skin conductance Measuring COGNITIVE LIKING (post reaction) Self-report Start (Next trial) Videos of Testing QuickTime™ and a YUV420 codec decompressor are needed to see this picture. Videos of Testing QuickTime™ and a YUV420 codec decompressor are needed to see this picture. Videos of Testing QuickTime™ and a YUV420 codec decompressor are needed to see this picture. Videos of Testing QuickTime™ and a YUV420 codec decompressor are needed to see this picture. Discussion • Analysis Our hypothesis is that joining quantitative and qualitative methodologies will help provide understanding of consumers’ real product evaluations Facial Expression AFFECTIVE LIKING Emotions Physical NUMBER OF SIPS Amount Consumed MultiDimensional Response Skin Conductance ANTICIPITORY FEELING Arousal Self Report COGNITIVE LIKING Purchase intent Liking Expectation