A Fuzzy System for Emotion Classification based on the MPEG-4 facial definition parameter set Nicolas Tsapatsoulis, Kostas Karpouzis, George Stamou, Fred Piat and Stefanos Kollias Image, Video and Multimedia Systems Laboratory National Technical Univ. of Athens Problem Statement Describe archetypal emotions using the FAPs of MPEG-4 Approximate FAPs through some Facial protuberant points Combine the emotion wheel of Whissel with a fuzzy inference system to extend to broader variety of emotions Emotion Analysis: Engineers and Psychological Researchers Engineers concentrated (basicallly) on archetypal emotions -surprise, fear, joy, sadness, disgust, anger. Psychological researchers investigated variety of emotions Their results are not easily implemented Some hints can be obtained Whissel suggests that emotions are points in a two-dimensional space Whissel’s emotion wheel Axes: activation –evaluation Activation: degree of arousal Evaluation: degree of pleasantness FAPs and Archetypal Expressions Anger Sadness Surprise squeeze_l_eyebrow (+) lower_t_midlip (-) raise_l_i_eyebrow (+) close_t_r_eyelid (-) close_b_r_eyelid (-) squeeze_r_eyebrow(+) raise_b_midlip (+) raise_r_i_eyebrow (+) close_t_l_eyelid (-) close_b_l_eyelid (-) raise_l_i_eyebrow (+) close_t_l_eyelid (+) raise_l_m_eyebrow (-) raise_l_o_eyebrow (-) close_b_l_eyelid (+) raise_r_i_eyebrow (+) close_t_r_eyelid (+) raise_r_m_eyebrow (-) raise_r_o_eyebrow (-) close_b_r_eyelid (+) raise_l_o_eyebrow (+) raise_l_i_eyebrow (+) raise_l_m_eyebrow (+) squeeze_l_eyebrow (-) open_jaw (+) raise_r_o_eyebrow (+) raise_r_i_eyebrow (+) raise_r_m_eyebrow(+) squeeze_r_eyebrow (-) Facial animation in MPEG-4 Motion represented by FAPs (Facial Animation Parameters) e.g. raise_l_o_eyebrow, raise_r_i_eyebrow, open_jaw Normalized to standard distances of rigid areas in the face, e.g. left eye to right eye (ES0) or nose to eye level (ENS0) Synthetic faces in MPEG-4 Defined through FDPs (Face Definition Points) Emotion Words due to Whissel Afraid Bashful Disgusted Guilty Patient Surprised Activat. Evaluat 4.9 3.4 2 5 4 3.3 6.5 Activat. Evaluat Angry 4.2 2.7 2.7 Delighted 4.2 6.4 3.2 Eager 5 5.1 1.1 Joyful 5.4 6.1 3.8 Sad 3.8 2.4 5.2 Joy Disgust close_t_l_eyelid (+) close_b_l_eyelid (+) stretch_l_cornerlip (+) raise_l_m_eyebrow (+) lift_l_cheek (+) lower_t_midlip (-) OR open_jaw (+) close_t_r_eyelid (+) close_b_r_eyelid (+) stretch_r_cornerlip (+) raise_r_m_eyebrow(+) lift_r_cheek (+) raise_b_midlip (-) close_t_l_eyelid (+) close_t_r_eyelid (+) lower_t_midlip (-) close_b_l_eyelid (+) close_b_r_eyelid (+) open_jaw (+) squeeze_l_cornerlip (+) AND / OR squeeze_r_cornerlip (+) Fear raise_l_o_eyebrow (+) raise_l_m_eyebrow(+) raise_l_i_eyebrow (+) squeeze_l_eyebrow (+) open_jaw (+) OR close_t_l_eyelid (-) OR lower_t_midlip (+) raise_r_o_eyebrow (+) raise_r_m_eyebrow (+) raise_r_I_eyebrow (+) squeeze_r_eyebrow(+) close_t_r_eyelid (-) lower_t_midlip (-) Detection of Facial Protuberant Points Automatic detection in images where the face segments are large; semiautomatic procedure otherwise Detection of eyes guides the detection of the other points 2 ENSo 1 5 3 9 ENSo ENSo ENSo 7 ENSo 10 ENSo 6 8 13 ESo ENSo 12 ENSo 14 ENSo ENSo ENSo 15 ENSo 16 ENSo ENSo 18 ENSo 19 E NSo 4 ENSo 17 ENSo ENSo 11 ENSo “Hierarchical facial features localisation using a morphological approach,” Raphael Villedieu, Technical Report NTUA, June 2000 Original Image (face extracted) Contours Blobs Eyes Detection (Symmetry: position, area) Vertical Edges Detection Erosion / Dilatation Eyes Refining (Box + Feature Detection) Feature-specific Detection Find Boxes (relative to eyes) Next frame, refined Features located Boxes “Hierarchical facial features localisation using a morphological approach,” Raphael Villedieu, Technical Report NTUA, June 2000 Filter (keep darkest pix.) Select points (extrema) Features and Linguistic terms Table 4 is used to determine how many and which linguistic terms should be assigned to a particular feature Example: the linguistic terms medium and high are sufficient for the description of feature F11 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -600 -400 -200 0 200 400 600 800 1000 Membership functions for feature F4 1200 Fuzzification of the input vector Universe of discourse for the particular features is estimated based on statistics: Example: a reasonable range of variance for F5 is mA5 3 A5 mSu5 3 Su5 where mA5, σA5 and mSu5, σSu5 are the mean values and standard deviations of feature F5 corresponding to expressions anger and surprised respectively (see Table 4 before) Unidirectional features like F11 either the lower or upper limit is fixed to zero Fuzzy Inference System A 15-tuble feature vector, corresponding to the FAPs depicted in Table 3. Output is an n-tuple, where n is the number of modeled emotions; for the archetypal output values express the degree of the belief that the emotion is anger, sadness, joy, disgust, fear or/and surprise Fuzzification: Use of Table 4 (Estimation of the FAPs range intervals) Fuzzy Rule Base: obtained from psychological studies; use of Whissel’s activation parameter; express the a-priori knowledge of the system. If–then rules are heuristically constructed from Tables 2 and 4 FAPs Range Intervals Fuzzy Rule Base User Defined Rules FUZZIFICATION FUZZY INFERENCE DEFUZZIFICATION Feature Vector Emotions Recognition of a broader variety of emotions Estimate which features participate to the emotions Modify the membership functions of the features to correspond to the new emotions Define six general categories corresponding to archetypal emotions Example: Category fear contains also worry and terror; model these by translating appropriately the positions of the linguistic terms, associated with the particular features, in the universe of discourse axis. Modifying the membership functions using the activation parameters Let activation values aY and aX corresponding to emotions Y and X Rule 1: Emotions of the same category involve the same features Fi. Rule 2: Let μΧZi and μYZi be the membership functions for the linguistic term Z corresponding to Fi and associated with emotions X and Y respectively. If the μΧZi is centered at value mXZi of the universe of discourse then μYZi should be a centered at: mYZi Y m XZi aX Rule 3: aY and aX are known values obtained from Whissel’s study Experimental Results Static Set (%) PHYSTA (%) 100 90 80 70 60 50 40 30 20 10 0 Fear Disgust Joy Sadness Surprise Anger Experimental Results Rec. Rate (%) 80 70 60 50 40 30 20 10 0 Disdain Disgust Repulsion Delighted Eager Joy