Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions F. Ringeval, A. Sonderegger, J. Sauer, D. Lalanne Department of Informatics – Psychology Université de Fribourg – Universität Freiburg, Switzerland IM2onGA meeting, 2nd International Workshop Emotion Representation, Analysis and Synthesis in ContinuousOctober Time and 18th, Space, 2011 emoSPACE 2013, April 26th, 2013 Corpus Design • Why creating a new corpus of emotion? – Idea originally comes from the EmotiBoard project (enhancing emotional awareness for remote collaborative interactions) – Context of remote collaboration has not been studied so far – No existing corpus with both audio-visual and physiological data, neither with French speakers Incrustation of emotional feedback into audiovisual data of the SEMAINE database; publication submitted to ACII 2013 • Objective of the corpus – Provide rich and consistently annotated multimodal data of natural human behaviour in a context of remote dyadic collaboration emoSPACE 2013, Shanghai, China Fabien Ringeval Page 2 Corpus Design • Videoconference situation (2 persons working together) • 2 x 2 between subjects design • Independent variables – Emotion feedback (yes/no): study the impact of EmotiBoard – Emotion manipulation (positive/negative): increase difference in emotional valence between participants of a team • Participants – 46 students (58.7% female) – Mean age: 22 years ± 3 (min: 18, max: 32) – French speakers with different origins: 33 French, 4 Germans, 8 Italians and 1 Portuguese emoSPACE 2013, Shanghai, China Fabien Ringeval Page 3 Corpus Design • EmotiBoard: emotional feedback generation – Vertical interactive surface on which multiple users can interact using different devices – Java library to transmit and display as client/server wizard-of-oz ratings of user’s emotion (arousal & valence) EmotiBoard: emotional feedback generation emoSPACE 2013, Shanghai, China Fabien Ringeval Page 4 Corpus Design • Collaborative task – As simple as possible, while ensuring that people would be both motivated and sufficiently involved with regard to their emotions – Winter survival exercise: 15 items have to be ranked according to their significance for survival in a deserted and hostile area (plane crash) emoSPACE 2013, Shanghai, China Fabien Ringeval Page 5 Corpus Design • Procedure – – – – – – 1st self-report: emotion questionnaire (SAM) Individual ranking of the items of the survival task; 10 min. Display of a film clip for emotion induction; 5 min. 2nd self-report: emotion questionnaires (SAM & PANAS) Discussion to agree on the final 15 items’ rank; 20 min. 3rd self-report: emotion questionnaires (SAM & PANAS), subjective workload, team collaboration and team satisfaction SAM’s manikins for valence SAM’s manikins for arousal emoSPACE 2013, Shanghai, China Fabien Ringeval Page 6 Corpus Design • Participant’s location – Separate rooms in semi basement with thick closed curtains and neon lighting from the ceiling; kept constant all along sessions emoSPACE 2013, Shanghai, China Fabien Ringeval Page 7 Multimodal Recordings • Audio sensor – HQ unidirectional headset + LQ omnidirectional microphones (built in webcam) – External sound cards: (1) Phantom alimentation of microphone, (2) Skype videoconference and (3) biosignals synchronisation – Recording with Audacity software; 44.1kHz, 16bits AKG 520L microphone Lexicon Omega Studio; external sound card Audacity audio recording software emoSPACE 2013, Shanghai, China Fabien Ringeval Page 8 Multimodal Recordings • Video sensor – HD 720p webcam; Logitech C270, 1080x720p, 25Hz – 2 webcams per participant: Skype and video recording – LQ audio signal captured for post-synchronisation of HQ audio with video data – Recording with webcam’s software; gain and contrast fixed once and auto-adjustment turned off Logitech webcam’s recording software emoSPACE 2013, Shanghai, China Fabien Ringeval Page 9 Logitech C270 webcam Multimodal Recordings • Physiological sensors – – – – ECG: palm of right hand, right and left inner ankles EDA: end of the index and middle fingers Biopac MP36 unit and Biopac Student Lab software (BSL Pro); 1kHz Synchronisation pulses are emitted each second to the external sound card when recording begins (DB9 output → Mono Jack) EDA sensors Back of the BIOPAC MP 36 unit emoSPACE 2013, Shanghai, China BSL Pro recording software; from top to down: EDA, ECG and RR biosignals Fabien Ringeval Page 10 Multimodal Recordings • Data Synchronisation – Video and HQ audio signal: localisation of a sync event in both HQ and LQ audio signals + inter-correlation maximisation (20ms); precision of 1ms – Biosignals and HQ audio signal: synchronisation pulses (right channel) make synchronisation trivial; precision of 1ms Inter-correlation signal between HQ and LQ audio data emoSPACE 2013, Shanghai, China Fabien Ringeval Left (audio) and right (sync pulses) channels of HQ signal Page 11 Multimodal Recordings emoSPACE 2013, Shanghai, China Fabien Ringeval Page 12 Data Annotation • ANNEMO: ANNotating EMOtions – Web-based annotation interface; Google Chrome web-browser – Emotional behaviours: arousal and valence (continuous time & values) – Social behaviours: agreement, dominance, engagement, performance and rapport (discrete time & values) emoSPACE 2013, Shanghai, China Fabien Ringeval Page 13 Data Annotation • Annotation Data Collection – 6 French speaking annotators (3M + 3F) annotated all the corpus – Oral instructions (4 pages document) + practice on 4 sequences – Automatic check of annotation data by a dedicated algorithm, e.g., blanks, missing sequences, wrong order of annotation, etc. – Only the first 5 minutes of interaction were annotated emoSPACE 2013, Shanghai, China Fabien Ringeval Page 14 Data Annotation • Post-processing and analysis – Piece-wise cubic interpolation and binning into 40ms frames – Local normalisations: zero-mean and synchronization – Good inter-annotator agreement rate for the affective dimensions, and a fairly good one for the social dimensions emoSPACE 2013, Shanghai, China Fabien Ringeval Page 15 Conclusion • Conclusion: – RECOLA: a new corpus of REmote COLlaborative and Affective interactions in French – 3 well synchronized HQ signals: audio, visual and ECG+EDA – Rich and consistent annotations of socio-affective behaviours; internal (self-reporting) and external (3M+3F) – From 27 subjects (5.5h of multimodal data) to 34 subjects (7h of audiovisual data) considering positive consent forms – ANNEMO: a new web-based annotation tool of emotion ALL WILL BE PUBLICLY MADE AVAILABLE SOON! Stay informed on: http://diuf.unifr.ch/diva/recola emoSPACE 2013, Shanghai, China Fabien Ringeval Page 16