Application to join HUMAINE network of excellence 1. Name of Legal entity GET (Groupe des Ecoles de Télécommunications)/Ecole Nationale Superieure des Télécommunications (ENST) - Télécom Paris 46 rue Barrault F-75634 Paris Cedex 13 Phone: +33 1 45 81 77 77 Fax: +33 1 45 89 79 06 2. Names of researchers Researchers Ioana VASILESCU Dr. Ioana VASILESCU obtained a Ph.D thesis in 2001 in linguistics in the area of speech perception (Université Lyon 2 and UC Berkeley). After a year at LIMSI (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur), she joined GET-Télécom Paris (ENST) as CNRS research scientist in October 2002. Her current research interests are in speech production (spontaneous speech phenomena, e.g. disfluencies) and perception (language identification by humans). Her work in the field of emotions concerns emotion analysis and detection in speech and perceptual approaches in building corpus independent annotation methodologies. Gaël RICHARD Dr. Gaël RICHARD obtained a Ph.D thesis in 1994 in the area of speech synthesis. He then spent two years at the CAIP Center (Rutgers University, USA) in the speech processing group of Prof. James Flanagan where he explored innovative approaches for speech production. Between 1997 and 2001, he successively worked for Matra Nortel Communications and for Philips Consumer Comunications. He was in particular the project manager of several large scale European projects in the field of multimodal verification and speech processing. He joined GET-Télécom Paris (ENST) as Assistant Professor in the field of audio and multimedia signals processing. Co-author of over 30 papers and inventor in a number of patents, he is also one of the expert of the European commission in the field of man machine interfaces. Gerard CHOLLET Gerard CHOLLET obtained a PhD in Linguistics and Speech Processing form the University of California, Santa Barbara. He joined CNRS in 1978 and is affiliated with GET-ENST since 1983. His current research interests are in audio-visual speech processing, coding, recognition, synthesis, biometry and man-machine interfaces. 33 PhD and 8 PostDoctoral students worked under his supervision. He initiated, managed and/or contributed to many European, international and national projects. He is the author or co-author of more than 200 book chapters, journal articles and conference papers. Mutsuko TOMOKIYO Guido AVERSANO Students Chloé CLAVEL Chloé CLAVEL is currently PhD student (since December 2003) at THALES RT and GET-ENST. Her work stress on “analysis and detection of acoustic manifestations of emotional states related to Fear”. She previously had completed her Master degree in signal processing after engineering studies at TELECOM Paris. Walid Karam 3. Description of Unit: The Groupe des Ecoles des Télécommunications (GET) is made up of six major Graduate Schools of France in the field of Information Technology: Télécom Paris (ENST) - Ecole Nationale Supérieure des Télécommunications, ENST – Bretagne, INT – Evry, ENIC - in partnership with the University of Lille, EURECOM in partnership with EPFL, and IAAI - Institute of Advanced Applications of Internet, in collaboration with the Université d’Aix-Marseille. GET assembles around 50 laboratories addressing all issues related to the Information Society Technologies. GET is a federation of 4 to 10 Research Departments in each GET school, in the following areas: Technologies and components: optics, microwave, radio, electromagnetism, electronics, microelectronics, VLSI, MEMs, design and architecture. Communication, Signal and Image processing: information theory, coding, modulation, detection, compression, classification, speech recognition and synthesis, vision, biometry. Computer science, software: architecture, operating systems, compilation, agents, objects, software engineering, cognitive sciences, databases, data mining, natural languages, , mobile computing. Protocols and Networks: queuing theory, distributed systems, protocols design, specification, validation, routing, multicast, QoS, administration, planning, intelligent, active networks, security. Social sciences, law, economy of ICT: micro and macro economical models, competition, industrial strategy, digital economy, tariffs, investment, innovation, information systems, regulation, user behavioural models. GET resources are at par with this broad array of activities. GET features 470 full-time professors, 1000 part-time lecturers, and 3000 students with 1000 graduates per year and 500 PhD students. A total of 50 nationalities are represented within the GET that has 30 international partner universities. The GET has established framework agreements with major industrial key players in the telecommunications and beyond in order to enable a smooth technology transfer (eg : Alcatel, France Telecom, CDC). The GET has currently a portfolio of nearly 50 patents (eg : MPEG4, MPEG7 for compression of multimedia objects, Turbocodes, for mobile, satellite and radio communications), has developped a start up programm of technologies tranfers. Revenues from industries partnership are for 2003 circa 9 millions eur and from patent licensing 1 million. The GET is the scientific coordinator of two Networks of Excellence in FP6 (EURONGI, Next Generation Internet, and BIOSECURE, Biometrics for secure authentication), is a major partner of the NoE NEWCOM (Wireless Communications) and SATNEX (Satellite Communications) as well as in several other networks, IPs, STREPs, etc, notably in the field of broadband communications, security, networked audiovisual systems, home platforms, eLearning, etc Telecom Paris (ENST) The “Ecole Nationale Supérieure des Télécommunications” (Télécom Paris – ENST) is one of the 6 GET graduate school. Telecom Paris is a state of the art institution in the fields of technology and innovation and has four major missions which pay tribute to its pedagogical excellence and its international dimension. These missions include research, engineering studies, research-oriented study, and life-long learning. Among those missions, the research, either methodological/fundamental or applied, is developped in narrow partnership with national and international resarch centers and industries. Signal and image processing, and computer science are at the heart of GET research. This research encompasses information theory, coding, modulation, detection, compression, classification, speech recognition and synthesis, audio indexing, vision, biometry. and software engineering, cognitive sciences, databases, data mining, and natural languages. GET conducts methodological and technological research in various fields of multimedia signal processing (3D geometric modeling, stochastic modeling, multiple data compression, selective compression, MPEG-4-compliant 3D mesh coding, MPEG-7 indexing,…). GET work on audio is also quite extensive: audio objects manipulation, sound scenes indexation and analysis, automatic music transcription, and communication and cognition that concerns modeling language and its cognitive links with images. The Signal and Image Processing department (TSI) has also developed tight links with the Computer science department of Telecom Paris which is heavily involved in Graphical User Interfaces and information visualization and hypermedia, language processing, interactive Web, information sharing, databases and knowledge base research, thus leading to document structure and indexation and multi-modal humanmachine interaction. Nb: link to emotions…? Publications Publications of Ioana VASILESCU [IV1] Clavel, C., Vasilescu, I., Devillers, L., Ehrette, T., (2004), Fiction database for emotion detection in abnormal situations, ICSLP 2004, Jeju Island, South Korea. [IV2] Devillers, L., Vasilescu, I. (2004), Anger versus Fear detection in recorded conversations, Speech Prosody 2004, Nara, Japon, March 2004. [IV3] Devillers, L., Vasilescu, I., Lamel, L. (2003), Emotion detection in a taskoriented dialog corpus, IEEE International Conference on Multimedia and Expo ICME 2003, Baltimore, July 2003. [IV4] Maddieson, I., Vasilescu, I. (2002), Factors in human language identification, 7th International Conference on Spoken Language Processing, Denver, Colorado, September 2002. [IV5] Vasilescu, I., Pellegrino, F., Hombert, J-M. (2000), Perceptual features for the identification of Romance Languages, 6th International Conference on Spoken Language Processing, Beijing, China, October 2000. Publications of Gaël RICHARD [GR1] R. Badeau, G. Richard et B. David, "Sliding window adaptive SVD algorithms", IEEE Transactions on Signal Processing, vol. 52, no. 1, janvier 2004. [GR2] S. Essid, G. Richard, B. David, “Efficient features for musical instrument recognition on solo performances” Proc. of AES 25th International Conference, June 17-19, 2004. [GR3] O. Gillet et G. Richard "Automatic transcription of drum loops", International Conference on Acoustics, Speech, and Signal Processing ICASSP’ 04, Montréal, Québec, 17-21 mai 2004. [GR4] Van den Heuvel H., Boves L., Moreno A., Omologo M., Richard G., Sanders E., “Annotation in the speechdat projects’, International Journal of Speech Technology, 4, pp. 127-143., 2001. [GR5] Richard G., d'Alessandro C., "Analysis-synthesis of the Aperiodic Component of Speech", Speech Communication, Sept 1996, Vol 9. Your research group and HUMAINE The area of expertise of the ENST researchers in the field of emotion oriented systems concerns (1) emotion analysis and detection in speech, (2) semantic representation of emotions (for SSMT and for language independent vocal information encoding for interactive voice servers), and (3) emotion generation (talking faces). (1) Researchers interested in emotion analysis and detection in speech can provide to the Humaine network an active contribution in the sphere of processing of physical signals conveyed by the voice to carry emotion manifestations in the vocal interactions, and of their descriptors with the aim of building emotion-oriented and task-dependent systems (I. Vasilescu, G. Richard, C. Clavel and G. Aversano). Besides, members of the group are developing a profitable partnership with other researchers members of Humaine (Laurence Devillers, LIMSI) and with academic (?) and industrial partners (THALES RT). More precisely, given the thematic areas identified as salient by the Humaine experts for a multidisciplinary study of emotions, our activity concerns and can bring contribution to the analysis and description of the relationship between signal and sign in the framework of emotion manifestations in vocal interactions according to different contexts of emergence. Or the idea is nowadays generally accepted that emotions are strongly dependent on the corpus employed. Our work is conducted on real-life (call center recordings) and realistic (tv fictions) corpora and focus on the detection of task-dependent emotionally significant features. As a result, the methodology we adopted and developed needs to be robust and takes into account the complexity and variability of the speech material in which emotionality is not prototypical but mixed with moods, attitudes, personality traits etc. Consequently, the expertise we can bring is both multidisciplinary (linguistics, speech processing analysis and synthesis) and supported by an experience in dealing with ecological interactions in which the relationship between emotional states and physical signs is highly determined by the situation. The final aim is to develop models of emotional behaviour for the use of emotion-oriented systems. I. Vasilescu possess a long experience in analysing and isolating emotional significant features in speech and in developing a robust and corpus independent methodology of annotation. This experience is the result of a fruitful collaboration with L. Devillers (LIMSI) [IV1-3]. I. Vasilescu and L. Devillers’s work focus on emotion annotation, analysis and detection in real-life agent-client dialogs recorded in call centers (stock exchange and bank services). Their work concerns the study of acoustic, linguistic and dialogic cues carrying salient information for emotion characterization in these different natural contexts. The search of emotional significant features in voice is supported by the development of an annotation paradigm aiming at being applicable for different real-life corpora. The annotation methodology exploits findings in different theories of emotions and makes use of perceptual tests in order to validate emotion labels (which can be either named or described via abstract dimensions). A project representing a collaboration between ENST (I. Vasilescu, G. Richard, C. Clavel), LIMSI (L. Devillers) and Thales RT (C. Sedogbo, C. Clavel) allows at extending this research to civil security applications [IV1]. This work focus on verbal interactions occurring in abnormal situations, in which human life could be threatened, and engendering strong negative emotions such as fear or panic. For this project, a multimodal corpus (audio/video) based on realistic tv fictions is under the process of construction and annotation.