PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry? Can you hear I’m angry? Perception of anger in a spontaneous French corpus by Portuguese learners of French as a foreign language. Sophie de Abreu, Catherine Mathon Université Paris 7 1 Introduction One difficult point a learner of a foreign language (whose intention is to communicate) has to face is that she or he needs to perceive the emotions of her or his interlocutor in order to react in an appropriate way to the situation. Little work has been done on the area of the perception and the production of emotions in a foreign language, even though the role of prosody is well known in communication, (Chun, 2002): Communication is the link between this area of Teaching a Foreign Language and the prosodic characterization of emotions in speech, (Galazzi & Guimbretière, 1994). The aim of this paper is to show how prosody can provide a sufficient amount of information that allows learners of French as a Foreign Language (FFL) recognizing the emotion of anger in a French spontaneous corpus. This hypothesis was tested on a Portuguese group. 2 Experimental protocol 2.1 Corpus As we wanted to work with a French spontaneous corpus with “real” emotion (no “fake/played” one), we used a corpus based on radiophonic hoaxes (online Fun Radio http://www.funradio.fr). A radio animator creates miscommunications in order to lead the victims of the hoaxes to express anger. He calls institutions (high schools, hospitals) or professionals (bakers, taxidermists, bank...), and, playing the role of a client, asks something which does not fit with the situation. We concentrated our work on the victim’s speech on account of its more spontaneous character. We collected 1 hour 4 minutes of speech, 24 hoaxes, and transcribed 15 dialogues among the 24 initial ones using the program Transcriber 1.4. The dialogues were chosen for their high expressivity of anger. The chosen dialogues represent about 40 minutes of speech. A pretest was conducted in order to attribute a degree to the emotion conveyed by the sentences of the corpus. We selected the final sentences from the answers given in this pretest: when a sentence was designed at 80% as “anger” by the listeners, it was kept for the final test. The final corpus contains 13 sentences judged as “anger”, 13 sentences judged as “no anger” and 5 training sentences. 2.2 Stimuli To show the role of prosody in the recognition of anger by our Portuguese group of FFL, we had to isolate prosodic information from the others linguistics parameters. There are 3 main methods to do it. As we did not intend to modify the spontaneous prosody of the sentences, we rejected the re-synthesis which does not keep the authentic face of the document. We also decided not to use a low-pass frequency filter. This method is rather good when intensity is not an important criterion. However, as intensity represents an PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry? essential parameter of anger, (Bänziger & Scherer,2003; Scherer,1995), we could not use a masking method which eliminates energy in high frequencies. We finally chose to hide the linguistic content of sentences with white noise (Miller and Licklider,1950) in order to keep only the prosodic information. For each chosen sentence extracted from the pre-test, we added a white noise to the original sentence using the software Soundforge 7.0. The created white noise had the same length with the original sentence. We defined the intensity of the white noise following the intensity of the speaker’s voice of each sentence, in order to hide as much as possible the segmental content of the sentences. We finally mixed sentences and white noise to obtain our stimuli. At last, we can point out the fact that the constructed stimuli were usually perceived by the listeners as sounds of bad quality. The 26 chosen stimuli were doubled in order to verify the coherence of the answers, randomized, and preceded by 5 training sentences. This test contains 57 stimuli and is about 8 minutes long. 3 Perception test 3.1 Task and subjects The listeners had to accomplish a double task. - First a decision task in order to determine if the stimulus conveys anger or not - If they decided the stimulus conveyed anger, the subjects had to evaluate the degree of anger (evaluation task). The subjects were previously advised of the bad quality of the sounds, in order to avoid a too long adaptation to the stimuli. Two groups of listeners were invited to do the perceptual test: The first group is composed by 10 native speakers of French (6 women and 4 men) and represents our control group. The Portuguese speakers, 7 women and 3 men, is our test group. The subjects are students of a same class in the Faculdade de Letras, Universidade de Lisboa. Their level corresponds to a B2 level, according to the European portfolio of language (see reference list). It was important to have listeners of an intermediate level of French because we consider that beginners do not have enough knowledge to interpret the degrees of anger while advanced students would have too much knowledge about prosody to show significant results. However, this hypothesis has to be taken carefully and be tested. 3.2 Interface Since we placed our work in a teaching perspective, we wanted to build an interface in relation to teaching languages, in particular emotions. We were very careful in controlling the effects of the interface of the test. The perception test, based on the stimuli described above was presented on a computer. The main problem we faced was to decide which language to use in the interface. Since the subjects are learners of FFL, they do understand French, but it seemed to us more judicious to give them the instructions in their native language, in order to be sure there is no confusion about the task. The decision was more difficult concerning the questions used during the test since the sounds of the test are in French. It was important to avoid a cognitive overload due to a constant code switching. We finally proposed to use more visual instructions during the test, reusing well known images by the internet’s users: the emoticons. We used three images: one representing PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry? the “anger” , another “something else than anger” , and the third one to hear the sound . The three icons were presented in the instructions of the task. We also used another technique in order to avoid the cognitive overload: when a step was achieved, the emoticon turns grey indicating the impossibility to redo it. This interface was thought to test other languages and other types of emotion. It was created especially in order to fit with the application in the area of foreign languages teaching and learning. Figure1: Picture of the interface of the perceptual test: examples of icons presented to the subjects. 4 Results To be sure the answers of both groups were reliable we doubled the stimuli and verified that the responses were the same. We made the means of the answers for each judgement and observed that the two means are close enough (delta= 0.6). In details, we noted that the majority of answers for both groups were the same the 1st and the 2nd time, and if different, that they vary the more often just of one interval (Anger1Anger2; Anger4-Anger5 for example). The high level of coherence shows that the control group is reliable. So we finally could survey the answers of the Portuguese group, compared with the French ones. First, the French detected at 62% of anger whereas the Portuguese perceived only at 50%. In the graph below we can see the details of the answers for each group. 300 250 Moy. cell 200 French 150 Portuguese 100 No Anger Anger 5 Anger 4 Anger 3 Anger 2 0 Anger 1 50 Cell Graph 1: Repartition of the answers for each group depending of judgments. PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry? The first difference we can observe between the two groups concerns the choice No Anger. Portuguese chose No Anger more often than French (50% for Portuguese vs 37% for French). But, French speakers detect weak anger (A1 and A2) more than Portuguese. For A1 and A2 there is a difference of 10% between the two groups. From analyzing these results, we can suppose that French perceive anger with more precision, while the Portuguese group classify weak anger more often like No Anger. For strong anger (A4 and A5), we observe no significant difference between French and Portuguese. We can conclude that it is easier to judge strong anger than weak anger for Portuguese learners of FFL. 5 Discussion In order to examine the influence of the segmental information which could have been perceived despite the white noise, we have done an additional test in written conditions by the French group. The comparison of their answers in the two conditions pointed out that some lexical information had been perceived in sentences where intensity was too high: 3 sentences were recognized at 100% as “anger” in the two conditions. For the same sentences, the percentage is always lower for the Portuguese group. This result points out the weakness of the white noise masking method: white noise must be lower than the average intensity of the original sound or the listeners will not hear any prosody at all, especially if the listeners are learners of FFL. Work is in progress to compare this information with another masking method. Moreover, comparing the answers of the two groups, it appeared that 6 of the 26 sentences have an inverted result. We looked at the acoustic measures to explain these divergences. One sentence was recognized as “Anger” by the Portuguese and “No anger” by the French listeners. There is no acoustic evidence for that inverted result. We make the hypothesis that the confusion may be due to how Portuguese express anger in their native language. For the 5 sentences recognized as “Anger” by the French and “No anger” by the Portuguese, we noted some disfluencies (pauses, repetitions, fillers, performance errors). It seems that Portuguese do not consider a sentence said with anger when there are disfluencies, unlike French listeners. Work is in progress to evaluate the importance of intensity, speech rate and pronunciation rate in these cases. 6 Conclusions Following the results of the test in written condition, we are planning to compare the obtained results with others masking methods. However, we showed that prosodic information represents enough information to allow subjects recognizing anger. Even Portuguese FFL learners were able to distinguish Anger from Not Anger and to give a rather appropriate evaluation. This study is part of a global project which aims are: - Opening the perceptual test to other language groups and to other emotions. Work is in progress with a group of Czechs. - Working on the reproduction of these sentences with anger by the foreign language speakers, correlated with an analysis of the prosody of their native language. 7 References Bänziger, T. and Scherer, K. R., (2003), Relations entre caractéristiques vocales perçues et émotions attribuées. Actes des Journées Prosodie 2001, Grenoble, France, 119-124. Chun, D. M. (2002): Discourse Intonation in L2, Amsterdam: John Benjamins (Language Learning & Language Teaching, vol. 1). PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry? Galazzi E. ; Guimbretière E., (1994), Intonation et attitudes: une question de perception, Studi di Linguistica, Storia della lingua Filologia francesi, Edizioni dell’Orso, Milan. Miller and Licklider, (1950), The intelligibility of interrupted speech, (J. Acoust. Soc. Am. 22: 167-173,1950) Scherer, K. R, (1995), How emotion is expressed in speech and singing, Proceedings of ICPhS 95, Stockholm, vol.3, p. 90-96 http://www.enpc.fr/fr/international/eleves_etrangers/portfolio.pdf