Can you hear I’m angry?

advertisement
PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry?
Can you hear I’m angry?
Perception of anger in a spontaneous French corpus
by Portuguese learners of French as a foreign language.
Sophie de Abreu, Catherine Mathon
Université Paris 7
1 Introduction
One difficult point a learner of a foreign language (whose intention is to communicate)
has to face is that she or he needs to perceive the emotions of her or his interlocutor in
order to react in an appropriate way to the situation. Little work has been done on the
area of the perception and the production of emotions in a foreign language, even
though the role of prosody is well known in communication, (Chun, 2002):
Communication is the link between this area of Teaching a Foreign Language and the
prosodic characterization of emotions in speech, (Galazzi & Guimbretière, 1994). The
aim of this paper is to show how prosody can provide a sufficient amount of information
that allows learners of French as a Foreign Language (FFL) recognizing the emotion of
anger in a French spontaneous corpus. This hypothesis was tested on a Portuguese
group.
2 Experimental protocol
2.1 Corpus
As we wanted to work with a French spontaneous corpus with “real” emotion (no
“fake/played” one), we used a corpus based on radiophonic hoaxes (online Fun Radio
http://www.funradio.fr). A radio animator creates miscommunications in order to lead the
victims of the hoaxes to express anger. He calls institutions (high schools, hospitals) or
professionals (bakers, taxidermists, bank...), and, playing the role of a client, asks
something which does not fit with the situation. We concentrated our work on the
victim’s speech on account of its more spontaneous character.
We collected 1 hour 4 minutes of speech, 24 hoaxes, and transcribed 15 dialogues
among the 24 initial ones using the program Transcriber 1.4. The dialogues were
chosen for their high expressivity of anger. The chosen dialogues represent about 40
minutes of speech.
A pretest was conducted in order to attribute a degree to the emotion conveyed by the
sentences of the corpus. We selected the final sentences from the answers given in this
pretest: when a sentence was designed at 80% as “anger” by the listeners, it was kept
for the final test. The final corpus contains 13 sentences judged as “anger”, 13
sentences judged as “no anger” and 5 training sentences.
2.2 Stimuli
To show the role of prosody in the recognition of anger by our Portuguese group of FFL,
we had to isolate prosodic information from the others linguistics parameters. There are
3 main methods to do it. As we did not intend to modify the spontaneous prosody of the
sentences, we rejected the re-synthesis which does not keep the authentic face of the
document. We also decided not to use a low-pass frequency filter. This method is rather
good when intensity is not an important criterion. However, as intensity represents an
PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry?
essential parameter of anger, (Bänziger & Scherer,2003; Scherer,1995), we could not
use a masking method which eliminates energy in high frequencies. We finally chose to
hide the linguistic content of sentences with white noise (Miller and Licklider,1950) in
order to keep only the prosodic information.
For each chosen sentence extracted from the pre-test, we added a white noise to the
original sentence using the software Soundforge 7.0. The created white noise had the
same length with the original sentence. We defined the intensity of the white noise
following the intensity of the speaker’s voice of each sentence, in order to hide as much
as possible the segmental content of the sentences. We finally mixed sentences and
white noise to obtain our stimuli.
At last, we can point out the fact that the constructed stimuli were usually perceived by
the listeners as sounds of bad quality. The 26 chosen stimuli were doubled in order to
verify the coherence of the answers, randomized, and preceded by 5 training sentences.
This test contains 57 stimuli and is about 8 minutes long.
3 Perception test
3.1 Task and subjects
The listeners had to accomplish a double task.
- First a decision task in order to determine if the stimulus conveys anger or not
- If they decided the stimulus conveyed anger, the subjects had to evaluate the
degree of anger (evaluation task).
The subjects were previously advised of the bad quality of the sounds, in order to avoid
a too long adaptation to the stimuli. Two groups of listeners were invited to do the
perceptual test: The first group is composed by 10 native speakers of French (6 women
and 4 men) and represents our control group.
The Portuguese speakers, 7 women and 3 men, is our test group. The subjects are
students of a same class in the Faculdade de Letras, Universidade de Lisboa. Their
level corresponds to a B2 level, according to the European portfolio of language (see
reference list). It was important to have listeners of an intermediate level of French
because we consider that beginners do not have enough knowledge to interpret the
degrees of anger while advanced students would have too much knowledge about
prosody to show significant results. However, this hypothesis has to be taken carefully
and be tested.
3.2 Interface
Since we placed our work in a teaching perspective, we wanted to build an interface in
relation to teaching languages, in particular emotions. We were very careful in
controlling the effects of the interface of the test.
The perception test, based on the stimuli described above was presented on a
computer. The main problem we faced was to decide which language to use in the
interface. Since the subjects are learners of FFL, they do understand French, but it
seemed to us more judicious to give them the instructions in their native language, in
order to be sure there is no confusion about the task. The decision was more difficult
concerning the questions used during the test since the sounds of the test are in
French. It was important to avoid a cognitive overload due to a constant code switching.
We finally proposed to use more visual instructions during the test, reusing well known
images by the internet’s users: the emoticons. We used three images: one representing
PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry?
the “anger”
, another “something else than anger”
, and the third one to hear the
sound . The three icons were presented in the instructions of the task. We also used
another technique in order to avoid the cognitive overload: when a step was achieved,
the emoticon turns grey indicating the impossibility to redo it.
This interface was thought to test other languages and other types of emotion. It was
created especially in order to fit with the application in the area of foreign languages
teaching and learning.
Figure1: Picture of the interface of the perceptual test: examples of icons
presented to the subjects.
4 Results
To be sure the answers of both groups were reliable we doubled the stimuli and verified
that the responses were the same. We made the means of the answers for each
judgement and observed that the two means are close enough (delta= 0.6). In details,
we noted that the majority of answers for both groups were the same the 1st and the
2nd time, and if different, that they vary the more often just of one interval (Anger1Anger2; Anger4-Anger5 for example). The high level of coherence shows that the
control group is reliable. So we finally could survey the answers of the Portuguese
group, compared with the French ones. First, the French detected at 62% of anger
whereas the Portuguese perceived only at 50%. In the graph below we can see the
details of the answers for each group.
300
250
Moy. cell
200
French
150
Portuguese
100
No Anger
Anger 5
Anger 4
Anger 3
Anger 2
0
Anger 1
50
Cell
Graph 1: Repartition of the answers for each group depending of judgments.
PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry?
The first difference we can observe between the two groups concerns the choice No
Anger. Portuguese chose No Anger more often than French (50% for Portuguese vs
37% for French). But, French speakers detect weak anger (A1 and A2) more than
Portuguese. For A1 and A2 there is a difference of 10% between the two groups. From
analyzing these results, we can suppose that French perceive anger with more
precision, while the Portuguese group classify weak anger more often like No Anger.
For strong anger (A4 and A5), we observe no significant difference between French and
Portuguese. We can conclude that it is easier to judge strong anger than weak anger for
Portuguese learners of FFL.
5 Discussion
In order to examine the influence of the segmental information which could have been
perceived despite the white noise, we have done an additional test in written conditions
by the French group. The comparison of their answers in the two conditions pointed out
that some lexical information had been perceived in sentences where intensity was too
high: 3 sentences were recognized at 100% as “anger” in the two conditions. For the
same sentences, the percentage is always lower for the Portuguese group. This result
points out the weakness of the white noise masking method: white noise must be lower
than the average intensity of the original sound or the listeners will not hear any prosody
at all, especially if the listeners are learners of FFL. Work is in progress to compare this
information with another masking method.
Moreover, comparing the answers of the two groups, it appeared that 6 of the 26
sentences have an inverted result. We looked at the acoustic measures to explain these
divergences. One sentence was recognized as “Anger” by the Portuguese and “No
anger” by the French listeners. There is no acoustic evidence for that inverted result.
We make the hypothesis that the confusion may be due to how Portuguese express
anger in their native language. For the 5 sentences recognized as “Anger” by the French
and “No anger” by the Portuguese, we noted some disfluencies (pauses, repetitions,
fillers, performance errors). It seems that Portuguese do not consider a sentence said
with anger when there are disfluencies, unlike French listeners. Work is in progress to
evaluate the importance of intensity, speech rate and pronunciation rate in these cases.
6 Conclusions
Following the results of the test in written condition, we are planning to compare the
obtained results with others masking methods. However, we showed that prosodic
information represents enough information to allow subjects recognizing anger. Even
Portuguese FFL learners were able to distinguish Anger from Not Anger and to give a
rather appropriate evaluation.
This study is part of a global project which aims are:
- Opening the perceptual test to other language groups and to other emotions.
Work is in progress with a group of Czechs.
- Working on the reproduction of these sentences with anger by the foreign
language speakers, correlated with an analysis of the prosody of their native
language.
7 References
Bänziger, T. and Scherer, K. R., (2003), Relations entre caractéristiques vocales
perçues et émotions attribuées. Actes des Journées Prosodie 2001, Grenoble, France,
119-124.
Chun, D. M. (2002): Discourse Intonation in L2, Amsterdam: John Benjamins (Language
Learning & Language Teaching, vol. 1).
PTLC2005 Sophie de Abreu & Catherine Mathon Can you hear I'm angry?
Galazzi E. ; Guimbretière E., (1994), Intonation et attitudes: une question de perception,
Studi di Linguistica, Storia della lingua Filologia francesi, Edizioni dell’Orso, Milan.
Miller and Licklider, (1950), The intelligibility of interrupted speech, (J. Acoust. Soc. Am.
22: 167-173,1950)
Scherer, K. R, (1995), How emotion is expressed in speech and singing, Proceedings of
ICPhS 95, Stockholm, vol.3, p. 90-96
http://www.enpc.fr/fr/international/eleves_etrangers/portfolio.pdf
Download