Production and perception of stop consonants by a

advertisement
Production and perception of stop consonants by a
hearing-impaired Brazilian subject
PaPI - 2005
Luisa Barzaghi, Sandra Madureira
Pontifícia Universidade Católica de São Paulo – Brasil
luisa@pucsp.br
INTRODUCTION
The relationship between hearing impairment and speech perception and production has
not been yet fully understood. Results of studies on the hearing-impaired subjects´
speech production have pointed out the following aspects: difficulties in implementing
the voicing contrast (Monsen, 1976; McGarr, Löfqvist, 1982, 1988); lack of transitional
cues, indicating a lower degree of coarticulation (Tye-Murray, 1987); differences
regarding F2 trajectories; longer duration of segments (Okalidou, Harris, 1999);
stereotyped displacements of the tongue body and excessive opening of the jaw (TyeMurray, 1991). On the other hand studies on the auditory perception of the stop
consonants and other speech segments by hearing-impaired subjects, show that
perception is altered by many factors and that the degree of hearing loss alone does not
account for the variability in identifying and discriminating sounds, although, in
general, it gets worse as the hearing loss increases (Tsui, Ciocca, 2000; Turner, Brus,
2001). Revoile et al (1982, 1999) show that hearing-impaired subjects tend to rely more
in temporal cues and less in spectral cues to identify the voicing contrast in English. In
identifying the stop place of articulation contrast hearing-impaired subjects are judged
to rely more in transition cues than in the burst information (Sammeth et al, 1999).
OBJECTIVE AND THEORETICAL BACKGROUND
The aim of this experimental study, which follows Madureira, Barzaghi and Mendes
(2002), is to investigate the production and perception of stop consonants of
Brazilian Portuguese (PB) by a hearing-impaired subject (HI) in an interrelated
way. A normal hearing-subject’s (NH) speech productions are taken as reference
for the sake of spectrographic analysis and also used as stimuli for the
discrimination and identification tasks.
The focus is on place and voicing contrasts. Spectrographic analysis and
identification and discrimination tests are carried out. The Acoustic Theory of
Speech Production (Fant, 1960) and Articulatory Phonology (Browman & Goldstein,
1986; 1990; 1992; Goldstein & Fowler, 2003) provides the theoretical background for
the interpretation of the findings in this work.
METHOD
The corpus and subjects were the same used in a previous study (Madureira, Barzaghi
& Mendes, 2002). All the subjects were native speakers of PB, from São Paulo, Brazil.
Production Task:
Two subjects participated in the production task: (1) a normal-hearing (NH) 35-yearold woman with clear speech; (2) a 16-year-old, female, with bilateral sensorineural
hearing loss, severe in the right ear and profound in the left, acquired at 18 months old
after meningitis. At the time of recording, she was attending a regular school and had
been engaged in an oral approach rehabilitation program at Derdic-PUC/SP for 14 years
(HI). The corpus consisted of ten repetitions of two-syllable words initiated by
one of the six stops of PB (/p/, /b/, /t/, /d/, /k/, /g/) inserted in a carrier phrase
Diga ____baixinho . The words are paroxytons, which happens to be the most frequent
stress pattern in PB (Albano, 1995) and presents a CVCV pattern (V=/a/; /t/ as second
consonant). The recording was made in studio, ten repetitions for the six sentences,
randomly organized in lists and digitalizated at 22 kHz sampling rates with the help of
CLS, Kay Elemetrics. The acoustic analysis (Multispeech, Kay Elemetrics) included:
f0 and F1 frequency at the onset of the vowels following the stop consonants; relative
duration of the stop consonants and their adjacent segments; F1, F2 and F3 of the
vowels preceding and following the stop consonants; F1, F2 and F3 transitions from the
stop consonants into the following vowels. Statistics analysis – in order to compare
acoustic parameters related to voicing and place contrasts, in the production of each
subject, a one-way ANOVA and post hoc comparison Sheffe’s test (alpha set at 0,05)
were used.
Identification Task
One hundred and twenty normal-hearing undergraduated students acted as judges and
evaluated the oral productions of both participants. The 60 students that judged the HI
subject´s productions were not familiarized with hearing-impaired subjects’ speech. The
six sentences of the fifth list recorded were randomly presented to the judges through
head phones connected to a computer in a silent room. The intensity was adjusted to the
most comfortable level for each subject. The subjects were told to write down what they
had listen even though it had not sounded like a familiar word.
Discrimination Task
In order to analyze the perception of the stop consonants by the HI subject a closed-set,
forced-choice discrimination test was developed (Boothroyd, 1985, 1996). The hearingimpaired subject evaluated both the normal-hearing subject's production and her own.
RESULTS
The results of the acoustic analysis and of identification and discrimination tests
indicated the HI subject neither produce nor perceived the voicing contrasts: for the two
acoustic voicing correlates investigated in this study (f0 e F1 frequency in the following
vowel onset – Fig.1) no significant differences between minimal pairs were found, in
accordance with the results of the analysis of VOT and consonant duration presented in
our previous study based on the same data. For the NH subject significant differences
were found regarding f0 (F(5,54)=30,511,p<0,001; Sheffe: /p x b/; /t x d/ e /k x g/
p<0,001) and F1 onset measures (F(5,54)=119,970, p<0,001; Sheffe: /p x b/ p<0.05; /t x
d/ e /k x g/, p<0,001); In the identification task all her productions were correctly
identified by the 60 judges.
260
[ta] [da]
[pa]
1000
[ba]
[ka] [ga]
[ta]
240
f0 at vowel onset
[pa]
[ka]
220
200
[da]
180
[ba]
[ga]
160
140
Hearing Impaired Subject
Normal Hearing Subject
F1 at vowel onset Hz
Hz
280
900
800
[ka] [ga]
[pa]
[ba]
[ta]
[da]
700
[pa]
[ba]
[ta]
600
[ka]
[da]
[ga]
500
400
Hearing Impaired Subject
Normal Hearing Subject
Figure 1 – f0 and F1 frequency values in the stressed following vowel onset in
words “pata, bata, tata, data, cata, gata” for the hearing-impaired (HI) and
normal-hearing (NH )subjects
2
In the identification task the voiced stop consonants produced by the HI subject were
only 6% (11 in 180 presentations) correctly identified by the judges (Table 1). In the
voicing discrimination task the HI subject showed no correct answers above chance
level, either when evaluating the NH subject's productions or her own (Table 2).
Table 1: Judges (n=60) responses to the stop consonants as produced by the
hearing- impaired subject (HI)
R
e
s
p
o
n
s
e
pata
bata
tata
data
cata
gata
other
s
total
pata bata tata data cata gata total
46
10
5
7
3
122
51
4
4
2
1
1
16
4
0
2
34
0
1
68
31
0
1
5
1
0
14
7
3
7
8
8
55 131
50
0
0
0
0
1
1
0
2
0
2
4
0
0
8
60
60
60
60
60
60 360
Table 2: Discrimination task results: hearing- impaired subject’s responses - HI
(% of right answers above chance level) to the NH subject’s and her own stop
consonants productions
Voicing
NH
subject’s
productions
(%)
0,2
H I subject’s
(own)
productions
(%)
0,1
Place - Bilabial x Alveolar
35
0
Place - Bilabial x Velar
95
20
Place - Alveolar x Velar
60
0
CONTRAST
On the other hand, the place of articulation contrast was better produced and perceived
than the voicing contrast: judges identified 87% correctly the bilabial stop consonants
(105 times in 120 presentations), 64% the alveolar (77 in 120) and 88% the velar (106
in 120) (Table 1). The place of articulation contrast was also better discriminated by the
HI subject who presented a higher percentage of expected correct answers in the
discrimination test, when the contrast involved bilabial and velar places of articulation bilabial x velar: 95%; bilabial x alveolar: 35%; alveolar x velar: 60%. She discriminated
the NH productions of plosives better than her own - bilabial x alveolar and alveolar x
velar: no correct answer above chance level; bilabial x velar: 20% (Table 2) . These
results were interpreted in relation to the characteristics of the oral production of
stops by the HI subject: the second formant transitions were flat and first and third
formants undifferentiated (Fig. 2); the formants onset frequency were also
undifferentiated across place as it is showed in Fig. 3 (we found significant differences
for the normal-hearing subject’s productions: for F2, F(5,54)=260,548, p<0,001);
segments in stressed position were found to be very long (Fig. 4); vowels in stressed
position, following all plosives, exhibited very high F1 frequency values which was
interpreted as suggesting a greater magnitude of jaw opening (Fig. 5). Figures 4 and 5
3
presents data related to bilabial plosives but equivalent results were found to the
alveolar and velar place of articulation.
F1, F2 and F3 transition - Hz
3200
3000
2800
2600
2400
2200
2000
1800
1600
1400
1200
1000
800
600
400
200
0
3200
3000
2800
2600
2400
2200
2000
1800
1600
1400
1200
1000
800
600
400
200
0
NH
HI
hhH
[pa] [ba] [ta] [da] [ca] [ga]
[pa] [ba] [ta] [da] [ca] [ga]
H
Figure 2 - Frequency values in F1, F2 e F3 transition between speech segments in
stressed position in the target words “pata, bata, tata, data, cata, gata” for the
normal-hearing (NH) and hearing-impaired (HI) subjects
Hearing Impaired Subject
3200
pa
ba
ta
da
ka
ga
3000
2800
Hz
Hz
Normal hearing subject
3200
2800
2600
2600
2400
2400
F3 onset
F3 onset
pa
ba
ta
da
ka
ga
3000
2200
2000
1200
1300
1400
1500
1600
1700
F2 onset
1800
1900
2200
2000
1200
Hz
1300
1400
1500
1600
F2 onset
1700
1800
1900
Hz
Figure 3 - Frequency values in F2 e F3 stressed vowel onsets (/pa/, /ba/, /ta/, /da/,
/ka/, /ga/) for the normal-hearing (NH) and hearing-impaired (HI) subjects
s
u
b
j
e
c
t
s
NH
2
HI
1
a
p
a
t
a
7%
10%
11%
7%
4%
5%
16%
0%
15%
other segments
9%
20%
61%
52%
3%
40%
60%
80%
100%
80%
100%
Relative duration
a
NH
2
HI
1
7%
5%
0%
b
a
t
8%
12%
6%
16%
15%
20%
a
other segments
63%
4%
9%
40%
2%
52%
60%
Relative duration
Figure 4 - Relative duration of segments /a Cata/ (where C is /p/ or /b/) and other
phonetic unities in the carrier phrase “diga Cata baixinho” as produced by the
normal-hearing (NH) and hearing-impaired (HI) subjects
4
Hearing impaired subject
Normal Hearing Subject
1700
post-stressed
preceding vowel [p]
stressed vowel [p]
post-stressed vowel
[p]
post-stressed
preceding vowel [b]
stressed vowel [b]
post-stressed vowel
[b]
1600
1500
1400
1800
post-stressed
preceding vowel [p]
stressed vowel [p]
post-stressed vowel
[p]
post-stressed
preceding vowel
[b]
stressed vowel [b]
post-stressed vowel
[b]
1700
1600
1500
1400
1300
1300
1200
1200
1100
1100
400
600
800
1000
1200
400
600
800
1000
1200
Figure 5 - F1 frequency values of vowels in three positions: post-stressed vowel in
the word preceding the target words /pata, bata/ and stressed and post-stressed
vowel in these target words, as produced by the normal-hearing (NH) and the
hearing-impaired (HI) subjects
FINAL CONSIDERATIONS
Perception and production results were found to be coherent. The analysis of
spectrographic data and of the results of the perceptual tasks indicates voicing contrasts
were not produced nor perceived by the HI subject. The velar and bilabial place of
articulations were better perceived by the HI subject, and also better produced,
considering the judges’ evaluation. The analysis of formant transitions indicated lack of
cues related to place of articulation and suggests that its identification by the judges has
been mostly cued by the burst, and that the burst cue was less effective regarding the
identification of the alveolar place of articulation (Smits et al, 1996). The greater
difficulty experienced by the HI subject in perceiving the place contrasts in her own
productions suggests that the HI subject relied more in the transition cues provided by
the coarticulated speech of the NH subject.
Spectrographic analysis of the HI stop speech productions revealed undifferentiated
transitions and extremely lengthened stressed vowels with high F1 values, suggesting a
greater magnitude of jaw opening in stressed syllable positions. The analysis of the data
in the light of Articulatory Phonology enabled us to consider the coordination of the
opening gesture of the glottis associated to a greater magnitude of the tongue body
gesture represented in the gestural score by a wider box relative to the constriction grade
of the tongue body. The fact that the greater magnitude of the tongue body gesture can
be represented in the gestural score enhances the explanatory power in relation to the
dynamics of speech production implemented by the HI subject to produce stops
followed by a low central vowel in stressed syllable position, since a vowel gesture
specified with a greater magnitude when preceded by gestures involving complete
closure such is the case of plosives imply in greater laryngeal tension making the
vibration of the vocal cords more difficult. Accordingly to this is the fact that the HI
subject systematically produced voiceless plosives in the stressed syllables of the target
words “pata, bata, tata, data, cata, gata” but often produced the voiced alveolar
plosive instead of its voiceless counterpart in the unstressed syllable onset. Specifying
scaling differences related to the magnitude of the articulatory gestures are helpful to
capture the interaction between prosody and segment as described here.
5
REFERENCES
Boothroyd, A. (1985). Evaluation of speech production of the hearing-impaired: some benefits
of forced-choice testing. J. Speech. Hear. Res., 28, 185-96.
Boothroyd, A., Hanin, L., & Eran, O. (1996). Speech perception and production in children with
hearing impairment. In: F.H. Bess, J.S. Gravel, & A.M. Tharpe (Eds.), Amplification for
children with auditory deficits (pp.55-74). Nashville, TN: Bill Wilkerson Center.
Browman, C. P. & Goldstein, L. (1990). Tiers in articulatory phonology, with some
implications for casual speech. In: T. Kingston, & M. E. Beckman (Eds.), Papers in Laboratory
Phonology I: Between the Grammar and Physics of Speech (pp. 341-376). Cambridge
University Press.
Browman, C. P. & Goldstein, L. (1986). Towards an articulatory phonology. Phonology
Yearbook, 3, 219-252.
Browman, C.P., & Goldstein, L.M. (1992). Articulatory phonology: an overview. Phonetica, 49,
155-80.
Fant, G. (1960). Acoustics theory of speech production. Mounton: The Hague.
Goldstein, L. & Fowler, C. (2003). Articulatory phonology: a phonology for public language
use. In: A.S. Meyer, & N. O. Schiller (Eds.), Phonetics and Phonology in Language
Comprehension and Production: Differences and Similarities (pp. 159-207). Berlin: Mouton de
Gruyter.
Madureira S., Barzaghi, L., & Mendes, B. (2002). Voicing contrasts and the deaf: production
and perception issues. In: F. Windsor, M.L. Kelly, & N. Hewlet (Eds.), Themes in clinical
phonetics and linguistics (pp. 417-428). London: Lawrence Erlbraum Associates.
McGarr N, Löfqvist A. Laryngeal kinematics in voiceless obstruents produced by hearingimpaired speakers. J Speech Hear Res 1988; 31: 234-39.
McGarr N, Löfqvist A. Obstruint production by hearing-impaired speakers: interarticulator
timing and acoustics. J Acoustic Soc Am 1982; 72(1): 34-42.
Monsen RB. The production of English stop consonants in the speech os deaf children. J
Phonetics 1976; 4: 29-41.
Speech Hear Res 1989; 32: 133-42.
Okalidou A, Harris KS. A comparison of intergestural patterns in deaf and hearing adult
speakers: implications from an acoustic analysis of disyllables. J Acoustic Soc Am 1999;
106(1): 349-410.
Revoile SG. Hearing loss and the audibility of phoneme cues. In: Pickett, editor. The acoustics
of speech communication. Boston (US): Allyn and Bacon.;1999
Revoile SG, Pickett JM, Holden LD. Acoustic cues to final stop voicing for impaired-and
normal-hearing listeners. J Acoustic Soc Am 1982; 72(4): 1146-54.
Sammeth CA, Dorman MF, Stearns CG. The role of consonant-vowel amplitude ratio in the
recognition of voiceless stop consonants by listeners with hearing impairment. J Speech Hear
Res 1999; 42: 42-55.
Smits R, Bosch L, Collier R. Evaluation of various sets of acoustics cues for the perception of
prevocalic stop consonants: perception experiment I. J Am Acad Audiol 1996; 100(6): 385264.
Turner CW, Brus S. Providing low-and mid-frequency speech information to listeners with
sensorineural hearing loss. J Acoustic Soc Am 2001; 109(6): 2999-3006
Tye-Murray N. Effects of vowel contest on the articulatory closure postures of deaf speakers. J
Speech Hear Res 1987; 30: 99-104.
Tye-Murray N. The establishment of open articulatory postures by deaf and hearing talkers. J
Speech Hear Res 1991; 34: 453-9.
6
Download