KCH_Report4

advertisement
Kathleen Currie Hall
24 October 2003
Field Report 4: The (psycho)acoustic vowel space of Greek
Minimal and near-minimal pairs:
Backness:
i/u
e/o
[] ‘gate’
[] ‘child’
Height:
i/e
u/o
[] ‘congeal’
[] ‘bird’
[] ‘bird’
[] ‘foot’
Low Central:
o/a
[] ‘baked’
i/e/a
[] ‘drink’
[] ‘play’
[] ‘many’
[] ‘this’
[] ‘wash’
[] ‘wide’
See file Contrasts.wav for sound files.
For Lab #4, I had trouble distinguishing vowels that I perceived as somehow “reduced” or
“centralized” from the standard five vowels. Specifically, I was hearing [], [], and [] for [],
[], and [] in certain words. Because these vowels are contrastive in English, however, and
because there seems to be in free variation in Greek (e.g., words like ‘come’ were pronounced as
both [] and [] by the consultant), I thought that it might just be that I was mapping
my own phonological vowel space onto the Greek vowels. To determine whether or not this was
the case, I measured F1 and F2 for all the vowels in all the tokens elicited so far, and plotted the
F2 vs. F1 vowel space. This is shown in Figure 1. Although there are a few outliers, it is clear
that the vowels are grouping nicely into the five expected phonological categories, [i], [e], [a],
[o], and [u].
The vowels were measured as follows: I isolated each word using Praat. I used the automatic
logging function to record the F1 and F2 values at the cursor, and placed the cursor in what
appeared to be the “middle” of each vowel; that is, halfway through the duration of the vowel
(determined by eye). For most of the vowels, I used an LPC order of 10, which gave values that
seemed to correlate well with visual inspection of the spectrogram. For some of the [o] and [u]
vowels, however, Praat seemed to be measuring the actual F3 values as the F2 values. For those
tokens only, I changed the LPC order to 15, and re-measured F1 and F2. The re-measuring
seemed to be a more accurate representation of the formants in those vowels. Changing the LPC
order did not seem to noticeably affect the formant values in the other vowels, so the original
measurements for the other vowels were kept.
The vowels that seemed originally to be “centralized” (labelled “reduced” in Figure 1) do not
in fact seem to be significantly different from non-centralized tokens of the same vowels; there is
a large amount of overlap across all vowels of the same basic category.
In addition to the centralized versions of the vowels [i], [e], and [o], there seemed to be a set
of even more centralized vowels that I transcribed as schwa. These were never in free variation
with other, non-centralized vowels, and I was not able to determine which of the “main” vowels
they were closest to. These vowels, unlike the other centralized ones, only occurred in
unstressed syllables. In Figure 1, it is clear that these vowels are distributed across the [e] and [a]
vowel space; that is, they are some sort of central vowel, but they do not clearly fall into one of
the other categories. I think that these really are just centralized versions of other vowels that
surface in certain unstressed environments, though I have not yet determined an exact account
for when they appear as opposed to the simple vowel. I do not, however, think that the schwa
should be considered a separate vowel of Greek. This vowel occurs only in the following words:
Words with schwa:
Transcription







Gloss
cloud
die
four
full
guts
head
sit
Having the consultant perform the vowel-perception experiment will probably help
determine which vowels are alternating with schwa.
Greek Vowel F2 x F1
F2 (Hz)
2500
2000
1500
1000
500
0
0
100
a unstressed
200
a stressed
e unstressed
300
e stressed
e reduced unstressed
400
600
F1 (Hz)
500
e reduced stressed
i unstressed
i stressed
i reduced unstressed
o unstressed
o stressed
700
800
o reduced unstressed
schwa unstressed
u unstressed
u stressed
900
1000
Figure 1: F1 and F2 plot of all measured Greek vowels
The average F1 and F2 values for each of the five main vowels are as follows:
Vowel
[i]
[e]
[a]
[o]
[u]
F1
(Hz)
324.8541
519.7883
719.6493
547.5217
349.6711
F2
(Hz)
2055.263
1751.516
1332.989
955.5391
862.6989
F1
F2
(ERB)
(ERB)
7.711119 21.15723
10.5497
19.828
12.80904 17.5843
10.89569 14.93929
8.119662 14.15618
To plot the basic acoustic vowel space, I chose one word that exemplified each of the five
main vowels and measured the F1 and F2 in those vowels. These values are shown below:
Specific words for vowel chart (chosen because they are close to the average values):
F1
F2
F1
F2
Vowel
Transcription
Gloss
(Hz)
(Hz)
(ERB)
(ERB)
claw
329.318 2066.426 7.785817 21.20233


breathe
518.386 1756.623 10.53188 19.85213


fish
731.299 1348.241 12.92602 17.67674


lie
546.084 938.205 10.87804 14.79788


sky
338.796 856.526 7.942636 14.10177


NB: ERB = 11.17 * ln(((f/1000)+0.312)/((f/1000)+14.675))+43.0
The spectrograms for the target vowels in these words are shown in Figure 2. The token for
[u] is somewhat questionable because it is much more diphthongized than the others; the
measurements are from the center of the diphthong and of course were very close to the average
F1 and F2 values for [u] for this speaker. These vowels (in isolation) can be heard in the sound
file “simple_vowels.wav.”
5000
0
0
1.40676
Time (s)
[i]
[e]
[a]
[o]
[u]
Figure 2: Spectrograms of the five simple vowels in Greek, 1 male speaker
Finally, I plotted these five vowels by themselves to show a basic acoustic map of the Greek
vowel space, at least for this speaker. I also converted the formant values from Hz to ERB to
show the psychoacoustic (perceived) vowel space.
second formant (Hz)
2400
2000
1600
1200
800
first formant (Hz)
300
700
Figure 2: Acoustic vowel space of the five simple vowels in Greek, 1 male speaker
second formant (ERB)
22
20
18
16
14
6
10
first formant (ERB)
8
12
14
Figure 3: Psychoacoustic vowel space of the five simple vowels in Greek, 1 male speaker
The ERB chart more closely corresponds to my original perception of the vowels. The
acoustic chart, in Hz, compresses the space vertically so that it appears that [i] and [u], for
example, are much farther apart than [i] and [a] or [u] and [a]. The psychoacoustic chart more
accurately reflects human perception of the vowel space by “stretching” the vertical axis so that
the vowels [i], [a], and [u] seem to be equally distant from each other. The human auditory
system is able to resolve and distinguish finer distinctions along the lower part of the Hertz scale
than it can in higher frequencies; this is why F1 (with lower frequency values) needs to be
visually “stretched” on the graph – the human system stretches it during perception. At higher
frequencies, where F2 occurs, the auditory system is less able to make distinctions, so that
sounds that are actually farther apart in Hertz are perceived as being closer together. The
psychoacoustic ERB chart reflects these differences by converting Hz into a non-linear,
perceptual scale.
Download