Kathleen Currie Hall 24 October 2003 Field Report 4: The (psycho)acoustic vowel space of Greek Minimal and near-minimal pairs: Backness: i/u e/o [] ‘gate’ [] ‘child’ Height: i/e u/o [] ‘congeal’ [] ‘bird’ [] ‘bird’ [] ‘foot’ Low Central: o/a [] ‘baked’ i/e/a [] ‘drink’ [] ‘play’ [] ‘many’ [] ‘this’ [] ‘wash’ [] ‘wide’ See file Contrasts.wav for sound files. For Lab #4, I had trouble distinguishing vowels that I perceived as somehow “reduced” or “centralized” from the standard five vowels. Specifically, I was hearing [], [], and [] for [], [], and [] in certain words. Because these vowels are contrastive in English, however, and because there seems to be in free variation in Greek (e.g., words like ‘come’ were pronounced as both [] and [] by the consultant), I thought that it might just be that I was mapping my own phonological vowel space onto the Greek vowels. To determine whether or not this was the case, I measured F1 and F2 for all the vowels in all the tokens elicited so far, and plotted the F2 vs. F1 vowel space. This is shown in Figure 1. Although there are a few outliers, it is clear that the vowels are grouping nicely into the five expected phonological categories, [i], [e], [a], [o], and [u]. The vowels were measured as follows: I isolated each word using Praat. I used the automatic logging function to record the F1 and F2 values at the cursor, and placed the cursor in what appeared to be the “middle” of each vowel; that is, halfway through the duration of the vowel (determined by eye). For most of the vowels, I used an LPC order of 10, which gave values that seemed to correlate well with visual inspection of the spectrogram. For some of the [o] and [u] vowels, however, Praat seemed to be measuring the actual F3 values as the F2 values. For those tokens only, I changed the LPC order to 15, and re-measured F1 and F2. The re-measuring seemed to be a more accurate representation of the formants in those vowels. Changing the LPC order did not seem to noticeably affect the formant values in the other vowels, so the original measurements for the other vowels were kept. The vowels that seemed originally to be “centralized” (labelled “reduced” in Figure 1) do not in fact seem to be significantly different from non-centralized tokens of the same vowels; there is a large amount of overlap across all vowels of the same basic category. In addition to the centralized versions of the vowels [i], [e], and [o], there seemed to be a set of even more centralized vowels that I transcribed as schwa. These were never in free variation with other, non-centralized vowels, and I was not able to determine which of the “main” vowels they were closest to. These vowels, unlike the other centralized ones, only occurred in unstressed syllables. In Figure 1, it is clear that these vowels are distributed across the [e] and [a] vowel space; that is, they are some sort of central vowel, but they do not clearly fall into one of the other categories. I think that these really are just centralized versions of other vowels that surface in certain unstressed environments, though I have not yet determined an exact account for when they appear as opposed to the simple vowel. I do not, however, think that the schwa should be considered a separate vowel of Greek. This vowel occurs only in the following words: Words with schwa: Transcription Gloss cloud die four full guts head sit Having the consultant perform the vowel-perception experiment will probably help determine which vowels are alternating with schwa. Greek Vowel F2 x F1 F2 (Hz) 2500 2000 1500 1000 500 0 0 100 a unstressed 200 a stressed e unstressed 300 e stressed e reduced unstressed 400 600 F1 (Hz) 500 e reduced stressed i unstressed i stressed i reduced unstressed o unstressed o stressed 700 800 o reduced unstressed schwa unstressed u unstressed u stressed 900 1000 Figure 1: F1 and F2 plot of all measured Greek vowels The average F1 and F2 values for each of the five main vowels are as follows: Vowel [i] [e] [a] [o] [u] F1 (Hz) 324.8541 519.7883 719.6493 547.5217 349.6711 F2 (Hz) 2055.263 1751.516 1332.989 955.5391 862.6989 F1 F2 (ERB) (ERB) 7.711119 21.15723 10.5497 19.828 12.80904 17.5843 10.89569 14.93929 8.119662 14.15618 To plot the basic acoustic vowel space, I chose one word that exemplified each of the five main vowels and measured the F1 and F2 in those vowels. These values are shown below: Specific words for vowel chart (chosen because they are close to the average values): F1 F2 F1 F2 Vowel Transcription Gloss (Hz) (Hz) (ERB) (ERB) claw 329.318 2066.426 7.785817 21.20233 breathe 518.386 1756.623 10.53188 19.85213 fish 731.299 1348.241 12.92602 17.67674 lie 546.084 938.205 10.87804 14.79788 sky 338.796 856.526 7.942636 14.10177 NB: ERB = 11.17 * ln(((f/1000)+0.312)/((f/1000)+14.675))+43.0 The spectrograms for the target vowels in these words are shown in Figure 2. The token for [u] is somewhat questionable because it is much more diphthongized than the others; the measurements are from the center of the diphthong and of course were very close to the average F1 and F2 values for [u] for this speaker. These vowels (in isolation) can be heard in the sound file “simple_vowels.wav.” 5000 0 0 1.40676 Time (s) [i] [e] [a] [o] [u] Figure 2: Spectrograms of the five simple vowels in Greek, 1 male speaker Finally, I plotted these five vowels by themselves to show a basic acoustic map of the Greek vowel space, at least for this speaker. I also converted the formant values from Hz to ERB to show the psychoacoustic (perceived) vowel space. second formant (Hz) 2400 2000 1600 1200 800 first formant (Hz) 300 700 Figure 2: Acoustic vowel space of the five simple vowels in Greek, 1 male speaker second formant (ERB) 22 20 18 16 14 6 10 first formant (ERB) 8 12 14 Figure 3: Psychoacoustic vowel space of the five simple vowels in Greek, 1 male speaker The ERB chart more closely corresponds to my original perception of the vowels. The acoustic chart, in Hz, compresses the space vertically so that it appears that [i] and [u], for example, are much farther apart than [i] and [a] or [u] and [a]. The psychoacoustic chart more accurately reflects human perception of the vowel space by “stretching” the vertical axis so that the vowels [i], [a], and [u] seem to be equally distant from each other. The human auditory system is able to resolve and distinguish finer distinctions along the lower part of the Hertz scale than it can in higher frequencies; this is why F1 (with lower frequency values) needs to be visually “stretched” on the graph – the human system stretches it during perception. At higher frequencies, where F2 occurs, the auditory system is less able to make distinctions, so that sounds that are actually farther apart in Hertz are perceived as being closer together. The psychoacoustic ERB chart reflects these differences by converting Hz into a non-linear, perceptual scale.