Bridget Smith Response to and analysis of statistics for “Eth – forsake thigh name” Throughout the course of my research dealing with variability in the dental fricative, especially voicing, I have attempted a number of different ways to analyze and display my data. The earliest attempts were abysmal, but at the time, I had only one, then two speakers to look at. Going from studies such as Stevens, et al. (1992), and Pierello, et al. (1997), I looked at voicing, but also included measurements such as intensity and duration, and took into account such factors as phonetic environment, and to some extent stress. I tried to graph the environments very explicitly at first, such as whether the preceding sound was a liquid or a nasal, a palatal or a velar, in addition to whether or not it was voiced, hoping to remove all exceptions. With only two subjects, there was no statistical analysis. For my second year paper, I had 7 usable speakers analyzed. I continued to play around with varying measures of intensity, due to their importance in previous studies of fricatives, but finally dropped these because they were not remotely descriptive of voicing, which I had decided was the issue I wanted to focus on. Intensity ratios may again come into play later, when dealing with variation in sonority. I also realized that my interpretation of stress was lacking, and gave that up to focus on phonetic environment, specifically voiced and voiceless segments immediately adjacent the dental fricatives in question. I would like to point out that these meanderings have given me more than what I had hoped – in that not looking directly at what I had thought I might find – that is weirdness in the voicing of <th> - but rather looking at all the various possibilities surrounding <th>, I have stumbled upon wondrous kinds of variation that I have been able to explore from an historical perspective. What began as an exercise in phonetic analysis has expanded to include phonological and historical methodologies, which have become even more exciting than the purely phonetic analysis. The data presented in my second year paper included for both theta and eth and /f/ and /v/, duration of the fricative and duration of voicing within the fricative. (Scatterplots of intensity and duration were created, but dropped because they contained no useful information. It was simpler to state that intensity did not distinguish phonemes or voicing.) Two charts were created for both /f/ and /v/ and theta and eth for each speaker One showed the duration of the fricative and voicing sorted by phoneme, and the other sorted by phonetic environment. The graphs from only one speaker were used in the final paper, along with assurances that, although the other speakers were not as clear as the exemplar I’d chosen, they did pattern in the same basic way. No statistics were used. The figures show duration of the fricative in milliseconds on the horizontal axis, duration of voice bar on the vertical axis. In figure 1, the filled circles represent /f/ and the empty circles represent /v/. Those to the right have longer duration. Because the duration of the voice bar was measured within the period of frication, its total possible duration is limited to the total duration of the fricative. Those tokens which are completely voiced will therefore appear along a 45 degree angle, having a 1:1 correspondence, or slope. Those tokens for which 50% or less of the duration shows a voice bar appear along or below a 23 degree angle, having a slope of 1:2 or less. The selected speaker shows fairly good distribution of /f/ and /v/ according to phoneme, with a few exceptions. /f/s are generally longer in duration, clustered to the right, and have less voice bar, which is shown by their clustering below the 23 degree angle. Figure 1: <f/v> duration of frication (x axis) and voice bar (y axis), sorted by phoneme. Figure 2 shows the same tokens, sorted by phonetic environment. The filled circles are surrounded on both sides by voiced sounds. The empty circles represent an adjacent voiceless sound including a pause, in which the vocal folds do not vibrate. By comparing figures 1 and 2, I pointed out that many occurrences of phonemic /f/ coincide with the presence of a voiceless sound immediately preceding or following. Likewise, many instances of /v/ occur between voiced sounds. These environmental conditions may be similar to those responsible for originally conditioning the voicing status of the fricative, and have remained unchanged. This distribution resembles the variation in voicing described by Stevens, et al. (1992). There are, however, many tokens that do not seem to be affected by environment, and are better described phonemically. Figure 2: <f/v> duration of frication (x axis) and voice bar (y axis), sorted by phonetic environment. Figure 3 shows the dental fricative divided by phoneme. Theta tends to have generally less voice bar than eth; however there is a great deal of overlap. Duration also varies greatly. Some canonical thetas (filled circles) are voiced and many canonical eths (open circles) are voiceless. Figure 3: <th> duration of frication (x axis) and voice bar (y axis), sorted by phoneme. When the tokens are marked for voicing of surrounding segments, their voice bar is much better accounted for, as in figure 4. Here the filled circles represent an adjacent voiceless segment or pause. Empty circles represent surrounding voiced sounds. Unlike with /f/ and /v/, the distribution of voicing in dental fricatives is far better accounted for by phonetic environment than that which was achieved by phonemic sorting. Figure 4: <th> duration of frication (x axis) and voice bar (y axis), sorted by phonetic environment. The benefit of this display is that it shows duration of voicing compared to overall duration. This make it easy to find individual exceptions, and was effective in presentation format, as I could show spectrograms and play sound files of the exceptions in situ. Because duration did not pattern as expected, this supports the theory that eth and theta do not pattern as /f/ and /v/ do, that is phonemically. The downside includes difficulty explaining the significance of the 23 and 45 degree angles. It also limits my ability to demonstrate that the pattern occurs on a larger scale. When pooled together, the data from the 7 subjects do not present a compelling visual pattern because each individual had different patterns of variation. In my analysis for the statistical methods course, I presented only the voicing data in barplots, comparing the phonemic sorting with environmental sorting. These results were even more astounding. With all 7 subjects’ data combined, a pattern emerged that was not visible in the scatterplots. The boxplot shows percentages of voicing (as opposed to raw duration) with the main boxes containing values up to 2 standard deviations from the mean (95% of the data), which is shown by the black bar across each box. This particular boxplot is actually a composite of two separate boxplots, which highlights their differences. The grey boxes represent phonemes. Phonemic theta is shown to the left, and has longer “whiskers” (one additional standard deviation) and more circles representing outliers. (This can be seen more easily on the individual boxplot, but to conserve space, you’ll just have to take my word on it.) The mean is around 20% voiced. Phonemic eth is represented by the grey box on the right. Its “whiskers” extend all the way to zero, and its mean is around 50% voiced. Figure 5: <th> accounted for by phoneme in grey, and by phonetic environment in white. 7 speakers, 396 tokens total. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. The white boxes represent environmental voicing, so that the value of “1,” which is overlaid on theta, represents those tokens which had an immediately adjacent voiceless segment, including a pause. The white box to the right, with the value of “2,” indicates those tokens which were completely surrounded by voiced segments. (At this point, I must confess that I need a line of code to help me disentangle “th” from “1” and “eth” from “2” so that this is more clear.) According to this figure, theta is only slightly better accounted for by environment than phoneme. This may have to do with the fact that there were a disproportionately higher number of eths (286) than thetas (110), but it may also point toward greater stability of phonemic theta. This deserves looking into, and was not as apparent in the scatterplots, so is a benefit of using the boxplots. The value “2,” for a completely voiced environment has only 174 tokens, while “1” for any voiceless environment contains 222 values, but falls into a similar, though slightly smaller range as the voiceless theta tokens. This does point to better accounting for variance than with phoneme. Eth, however, shows drastic improvement by accounting for environment. True, the number of tokens are fewer, but the range is so greatly diminished, and raised to the voiced end of the spectrum, with a mean of 100% voiced, that we cannot account for this by number alone. The boxplots have the advantage of dovetailing nicely with a statistical measurement. For one thing, the results for all of the speakers are combined into one graph, as they should be combined for any kind of statistical measurement. Also, the comparison between phoneme and environment can be displayed in this one single graph. Voicing, or “voice bar” is represented by percentages, which are easier to understand than the 23 and 45 degree lines arbitrarily imposed by the raw duration. The downside is that individual tokens are not accounted for, and duration is no longer mentioned. (Because it is only interesting in comparison with /f/ and /v/.) For the statistical methods paper, I followed Mary’s lead by producing a “squared partial r” as described by Cohen and Cohen (1983). If I understand it correctly, the equation takes r^2 from the linear regression of y against x1 and x2, (or the proportion of variance accounted for by x1 and x2) then subtracts the r^2 from a linear regression of y against only x2 (proportion accounted for by x2), then divides the whole thing by 1 minus the proportion from x2. R2Y.12=summary(lm(dat$percVoice~dat$vless+dat$phonenumber))$r.squared # R2Y.12 yields a proportion of [1] 0.6432622 that is accounted for by both phoneme and environment r2Y2=summary(lm(dat$percVoice~dat$phonenumber))$r.squared # r2Y2 yields only [1] 0.1365537, which is how much variance is accounted for by phoneme alone # subtracting the one from the other R2Y.12-r2Y2 # yields us [1] 0.5067084 # then the denominator is 1 minus the proportion from phoneme, or x2, # so the total that is completely unaccounted for by phoneme /(1-r2Y2)) # and the total of the entire computation [1] 0.2019833 This is a proportion of how much x1 accounts for the variance that x2 does not, out of the total variance that it does not account for. Proportions (or percentages) are nice because they are easy to read and understand. So, for instance, phoneme accounts for only 20% of variance after phonetic environment is taken into account. The proportion accounted for by environment minus phoneme, however, is 0.5868442, almost 59%. Numbers feel like a more tangible way of describing the data, but they can be easily misused. Looking at these figures, I begin to wonder if this is the best way to present the data. There is more nuance in the scatterplots, and exceptions can be accounted for. If you are looking for the “big picture,” the numbers suggest that the idea of phonemic categorization of <th> is nearly non-existent. These numbers allow us to overlook the fine details that suggest a much more complex pattern than is suggested by these numbers. Looking at exceptions in the scatterplots leads to more detailed examination of the spectrograms, which yields interesting information about voicing, duration, and sonority, and possibly more. Even the barplot, with greater accuracy of theta than eth, leads to more questions about why this is the case, which may shed even more light on these patterns. It is interesting and significant that phonetic environment is a much better predictor of voicing than phoneme for <th>, but it doesn’t tell the whole story. It may have statistical power, but, to my mind, it loses its explanatory power. If I had not gone in search of some of the exceptions, I would not have discovered some of the most interesting patterns, such as gemination, resyllabification, devoicing of vowels and sonorants, pre-voicing, etc., that have lead me to search deeper, into the historical development of the entire language family, and wider, into place and manner (in addition to voicing), stress and prosody. I don’t think my original question (going way back to the beginning), “what’s up with <th>?” is answered by these statistics. That environment better predicts voicing than phonemic category is not the only answer to this open-ended question. Simple questions are great for focusing research into reasonable segments, but they must imply an hypothesis. You have to have a reasonable assumption about how it’s going to turn out. You could be wrong, but usually by the time you narrow the question enough to have a manageable study, you have a pretty good idea about the kinds of answers you’re going to get. I suppose that is great for getting grants and acceptance to conferences, and I can do that, too, but I much prefer the exploratory methods I’ve been using, I’m afraid that when I no longer have the excuse that I’m new and inexperienced, I won’t be able to get away with the exploratory scatterplots and the messing around with various statistical procedures because nobody wants to fund you unless you’re halfway to your answer already. That being said, each method has its benefits and downsides. Which one you choose depends on what question you hope to answer, and how you want to get to that answer. The scatterplots were not the best method for what became my contribution to the ICHL proceedings, and my second-year paper because the paper focused more on the answer obtained by the second analysis, that is strictly blowing up the myth that <th> can be reasonably considered two distinct phonemes. My case would have been stronger had I used the statistics above (or what follows in my late-breaking report) and the boxplot. However, because I made mention of a follow-up focusing on the variation expressed in the scatterplots, I hope to lure people into reading my GLAC paper, for which the scatterplots are perfect. The GLAC paper delves into the variation and compares it to the historical development of dental fricatives in the Germanic languages. The question in that paper deals more with historical methodology (with a modern twist) and tackles some assumptions about sound change. It is more expository in nature, perhaps best answering the question, “What does sound change look like?” There can be room for both kinds of questions, and the trick, I suppose, is knowing when, and in front of whom, to ask each type of question, and then being prepared to follow it up with the appropriate figures and statistics. For instance, as Mary has pointed out, I did not take a firm stand on the (not) phonemic results of theta and eth. Using the statistics and boxplots would have been a good way to stake out a strong position on that issue. In the meantime, I have come up with lots of ideas for smaller studies based on the variation I found in my exploration. The journey, though interesting, is probably not something suitable for publication. ***late breaking news*** I finally got SPSS to do partial correlation. As expected, the numbers did not match up to my R analysis. Here are the SPSS tables. Correlations Control Variables vless percVoice phonenumber percVoic Correlation e Significance (1-tailed) 1.000 .449 . .000 0 393 Correlation .449 1.000 Significance (1-tailed) .000 . df 393 0 df phonenumber Control Variables percVoice vless phonenumber percVoice Correlation vless 1.000 .766 Significance (1-tailed) . .000 df 0 393 Correlation .766 1.000 Significance (1-tailed) .000 . df 393 0 So, then I looked at various sources for formulas for partial correlation. Either I didn’t understand the formulas as they were written, or SPSS uses a different one, but I figured that what SPSS used was an actual correlation, so I used wikipedia’s verbal description of partial correlation, which is “The obvious way to compute a (sample) partial correlation is to solve the two associated linear regression problems, get the residuals, and calculate the correlation between the residuals.” Then I messed around in R, and came up with a very basic code: XZ<-(lm(dat$percVoice~dat$vless)) resXZ<-residuals(XZ) YZ<-(lm(dat$phonenumber~dat$vless)) resYZ<-residuals(YZ) part<-cor(resXZ,resYZ) part which yielded the correlation of percentage voicing (percVoice) on phoneme (phonenumber) partialling out environment (vless): [1] 0.4494255 and changing around the code to get correlation of percentage voicing on environment, partialling out phoneme: [1] 0.7660576 The best news is that I can compare this value with the squared partial r, to see if the name is true to the value. So, if I take the square root of the squared partial r, which I got from Mary’s code, I should get the partial correlation: > sqrt(0.5868442) [1] 0.7660576 > sqrt(0.2019833) [1] 0.4494255 And what do you know, that’s what they are. This would be the additional analysis I was looking for to confirm my results. As far as additional data, there are a lot of things that I would like to follow up on. Regarding the issue of phonemic voicing, I think the case is made pretty convincingly with what I have here; however, voicing will continue to be under investigation in the next few studies I do because I want to compare these results with studies of place and of manner (realizations from stop through approximant) which will also need to take into account such other factors as stress and prosodic hierarchy. I also want to amass a corpus of speakers to compare patterns of variation, the stuff that’s not accounted for by environmental voicing, and just how much is. I want to see if those patterns fit any kind of sociolinguistic grouping, such as age, gender, or dialect. I would also like to compare patterns between parents and children, to find out what kind of possible misinterpretation of the signal, if any, or analogizing from one pattern to another may be involved. I would like to pinpoint the effects of literacy, instead of just theorizing about them, using children and semi-literate populations, if possible. This study has answered one very limited question, but it has raised many more. References Cohen, Jacob and Cohen, Patricia (1983) Applied multiple regression/correlation analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: L. Erlbaum Associates. Stevens et al (1992) “Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters,” J. Acoust. Soc. Am. 91, 2979-3000. New! Analysis of f/v QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. partial correlation for phoneme: [1] 0.7197954 and for environment: [1] 0.4650595 almost the inverse of what I got for <th>.