Sharpe 1 Major Sharpe 25 Nov 2009 Title Mandarin-English Bilinguals’ Production of English Voiced Onset Timing Introduction: Research Area The topic area is a quantitative study of voice onset time (the length of time after a plosive before the vowel after it is produced) of Chinese speakers when pronouncing voiceless (both aspirated and unaspirated) and voiced plosives in English. Mandarin Chinese only has voiceless and voiceless aspirated plosives, so when the Chinese learn English, they must learn how to properly pronounce the voiced plosives in order to be intelligible to native speakers. A part of this process is correctly producing a plosive’s voice onset time (VOT), since it varies across voiced and voiceless plosives in English. The results of this study has implications for both second/foreign language acquisition and language pedagogy. If a native Chinese learner is not explicitly taught the VOT of English plosives, yet they do produce them correctly, then that means people learning a second language have the ability to “acquire” such suprasegmentals. The topic of implication for SLA is expanded by the fact that I will be comparing students who started learning English before the age of 8 to those who started after the age of 8. It is generally acknowledged that those who start earlier will be more likely to have more native-like English pronunciation. Using the same example, if subjects are found to correctly pronounce English plosives, then that means ESL teachers do not necessarily Sharpe 2 need to revise their teaching methods by teaching English VOT because learners are able to acquire the correct pronunciation naturally. If the opposite proves to be true, then there is obviously a need to explicitly teach Chinese students the VOT of English plosives in order to help them with their intelligibility. Aim/Justification By doing this research, I hope to come to a conclusion about whether or not it is productive to teach VOT to nonnative speakers. I hope to demonstrate whether implicitly teaching English VOT to ESL learners is effective, or whether teachers should take a more explicit approach. Literature Review Introduction Individual differences in L2 pronunciation can come from a variety of sources, the least of which is age, feedback received from peers and instructors, motivation, satisfaction with one’s L2 level, and even the rate at which one speaks. In the past, science would typically agree that learners acquire an L2 in largely the same ways if their environment is the same; however, research shows the opposite to be true. In fact, there are even individual differences in the perception of sounds that are not from one’s L1. For example, an adult Japanese ESL learner may typically have difficulties with the r/l distinction in English, since the two sounds exist in free variation in his or her native language. Or he or she may not have any difficulties at all because he or she has retained the ability to differentiate between the two sounds into adulthood. In the same Sharpe 3 vein, studies suggest that when bilinguals encounter sounds that are similar in their L1 and L2, they automatically assume that the two sounds are subsets of the same discrete phoneme. While this is technically true from a purely phonological standpoint, it is not necessarily so, as the French /b/ has a different VOT lag than the English /b/, resulting in quite the different pronunciation. Furthermore, the fact that these sounds are so similar can cause trouble for the L2 learner. Where one would anecdotally expect learners to quickly and easily acquire sounds that are similar in their L1 and L2, this is often not the case. Their pronunciation of the French /b/, for example, would be fossilized and prevent them from producing the correct VOT for the English /b/. These last few points give much gravity to the present study. If Mandarin-English bilinguals consider the Mandarin /p/ and the English /b/ to be subsets of the same sound then does that mean they will have a great amount of difficulty acquiring the true VOT lag for the English voiced stops? And does the fact that the two sounds are so similar, especially in intervocalic position, equate even more difficulty for the L2 learner who wishes to overcome his or her Mandarin pronunciation? Variations in Environment and L2 Attainment According to Moyer (1999), previous research claims that L2 attainment is not changed according to variations in environment, motivation, or immersion. However, Moyer demonstrates that there are certain factors that contribute to the ultimate attainment of L2s by learners, including age of immersion, descriptive feedback on their language production, and motivation, all correlating with the learner’s satisfaction with his or her L2 production. This means that there are important factors that influence L2 Sharpe 4 learning—factors that have been largely ignored by researchers for the past several decades. While Moyer’s study is largely qualitative, dealing with learners’ selfprescribed perceptions, Theodore et al. (2009) take on a similar subject, but apply more quantitative data to their findings. In their study, Theodore et al. (2009) found that the VOT lag of voiceless stops is in part determined by how quickly a person is speaking. Unsurprisingly, this would result in the hypothesis that different people have different VOTs because some have quicker talker rates than others. Indeed, Theodore et al.’s(2009) study confirms this, producing evidence that VOT is not a constant between native speakers. Variations in VOT production may be surprising because classical foundational literature on VOT claims the phenomenon is affected by language universals, but Golestani & Zatorre (2009) mention another interesting phenomenon: phoneme perception can also vary from learner to learner. The authors use Hindi dental and retroflex stops as variations that native Hindi speakers would be able to recognize, but a native English speaker would not necessarily be able to differentiate. The authors state, though, that some English speakers would be able to hear the difference between these two sounds in adulthood. And those who cannot perceive the difference between these stops can learn to do so with minimal training. This is reassuring for the pedagog and language learner alike, as it demonstrates that nonnative speakers can learn to hear phonemic differences in an L2 that do not exist in their L1, where they would otherwise consider them to be the same sound. Fowler et al. (2008) brings more good news to the table, showing us that nonnative VOTs are not set in stone. Sharpe 5 According Fowler et al. (2008), an L2 speaker’s voice onset can fluctuate, in a sense, according to how long he or she has stayed in another country. The authors give the example of a Brazilian woman whose voice onset in Portuguese had undergone a change due to her having lived in the United States for an extended amount of time, almost exclusively speaking English. The woman returned to Brazil for a short while, and during this time, her voice onset recovered most of its native qualities. Anecdotally, we could assume that when the woman left the US, she sounded more American than Brazilian, and when she came back to the US, she regained most of the Brazilian accent she had lost—just to lose it again. So, our voice onsets have the ability to change depending on whatever environment we happen to occupy at the time. Furthermore, there are cases in which these voice onset fluctuations are practically nonexistent simply due to the fact that a bilingual has been exposed to his or her L2 for such an extended period of time. The most immediate example that comes to mind is early bilinguals. Those who learn a language before the critical period may sound more native in their pronunciation of their L2 (Moyer 1999). A perhaps more fascinating phenomenon is the concept of overhearers, and it further demonstrates the flexible nature of humans’ VOTs. Fowler et al. 2008 defines an overhearer as a person who grew up surrounded by a foreign language (maybe Spanish) during his or her early childhood without necessarily learning or speaking it. If this person decides to learn Spanish later in life, he or she will have a near-native Spanish accent, voice onset included. Simply having been exposed to the sound system of Spanish, he or she acquired an almost perfect accent. Fowler et al. uses this information to claim that early Sharpe 6 bilinguals and nonnative speakers have very few differences between their language perception and production. Bilinguals’ Perceptions of Sounds Fowler et al. (2008) note that bilingual speakers make cognitive connections between stop consonants that are of the same class. These stop consonants, while belonging to the same class, can vary substantially in terms of voice onset, yet bilinguals still consider them to be variants of the same sound despite these differences (Garcia-Sierra et al. 2009). According to Flege’s Speech Learning Model, bilingual speakers might not even notice the difference between their L1’s and L2’s voice onset rules, so there would not be a chance for them to formulate new phonological processes when producing their L2 (Fowler et al. 2008). Furthermore, there is evidence that if two sounds in a language are too similar, they will be more difficult for L2 learners to acquire. Wode (1983) mentions this phenomenon, citing that language transfer can occur in such a situation. Assuming Chinese intervocalic voiced stops have radically different VOTs from English voiced stops, this means there could very well be transfer between the two languages by Chinese-English bilinguals. Rate of acquisition can also be affected. Major & Kim (1996) have gone so far as to develop The Similarity Differential Rate Hypothesis, wherein they state that dissimilar sounds are acquired more quickly than similar sounds. Considering my subjects are early bilinguals, this last point may not carry any weight. It would depend on the amount of time it actually takes for L2 learners to acquire old and new sounds. This raises the question about the similarity between the Mandarin voiceless and the English voiced stops in the bilingual’s Sharpe 7 mind, and the problem is further compacted by studies that claim the Mandarin voiceless stops may be more similar to the English voiced stops than previously conceived. Möbius (2004) briefly mentions that unaspirated Mandarin stops are more likely to be voiced in certain phonological situations than aspirated stops. Duanmu (2000) correlates this claim when he mentions that unvoiced Mandarin stops and fricatives can indeed be voiced in unstressed position (emphasis added). He uses the example of [tsʷəi pa] morphing into [tsʷai ba] (page 27). He then raises the question about whether it would be possible for a Mandarin speaker to replace unaspirated stops with voiced stops in all situations; unfortunately, Duanmu does not go into this issue further, so the question is left unanswered. The fact that Duanmu does not continue discussing this possibility, combined with the fact that his sources for Mandarin voicing are more than 50 years old, makes his claim dubious at best and warrants further research. After all, we should know whether the studies he cited have been proven or solidified, and whether this voiceless/voiced variation is the result of free variation or assimilation. The change of voicing would most likely be the latter. Möbius (2004) goes on to discuss Mandarin phonology by telling us that there are major phonotactic constraints in the language: stops can only occur in the syllable onset and there are no consonant clusters. These constraints mean that Mandarin stops are always surrounded by voiced sounds, which supports the idea that an unaspirated stop could become voiced. This is assuring for the Mandarin-English bilingual attempting to acquire proper English pronunciation, but his or her perceptions about the quality of those consonants can play a major part in their acquisition. Furthermore, as we have already discussed, similarity Sharpe 8 does not necessarily cause faculty or ease for the L2 learner. Whatever the case, this phenomenon must be kept in mind throughout the present study, as it is entirely possible that Mandarin-English bilinguals apply this voiceless-to-voiced rule to English, using unaspirated stops in pre- and postvocalically and voiced stops intervocalically. Conclusion Fowler et al.’s (2008) findings that bilinguals consider similar sounds in L1s and L2s to be variants raises an interesting question: what correlations would MandarinEnglish bilinguals make between their two languages? Would they consider Mandarin voiceless stops to be equivalent to English voiced stops and Mandarin aspirated stops to be the same as English voiceless stops? By this point, we can only assume so, due to the previously-mentioned literature by Fowler et al. (2008), Wode (1983), and Major & Kim (1996) which describes the linguistic processes responsible for the production and perception of similar and dissimilar sounds by bilinguals, and Duanmu (2000) and Möbius’s (2004) findings that state that Mandarin and English stops may be very similar in some cases. Building upon these present theories, we can hopefully come to a clearer answer about these questions and the phenomena behind them. Research Questions/Hypotheses Do Mandarin stops impede learners’ acquisition of English stops? If there is an impediment, does my research support or oppose explicitly teaching English VOT to ESL learners? Lastly, do male and female bilinguals show evidence of varying degree of VOT mastery in their L2? Sharpe 9 I predict that Mandarin-English bilinguals’ acquisition of English stops are not impeded by the stops in their native language, that implicitly teaching English VOT is sufficient for ESL learners to correctly acquire it, and that males and females in fact show no differences in their VOT production. Methodology 20 participants will be recruited from the Ohio Program of Intensive English (OPIE) on the Ohio University main campus. There should be 10 students from each sex, their age will ideally be 20 or 21 years old, and they should all come from the same class sequence, AE50, which is for intermediate learners of English. This choice in participants should reflect the nature of the study, which will concentrate more on any possibly overlooked variability in L2 VOT production caused by gender differences and not so much on age differences. All participants should be native Mandarin speakers, and they should all be from Beijing in order to eliminate regional dialect differences. The task will be explained to each participant in English by the researcher, though there will be a certain amount of deception in order to protect the integrity of the study. Students will be told that they are conducting a normal session like in the OPIE pronunciation lab, where native English speakers tutor OPIE students’ speech to help them become intelligible. Naturally, the researcher will have each students’ permission to record their session before any recording actually takes place. Native-speaker data in the form of a control will come from data already collected and analyzed by Möbius 2004; however, if Möbius’s 2004 data proves to be unsatisfactory for this study, then Sharpe 10 native speaker data will be analyzed from an online corpus. The need for such data will be determined at a later date. Appendix A from Fowler et al 2008 has a nice list of 16 English phrases that the researchers used to elicit French-English bilinguals and English monolinguals’ VOT. The sentences show a wide variety of different English stops, and for that reason, it will also serve my study well. If a participant misreads any sentence, they will be corrected and asked to repeat it with the correct pronunciation or intonation. This may prove to be exceedingly difficult early on in the study, and if this turns out to be the case, then I will have students read the sentences out loud first, and then give their opinions about each sentence. This will give short, spontaneous data that should ideally fill any gaps that chronic mispronunciation might leave open, and it may turn out to be a more ideal procedure because the VOTs of recited and spontaneous speech may change due to the latter being more natural than manipulated. While the students read the sentences out loud, they will be recorded with an iMac G5 using an internal microphone. The recording software I will use is Audacity. After a student finishes reading his or her sentences, the file will be saved as an OGG Vorbis file—a lossless codec that should not any loss in sound fidelity. The nature of this type of file should allow the data to be applied to different types of analysis, since I will have the original OGG Vorbis files and not just to analyzed VOTs of the files on hand. Analysis Sharpe 11 The files will then be passed through software described by Möbius 2004 called “get_f0” or ESPS/xwaves. According to the Möbius 2004, the software has been proven to be just as efficient in determining VOT values as a human researcher can. I will, therefore, leave the VOT analysis up to this software, as manual analysis is very involved and complicated. Because my study also deals with the VOTs of speakers of different sexes, I will put this data through the statistical software Gnu PSPP see what kinds of trends develop. Anticipated Problems/Limitations of the Study First, having students read sentences out loud may result in speech that is markedly different from natural, spontaneous speech. In order to analyze this problem, I will have six students, three from each gender group, give me some spontaneous speech data so that I may see whether their speech truly is different depending on if they are reciting a text or speaking naturally. Furthermore, it is entirely possible that Möbius 2004 overestimated the software he praised in his study. It may not be nearly as capable at analyzing VOT as he thought. In order to minimize this limitation, I may analyze the waveforms of a few sentences myself in order to assess the program’s capabilities. Lastly, the other studies in which researchers analyze VOT have such a broader scope than my study will. Fowler et al. 2008, for example, has two times the participants I will—a truly daunting amount of data to sift through, but a scientifically rigorous amount as well. Expected Findings Sharpe 12 I expect to find that there is not so much difference between the VOT of male and female participants. Fowler et al. 2008 already answered this question in part, but not intentionally. In their study, they found that there was no discernible difference in the L2 VOT of men and women. However, they had an uneven ratio of females to males, so a fairer study is worthwhile. I also expect to find that Chinese-English bilinguals do use voiceless stops for English voiced stops when they speak, simply because they lack the English phoneme in Mandarin. However, it is apparently true that a Mandarin unaspirated stop can be voiced when it occurs intervocalically in an unstressed position, presumedly in free variation, so I will need to account for this in my findings. Conclusion Research discussed in the literature review suggests that while a bilinguals’ NL VOT can effect his or her L2 VOT production, it is never discussed as an impediment, even though it contributes to marked foreign accents. Indeed, our VOT production is elastic, changing the longer we live in a foreign country and the longer we return to our native one. This is good news for the nonnative speaker wishing to minimize his or her accent, or for the teacher trying to teach his or her students to do exactly that. References Duanmu, S. (2000). The Phonology of Standard Chinese. Oxford: Oxford University Press. Sharpe 13 Fowler, C., Sramko, V., Ostry, D. J., Rowland, S. A., & Hallé, P. (2008). Cross language phonetic influences on the speech of French-English bilinguals. Journal of Phonetics, 36, 649–663. Garcia-Sierra, A., Diehl, R. L., & Champlin, C. (2009). Testing the double phonemic boundary in bilinguals. Speech Communication, 51, 369–378. Golestani, N. & Zatorre, R. J. (2009). Individual differences in the acquisition of second language phonology. Brain & Language, 109, 55–67. Möbius, B. (2004). Corpus-Based Investigations on the Phonetics of Consonant Voicing. Folia Linguistica, 38, 5–26. Major, R.C., & Kim, E. (1996). The similarity-differential hypothesis. Language Learning, 49, 2, 275–302. Moyer, A. (1999). Ultimate Attainment in L2 Phonology: The Critical Factors of Age, Motivation, and Instruction. Studies in Second Language Acquisition, 21, 81–108. Theodore, R. M. (2009). Individual talker differences in voice-onset-time: Contextual influences. Journal of the Acoustic Society of America, 125, 3974–3982. Wode, H. (1983). Phonology in L2 acquisition. In H. Wode (Ed.), Papers on language acquisition, language learning, and language teaching (pp. 175–87). Heidelberg, Germany: Groos.Appendix A 1. Ben bought some flowers and put them on his dining room table. 2. Seven hungry children crowded around the buffet. 3. Miranda’s job was boring, and she fell asleep at her desk. 4. At the store, Kate purchased a tape recorder and a new stereo. Sharpe 14 5. As recently as two days ago, Lucy parked her car at the grocery store, and she forgot where she left it. 6. Fred wore a heavy parka and comfortable boots on the hike up Tabletop Mountain. 7. Driving along the turnpike, Kayla listened to polkas on the radio. 8. On his perch, the tiny bird called to his mate. 9. Braving the raging surf, Peter caught a towering wave and rode his surfboard to shore. 10. Over the holiday weekend, Marvin performed his magic tricks, keeping his brother Tommy amazed and amused. 11. Bonnie covered the stewed tomatoes and turned down the burner before starting to work on some pies for desert. 12. Every time he sneaked down the stairs hoping to get himself a snack, Paul’s wife caught him and him a carrot and a piece of celery. 13. Depressed that the dentist had found three cavities, Tim pestered his mother to buy him some chocolate candy. 14. While waiting for his [sic] car to be fixed, Linda watched TV. 15. Looking through the telescope, the students saw Venus. 16. Colin browsed in the bookstore while his sister shopped for a new briefcase.