Web Accents Steven H. Weinberger, George Mason University

advertisement
PTLC 2005 Steven Weinberger Web Accents 1
Web Accents
Steven H. Weinberger, George Mason University
1 Introduction
Every human who speaks a language has an accent, and every human who listens to
others talk perceives an accent. This is true for foreign accents and for regional accents
within the same language group.
L2 phonological studies, by definition, use accented speech as a major data source.
The conclusions of many of these studies suggest that foreign accented speech not only
contains valuable linguistic clues to a speaker’s internalized native phonology, but also
shows universal characteristics (Ioup and Weinberger 1987, Leather and James, 1996).
In this paper I report on the construction of an archive that compiles and delivers
annotated accented speech signals. The Speech Accent Archive is structured to provide
uniform, searchable, and annotated data to anyone doing linguistic research in accented
speech. I discuss some of the methodological issues related to the Archive’s
construction, the way in which it is used as a classroom tool for applied phonetics
classes, and the research potential of this sort of database.
2 Methodology and description of the Archive
The archive is located at http://classweb.gmu.edu/accent. It currently contains more than
400 samples of English speech recorded from speakers from more than 125 different
native language backgrounds. The Speech Accent Archive presents audio samples from
each speaker, demographic information about each speaker, phonetic analyses of the
samples, and background material for each native language.
2.1 The audio samples
Each audio sample is recorded on equipment that has evolved from a Sony TC-D5M
stereo cassette recorder to a Sony MDR-70 minidisk recorder, to an Olympus DM-10
CD-quality digital recorder. The samples are digitized at 44.1 kHz, 16-bit mono, and
then encoded and downsized 8x into Quicktime sound tracks at IMA 4:1, 22.05 kHz, 16bit mono for web delivery.
2.2 Speaker demographics
Subjects are asked a set of 7 demographic questions:
1. where were you born?
2. what is your native language?
3. what other languages besides English and your native language do you know?
4. how old are you?
5. how old were you when you first began to study English?
6. how did you learn English? (academically or naturalistically)
7. how long have you lived in an English-speaking country? which country?
The answers to each of these questions are posted for each speaker in the archive.
Limiting the demographic variables to these 7 distinct items allows us to triangulate on 3
major speaker parameters: language background (nos. 1,2, and 3), age (nos. 4 and 5),
and residency (nos. 6 and 7). These are precisely the parameters that contemporary L2
acquisition theory finds to be most instrumental in determining speech proficiency.
PTLC 2005 Steven Weinberger Web Accents 2
2.3 The elicitation paragraph
Each speaker reads the following paragraph:
Please call Stella. Ask her to bring these things with her from the store: Six spoons
of fresh snow peas, five thick slabs of blue cheese, and maybe a snack for her
brother Bob. We also need a small plastic snake and a big toy frog for the kids.
She can scoop these things into three red bags, and we will go meet her
Wednesday at the train station.
This paragraph was constructed to fulfill three requirements: 1) It must be short to be
bandwidth-friendly. (The paragraph contains 69 words and can typically be read within
50 seconds); 2) it must be composed of common English words known by most L2
speakers; and 3), it must contain most of the English consonants, vowels, and clusters.
The archive was originally designed to collect and deliver foreign accent. Therefore, the
speech elicitation device was constructed to invoke particular second language
phonological behaviors. And while some naturalness may be compromised with a
scripted paragraph, it has been suggested that the differences between spontaneous
data and reading data are minimal (Munro and Derwing, 1994). The distribution of
sounds with the number of occurrences in parentheses, is shown in Figure 1:
Figure 1: Distribution of sounds in elicitation paragraph
2.4 The phonetic transcriptions
Each speech sample is transcribed phonetically. Two to four trained transcribers create
the narrow transcriptions. The transcriptions concentrate on segmentals, and do not deal
with stress or tone. Even though most speakers produce continuous speech, we
arbitrarily leave spaces between each word to enhance readability. We also add extra
spaces to indicate pauses. An example of one of the transcriptions, from a native
Sicilian speaker, is given in Figure 2:
PTLC 2005 Steven Weinberger Web Accents 3
Figure 2. Sicilian1 phonetic transcription
2.5 Phonological generalizations
The continuing analyses of our speech samples include a set of speaker-specific
phonological generalizations.
Speech behaviors such as final devoicing, vowel
shortening, vowel insertion, etc. are catalogued and presented for each non-native
speaker. For the entire corpus, there are a total of 32 consonant, vowel and syllable
generalizations. This is a surprising finding, given the range of non-native speakers!
2.6 Native phonetic inventories
The archive includes nearly 100 different phonetic inventories of the native languages
spoken by our subjects. They are all presented in a uniform manner. This allows users
to do limited contrastive analyses between the target language (English) and the native
language.
2.7 Remote submissions
There is a web-based data submission page that allows researchers from anywhere in
the world to send in data samples. The submission page recapitulates the precise
protocol that our local graduate researchers must follow. We require that remote
researchers contact us prior to, and after their data collection.
3 The Archive as a Classroom Tool
The Speech Accent Archive is essentially built by the graduate students in the
Linguistics Program at George Mason University. For the last six years, we have been
using the archive to train students on field recording, narrow phonetic transcription,
collaborative research, and phonological analysis.
Our graduate linguistics program serves two types of students: those interested in
linguistic theory, and those with more applied interests. Both types of students populate
our phonetics and phonology courses, and the archive is a good vehicle to attend to their
intersecting interests. For example, in our English phonetics course, students work in
groups of three. They are given digital recorders and learn to do a controlled field
recording of a chosen non-native speaker. Recordings are done according to a strict
protocol. The students optionally learn how to digitize, encode, and compress the
sample. After some rigorous phonetic training, they work collaboratively to phonetically
transcribe their sample. The phonetic transcriptions are generally narrow ones, and this
is typically a difficult task (Shriberg and Lof, 1991). We make minimal assumptions about
the phonemic structure of the native languages. Nevertheless it is assumed that
phonetic transcription does not always proceed in a theoretical vacuum (Laver, 1994, p.
3). There are instances when our phonetic judgments are affected by our knowledge of
contrastive analysis.
PTLC 2005 Steven Weinberger Web Accents 4
After the transcription is done, the students continue with the analysis and formulate a
set of phonological generalizations for their particular non-native speaker. Based upon
these specific generalizations, the more applied-minded students in the groups then
proceed to develop potential pedagogical interventions for the non-native speaker.
Students report that their work on the archive makes the study of phonetics more
interesting and relevant for them. The research appeals to the theoretical and applied
linguistics students. Indeed, many students from the class go on to apply for graduate
research positions in to work on our archive phonetics.
4 The Archive as a Research Tool
Data from the archive allow us to test and confirm various L2 acquisition principles such
as the Critical Period Hypothesis (Birdsong and Molis, 2001; Long, 1990; Scovel 1988).
The database is extensive, and it allows us to construct research programs with a variety
of dependent and independent variables. Our own students have used the archive data
for their master's theses, and researchers at other institutions have productively used the
data.
5 The revised Speech Accent Archive
At the end of summer 2005, a completely new version of the archive will go online. The
revised version is completely database-driven, and will be comprehensibly searchable,
allowing for more refined speaker and generalization searches. We are translating all
older IPA fonts in the archive to Unicode (Doulos SIL).
The new URL will be
http://accent.gmu.edu.
References
Birdsong, D., and Molis, M. (2001) On the Evidence for Maturational Constraints in
Second-language Acquisition. Journal of Memory and Language, vol. 44, pp.
235-49.
Ioup, G., and Weinberger, S. (Eds.). (1987) Interlanguage Phonology. Cambridge, MA:
Newbury House.
Laver, J. (1994) Principles of Phonetics. Cambridge: Cambridge University Press.
Leather, J., and James, A. (1996) Second Language Speech. In Ritchie, W., and Bhatia,
T. (Eds.), Handbook of Second Language Acquisition. San Diego: Academic
Press.
Long, M. (1990). Maturational Constraints on Language Development. Studies in
Second Language Acquisition, vol. 12, pp. 251-85.
Munro, M., and Derwing, T. (1994) Evaluations of Foreign Accent in Extemporaneous
and Read Material. Language Testing, vol. 11, pp. 253-266.
Scovel, T. (1988). A time to Speak: A Psycholinguistic Inquiry into the Critical Period for
Human Speech. Rowley, MA: Newbury House.
Shriberg, L., and Lof, G. (1991) Reliability Studies in Broad and Narrow Phonetic
Transcription. Clinical Linguistics and Phonetics, vol. 5, pp. 225-279.
Download