INTONATION IN DISCOURSE AND ITS EFFECTS ON

advertisement

INTONATION IN DISCOURSE AND ITS EFFECTS ON INTELLIGIBILITY AND

COMPREHENSIBILITY OF SPEECH

Patsita Boonyakanjanapaisan1 1, *, Preedaporn Srisakorn2 2,# , Kornwipa Poonpon3 3,#

1 Master’s degree in English student, Thailand

2

Doctor of applied linguistics

3

Doctor of applied linguistics

*patsita.boonya@gmail.com, #presri@kku.ac.th, #korpul@kku.ac.th

Abstract

In the international workplaces (e.g., AEC 2015), while successful oral communication is necessary, intonation in discourse (in interaction) produced by Thai speakers of English seems to be the problem, as causing the native listeners lose concentration and misunderstand the speaker’s information or intent. Such intonation used could affect intelligibility and comprehensibility of speech and lead to communication failure. This study aims to examine intonation in discourse produced by Thai students and its effects on intelligibility and comprehensibility of speech perceived by native and non-native listeners. This is to find a guideline of improving discourse intonation further. The 12 Thai undergraduate students provided speech samples by narrating a cartoon story. Discourse analysis was acoustically held on within Brazil’s (1997) model of discourse intonation using

PRAAT computer program. Tone choice and intonational paragraph were focused on. The 16 listeners from Thai (4), Chinese (4), Malaysian (4), and native (4) English speakers judged speech samples for intelligibility, comprehensibility, and impression to speech. The regression analysis was also employed to find relationship between comprehensibility scores and acoustic variables relating to targeted intonational features. The overall results revealed that overuse of level tones, less use of rising tones, compression of the overall pitch range, pauses, and a lack of fall-rising tone may negatively affect comprehensibility, but not intelligibility. This suggested that intonation in discourse should be included in pronunciation teaching by using technology devices and advanced model for more effective oral communication.

Keywords: intonation in discourse, intelligibility, comprehensibility, Brazil, PRAAT

Introduction

Intonation is long accepted to be a key role in effective spoken language (Levis &

Pickering, 2004 p. 505). Intonation in discourse means intonation in context of interaction [1] or above word or isolated sentence. In research on intelligibility, the extent of understanding the utterance [6], of the L2 learners using discourse analysis, it is assumed one of the three biggest factors (stress, rhythm, and intonation) affecting intelligibility in American English.

However, non-native speakers of English are normally reported to have limitations on these skills. Overall, the intonation patterns, particularly of Asian speakers of English seemed to be a key factor causing the native listeners lose concentration [2], as well as misunderstanding

information and misinterpreting the speakers’ intent [3]. In particular, tone choices (either falling, rising, or level tone) produced by the non-native speakers normally showed ineffectiveness in creating informational and social convergences with the native listeners [2].

This showed the speakers’ uninvolved characteristics, and obstructed comprehensibility, the degree of difficulty in perceiving speech [6]. In addition, intonational paragraphs (implying speech paragraphs) exploited by the non-native speakers manifested weaker control to present information structure (compared with the native speakers), negatively affecting comprehensibility to the native hearers [8]. Although all studies did not include Thai students in the study, it must be interesting to find that whether Thai students have these problems of intonation in discourse, especially affecting intelligibility or comprehensibility of speech.

This is because in the context of international communication, these two features are really important for successful communication and in pronunciation teaching and learning. So, the study initially aimed to examine intonation in discourse produced by Thai students and its effects on intelligibility and comprehensibility of speech perceived by the native and nonnative listeners. Given this aim, three research purposes were derived as follows: (1) to examine the use of intonation in discourse, tone choice and intonational paragraph of Thai students, as compared with that of a native speaker, (2) to examine the extent of intelligibility and comprehensibility of English speech produced by Thai students, as judged by the native and non-native listeners, and (3) to find whether the use of intonation in discourse produced by Thai students affect intelligibility and comprehensibility of speech, as perceived by the native and non-native listeners. All purposes were accomplished by mainly using discourse analysis, listening tasks, and a regression analysis, respectively.

Methodology

Participants: speaker participants were 12 third year undergraduate students majoring in English were conveniently selected as the speaker participants, 5 females and 7 males.

They were enrolled in the first semester of 2013 academic year at Khon Kaen University. For comparative analysis of intonation in discourse, a native English (British) speaker teacher participated in the study, as providing a standard native speaker performance. Listener participants were 16 native and non-native ones, conveniently selected. They also came from different backgrounds. They represented the four groups of listeners: 4 Thai, 4 Chinese, 4

Malaysian, and 4 native English listeners. All listeners and the number of participants were assumed to be “the target linguistic community” [6] that Thai students might encounter in the future workplaces.

Research design: the present study used a concurrent triangulation approach that consisted of both qualitative and quantitative data. The results obtained were compared and converged, and interpreted as the overall results. The quantitative data were derived from discourse analysis, acoustically held on within Brazil’s (1997) model through using both auditory and instrumental analysis (PRAAT). This speech analysis also provided qualitative data. The other quantitative data were gained from a general multiple regression analysis to find relationship between comprehensibility scores and acoustic variables. The qualitative data were obtained from the listeners’ comments on their ratings on comprehensibility and impression to speech.

Research instruments: the main tools used in the study were speaking task eliciting speech samples, computer program for acoustic and discourse analysis, and listener judgment package.

(1) Speaking task: narrative or story-telling (using a cartoon story): this narrative task, inspired by [6], was used to elicit speech samples from speaker participants (both students and a native speaker). In the study, the speaker participants were asked to narrate or tell the story of the cartoon in English page by page after having read the whole. For the cartoon, it was used for mainly stimulating the speakers to produce intonation in discourse, so it contains many funny and surprising themes. As, it is rarely available, it was created by the researcher. The principles for developing it were adapted from the trial study conducted prior to the study.

(2) For acoustically analyzing intonation in discourse and extracting long sound files to be the shorter ones, PRAAT computer program (version 5.3.59) was used. It is one of the widely known speech analysis softwares, developed by Boersma and Weenink (2004) from the Institute of Phonetic Sciences of the University of Amsterdam. It is freely downloaded at http://www.praat.org. The main function utilized was “Show pitch”, and “Get pitch”.

Moreover, Audacity, free audio editing software, was used for recording long sound files, as provided by the GNU General Public License (GPL), and downloaded at http://audacity.sourcefor.

(3) Listener judgment package: this tool was sent to the listeners for judging speech at their homes. It was a package that contained a set of judging tasks (intelligibility, comprehensibility, and impression to speech), and a CD containing speech samples.

For intelligibility, the study used a dictation (or transcription) task, adapted from [6], and [5]. It was used for eliciting intelligibility scores by asking the listeners to transcribe speech after listening each one of them in standard orthography (as word for word), or write down exactly what was said on paper.

For comprehensibility, a 9-point Likert scale (ranging from 1 easy to understand to 9 extremely difficult to understand) was employed for eliciting comprehensibility scores, adapted from [6], and [5]’s work. It was used by asking the listeners to rate speech on this scale after listening each one of them. For more validity of this scale, the researcher additionally defined the scale 4 as a bit difficult, and 6 as quite difficult to understand to guide the various listeners to rate in the same way. Moreover, there is open-ended question, asking the listeners to provide comments on their ratings immediately after rating comprehensibility of each speech sample.

For impression to speech, the open-ended question was used for eliciting listeners’ impression to speech, adapted from [2], and [3]. The listeners were asked to respond to the question

‘Do you understand the story and what is your impression to the speech sample if possible in terms of intonation’.

Data Collection: this included three main parts, eliciting speech samples, preparing speech samples, and judging speech samples by the listeners. All data were collected in the first semester of 2013 academic year.

For eliciting speech samples, adapted from [6] and [7], Thai students’ speech was recorded in the classroom in one day by using a netbook computer with Audacity computer program, and a microphone. In the classroom, 12 Thai students were asked to respond to the speaking task by narrating, telling, or describing a cartoon story after having read a whole.

Regarding the trial study, they equally had about thirty minutes for preparing themselves before starting. And the order was randomly selected and switched between male and female students. Each narrative was recorded and saved into digital (.wav) file. The speech sample of

a NS male teacher was elicited later in the similar situation as students as possible (quiet lecture room).

For preparing speech samples, each narrative of the 12 Thai students in digital (.wav) file was transferred to PRAAT and extracted to be three excerpts (still in .wav files). For investigation of intelligibility, adapted from [6] and [5], each narrative was extracted ranging in length from 4.5 to 10.5 seconds ( M = 7.0 seconds). The selection of the phrase was that it should be sufficient and long enough to be transcribed by the listener after listening once, and end with natural pause. For studying comprehensibility, inspired by [2], each narrative was extracted ranging in average length 1.0 minute. For eliciting impression to speech, inspired by [2], the extracts used for comprehensibility (average length 1.0 minute) were used. Based on [7] and [4], the selection of the utterances employed the opening of the sound files, containing equivalent narratives of the same theme or story. Then, all excerpts ( N = 36) were imported to Audacity computer program and converted into mp3 files to be speech samples.

This file type (mp3) had small size and can be saved in CD.

For judging speech samples by the listeners, adapted from [2] and [3], it was conducted at the listeners’ homes. Prior to this task, one listener of each group (except

Malaysian) met the researcher face by face to get the brief of the task and the materials for the whole group to judge speech at their homes. At homes, in listening session, adapted from

[6], the listeners were asked to hear each of the speech samples (mp3) at one time before judging, and could change their evaluations. All speech samples were presented at random order to each hearer. This listening task had three main parts. First, the listeners judged intelligibility of each speech sample ( M =7 seconds) by transcribing speech by writing down word for word on paper. Second, they rated each of the longer speech samples ( M =1 minute) for comprehensibility by checking box on a 9-point Likert scale, and they were asked to provide comments on their ratings immediately after rating. Finally, they were asked to respond to each of the speech sample ( M =1 minute, same as comprehensibility rating) by providing the impression to speech by writing down on paper. Even though the situation of listening task was not in a controlled experiment, the listeners were asked to perform task in quiet situation with headset as possible.

Data analysis: the study used (1) discourse analysis, acoustically carried on within

Brazil’s (1997) model of discourse intonation through using PRAAT. Tone choices were counted and intonational paragraphs were focused on key, (2) computing intelligibility scores and comprehensibility scores, and (3) firstly data from the listeners’ responses, the intelligibility and comprehensibility scores, and data from discourse analysis were compared and converged, secondly, the multiple regression (following [2] and [3] was utilized to find relationship between the acoustic variables (relating to tone choice and intonational paragraph) and comprehensibility scores, and the data from these two phases were compared and interpreted to be the overall results.

For discourse analysis, Brazil’s (1997) model of discourse intonation was used.

Central to Brazil’s (1997) theory, the communicative value of intonation relates to the knowledge of common ground, the area of world views that speaker assumes to share with the hearer at moment by moment in the interaction (p. 70), such as knowledge, experience, and attitude. When a speaker assumes common ground shared with his/her hearer, he/she creates tone choices based on this assumption. Tone choice means the choice of tone or pitch movement on the tonic syllable of the tone unit [4]. In Brazil’s (1997) theory, there are five tones, fall, rise-fall, level, fall-rise, and rise. Normally, in research, these five tones are considered as three tones, falling (including rise-fall), level, and rising tone (including fallrise). The combination of tone choices in any given discourse is called tonal composition [7].

When a speaker assumes common ground shared with his/her hearer, he/she creates tone choices based on this assumption. So, tone choices represent common ground [4]. Falling

tone (including rise-fall final contours) manifests the speaker’s assumption that the content of the tone unit is not yet presented in the common ground. That is, it is a new assertion, involving with world-changing to the hearer, and cannot recover from the prior context [7].

Rising tone (involving fall-rise final contours) indicates the speaker’s assumption that the matter of the tone unit is common ground or a part of shared understandings between participants. This information may be recovered from the immediate context or the area of common ground. [7] and [4] mention about using tone choices in interaction, based on

Brazil’s theory, that the speaker can make tone choice(s) of an utterance or longer either to create a state of convergence with the hearer or to present temporal withdrawal from the context of interaction. For the former, the speaker uses a combination of mainly falling and rising final tone choices to convey both informational and social meanings. In this, each tone choice is based on the speaker’s assumption that its content is whether a new assertion, or shared common ground with the hearer. And then, tone is directly projected for linking or penetrating to the hearer’s world. Using a combination of falling and rising tones is referred to direct discourse. For the latter, the speaker exploits a combination of largely falling and level tones. The speaker uses this kind of tone choices because of, for example, focusing on the language sample itself, reading out the text, and using language formulae. Level tone, or neutral tone, particularly, is naturally used when thinking, hesitating, and using a routine language, as examples. Its meaning is neither new nor shared with the hearer. So, this combination of largely falling and level tones shows less level of involvement with the hearers in the interaction [4]. This combination is called oblique discourse. Therefore, for creating convergence with the hearers, direct discourse should be used.

Given the intonational paragraph, it refers to “a unit above the level of the tone unit and equivalent to a paragraph in written discourse (Lehiste, 1979 cited in [4]”. In other words, it represents a speech paragraph. According to [8], the speech paragraph is accepted by analysts that it is created and translated by the speaker and hearer, using phonetic cues appearing at the boundaries. Such cues include a high pitch onset, occurring on the first onset syllable within a speech paragraph. (Onset syllable is the first prominent syllable in a tone unit.) The other cue is a low pitch close, occurring on the last tonic syllable within a speech paragraph. (Tonic syllable is the second prominent syllable in a tone unit, which carries information.) Between these two boundary criteria, there is likely to be a gradual fall in pitch on prominent syllables from the first to the final tone unit within a speech paragraph, indicated by overall F0 contours [1], [4], Tench, 1996 cited in [8]. Speech or intonational paragraph can be exploited by speakers in any type of text both speaking and reading to mark topic structure [4]. Given the topic structure, in speaking, a high pitch onset is used for initiating a new topic, a mid pitch onset for topic continuation, and a low pitch close for topic final boundary. When the speaker creates the next topic or topic shift, a high pitch onset is used once again. So, a high pitch onset is the most prominent cue of a new paragraph unit. In

Brazil’s (1997) theory, such intonational or speech paragraph is similar to his unit, pitch sequences (Pickering, 2004), “a stretch of speech which ends with low termination and has no occurrences of low termination within it” [1], p. 120. That is, this structural unit always has one low termination. (Termination is pitch level on the tonic syllable in a tone unit). (For more understanding the place of key and termination in a tone unit see Table 1 below.) A pitch sequence can begin with high, mid, or low key choices (key choice is pitch level on the first prominent syllable in a tone unit), while a speech paragraph notably begins with only high key or high pitch onset. This could mean that a pitch sequence may be a unit smaller than a speech paragraph, in other words, a speech paragraph may contain some pitch sequences. In extending the unit of pitch sequence to relate to a speech paragraph, Barr 1990 cited in [8] creates a sequence chain. Barr defines a sequence chain as a string of pitch sequences that only the first sequence begins with a high key. That could mean that a

sequence chain contains some pitch sequences. And it is similar to a speech paragraph as opening with a high key or a high pitch onset. So, the sequence chain structure indicates the speech paragraph, and a high key is the salient cue, which signals the beginning of each chain. The present study employed this structure to specify speech paragraph, and particularly focusing on how the Thai students use the high pitch onsets or high keys to signal topic beginnings or topic shifts.

Table 1, The location of key and termination in a tone unit, adapted from Brazil (1997, p. 12).

A tone unit with two prominent syllables onset syllable tonic syllable key termination

A tone unit with one prominent syllable

key, termination

For computing intelligibility scores, scores of the transcriptions were computed by using a percent-correct score, which was the number of words correctly transcribed per total words in each speech sample. Furthermore, based on [5], the types of transcription errors that the listener made differently from the speech samples (transcribed by the researcher) were considered for example omission the word/morpheme, and substitution. For comprehensibility scores, using a 9-point Likert scale, it was analyzed by using mean and standard deviation (SD).

For validity of acoustic analysis, all acoustic measures and analysis were conducted twice by the researcher. The first time was undertaken and after three weeks the second time was done. The inconsistent results were analyzed again and then the final answers were identified. For reliability of rating tasks on comprehensibility, standard deviation (SD) was considered.

Results

Discourse analysis showed that for tone choices, the level tones were used more (twice) than the native speaker, particularly the male students. Rising tones were used less than the native speaker for some amount. But falling tones were not too different. The overuse of level tones may cause speech less expressive, more oblique, and reduced degree of involvement with the hearers, which may obstruct comprehensibility. Fig 1 shows the results.

40

30

20

10

60

50

NS Male

Thai Female

Thai Male

0

Rising Tone Level Tone Falling Tone

Fig 1 , Tone choices of Thai students compared with a native speaker male.

For intonational paragraph, in case of Thai students, 4 (male1, 2, female1, and 2) from 12 could create paragraph structure similarly and consistently with the native speaker. Fig 2 illustrated male1’s a paragraph structure, starting a paragraph with a high pitch onset on word

AND (284 Hz) after the preceding one, specified by a low termination and a long pause (1.3 s). He then created a potentially falling in pitch on words

THEN (150 Hz),

NAPHORN (250 Hz), and

SO (279 Hz) until ending with a low termination, not seen in the figure. It could be seen that the first onset,

AND, had the highest pitch onset of the paragraph. This helped speech more expressive, and the overall pitch range was wide (216 Hz), shown by the steep intonation contour.

NEIGHBOR//// (1.3) ////

AND

THEN//

NAPHORN//

GAVE them

BREAD/// ///

SO// they

SAID//…

Fig 2 , Thai male1’s paragraph structure.

For the rest of the students (n=8), 5 (male3, 4, 5, female3, and 4) created average speech performance. That is, they did produce speech paragraphs, but the structure was not consistently smooth or complete. The biggest reasons that hindered this paragraph structure may be on the two areas, compression of the overall pitch range and pause. For compression of the overall pitch range, most of the students, especially the male (i.e., male3, 4, 5, and female3 except female 4), likely made key choices or onset syllables not adequately high, as well as the other onsets inside paragraph. Fig 3 illustrated an example of male3’s compressed pitch range. The word toDAY was the mid onset

containing very low pitch level (110 Hz). And, when the next paragraph began, it was made by a

(low) onset,

THAT, still having low pitch level (88 Hz).

… toDAY AM//going to talk a

BOUT//the

STOry//// ////

THAT /// NAME// the…

Fig 3 , An example of compressed pitch range of male3.

Given the pause problem, the students’ speech (i.e., male 3, 4, and female 4, except male5 and female 3) frequently had long pauses (more than 0.8 s.) inside the paragraph structure. These long pauses or topic length pauses, implying paragraph boundary, broke a normal paragraph into the shorter ones, which contained a small number of words. This caused speech sounded fractural or not smooth. This was for example 1 …I split// (0.7) // 

OUT in THREE

PART//

PART ONE// (1.4)

////

PART ONE

IS//// (0.9) ////a

COMing

OF a

STRANger//// (1.9). For the last 3 students (i.e., male 6, 7, and female 5), these two problems were severe, especially pauses. That is, almost their entire speech contained very long pauses. It made paragraph structure not appear, even a simple sentence was fractured and hardly to be finished. Thus, the first research purpose was fulfilled by these data, tone choices used by Thai students were different from the native speaker at the overuse of level tones and the less use of rising tones (considering with the compression of the overall pitch range). Intonational paragraph was very different, and frequently broken down by compression of the overall pitch range and pauses. This obstructed both information and social meanings, or reducing comprehensibility.

For the second research purpose, the mean intelligibility score of all students was 83.5 with the trivial transcription errors. The mean comprehensibility score of all students was 2.6, which closed to 1 (extremely easy to understand). So, overall speech samples used in this study were highly intelligible and comprehensible.

For the final research purpose, low-rising tone significantly contributed to the comprehensibility scores by having high conjoint effect or R 2 =58.2% of the variance ( p < 0.05), and strong unique effect by partial correlation= –0.7. This negative relation meant that if this tone was more used, the scores were less, or close to 1 (extremely easy to understand). Although the speech samples were still intelligible and comprehensible, actually discourse analysis showed that the overuse of level tones, compression of the overall pitch range, and pauses were problematic affecting both native and non-native listeners’ perception. The low-rising tones related to comprehensibility scores were rarely used among students. Hence, overall, the use of intonation in discourse of the Thai students had negative effects on comprehensibility, but not intelligibility.

Discussion and Conclusion

[7] said that the non-native speakers normally used level tones and less rising tones because of the problem of linguistic coding such as on-line verbal planning. This included using ineffective intonational paragraph [8]. Hence, for the goal of comprehensibility in the interaction, discourse

intonation should be included in the pronunciation teaching to overcome these obstacles. For tone choices, level tones and rising tones (including fall-rising) should be focused on to increase more information and social convergence with the hearers. Intonational paragraph; key, overall pitch range, and pauses should be pointed out to create more coherent organizational structure. For the task, it must be carried on in the context of interaction (discourse) such as vocal warm-ups, and narratives used in the study, which could control many features such as themes, and vocabulary, while funny and gave individual speech performance. Technology devices could be utilized to facilitate pronunciation teaching and learning such as video providing visual feedback of performance. PRAAT could be used for showing visual of intonation contour , as being easy, and novice users could use it. To understand the discourse intonation, the model of Brazil (1997) was not too hard to use. Hopefully, Thai students could be improved intonation in discourse, as using it an effective tool in presenting spoken discourse in the future workplaces. However, there were some limitations occurring during the study, first was the small number of listeners, and rating condition was not experimentally controlled. The answers may not be generalized to be the ones from the population. And the study was limited only on monologue from the students at the intermediate proficiency level, including listeners from only four nations.

References

1. Brazil D. The communicative value of intonation in English. UK: Cambridge University Press. 1997.

2. Kang O. Relative salience of suprasegmental features on judgments of L2 comprehensibility and accentedness. Syste. 2010; 38: 301-315.

3. Kang O, Rubin D, Pickering L. Suprasegmental measures of accentedness and judgments of language learner proficiency in oral English. The Modern Language Journal. 2010; 94(4): 554-566.

4. Levis J, Pickering L. Teaching intonation in discourse using speech visualization technology. System. 2004;

32: 505-524.

5. Munro M, Derwing, T. Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning. 1995; 45(1): 73-97.

6. Munro M, Derwing T, Morton, S. The mutual intelligibility of L2 speech. Studies in Second Language

Acquisition. 2006; 28: 111-131.

7. Pickering, L. The role of tone choice in improving ITA communication in the classroom. TESOL quarterly.

2001; 35(2): 233-254.

8. Pickering, L. The structure and function of intonational paragraphs in native and nonnative speaker instructional discourse. English for Specific Purposes. 2004; 23: 19-43.

Download