Original Article
Developing second language speaking skills: Eliciting repeated speech to increase fluency
and accuracy
Colleen K Davy
(Department of Psychology), Carnegie Mellon University
Brian MacWhinney
(Department of Psychology), Carnegie Mellon University
Abstract
Repeated production of several-minute speeches on a given topic leads to increases in
fluency and complexity of those speeches (Bygate et al. 2001) as well as other speeches
given weeks later (de Jong & Perfetti, 2011). However, this type of task does not improve
accuracy, because the task demands do not allow the speaker to immediately correct
errors, which is necessary for greater accuracy. Highly constrained, sentence-level
rehearsal exercises (i.e. Yoshimura & MacWhinney (2007) may be more conducive to
acquiring fluent and accurate production of new structures and vocabulary items. This
study investigates the use of a repeated sentence imitation exercise designed to improve
second language speech fluency in native English L2 Spanish learners. We find that, though
the exercise is not communicative in nature, it requires them to process the meaning of the
sentence and reconstruct it, rather than simply echoing what they hear. Furthermore, this
task leads to increases in fluency and accuracy with each production. Finally, we show that
manipulating the sentences used in practice can adjust the fluency and accuracy of
production of similar sentences. We conclude that imitation can be useful as both as a
pedagogical tool and as a method for studying processes in second language learning.
Keywords
Repeated imitation, fluency, second language acquisition, speaking
Introduction
Speaking is typically considered the hardest skill to acquire when learning a second
language. Language learners continue to struggle in the development of fluency in speech
production, often failing to achieve improvements in speaking fluency even after two years
of intensive and immersive language instruction (Derwing et al., 2007). One potential
reason for this is that, in situations where the learner does not speak the language much
outside the classroom, there are few opportunities for the development of speaking skills.
Furthermore, Rossiter, Derwing, Manimtim, and Rhomson (2010) have shown that many
foreign language textbooks and teacher resource manuals do not provide opportunities for
speaking, and those that do often fail to focus on fluency. They note that many of the
speaking activities provided are free-production tasks, with little control over the
structures and vocabulary produced. Further, these activities tend not to provide
opportunities for rehearsal or repetition, as repetition and rehearsal in the language
classroom are considered non-realistic. In a recent paper, DeKeyser (2011) makes a case
for the importance of rehearsal, suggesting that teachers can develop tasks that provide
repetitive speaking activities while still being realistic. However, while naturalistic
activities can be more useful, they require effort and a significant amount of cleverness to
create appropriate activities. We suggest that there may be a role in second language
acquisition (SLA) for non-naturalistic activities that provide repeated practice of speaking.
In this paper we will provide evidence that highly constrained rehearsal activities can be
useful for developing speaking skills, even when they are not entirely naturalistic.
Developing speaking skills through rehearsal
Current second language pedagogy tends to emphasize the importance of learning through
naturalistic, communicative speaking tasks sometimes complemented with structured
input (i.e., Krashen, 1982; Van Patten and Cadierno, 1993). This format provides little room
for repetitive practice. However, research on skill acquisition (i.e., Carlson, Sullivan, and
Schneider, 1989; Anderson, 1993) has long suggested that performing complex tasks
requires large amounts of repeated practice to achieve fluent execution. Responding to this
issue, DeKeyser (2011) has recently called for a return to a focus on rehearsal in the second
language classroom, suggesting that, while speaking tasks should be naturalistic (i.e., more
realistic than the traditional audio-lingual method of mere listen-and-repeat tasks), a bit of
realism can be sacrificed in favor of ensuring sufficient practice on a wider variety of
grammar and vocabulary than can typically be covered in truly naturalistic contexts.
Bygate (2001) found that having speakers perform the same speaking activity many
times over a ten-week period led to increases in fluency and complexity of that speech,
even with weeks in between each rehearsal. A common format for such rehearsals is the
4/3/2 task (Nation, 1989), in which speakers give a speech on a topic in four minutes, then
three minutes, then two. De Jong and Perfetti (2011) found that use of the 4/3/2 task led to
improvements in fluency not only on the rehearsed speeches, but also on other speaking
activities completed weeks after training. Moreover, these effects were only found if the
practice involved rehearsal on the same speech: the control condition which practiced
three different speeches did not show these effects, showing that it is not simple speaking
practice that leads to improvements, but the repeated nature of the practice.
The authors of these studies suggest two possible explanations for the increases in
performance. First, the first repetition activates the useful lexical and grammatical nodes
needed for performance, and these remain partially activated during the second repetition,
leading to easier selection in the short term, which leads to a lower cognitive load and
allowing more resources for producing speech fluently. Second, as shown by skill
acquisition models (i.e., Anderson’s ACT-R model (1993)), multiple repetitions of the same
task can lead to proceduralization of grammatical forms or faster retrieval of lexical items,
which then leads to lowered cognitive load in producing utterances, allowing for more
efficient planning of future speech.
However, given the short time period and low number of practice trials, it seems
unlikely that enough proceduralization is occurring to lead to the observed effect. We offer
a third explanation, which is simply that the repetition provides “fluency facilitation” by
allowing the speaker to focus on fluency rather than other aspects of the task. This is
similar to Skehan’s (2009) suggestion that when the speaker is given multiple
opportunities to give a speech, they will refocus their attention each time. Thus, rather
than necessarily leading to proceduralization, repeating a task allows the speaker to refocus their attention on a different dimension; in the case of Bygate’s study, the relevant
dimension is complexity and fluency rather than accuracy. This fluency facilitation effect is
consistent with the finding that, while fluency and, in some cases, complexity increased,
these studies did not find increases in accuracy. If these increases were due to
proceduralization, accuracy should increase as well, since proceduralization often begins
with the refinement of procedures to produce correct output. Bygate (2001) posits that
increases in accuracy cannot occur in longer monologic or dialogic tasks, even those that
allow for repeated practice, because the gap between the repetitions is too long. Moreover,
increases in accuracy require self-monitoring and immediate correction, which cannot
occur in such long tasks. In order to allow for increases in accuracy (and potentially
fluency), speakers must repeat the problem structure immediately, or at least very soon,
after the error.
So, while naturalistic speaking activities may fulfill the requirement of being more
connected to form-meaning mappings, they still have shortcomings, both in their ability to
lead to more accurate production and subsequent proceduralization of correct language
and their usefulness as a method of exploring the processes and mechanisms behind the
development of speaking skills. To fill in these pedagogical and theoretical gaps, we
suggest that even more constrained, but less naturalistic tasks, may be useful. For example,
Yoshimura & MacWhinney (2007) found that prompting repeated rehearsal of individual
sentences through overt reading led to increases in fluency and accuracy of production of
sentences containing new vocabulary items. They found these improvements even in as
non-naturalistic a context as a read-aloud task, suggesting that highly controlled practice
like this can still be useful as a method of practice in L2 learners who are still struggling
with particular vocabulary items and constructions. This is important because L2 learners
often avoid language that they are not yet comfortable with, which may delay acquisition of
new structures and vocabulary. Furthermore, controlling the exact input and output
learners use facilitates careful experimental comparisons.
Although overt reading does lead to improvements, it is very different from actual
speech production. A different task that does not rely on the use of reading and is more
reconstructive may provide a more realistic speaking process and thus produce greater
effects than a reading aloud task. We propose that sentence imitation represents a natural
and effective way to prompt repeated practice of new or problematic constructions.
The Repeated Imitation (RI) task
Sentence repetition is often used in self-practice materials, such as the Pimsleur method, as
well as informally in the classroom environment. However, the effects of those methods
have seldom been subjected to detailed experimental evaluation of the type we will present
here. It is important to remember that sentence imitation is a perfectly natural process.
Young children often spontaneously mimic adult speech, especially words and phrases they
are not yet ready to use in their own speech (for one such study on child imitation, see
Bloom, Hood, & Lightbown, 1974). Some cultures specifically encourage imitative speech;
for example, the Kwara-ae, a Melanesian people in the Solomon Islands, raise children with
an activity referred to as calling-out, where adults prompt children to relay information or
make requests of others by repeating after them. Subsequently, imitating adult speech
becomes a significant part of the child’s life during the language development stage (see
Watson-Gegeo and Gegeo, 1973).
One fundamental concern with using imitation as a training method is that since
learners are merely repeating what they hear, they may not be creating form-meaning
mappings during this task that can lead to morphosyntactic development and appropriate
speech usage. However, Erlam’s (2006) review of the use of the Elicited Imitation task used
in L2 assessment has suggested that the ability to repeat sentences correctly is tied closely
to the ability of the learner to comprehend the sentence, and that performance on this task
is highly correlated with other more widely-used measurements of L2 speaking ability.
We propose that imitation is useful for more than assessment. In this paper we explore the
use of imitation in the form of an exercise that we refer to as Repeated Imitation, which
consists of listening to a native speaker and immediately repeating what was said, then
repeating this process multiple times in a row for each sentence. We suggest that the
native speaker model and iterative nature of the task will allow the students to practice and
monitor their own speech, providing practice without requiring a native speaker or teacher
for feedback. The two studies in this paper investigate the use of this task as a method of
practice for proficient language learners. We predict the task will show the following
effects:
1. Through each repetition of the RI practice, participants will improve in accuracy.
2. Through each repetition of the RI practice, participants will improve in fluency.
3. Participants will be able to produce sentences they have rehearsed more quickly and
accurately than those they did not practice.
4. Since RI is reconstructive and not merely echoic, the speech participants are able to
produce will closely mirror the way in which they translate the sentence.
Study 1
Study 1 tested using RI as a method of rehearsal elicitation, focusing specifically on
whether this task would lead to improvements on producing the target sentences, and
whether participants processed the sentences they repeated or were merely echoing
phonological input.
Methods
Procedure. After obtaining informed consent, participants were seated in front of a
computer and were instructed that during this task they would hear a sentence, after which
they would repeat the sentence back as quickly and accurately as possible. They initiated a
trial by clicking a button, at which point the word “Listen” appeared on the screen, and the
sentence was presented through the computer speakers. After the sentence finished
playing, the words “Repeat Now” appeared on the screen and they repeated the sentence
they just heard, pressing the space bar to stop recording. After speaking, they translated
the sentence into English and rated their performance on the sentence on a scale of 1 to 7,
with 1 being the lowest and 7 being the highest. They repeated this process four times for
each sentence, translating the sentence and rating their speech in between each trial.
*** Did you really have them translating after each repetition? Why so much? Also there
seems to be a confusion between “repetition” and “trial”. Why use both terms? This
introduces confusion later ***
Participants returned one week after the Training session for a post-test, where they heard
the sentences and repeated each of them back one time, to measure the long-term effects of
the practice.
Stimuli. The stimuli were 40 sentences taken from the Foreign Service Institute’s Basic
Spanish program. The sentences had between four and 19 words, with an average of 8.42
words, and between 9 and 31 syllables, with an average of 15.84 syllables. For this first
study, we used a wide range of constructions, consisting of both statements and questions,
in the indicative and subjunctive mood, and in present, preterite, imperfect, and future
tense. These sentences were spoken by a number of different speakers, both male and
female.
Participants. The participants in this study were nine students at Carnegie Mellon
University currently enrolled in a third semester Spanish class.
Coding. The data in this study were transcribed using Praat speech analysis software
(Boersma & Weenink, 2007) and were coded for temporal data and grammatical errors. As
a measure of fluency, we calculated the mean duration of utterance (MDU) by subtracting
the onset of speech from the end of production. We then coded the utterances for
completeness and accuracy, first marking whether the speaker completely reproduced the
sentence, then coding for a number of different errors. We coded for a series of verbrelated errors such as tense, subject-verb agreement, and conjugation errors, lexical errors
such as choosing the wrong word or having incorrect number or gender, as well as
pronunciation errors. We do not distinguish, however, between errors of omission
(missing a word) or commission (using the wrong word). For the current analysis, we
combined all errors into one measurement of Total Errors per sentence.
Finally, we coded the translations provided by the participant after each repetition
according to whether they matched the speech. If the speech matched the translation
exactly, it was coded as a Match (even if the speech and translation were incorrect);
otherwise, it was coded as Missing (containing less information than the speech), Extra
(containing more information than the speech), or Meaning (containing the same amount
of information but having incorrect information; for example, providing the translation in
the wrong tense or using the wrong verb translation). This coding system allows us to
judge whether participants had a general representation of the meaning of the speech they
produced, or whether they were producing speech without being capable of conveying the
meaning in their native language.
Results
Effect of practice on completeness. First, we investigated whether participants’
performance in repeating the sentences they heard improved with each repetition of that
specific sentence. A logistic generalized linear regression on the probability of completely
repeating the sentence at each repetition (1-5, with 5 being the delayed post-test) showed
a significant main effect of Repetition (p<0.001), with each subsequent repetition having
fewer incomplete sentences than the first trial. This effect held even at the delayed posttest (p<0.01).
Effect of practice on accuracy. Next, we performed a generalized linear regression on the
number of errors produced at each repetition and each trial, to see whether participants
improved both on producing each individual sentence and whether they performed the
task better over time. Again, there was a significant effect of Repetition, with participants
making significantly fewer errors by the fourth training repetition (p<0.001); however,
there was no significant difference between performance at the first repetition during
Training and the test trial a week later. So, while participants were more likely to
completely repeat the sentence at the delayed test, they produced more errors. Finally,
there was a significant effect of Trial (p<0.0001), with participants making fewer mistakes
overall as the practice progressed.
Effect of practice on fluency. To measure the effects of the training on fluency, we
performed a generalized linear regression of the MDU by repetition and trial number. We
found a significant effect of Repetition, with participants producing the sentences more
quickly by the third and fourth repetitions (p<0.0001); the post-test production was also
significantly more fluent than the first and second repetition during training (p<0.01). We
also found a significant main effect of Trial, with participants producing sentences more
fluently later in the training than in the beginning (p<0.01).
*** I have no idea what “trial number” or the “Trial factor” means and how it differs from
“repetition” or “session” or what. Is this just a binary factor of Training vs Posttest? ***
Translatability and speech production. We next addressed the question of whether RI is
based on reconstruction, rather than repetition. If the task is reconstructive, participants’
ability to translate the sentence into English, which also requires a reconstruction of
meaning, should match their ability to repeat the sentence. If participants repeat the
sentence correctly but cannot give a correct translation, that indicates that they are able to
simply memorize sequences of sounds and produce them, without understanding the
sentence.
For these analyses, we counted only trials where the participant provided a
translation. Not providing a translation may be indicative of lack of understanding, but it
may also simply be due to difficulties with typing or simply boredom. However, if the
participant provided even one word in their translation, it was included in the analysis. As
a result, of 1716 total trials, we ended up with 1498 trials where the participant provided a
translation.
*** I can see how they would get bored if you are asking them to repeatedly give the same
translation four times. Throwing out data on that basis seems unnecessary. ***
The vast majority of transcriptions matched the speech produced; of the 1468 valid
trials, 1082 matched, while only 386 did not. A chi-squared test for independence showed
that this was significant (χ = 329.984, p<0.001). Of the 386 non-matching trials, a vast
majority (246) of those transcriptions were a meaning-related mismatch, with relatively
small numbers of transcriptions having missing or extra information (47 and 93,
respectively).
We then broke down the results according to whether the participant had correctly
produced the sentence or not. Table 1 presents descriptive information for each code, for
all transcriptions, and transcriptions for correctly-produced and incorrectly-produced
sentences. We then ran chi-squared tests to see whether the differences between groups
were significant.
Incorrect
Correct Total
χ2
Sig.
Transcription
Matching
606
476
1082
15.619
.000
Match?
Different
173
73
246
40.650
.000
46
1
47
43.085
.000
75
18
93
34.935
.000
900
568
1468
Meaning
Missing
Trans.
Extra Trans.
Total
Table 1: Descriptive statistics for transcription/speech matches. Chi-squared tests all with
1 degree of freedom.
To further address the issue of the mismatched translations, we provide two specific
examples.
(1) Usted quiere decir que los profesores son exigentes, verdad?
you want-3S say-INF that the professor-PL are demanding-PL, correct?
‘You mean that the professors are demanding, right?’
The word exigente, or “demanding”, was unknown to many participants. No participant
correctly translated the word, though eight out of the nine participants correctly repeated
it. In this case, it is clear that the participants are merely imitating. However, in this case it
may actually be beneficial to them, as it allows them to practice new vocabulary that they
otherwise would be unable to produce spontaneously.
The second example (2), however, is a little more problematic:
(2) Porque me trajeron esta carta?
why me bring-3P&PRET this letter?
‘Why did they bring me this letter?’
When producing this utterance, many were able to correctly repeat the sentence at least
once. However, while five participants were able to repeat this utterance, only three
correctly translated it. The other two, rather than using the third person plural, used the
second person singular “you” as the subject of the sentence. This illustrates a shortcoming
of this type of practice: participants can accurately repeat Spanish s without fully
understanding what they convey in terms of tense, mood, and person-number agreement.
Regardless of whether the sentence is produced correctly in speech, the translation
appears to match very closely. However, the fact that there are instances where the
speaker produced more speech than they were able to translate suggests that speakers can
rely somewhat on echoic memory to reproduce the sentences. As our two examples
showed, this can be a positive feature of the task or a negative one, depending on the type
of grammatical structure or vocabulary item being targeted by the training.
Discussion
We first asked whether using repetitive imitation could lead to improvements in students’
ability to repeat back aurally presented sentences. Looking at student performance across
multiple repetitions of sentences, we found that with each repetition, students improved
significantly, both in their fluency, as measured by initial pause and total duration, and
accuracy, as measured by the number of errors produced per utterance. This suggests that,
as in the Yoshimura and MacWhinney task, repeated practice can lead to marked
improvements in speech production.
Next, we asked whether this task prompted students to process the speech they
were hearing, which may lead to reconstructing the sentence during production, rather
than merely echoing what they heard. The fact that the participants’ ability to repeat the
sentences back appears to match quite closely their ability to translate the sentence into
English suggests that comprehension plays a strong role in the completion of this task.
Moreover, the target sentences are often too long to be stored just as a series of unlinked
words in working memory. These two facts suggest that accurate sentence repetition
requires a conceptual understanding of the sentence.
The next step is to investigate the implications of using this task as a method of
improving speaking skills in the long term. Since participants practiced a wide variety of
grammar structures and vocabulary through the training, it is impossible to track
improvement on any specific feature over time. Study 2 trained language learners on two
different structures, to investigate how performance on specific structures changes over
time as a result of repeated imitation.
Study 2
In Study 2 we adjusted the RI exercise to include the prompting of meaning through
pictures, in order to provide more contextual information and create a stronger mapping
between form and meaning. Additionally, using pictures also allows for prompting speech
without providing an immediate native speaker model: participants can describe the
pictures in the way modeled earlier during training and later produce speech on their own.
This study contained immediate and delayed post-tests consisting of sentences either seen
during the practice or of similar constructions to the trained sentences.
Additionally, we manipulated the grain size of training, prompting participants to
practice the sentences either in short phrases or as full sentences. In doing so, we will be
able to further investigate whether this task can adjust speakers’ accuracy and fluency
depending on the type of practice they receive. If speakers show different patterns of
accuracy and fluency, despite receiving the same amount of practice on the same sentences,
that will show that rehearsal affects the development of speaking skills in ways other than
simply providing multiple opportunities for proceduralization; different types of training
may differentially affect how participants allocate resources to accuracy and fluency.
Methods
Procedure. The study consisted of three parts: RI training, immediate post-test, and
delayed post-test. The task was identical to the one used in Study 1 with three changes:
first, participants practiced each sentence three times instead of four; second, the
participants saw pictures (as shown in Figure 1), during sentence presentation; and third,
participants were not prompted to produce a translation or give a self-rating of their
performance.
Figure 1: An example picture used in training and test trials. In this case, the sentence
elicited would be “El sugiere que el cocine la cena.”
After the RI training phase they moved to the immediate post-test. In this phase, they saw
the pictures as in the training phase, but did not hear the sentence; instead, they were
immediately prompted to produce a sentence to describe the pictures they saw. Half the
sentences in this test had been practiced during the training phase; the other half were
novel, but containing the same vocabulary and sentence structure as the trained sentences.
One week later the participants came back for the delayed post-test, which was identical to
the immediate post-test but presented in a different order.
Stimuli. The stimuli in this study consisted of two types of sentences. The first sentence
type, illustrated in (3), used two coordinated clauses, each with a verb in either the
Preterite or the Imperfect. The first clause included a temporal adverb that signaled the
correct tense.
(3) Ayer tu limpiaste los platos y yo cociné la cena.
Yesterday you clean PRET the dishes and I cook PRET the dinner.
‘Yesterday you cleaned the dishes and I cooked dinner.’
The second sentence type, illustrated in (4), included a main verb in the Present tense and
a complement clause with a verb in the Subjunctive.
(4) Yo aconsejo que tu limpies los platos.
I suggest PRES that you wash SBJV the dishes.
‘I suggest that you wash the dishes.’
In this second sentence type, the speaker must attend to cues that trigger use of the
Subjunctive mood, and manipulate conjugation in two different verb moods. All of the
vocabulary and verb forms trained in this study were familiar to the participants, but the
use of the verb forms had not yet been mastered.
Conditions. In this study we contrasted two types of practice: practice in phrases versus
practice in full sentences. The sentence types used in this study differ on dimensions of
length and complexity, which may differ in the types of practice they require and the ability
of that structure to push accuracy or fluency. The preterite/imperfect sentences, while
they are not particularly complex, are very long, and thus put large demands on the
speaker’s memory.
*** you should give mean lengths of the sentences in the two different conditions ***
Practicing in full sentences, as opposed to individual phrases, increases what Robinson
(2005) refers to as resource-dispersing demands, or demands that require emphasis on
general performance, precluding a focus on grammaticality and accuracy. Meanwhile, the
subjunctive sentences are very complex while still being short in length, and subsequently
are high in what Robinson refers to as resource-directing demands, which require a focus
on the conceptual and grammatical features of the speech, and subsequently increase
accuracy.
All participants in this study received practice in both conditions. One group, the
PretPhrase/SubSent condition, practiced preterite/imperfect as phrases and subjunctive
sentences as full sentences. The other group, the PretSent/SubPhrase condition, practiced
preterite/imperfect sentences as full sentences and subjunctive sentences as phrases.
Training Conditions
PretPhrase/SubSent
Preterite/Imperfect
Subjunctive
Phrase
Sentence
Sentence
Phrase
(SC)
PretSent/SubPhrase
(LS)
Table 1: Training tasks used for each condition.
If the task’s effect on performance is related solely to the focus of attention on form and
potential proceduralization of that item achieved during practice, we would expect to see a
clear fluency/accuracy tradeoff by sentence type, with sentences trained in the phrase
condition showing greater accuracy and those trained in the sentence condition showing
greater fluency. However, if task performance is related to general factors of speaker
attention and biases, training in one condition should affect performance on both types of
sentences.
Analyses
Analyses were conducted much in the same manner as in Study 1. However, given that we
were comparing two constructions that varied greatly in length, using MDU as a
measurement of length would make it harder to compare the two sentence types. Instead,
we used Duration Ratio, which was calculated by taking the MDU and dividing it by the
time it took the native speaker to produce that sentence. This measurement is meant to be
indicative of the native-likeness of production, independent of the length. We also
measured the initial pause, or the amount of time the participant took before they started
speaking, which also indicates the fluency of speech by measuring the amount of time
needed to think before speaking. Accuracy was coded using the same hierarchical coding
scheme as in Study 1, again combining all errors into one measure of total errors for these
analyses.
Results
Training. We performed two 2 (Condition) by 3 (Repetition - 1, 2, or 3) univariate ANOVAs
– one for errors and one for duration. There was a significant difference in duration ratio (F
= 15.95, p<0.001) and errors (F=22.75, p<0.001), particularly between the first and second
repetitions. There were no significant interactions.
Testing: By condition. Next we performed 2 (Condition- LS and SC) by 2 (Test- immediate
and delayed) univariate ANOVAs for duration ratio, initial pause, and errors on items on
the immediate and delayed post-tests administered after the training. We found that for
duration ratio, there was no significant effect of condition at the immediate post-test, but
by the delayed post-test the PretSent/SubPhrase condition had significantly shorter
duration ratio. For initial pauses, we found a significant main effect of Condition (F= 17.21,
p<0.001), with PretSent/SubPhrase having significantly shorter initial pauses at both
immediate and delayed test than the PretPhrase/SubSent condition. For errors we found a
significant main effect of Condition (F = 12.34, p<0.001), but with the opposite pattern of
results, with PretSent/SubPhrase having significantly more errors than the
PretPhrase/SubSent condition. There was a significant interaction between Test and
Condition, with the PretSent/SubPhrase condition improving significantly between
immediate and delayed test; by the delayed post-test, there was no significant difference
between conditions.
Test: By sentence type. Next, we examined whether performance at test differed by
sentence type. We performed separate 2(Condition-LS and SC) by 2(Test- immediate and
delayed) univariate ANOVAs for duration ratio, initial pauses, and errors for both
subjunctive and preterite/imperfect sentences. For preterite/imperfect sentences, the
longer, simpler sentences, we found a clear fluency/accuracy tradeoff: there was a
significant main effect of Condition for both duration ratio and initial pause (F = 6.009, p =
0.014 and F = 17.314, p<0.01, respectively), with the PretSent/SubPhrase condition (which
practiced the preterite/imperfect sentences as full sentences) having shorter duration
ratios and initial pauses at both immediate and delayed test. For duration ratio, we found a
significant interaction of Condition and Repetition (F = 5.931, p = 0.015). This interaction
involved no significant difference at the immediate post-test, but with shorter duration
ratios for the PretSent/SubPhrase condition at the delayed post-test. Errors showed the
opposite pattern: we found a significant difference of Condition (F = 6.620, p=0.01), with
the PretSent/SubPhrase condition making significantly more errors than the
PretPhrase/SubSent condition.
For subjunctive sentences we do not see this tradeoff. For duration ratio, we found
a significant interaction between Test and Condition (F = 15.324, p<0.01), with the
PretSent/SubPhrase condition initially performing worse at immediate post-test, but better
by the delayed post-test (suggesting an additional learning period for learning to produce
long full sentences but overall leading to better learning of the construction). For initial
pauses, we found a similar interaction (F = 4.741, p = 0.03), with the PretSent/SubPhrase
condition having shorter initial pausess at the delayed post-test, though having no
difference for Condition at the immediate post-test. We found a similar interaction for
errors, with the PretSent/SubPhrase condition performing much worse at the immediate
post-test compared to the PretPhrase/SubSent condition (F = 16.692, p<0.01), but having
numerically fewer, though not significantly fewer, errors than the PretPhrase/SubSent
condition by the delayed post-test.
Novel vs. trained sentences. Finally, we analyzed whether there was a difference between
sentences that students had practiced during the training phase and similar but novel
sentences. If performance improved significantly on trained sentences only, this would
indicate that improvement is due to proceduralization of individual sentences. A one-way
ANOVA comparing trained to novel sentences showed a significant difference for duration
ratio (p<0.001), with trained sentences having shorter duration ratios than untrained
sentences, but with no effect for initial pauses or errors produced. Furthermore,
comparing the two sentence types separately, it appeared that this effect was present only
in the preterite/imperfect sentences. Finally, we found no difference between training
conditions; neither condition led to more generalizable performance than the other.
Summary
The goal of this second study was to identify a) whether repeated imitation affects the
production of speech outside the performance of the task itself and b) whether
manipulating the task can affect whether it enhances fluency, accuracy, or both. Similar to
Study 1, we found that, during training, participants improved through each repetition in
both accuracy and fluency of imitation. There was no effect of training type during this
phase.
The type of training did have an affect on performance on later sentence production
tasks, however. First, the type of sentence and the size of the trained unit affected the
fluency and accuracy of production. When sentences were practiced initially as phrases,
rather than sentences, it led to increased accuracy during later production. This indicates
that a part of the driving force of the fluency-accuracy tradeoff is that lowering task
constraints will allow for greater focus on accuracy, which may lead to more long-term
learning than practicing initially in sentences. Conversely, practicing in sentences leads to
more fluent speech overall, potentially because when the sentences are longer they require
effort simply to produce the entire sentence, thus taking attention away from accuracy and
instead encouraging fluent production.
The key finding is that, in addition to improving performance on specific sentence
constructions and vocabulary, providing practice that encourages fluency (i.e., practicing
long sentences) improves fluency on sentence performance overall. This was clearest in
the PretSent/SubPhrase condition, which produces preterite/imperfect sentences that are
long and provide the highest performance-based demands. This condition produces
increased fluency on all sentences produced, even on sentence constructions that were
rehearsed in phrases. This suggests that the degree to which repeated practice leads to
proceduralization of grammatical structures can be influenced by specific aspects of the
training materials and method.
General Discussion
The current paper examines the extent to which rehearsal in general, and the Repeated
Imitation task in particular, is a useful method of developing speaking skills in a second
language. In Study 1 we tested the use of Repeated Imitation, an exercise that involves
listening to a native speaker and repeating what you hear, to improve proficient Spanish
speakers’ fluency and accuracy in repeating sentences. Participants did improve in their
production, both in grammatical accuracy and fluency of repetition. We also confirmed
previous findings that imitation tasks like this one are reconstructive, not echoic, in nature,
by showing that students’ ability to repeat the sentence closely mirrored their ability to
translate the sentence into English.
Study 2 replicated the results of Study 1, showing that participants improved
throughout the training, and also that this training is generalizable to other tasks, by
administering post-tests, both immediately and a week after training, using a picture
description task that did not involve repeating sentences. We also compared the effects of
two types of training; practicing in sentences, which places an emphasis on fluency in order
to produce the entire thing, and practicing in phrases, which allows for a greater emphasis
on accuracy. We found that, during training, both conditions led to improvements in
fluency and accuracy of production, just as found in Study 1. It is possible that this is due to
proceduralization of the sentence constructions and vocabulary items used in the trained
sentences, but it is also possible that this finding is more in line with Skehan’s (2009)
interpretation that repeating a task allows the speaker to adjust their focus on a different
dimension (fluency, accuracy, or complexity).
Furthermore, the type of training affected the fluency and accuracy of sentence
production during testing. A simple fluency-accuracy tradeoff might be more in line with
the proceduralization view of the training; we would expect practicing in phrases to lead to
greater proceduralization, because it allows the speaker to focus on form-meaning
mappings, while practicing in sentences would focus the speaker’s attention on
performance constraints (see Robinson’s Cognition Hypothesis) and lead to greater
fluency. However, while we did find this tradeoff for the PretPhrase/SubSent condition, we
saw an overall benefit for fluency in the PretSent/SubPhrase condition, which involved
practicing producing very long sentences. The participants in this condition showed more
accurate production on the complex subjunctive sentences, but also greater fluency,
compared to the participants who practiced subjunctive sentences as full sentences.
The major concern with this task is, of course, the fact that is not naturalistic in this
particular social context. While we have shown that the training affects performance on
other related speaking tasks that do not involve imitation, more work needs to be done to
show how much this task leads to improvements in speech in other contexts. We believe
that this type of practice should not be used as a replacement for open-ended speaking
activities; rather, this task should be a supplementary activity, used to prompt focused
practice on problematic forms.
However, this task can be used as a viable method of prompting speaking practice,
with numerous benefits that open-ended tasks can not offer. For teachers, this task can
allow practice that their students can perform on an individual basis, either as homework
or an in-class activity. The native-speaker model may allow students to monitor their own
speech, and they can also record their speech to receive teacher feedback later. It also
provides practice on constructions and vocabulary the students may opt not to use in more
open-ended tasks. In addition to pedagogic implications, the structured nature of this task
makes it useful as a way of investigating the development of speaking skills. The results of
Study 2 suggest that students perform this task much in the same way as open-ended tasks,
and that the task can be manipulated to change performance, allowing researchers to
investigate the development of speech during practice.
Acknowledgements
This work was supported in part by a Graduate Training Grant awarded to Carnegie Mellon
University by the Department of Education [grant number R305B040063], and the Pittsburgh
Science of Learning Center, which is funded by the National Science Foundation [grant number SBE0354420]. We also thank John Kowalski for his programming assistance, Maria Liliana Mariño for
her help in creating and recording stimuli, and David Plaut for reading and editing previous drafts
of this paper.
References
Anderson, J. (1993). The Architecture of Cognition. Mahwah, New Jersey: Lawrence
Erlbaum Associates, Inc.
Bloom, L., Hood, L., & Lightbown, P. (1974). Imitation in language development: If, when,
and why. Cognitive Psychology, 6, 380–420.
Boersma, P., & Weenink, D. (2007). PRAAT. Retrieved from http://www.praat.org
Bygate, M., Skehan, P., & Swain, M. (2001). Effects of task repetition on the structure and
control of oral language. In Researching pedagogic tasks: Second language learning,
teaching and testing (pp. 23–48). Pearson Longman.
de Jong, N., & Perfetti, C. (2011). Fluency training in the ESL classroom: An experimental
study of fluency development and proceduralization. Language Learning, 61(2), 533–
568.
DeKeyser, R. (2011). Practice for second language learning: Don't throw out the baby with
the bathwater. International Journal of English Studies, 10(1), 155–165.
Derwing, T., Munro, M., & Thomson, R. (2007). A longitudinal study of ESL learners' fluency
and comprehensibility development. Applied Linguistics, Applied Linguistics, 29(3),
359–380.
Erlam, R. (2006). Elicited imitation as a measure of L2 implicit knowledge: an empirical
validation study. Applied Linguistics, 27(3), 464–491.
Krashen, Stephen D. 1985. The input hypothesis. London: Longman.
Nation, P. (1989). Improving speaking fluency. System, 17(3), 377–384.
Robinson, P. (2005). Cognitive complexity and task sequencing: studies in a componential
framework for second language task design. International Review of Applied Linguistics
in Language Teaching, 43(1), 1–32.
Robinson, P., Cadierno, T., & Shirai, Y. (2009). Time and motion: Measuring the effects of the
conceptual demands of tasks on second language speech production. Applied
Linguistics, 30(4), 533–554.
Rossiter, M. J., Derwing, T. M., Manimtim, L. G., & Rhomson, R. I. (2010). Oral Fluency: The
Neglected Component in the Communicative Language Classroom. The Canadian
Modern Language Review, 66(4), 583–606.
Skehan, P. (2009). Modelling second language performance: Integrating complexity,
accuracy, fluency, and lexis. Applied Linguistics, 30(4), 510–532.
VanPatten, B., & Cadierno, T. (1993). Input processing and second language acquisition: A
role for instruction. The Modern Language Journal, 77(1), 45–57.
Watson-Gegeo, K. A., & Gegeo, D. W. (1986). Calling-out and repeating routines in Kwara-ae
children's language socialization. In B. B. Schieffelin & E. Ochs (Eds.), Language
Socialization Across Cultures. Cambridge University Press.
Yoshimura, Y., & MacWhinney, B. (2007). The effect of oral repetition on L2 speech fluency:
An experimental tool and language tutor. Presented at the Speech and Language
Technology in Education, The Summit Inn, Farmington, PA.