The Task of Utterance Interpretation

advertisement
Step Aerobics
Keeping in Step:
Task Structure, Discourse Structure, and Utterance Interpretation
in the Step Aerobics Workout
Judy Delin
Department of English Studies
University of Stirling
Stirling
FK9 4LA
Scotland
j.l.delin@stir.ac.uk
January 10, 2000
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
1
Step Aerobics
2
Abstract
This study examines the monologues given by instructors in Step Aerobics classes,
focusing on the way in which language both arises out of ongoing action and is
constitutive of it. Following Levinson (1992), I show how the structure of the activity
constrains the interpretations that are made of the utterances that arise throughout the
workout. Aerobics participants need specific pragmatic knowledge, a key part of which is
the ability to detect and interpret five distinct functions of utterance, defined according to
their timing and placement in the hierarchical structure of tasks that the class is
performing. I demonstrate that it is beat placement, rather than grammatical form or
sequential organisation, that is the most important cue for this interpretative task. Having
presented the utterance functions and the cues to their interpretation in detail, the paper
goes on to outline how participants achieve the correct assignment of pronoun reference
and ellipsis in the instructor’s monologue. This is explained by means of an approach to
discourse modelling first suggested by Grosz and Sidner (1986), showing how instructors
set up ephemeral actions as complex conceptually-salient discourse entities, making them
accessible for subsequent reference.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
3
Keeping in Step: Task Structure, Discourse Structure, and Utterance
Interpretation in the Step Aerobics Workout
During the course of a Step Aerobics workout, participants have to perform a
highly complex set of tasks. To do this, they are ‘cued’, visually and verbally, but the
instructor, who produces a monologue to accompany the activity. This paper focuses on
how participants use their knowledge of both the task structure and language to interpret
this monologue, and how instructors position and construct utterances so that they serve
the goals of the workout.
The analysis reveals that less than half the instructor’s utterances actually have a
directive function. In fact, instructor utterances serve five main functions, and
participants must be able to differentiate between them in order to act correctly. This task
is not achieved through relying substantially on either utterance form or content, since
there is a great deal of overlap between the different utterance functions in both these
respects. Instead, participants rely on the placement of utterances in relation to the beats
of the music and the synchronised activity that is taking place in order to distinguish
reliably between utterance types. It is utterance timing, rather than sequential
organisation, that enables participants to act appropriately.
This research draws on the framework for the description of language and action
outlined by Levinson (1992). Levinson introduces the notion of activity type (p. 69),
which he defines as ‘any culturally recognized activity...whose focal members are goaldefined, socially constituted, bounded, events with constraints on participants, setting,
and so on, but above all on the kinds of allowable contributions.’ Levinson’s formulation
places action at the centre of descriptions of such events. The interpretation of any
language that occurs within them is closely linked to that action. He draws attention to
one ‘particularly important question (p. 75): ‘In what ways do the structural properties of
an activity constrain (especially the functions of) the verbal contributions that can be
made towards it?’ This paper seeks to answer this question for the genre of Aerobics
workouts through examining the constraints on the verbal contributions that can be made
by instructors, and looking at how these contributions are ‘parsed’ by participants into
relevant functions, based on the structure of the action at hand.
As Levinson (1992) has suggested, specific types of inference about what
utterances can ‘count as’ in a given context rely on a knowledge of activity type. The
task examined here even dictates what can count as the functionally-defined unit
‘utterance’. Because task knowledge and utterance interpretation are so interdependent,
different levels of expertise in the task can also give rise to predictable differences in
utterance interpretation. In other words, novices predictably ‘misparse’ certain utterance
types, leading to certain types of mistakes in action. The monologue therefore displays an
element of ‘double coding’, in that novices and experts often interpret it differently. It is
part of the growing expertise of each participant to use action knowledge to interpret
utterances more reliably, and in turn to use utterance interpretation more surely as a guide
to correct action.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
4
A particular example of how task knowledge and timing are used in interpretation
can be seen in the successful resolution of ellipsis and anaphora. In the final part of the
paper, a model is suggested that outlines how instructors set up complex entities as
conceptually salient for the purposes of the activity, and make them accessible for
subsequent reference. The clear hierarchical structure of both the task and its discourse
make them particularly amenable to the discourse model suggested by Grosz & Sidner
(1986). This model uses hierarchical structure to capture conversational participants’
focus of attention. The model is very appropriate for an activity in which the hierarchical
structure of the action readily dictates how the discourse is to be interpreted, and can
successfully be used to capture how correct antecedents are identified in cases of ellipsis
and anaphora.
This paper has five major sections. The remainder of the current section provides
a background to the study and overview of what the monologue is like. It examines the
issues involved in understanding in this context, an overview of the workout itself and its
structure, and a discussion of the relative roles of verbal and visual information in the
interpretative task. In the second section, the data and methods of analysis for the study
are presented. In the third section, the paper argues that the workout monologue is
analysed by participants into five functionally distinct types of utterance. In that section, I
show how understanding is based on placement of the different utterance types in terms
of musical beats and the ongoing exercise. This is explained using a framework for
discourse analysis that predicts when and where in the workout task the different
utterance types are allowable, capturing accurately the knowledge that workout
participants are expected to acquire and employ in order to make the correct links
between cues and action. In the fourth section, I show how a theory of focus and
discourse structure such as that suggested by Grosz & Sidner (1986) can capture the
creation and maintenance of the shared focus of attention that enables anaphora and
ellipsis to be resolved correctly. In the final section, I discuss implications, conclusions,
and directions for further research.
Language and Action in the Step Aerobics Class
Step Aerobics, as defined by the originating company, Reebok, involves stepping
up and down onto a low platform, to music, while performing simultaneous
choreographed arm movements (Reebok 1990). Aerobics is in general a group activity,
performed in a gym or hall, with a single instructor visible on a platform or at the front of
the class. Instructors, and participants, are usually female. The instructor wears a headset
radio microphone for large classes, or uses her voice unamplified for smaller ones. Music
is chosen to accompany the different stages of the activity, and the instructor’s voice is
made audible above the music. Although usually at the front of the class, either facing or
with her back to them, the instructor may also move around giving brief comments on a
one-to-one basis for some exercises, although this did not happen in the data examined
here.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
5
During the activity, the instructor is highly verbal. For example, approximately
1370 utterances, or nearly 4.5 thousand words, were produced in one of the hour-long
workouts analysed in the study. Approximately 1500 utterances, or over 7000 words,
were produced in the other. Language is clearly an important component of the class.
The constraints on verbal contributions are different for participants and
instructors in Aerobics classes. Aerobics workouts of the sort studied here are monologic,
because participants are not generally expected to respond verbally. They will only do so
in response to the most formulaic of questions (‘let me hear you say yeah!’ ‘yeah!’).
Participants who make mistakes are expected to get back in step as best they can: they do
not ask for verbal clarification. If a large number of people go wrong at once, the
instructor will see this and interpret any verbalisation from the class as underlining the
difficulty. In these cases, she will rectify this, usually by restarting the class from the next
appropriate point. We can therefore expect any breakdown in a face-to-face class to be
observable from the instructor’s monologue. It is therefore part of the rules of this
particular language game that participants are non-verbal, and that they accept this
position at the outset. The analysis presented here is therefore deliberately centred on the
instructor’s monologue and how it is interpreted.
Although language and authority appear to be on the instructor’s side, however,
we should not assume that the class consists of the instructor barking at the class like a
drill sergeant. She must attend to the ‘face wants’ of the participants (cf. Brown and
Levinson 1987) as well as she can, within the confines of the activity, to ensure their
enjoyment of and continued attendance at classes, as shown in detail in Delin (1998).
Ignoring elements of social and technical preamble before the music starts, this
paper concentrates on the activity-related language that arises during the structured
workout. An expert participant has some specific and complex skills in conceptualising
the workout task and in interpreting the discourse that goes with it. As novices will
testify, despite the best efforts of instructors, the workout monologue is not easy to
interpret until the necessary skills have been developed. Participants must learn to
interpret and use verbal cues often before visual support is available to them. Although
the instructor will eventually perform most of the workout actions herself, a strategy of
simply mimicking action will leave participants lagging behind. An ability to use the
verbal information that is supplied in advance, therefore, will enhance performance
significantly. This paper seeks to present what it is that experts know, and novices learn,
in the successful negotiation of language and action in the workout.
The Task of Utterance Interpretation
One might expect unusual or novel vocabulary to be the biggest problem for a
novice Aerobics participant. Field-specific vocabulary such as grapevine, side out, single
side step, and bow and arrow are used to refer to different actions.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
6
More interesting, however, is how participants interpret the language of the
monologue so that it provides them with information they can use at critical points in the
activity, such as just before changes in action. As might be expected, a significant portion
of the workout monologue consists of utterances that tell participants what to do. These
utterances fall without the broad definition of directive offered by Searle (1976). In this
context, however, we will need to refine the notion of directive to mean actions to be
done at a specific point, since directives in the workout are issued in order to warn the
class that they must soon change to different movement. As we shall see, the complete
interpretation of what movement the new one should be, and when it should occur, is
arrived at on the basis of a complex set of processes that includes anaphor and ellipsis
resolution. Participants must use task knowledge and interpretation of the utterance’s
position in relation to activity and to musical beats to achieve comprehension.
Some directives are imperative in form, such as squeeze up, tap and change,
shake those arms, and hold it there. As in many other registers of language, however,
there is no necessary match between linguistic form and utterance function. Directives in
the workout may be nominals, declaratives, prepositional phrases, and adverbials, as
illustrated respectively in (1) below:
(1)
a
b
c
d
right leg
leg that’s behind, goes to the side
to the centre
overhead
The monologue is not solely composed of utterances with directive function.
Important though they are, directives make up only 43% of the instructors’ total output in
the corpus analysed here. The remaining utterances serve other functions crucial to
correct participation, and participants must learn to distinguish these. It is common,
indeed normal, for instructors to give information to ‘foreshadow’ a change in action, for
example. This function may occur across several utterances. It is intended only that the
class should absorb the information while completing a previous task, rather than acting
on it immediately. For example, the four consecutive utterances in (2) jointly specify that
a single set of four repeated arm movements should be performed only after the final
utterance:
(2)
start with the arms in front
pull back
for four
here we go
The preamble to an action change may refer to the same action several times, making the
monologue appear repetitious. For example, in (3), the action of changing the leading
foot is referred to no less than four times: changing the legs is signalled by the first
utterance, then re-signalled by the utterance tap and change, then again by tap. Finally,
left leg up refers a fourth time to the same action, specifying which leg should now lead:
this is the result of the ‘tap’, a false step on the floor that allows the other foot to lead the
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
7
step. Despite four references to the same movement, participants understand that they
need to do the action once and once only within the segment. Furthermore, they know
which of the several utterances is the final one before the change: in this case, tap and
change.
(3)
we’re gonna change the legs
two more
one more
tap and change
tap
left leg up
Other utterances do not refer at all to a change in action, but are intended instead
to allow participants to check that they are performing the current action correctly. For
example, the utterances in (4) do not indicate new movements, but constraints on
performing the current one. The range of syntactic forms significantly overlaps that used
for directives indicating distinct actions, including the use of imperatives:
(4)
a
b
c
d
nice straight backs
hands going to your chin
step together in the middle of the step
roll back and down
These examples show that utterances that seem like directives, syntactically and in
content, may indicate constraints on current actions. Other utterances may simply be part
of a preamble to the relevant directive. The identity of an action may be constructed over
four or five consecutive utterances, each of which adds information to the description.
The class is expected to know which kind of utterance is which. In what follows, we will
see how the nature of the workout task itself forms a central part of the pragmatic
knowledge that allows appropriate interpretative inferences to be made.
The Workout
What, then, is the activity like, and how is it constructed? Instructors plan the
exercise activity in detail before the class, generally conforming to the four section
structure suggested by Reebok (1990, p. 9) of warm up and stretch, aerobic stepping,
isolation work using specific chosen muscle groups, and finally post-exercise stretch. In
each section, the music is crucial: it is selected according to its beat speed, expressed as
beats per minute. In musical terms, it must be 2/4, 4/4, or 8/8 to fit the rhythmic structure
of the actions that make up the exercises.
Breaking the workout structure into four sections already indicates something of
its hierarchical nature. In fact, the activity is further divided into sub-tasks down to a very
fine level of granularity. For example, the aerobic stepping section may build up
gradually, and also incorporate a cooling-down period before the stretching section. Each
of these sub-sections is itself made up of concatenated units of activity. A unit is made up
of sets of repetitions such as 8 of one action, 16 of another, and 8 of a third, and so on,
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
8
and that unit would be repeated twice, once in each direction. As we will see, the
hierarchical nature of the task is important both to the conceptualisation of the activity,
and for the production and interpretation of the accompanying language.
The basic building block of Step, from which all higher levels of organisation are
constructed, is the step pattern or floor pattern, a small number of actions performed with
the feet. Step patterns can be performed on the spot, or incorporate left or right, forward
or backward travel. It is up to the instructor to concatenate patterns and co-ordinate travel
so that each member of the class ends up back where she should be in relation to her step
platform and to the four walls of the room at the end of the sequence.
The simplest kind of step pattern takes four beats from the resting position to
perform. One four-beat pattern is the basic step: up onto the step with the right foot, up
with the left foot, down with the right foot, down with the left foot. This can be varied by
leading with the left foot, or by starting on top of the step and stepping down first.
Another group takes eight beats to perform: these are known as alternating steps. The
following example of one kind of alternating step, the ‘Tap Up’, is taken from the
Reebok trainers’ manual (1990):
Tap Up
Execution:
Cycle:
Cue:
Approaching from the front of the step:
R foot up, L foot tap up, L foot down, R foot
down, L foot up, R foot tap up, R foot down,
L foot tap down.
8 counts
up, tap, down, down, up, tap, down, down (p.24)
We can see here in ‘cue’ that the instructor is told what to say to accompany the exercise.
The hierarchical structure of the workout can therefore be described by beginning
with the fine detail of individual acts (such as the placing of one foot on the step),
composing these individual acts into floor patterns (a basic step, an alternating step),
adding floor patterns together into repetitions (such as 8 basic steps), and then composing
‘phrases’ of sets of repetitions (such as 8 basic steps, 16 tap ups, another 8 basic steps).
These phrases are themselves combined into ever more complex sequences of activity.
The penultimate level is the four-section structure of the workout as a whole described
above. At the top level, for our purposes, is the category of the workout itself, although
we could see this as again embedded in, say, an eight-week sequence of exercise classes.
This hierarchical structure is exemplified in Figure 1.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
9
workout
warm-up
section
1 32-beat phrase
1 16-beat phrase
1 32-beat phrase
8 basic steps
16 side steps
8 basic steps
8 basic steps
basic step
right foot
up
left foot
up
right foot
down
left foot
down
Figure 1. Hierarchical structure of the workout.
The data for this study consisted of two one-hour Step Aerobics classes. Class 1
was collected in a University sports centre, at a class in which the author was a
participant (and was a regular participant in the series). Class 2 was taken from a
commercial instructional video recording distributed by Reebok (1992). The Reebok
class was taken to represent an institutionally-approved version of the Step workout; the
instructor on the video is in fact the inventor of the exercise). Obviously, workouts
recorded for video and those performed in a class situation will differ substantially in
many respects, and these factors are periodically discussed. However, the instructional
monologue produced is assumed to be similar whether talking to a class or to an
individual watching the video, particularly since the video is intended to replicate a
`complete, heart-pumping session of Step Reebok' (Reebok, 1992). A third University
class was recorded, transcribed, and consulted informally as a cross-check on
conclusions.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
10
In addition to qualitative comment on the data, this study included quantitative
analysis. This was based on a close analysis of a total of 27 minutes and 57 seconds of
data taken from class 1 (henceforth `University') and class 2 (henceforth `Reebok'). This
corpus consisted of excerpts were taken from two separate sections of the University
class, and one section of the Reebok video. Data from the University class consisted of a
segment of the ‘Warmup’ section (15 minutes and 8 seconds, 386 utterances, 1175
words), and the ‘Aerobic’ section (5 minutes and 44 seconds, 191 utterances, 508 words).
The total duration of data selected for close analysis from the University class was
therefore 20 minutes and 52 seconds, consisting of 577 utterances and 1683 words. From
the Reebok video, data was taken from ‘Warmup’ section alone: 300 utterances or 936
words, a duration of 7 minutes and 5 seconds. The corpus as a whole contained 877
utterances, and 2619 words.
This data was coded utterance-by-utterance using the Workbench for Analysis
and Generation (WAG), a computer coding tool that facilitates the analysis of data within
the framework of Systemic Functional Grammar1. The current study was based on a
systemic network built expressly for this analysis, allowing the relevant features of the
workout data to be captured. The software allows for units of language (in this case,
utterances) to be coded according to the features of the network, and performs statistics
on the codings when completed. All syntactic and utterance-type coding described here
was performed using the coder.
Prior to coding, the data described in Figure 2 was also transcribed by hand onto a
grid representing the 4/4 rhythm of the music, locating each syllable as closely as
possible to the musical beat on which it occurred (to an accuracy of half-beats). This
method of analysis was used as the basis for the study of utterance function assignment in
relation to beat placement.
Utterance Functions in the Workout Monologue
We saw in the introductory section an informal description of the kinds of
function that utterances in the workout monologue can serve. It is crucial to appreciate
here the way in which the language relates to the activity type under way if we are to
understand how the instructor's utterances can be assigned their correct functional
interpretation. As an experienced participant observer in workout classes, and following
the approach outlined by Levinson (1992), I have constructed the set of utterance
function categories to be presented here on the basis of an understanding of how the
activity itself divides down into functional sub-parts, and what kinds of information are
needed by participants at specific points. The structure of the activity, and the language
that arises within it, jointly serve to support the hierarchical goal structure of the activity.
The goal structure of the workout places constraints on form, placement, and content of
utterances, and interpretation is constructed on the basis of both the linguistic and the
situational facts. The categories presented below, therefore, are not defined within the
domain of speech alone: they are constructed and interpreted through the activity type in
which they are embedded and of which they constitute a part. What are termed
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
11
`utterances' in this analysis are not independently verified in any systematic way on the
basis of linguistic evidence such as pause or breath groups, although some linguistic
differences are noted between them. Instead, utterance types are defined functionally, in
terms of their role in the workout activity and experienced participants' actions in
response to them. I have tried as far as possible, therefore, to develop a framework that
arises, like the language itself, directly from the activity type in question.
During the course of the workout, utterances with five main functions are
produced and interpreted. In this section, we will examine these utterance types in more
detail, and show what tasks interpreters are performing in differentiating them from one
another. The five utterance types can be summarised as follows:
Directives warn participants about what they will need to do at the next change.
They may be straightforward cues such as now upward row, overhead, or again,
or Directive Countdowns asking for a repeat of a previous action, such as four
more.
Descriptions2 specify what participants should be doing right now, often
functioning as a check or follow-up to a previous directive. These may be
utterances such as up up down down, describing at the time the ongoing action, or
Descriptive Countdowns which count number of actions or duration down to a
change (for four, three, two...).
Teaching Points are of two kinds. They may either give further detail on an
action currently under way, constraining the activity as in try not to push on that
leg, resulting in participants checking that their behaviour conforms, or they may
instruct participants in a new sequence of actions before they do it themselves, as
in now we're gonna do an A step.
Comments are utterances that are usually used to check and create interpersonal
relations between participants and instructor, such as have I warmed you up yet?
and this is very good.
Markers are used to warn participants that they must act on the next beat. They
may act as follow-up to a directive, and/or as a preface to a description. Examples
are Ready? and Here we go.
The number of utterances in the University and Reebok corpora were 577 and
300, respectively. The distribution of utterances in the University corpus was Directives
(35.4%), Descriptions (47.3%), Teaching Points (10.7%), Comments (3.6%), Markers
(1.4%) and Unintelligible (1.6%). The distribution in the Reebok corpus was Directives
(57.7%), Descriptions (19%), Teaching Points (9.7%), Comments (8%), Markers (5.3%)
and Unintelligible (0.3%). A chi-square test revealed that Directives were significantly
more prevalent in the Reebok corpus than the University corpus, X2 (1) = 23.54, p < 0.05,
whereas Descriptions were more prevalent in the University corpus than the Reebok
corpus, X2 (1) = 87.40, p < 0.05. These differences may be accounted for by the structure
of the workout: the Reebok workout is very inclusive in terms of the muscle groups it
addresses in its duration. There are more frequent action changes with each action being
performed for a shorter time. For changes in action, Directives are mandatory. The
University workout, by contrast, is part of a series with different exercise goals for each
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
12
class, and segments of it may already be quite familiar to regular participants. There are
fewer changes of action, and actions go on for longer. Descriptions are therefore used to
support continuing actions, with Directives appearing at the less frequent changes. It is
likely, too, that individual instructor styles play a role in how utterances are selected, and
further research is needed to ascertain what particular factors affect frequency of
utterance choice. It is sufficient for the purposes of this paper, however, to note that
utterances of all the major types were used by both instructors.
The subsequent sections examine each utterance type separately, describing the features
that distinguish them both in form and function.
Directives
Directive utterances are those that tell participants about an imminent change of
action, and what is to be done in the next task segment. These are composed of a range of
syntactic forms. Further examples appear in (5):
(5)
a
b
c
d
e
f
g
tap and change
stepping up
and front
side for two
now upward row
to the centre
single side step
An important sub-category of directive is the count directive, which cues
participants how many times a movement must be repeated, or for how long a duration in
terms of beat counts. 85 of the directives in the total corpus are count directives:
examples appear in (6):
(6)
a
b
c
d
one more time
last one
another four
four more
Together, the non-count and count directive define the next activity, and therefore
often appear one after the other. Example (7) shows a typical sequence. Directives are in
bold, count directives in italics:
(7)
arms overhead
four more
four, three
arms down
for four
four three two
arms up
four three two
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
13
arms down for two
up for two
down for two
up for two
Note that, in the case of arms up, a count directive does not appear, since it has been
established that the operational unit for the duration of the exercise is four repetitions.
This `carrying over' of information from one task segment to another will be examined
further in the discussion of focusing.
The time-critical nature of directives is crucial to interpretation. Since it is part of
directive function to warn or cue participants, a defining characteristic of directives is
their temporal placement, as indicated by the Reebok advice on cueing quoted earlier. In
the workout context, temporal placement can be described in terms of utterance position
in relation to beats of the music and the ongoing activity. For example, if the current
activity consists of four repetitions of a four-beat movement (i.e., 16 beats), a directive
for what is to be done in the next task segment must be issued on the last four-beat
repetition: that is, between beats 13 to 16. Likewise, if the activity is an eight-beat
movement with four repetitions (32 beats), a new directive must appear on the last eight
beat repetition: between beats 25 and 32. Directives, then, always appear on the last
action repetition. Two-beat actions, however, are an exception: here, the final repetition is
rather short to issue a coherent directive: in these cases, instructors will take the final two
repetitions (four beats) to issue the directive.
Two examples will serve to make these facts about directive placement somewhat
clearer, represented diagrammatically to show the relationship of the language to the
music and the activity. Figure 2 shows utterance placement in a four-beat activity (a basic
step). In this and the following figures, activity beats are represented by numbers below a
line, while each line is given a number above it to denote the number of repetitions that
have taken place of that activity. The accompanying language is placed below the beatmarking numbers, and utterance functions are differentiated as noted in the key to each
figure. Boxes indicate multi-word utterances that are too long to place on a single line: in
these cases, the extent of the box, or the extent of a bracket above the box, indicate the
number of beats that the utterance consumes. In Figure 2, it should first of all be noted
how the count directive, another four, is placed on the final repetition of the previous
activity segment. The next four four-beat bars, then, will be taken up with continuing
repetitions of the same activity. As predicted by the first count directive (another four),
participants know they must expect a new directive on the fourth and last repetition. The
predicted directive, now upward row, duly arrives just before the second beat of the final
four-beat bar. It could, in fact, have been later: as long as it is complete by the half-beat
after beat four, it will suffice as a cue. However, this instructor wishes to use a marker
(which we return to below) to mark the final beat before the change: squeeze is such a
marker. This means that the directive has been placed sufficiently early to allow room for
it. Finally, the beginning of the next set of repetitions is marked by the description up (see
below for a description of descriptive utterances), said on the beat at which the class is
stepping up onto the step.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
14
4
1
2
3
4
another four
1
1
2
2
3
4
1
2
3
3
for three
4
1
2
4
3
for two
4
1
last
one
1
1
2
3
4
2
3
squeeze
now
upward
row
up
directive
marker
description
Figure 2. Utterance placement in a four-beat activity.
A two-beat example appears in Figure 3. As it is rarely possible to generate a
useful directive utterance in the space of two beats, instructors often use the space of the
last two repetitions of the previous segment, instead of the final repetition, to issue the
directive. Accordingly, the onset of the directive push forward with the arm is placed
halfway through repetition three of the previous segment. It is followed by three
descriptive utterances (four, three, two) before the onset of the next directive, now
overhead, which again is positioned halfway through repetition three of the activity. Note
that the fourth and final beat of the current task segment is the possible site for a marker,
(indicated here by going). Finally, the new task segment is marked by descriptives that
further elaborate on the content of the directive now overhead, telling participants that
their arms should now be going up.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
4
Step Aerobics
1
1
2
2
1
1 2
four
3
3 4
4
1 2
3 4
push forward
with the arm
2
3 4
three
15
3
4
1 2
two
1
3 4 1 2
going up
2
3 4
up
now
overhead
directive
marker
description
Figure 3. Utterance placement in a two-beat activity.
We are now in a position to formulate a metric for the interpretation of directives,
based on timing and placement, in the workout monologue:
Metric for Directive Function: directive onset will occur as late as possible
during the last four beats of the task segment prior to the required change, or
during the final repetition of the previous task set, whichever duration is greater.
Maximum lateness of placement is dependent on the presence or absence of an
optional final-beat marker.
This, then, is a specific statement of the kind of pragmatic knowledge that participants
must develop and use in order to interpret the workout discourse appropriately.
Markers
Markers are as utterances issued on the final beat or half-beat before a new task segment
begins. They signal clearly when a change in activity will take place. Markers are
optional. The archetypal marker may be familiar to the readers who attended ballet class
in their youth: the pianist would play an introduction, turn to the class, and say ‘and’ to
tell them where to start. In the workout discourse, markers usually appear after directives,
but may appear on their own while a task segment is underway, reminding participants of
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
16
a change in move without describing the content of that change. In the data analysed for
this study, markers constituted 1.4% of the University monologue, and 5.3% of the
Reebok monologue. Given the large number of action changes that occur in the workout,
it is clear that final-beat marking using a marker is optional: late placement of the
directives, often with fall-rise intonation, is often sufficient warning of the specific beat
upon which a change is to take place. It is clear that intonation is an important signalling
device in the workout, not only of utterance function but of position in the hierarchical
task structure. A full analysis is beyond the scope of the current study, however.
A wide range of syntactic forms are used as markers. They may be as brief as and, or as
long as you’re gonna go. The University instructor favours -ing forms, such as going,
curling, pushing. Whatever their syntax, markers must, however, fit into at most a beat
and a half, since their function is to indicate that a new action must take place on the next
beat. An example of going as a marker appeared in Figure 3; further examples appear in
Figure 7 at the end of the exposition of the individual utterance types.
Unusually, markers can appear in pairs, if the previous directive has left enough space: an
instructor might use two in a row such as here you go you're gonna go, ending on the
half-beat before the relevant action change. In addition, markers can appear on their own
with no directive, if the action is ongoing and is therefore sufficiently salient. For
example, a new set of repetitions of the same action might simply be marked by an
utterance of and.
Descriptions
In the University workout in particular, a large number of utterances serve the function of
describing the current move. These descriptive utterances supply information to back up
the framework provided by directives: they narrate the workout activities as they happen,
providing a more fine-grained level of description than that given in directives.
Frequently, a sequence consisting of directive and a marker is followed by description, as
in (8):
(8)
Directive:
Marker:
Description:
now add the hop
you're gonna go
up hop
In this case, up hop appears on the two beats where stepping up and hopping should
occur. Other examples appear in (9), below:
(9)
a
b
c
d
e
up tap and down tap
backwards
up lift
side, up, side, and down
squeeze
curl
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
17
As it has to appear while the class is performing the relevant components of specific
actions, description is time-critical in its delivery. It is, as a result, terse. Syntactically
(apart from the descriptive countdowns described later in the section, which are as might
be expected mainly numbers), description consists of either adverbials (just under 43%)
and clauses (just over 16%); noun phrases are rare. As we noted above, the University
instructor uses description relatively prolifically, while the Reebok instructor, by contrast,
describes less frequently, relying more on directives issued prior to the action.
While syntactic information does not serve reliably to distinguish description and
directives, the two categories of utterance are markedly distinct in terms of the related
issues of beat placement and information status. Firstly, it is important to stress that a
description is simply a late-placed directive: an utterance that must function to describe a
current action rather than to direct a subsequent one. Description is placed too late to use
as a cue: participants attempting to use them as cues will be at least half a beat behind the
class. Informationally, to an adept participant, description will be composed of
information that is already shared with the instructor, and is entailed by the content of the
preceding discourse. For example, having issued the directive and knee lifts, the
University instructor describes up lift on each iteration of the task; likewise, having
directed an action to the class as then we're going to come back for three (meaning three
steps backwards), she then says backwards on each of the three steps as they are
performed. This involves an understanding that the described actions are part of the
action described by the directive: that is, that there is a rhetorical relation between the two
kinds of information that might be captured variously as a part-whole or elaboration
relation. For expert participants, the descriptive information is inferrable on the basis of
the directive. Less expert class members, however, may either not appreciate this
relationship and therefore see the description as requiring different actions (the
interpretation of the workout as a flat structure consisting of sequences of unrelated
directives), resulting in confusion.
A more successful, but still non-expert, strategy lies in understanding the
relationship between directive and subsequent description, but still treating description as
the component that cues participants on when to act. If a participant has to use description
to tell her what she should be doing, she will be late in performing the action. This failure
to share what is clearly intended to be shared knowledge indicates to a participant that
she is not expert, and is excluded from the construction of `ideal participant' that the
description implies. It is more than likely that participants move between these levels of
descriptive interpretation during the workout, since even the most experienced
participants can lapse in concentration from time to time, and will take the opportunity to
get back in step that description offers them. The ability to interpret description as
description, rather than needing to use it as directive, is a sign of developing skill.
Description, then, is a category which is `double-coded' depending on level of expertise.
A second kind of descriptive utterance appears in the monologue in the form of count
descriptions. Count descriptions serve to enumerate repetitions or beat duration on the
relevant beats, and are exemplified in Figure 4. Count descriptions serve a different
function from count directives: the fact that the two kinds of utterance are concatenated in
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
18
the monologue reveals this. In Figure 4, repetitions of a four-beat action are cued by the
count directive four more, followed by the count descriptions for four, for three, for two,
for one, each placed on the first beat of each repetition. Informationally, count
descriptions serve to elaborate at a low level of detail: while the directive four more of
these, for example, defines the task segment, the descriptions for four, for three, for two,
for one serve to count off the component actions of that segment. Note that, while the slot
is not exploited, there is still, in principle, time for a marker at the end of the final
repetition.
The role of description and directives, and counts of both kinds, in defining task segments
as salient discourse entities will be addressed in the discussion of focusing later in the
paper.
1
1
2
2
3
4
1
2
3
3
4
1
2
4
3
4
1
2
3
4
four more
1
1
2
2
3
4
1
for three
directive
2
3
3
4
1
for two
2
4
3
4
1
2
3
for one
description
Figure 4. Count directives and count descriptions.
Teaching Points
Teaching points are a heterogeneous category, but are grouped together here because they
are not time-critical and are interpreted by participants as advisory rather than directive.
Teaching points serve two particular major functions. Instructing teaching points function
to teach a new move that participants are shortly to undertake, and constraining teaching
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
4
Step Aerobics
19
points describe a current move in more detail, enabling participants to check that they are
performing it correctly. Teaching points as a general category occupy just over 10% of
the monologue in the total corpus examined here, with the proportion being very similar
for both instructors. Example (10) is a long instructing teaching point from a University
workout:
(10)
and we're gonna do three of these here
gonna go across to the front diagonally
then we're going to come back for three
watch while I do it
keep up tapping and down tapping here
and I'll show you
watch
here I go
I'm going to go
four
three
two
then I'm going over diagonally
one two three
and tap
backwards
backwards
backwards
then back over
one
two
four
ok
In this long embedded segment, the instructor performs a range of speech act functions:
in particular, she describes herself performing the action. These segments are therefore
capable of further analysis in a recursive manner, re-applying the framework within the
teaching segment. Many teaching points are shorter, as in the two examples in (11) below
from the University data:
(11)
a
b
we're gonna do a basic up tap down tap to the side
Now you're gonna watch for these arms, watch
These teaching points differ from directives in that they are not to be acted upon
immediately: instead, they foreshadow activities, and therefore other utterances, to come.
Because they are delivered at a time when participants are already established in a
different activity, they are not time-critical, and as a result tend to approximate more
closely to the rhythms of casual speech than do directive and descriptive utterances.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
20
An example of the placement of an instructing teaching point, and its
`foreshadowing' of directives and description to come, appears in Figure 5. In this
example, the instructing teaching point then we're gonna add the hop, yes? comes six
four-beat bars before the directive it foreshadows3, now add the hop. The teaching point
is not placed on a final repetition of an action: it takes place when a sequence of actions
(indicated by sequences of up, squeeze, over) is already underway. Most importantly, its
placement violates a beat segment boundary, differentiating it further from a potential
directive: as we have seen, directives typically inhabit a region as near as possible to the
end of a four-beat or single-task-segment structure. This teaching point, on the other
hand, straddles two bars. Since the we're gonna construction is also a common one for
directives, it is clear from examples like these that placement is central to disambiguating
utterance function.
3
1
2
up squeeze
3
4
1
2
then we’re
gonna add the
hop, yes?
3
4
1
2
over
3
4
1
2
3
4
4
1
2
up squeeze
3
4
1
2
3
squeeze
4
1
2
3
4
squeeze over
1
2
now add
the hop
1
1
2
3
4
1
2
3
4
up hop
directive
marker
comment
teaching point
description
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
3
4
you’re
gonna go
Step Aerobics
21
Figure 5: Teaching point foreshadowing later directive
The second category of teaching point is the constraining teaching point, which serves as
a check on an activity already under way. The function of constraints is to ensure that
participants are doing activities correctly and safely, and in a way that will enable them to
get the maximum benefit from the exercise. (12) gives examples from the Reebok
workout:
(12)
a
b
c
each time your hands come down and rest on your thigh
shoulders back
abdominals tight
push those crosses straight out in front
Like instructional teaching points, constraints are not placed in the segment-ending
positions that might cause them to be confused with directives: stretching those
hamstrings and pushing those crosses out in front are not exercises in themselves, but
refinements of current exercises. Placement, therefore, is `parenthetical' to the main
structure created by directives, description, and markers. An example of a constraining
teaching point in context appears in Figure 6, in which really work that back is inserted
between two instances of and down which mark, and then describe, the current activity.
Note, again, the cross-segment-boundary positioning of really work that back. It is this
positioning that enables participants to understand constraining teaching points as
elaborating at a lower level of detail on current actions.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
1
1
2
2
3
4
1
2
3
3
4
1
2
4
3
4
1
one
1
1
2
2
3
up
4
1
and
down
2
1
that
back
2
4
1
2
3
4
and
directive
marker
teaching point
1
down
3
2
4
1
and down
2
last
one
4
1
curling up
2
4
to the curling
centre
2
3
4
3
4
two Really work
more
3
3
3
4
curling up
1
2
3
3
22
4
3
4
and
1
down
2
and you’re
marching
description
Figure 6: Constraining teaching point in context
Comments
Comments are utterances that serve to check and manipulate social relations in the
workout. They are the only class of utterance whose content is not directly determined by
the structure of the activity itself. They may, for example, serve a meta-linguistic
function, checking that everyone can hear and is in a good position to begin the workout.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
23
More commonly, however, they serve to joke, check progress, and praise participants at
various point throughout the class. Examples from the Reebok workout appear in (13):
(13)
a
b
c
d
e
ready?
good
that's it
right good
that's good
In fact, the examples in (13) above exhaust the variety of comments in the 300 utterances
analysed closely in the Reebok data, perhaps because more contentful observations on
class behaviour are not possible in the context of a video recording for commercial
distribution. The University instructor, who has a live class in front of her, uses a wider
variety of comments. Examples appear in (14):
(14)
a
b
c
d
e
f
nice and easy
how're we feeling?
this is very good
keep it up
you can almost hear the steam coming out of your brains
why d'you all look so worried?
Comments are not time-critical in their placement: as long as they avoid the
positions defined as prime sites for directives, even imperatives such as keep it up should
not be confused with directives. It is likely that a combination of content (for example,
evaluative content, mental process verbs, non-field-specific vocabulary) and syntax
(questioning and clausal constructions) serves to differentiate comments adequately from
other utterance types.
Summary: Utterance Placement and Interpretation
Having described the five utterance types found in the workout monologue, we
can examine how the utterance types work in context. In particular, it is helpful to
examine some sequences of utterances from the workouts that further exemplify the
`grammar' of utterance placement at work.
As we have seen, three of the five functions of utterance are time-critical:
directives, descriptions, and markers. Interpretation of directives as directives relies very
closely on the structure of the task at hand. When workout participants reach the end of a
task segment, their expectation that the next utterance will be a directive is greatly
increased. In fact, because of the particular time-constraints of the task and the class
expectation that a change is approaching, they are desperate for such information.
Understanding of at least the current segment of the task structure is a crucial cue to topdown interpretation of the incoming utterances: anything appearing at these crucial points
in the activity is likely to be interpreted as a directive.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
24
The directive metric given earlier in this section also caters for the positioning of
markers, which, although optional, must be placed when used on the final beat or halfbeat before an action change. Finally, it is part of the definition of description that it is
placed on the beat of the relevant action, so that it performs its primary task of allowing
verification of the current action.
Teaching points and comments may be freely placed, as long as they avoid the
key beat placements reserved for directives in particular. Freedom from beat timing
means that utterances of both these categories are more likely to approximate the normal
rhythm of speech, and they are more likely to be syntactically complex than description
and directives. As we have seen, too, they may straddle beat structure boundaries freely.
Rhythm, syntax, and placement therefore combine to prevent confusion between the
different classes of utterance.
Figure 7 is an example that features all five types of utterance and illustrates the
points that I have been making about their placement. It includes and follows on from the
excerpt in Figure 5 above. The workout task is a sixteen-beat one: we join the extract at
the beginning of the third of four sixteen-beat repetitions. First, note the placement of the
time-critical utterances: directives, description, and markers. The major task directives
now add the hop and once more round (non-count and count respectively) are placed as
near as possible to the end of the sixteen-beat segment, as the metric for directive
interpretation given above predicts.
The reason that these directives are not right at the end is that a marker is present
in each case (you're gonna go and going), and these occupy the last possible position in
the segment. Descriptions are placed on the beats to which they refer: this accounts for
the utterances up squeeze, up squeeze, squeeze, over, up hop, hop over, one two three and
tap, up hop, up hop, up hop, over, and one two three and tap. Two further utterances,
then we're gonna add the hop, yes? and don't forget to go backwards are teaching points,
instructing and constraining respectively. As we saw above, the teaching points avoid the
critical directive position late in their respective task segments (in this 16-beat activity,
beats 13-16 inclusive). Finally, the encouraging comment, keep it up, similarly avoids the
critical placement, and is in fact delivered, although the transcript does not show it, with a
higher pitch than any of the descriptive utterances that are its only potential sources of
confusion.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
3
1
2
up squeeze
1
2
up squeeze
3
3
4
1
2
then we’re
gonna add the
hop, yes?
4
1
3
4
1
2
over
3
4
1
2
25
3
4
4
2
3
squeeze
4
1
2
3
4
squeeze over
1
2
now add
the hop
3
4
you’re
gonna go
1
1
2
3
4
1
2
3
4
1
2
3
4
hop over
up hop
2
1
2
up hop
1
3
2
3
4
1
2
3
don’t backwards
forget
to go
4
up hop
directive
marker
1
2
up hop
comment
3
4
1
2
3
4
one two three and
tap
1
2
3
4
1
keep it up
2
1
2
3
2
3
4
up hop over
teaching point
4
1
3
4
once going
more
round
3
4
one two three and
tap
description
Figure 7: 5 types of utterance in context
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
26
Different types of utterance address different levels of conceptual hierarchy in the
workout task. Pragmatic knowledge of utterance placement and task structure leads
participants to infer correctly the ways in which this happens. While directives, count
directives, and instructional teaching points provide descriptions of entire task segments,
constraining teaching points and descriptive utterances operate at a finer level of detail,
reaching inside task segments to describe or constrain sub-activities. Markers act as
`punctuation', to warn exactly when a change is expected, while comments serve a
primarily social function, maintaining friendly proximity during the workout.
Do it Again: Focusing and Reference Resolution
In addition to inferring the distinct functions of utterances, workout participants
must also resolve a large number of anaphoric references, and successfully interpret a
high degree of ellipsis. In this section, we will look at the ways in which anaphoric
referring expressions and ellipsis come to be interpreted correctly in a situation where the
discourse referents – actions and sequences of actions – are both complex and ephemeral.
For example, the class is frequently given directives such as again, another four and once
more round. In a sequence of actions that stretches back as much as forty or fifty minutes,
it is clear that mechanisms are in place that enable participants to segment their previous
activity into a conceptual structure that provides correct antecedents for such expressions.
Similarly, during a sequence of activities, elliptical directives such as now to the front and
and singles are often encountered. Here, the class is clearly intended to preserve
particular elements of the current activity, and change only those which the new directive
is intended to update. This ability suggests that a working description of the current
activity, represented as an accessible aggregate of different variables, is available to
them. Furthermore, since all the expressions indicated are capable of different
interpretations at different points in the workout, it is clear that the knowledge structures
involved are dynamic, and capable of complete updating whenever a global change in
activity takes place.
The highly structured nature of the workout task, and the close mapping between
this and the accompanying monologue, suggests the appropriateness of computational
theories of focusing. These employ hierarchical and dynamically-updated discourse
models as a means of representing interpretative processes (for example, Grosz 1977;
Sidner 1979; Reichman 1981). In particular, the following brief account draws on the
theory of Grosz and Sidner (1986). They suggest that interpretative processes required for
keeping track of discourse can be captured in terms of a model consisting of three related
structures: linguistic structure (the actual discourse), intentional structure (a structure
composed of the various purposes of the discourse segments) and attentional structure (a
representation of the state of attention of participants in the discourse). Recognising how
discourse coheres into segments, and modelling the relationships between those
segments, is known to provide information vital for the understanding of semantically
underspecified linguistic expressions such as ellipsis and anaphora. It also allows
discourse participants to keep track of discourse topic, digressions, and interruptions.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
27
While there is not the space here to go into either a complete description of Grosz
and Sidner's theory, nor to exhaust its possibilities for describing the current data, an
examination of some particular cases of reference will exemplify some of the finergrained interpretative tasks that workout participants successfully carry out. Central to the
theory are so-called focus spaces which represent attentional state. In the model, each
segment of the discourse gives rise to a focus space which contains the entities that are
salient in that segment. In addition, each focus space contains information about the
purpose of its discourse segment and its relationship with other segments. The
relationship between focus spaces is represented as a stack, with new spaces being
pushed onto the top of the stack when the discourse segment purpose for a new segment
contributes to that of the segment immediately preceding it. Discourse segment purposes,
then, are seen as hierarchically organised, contributing to the global purpose of the
discourse as a whole. When a purpose is achieved, and the discourse associated with it
therefore comes to an end, the focus spaces dominated by that purpose are popped from
the stack. In terms of the accessibility of discourse referents, the stack as a whole is seen
as representing the salience of entities, with the topmost focal space containing the most
salient. However, information in lower spaces is accessible from higher ones, providing
that all are dominated by the same focal space.
We can conceptualise the workout monologue in Grosz and Sidner's terms as
being dominated by a single, top-level, discourse purpose: we can gloss this as the
instructor's intention that the class intend to perform the workout. However, it is the
subordinate structures into which the workout and its monologue are arranged that
provide the interesting answers for reference resolution.
As we have seen in the discussion of the workout plan, there is a range of levels at
which the workout activity can be conceptualised and described: from the individual
actions that make up floor patterns, through sets of repetitions of different floor patterns,
right up to the four major sections of the workout. For economy, instructors need to be
able to refer quickly and easily to parts of the task, so that participants will be able to
repeat movements. In this section, we will look briefly at how the University instructor
sets up task segments of particular `grainsize' in order to refer to them subsequently.
The first step is to set up a floor pattern and establish it as salient to participants.
This is achieved either by a straightforward directive, or, if it is new or complex and
therefore needs to be taught, by a teaching point, as in (15):
(15)
we're going to do a basic up tap and down tap to the side
This introduces the floor pattern on the basis of which the next sequence of activity will
be built up. To establish the floor pattern, this instructor typically describes a few
repetitions of it. Not every step is described. Even with some steps omitted, however, the
spacing of descriptive utterances still serves to establish in participants' minds the
duration of the salient activity in beats (in this case, four beats; in example (16), brackets
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
28
indicate elapsed beats that are not marked linguistically). After we're gonna go, a
marker, the description begins:
(16)
we're gonna go up tap and down tap
up tap and down tap
(1.5) and down tap
(2) down tap
Then, to build up a more complex sequence of activity out of the now-established floor
pattern, the instructor will add several repetitions together to make a longer phrase. This
will be done using a teaching point. Note that, in (17), the anaphoric these is able to refer
to the `up tap and down tap' that has been established as the basic floor pattern, while the
final elliptical three refers to three such movements:
(17)
gonna do three of these here
gonna go across the front diagonally
then we're gonna come back for three
…
then back over
This teaching point serves as the first introduction of a 32-beat pattern of activity,
composed of three 4-beat `up tap and down taps', a 4-beat movement across the step,
three further 4-beat up tap and down taps moving gradually backwards to the foot of the
step, and a final 4-beat step over back to the starting position. The instructor then
describes this as before, as the class join in. The pattern of her utterances soon falls into
the normal one of directives and description as they accompany her: she moves from a
strategy of exclusively on-the-beat description to one of issuing some information early
enough to use as directive. For example, over and and then back over are directives. In
example (18), numbers in brackets again indicate elapsed 'silent' beats:
(18)
up tap down tap
up tap (2)
(3) over
(4)
one two three and tap
up (3)
up (3)
and then back over
one two three and tap
The higher-order task of 32 beats has therefore been made conceptually salient by a
combination of teaching point, directive, description, and counting. The class should now
have a clear representation of the nature and duration of the task segment, based not just
on the original four-beat floor pattern but on the larger structure in which it participates.
Interestingly, keeping track of the beats that elapse between utterances has been as
important as the utterances themselves in establishing this larger task.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
29
We can view the whole process of establishing the 32-beat task as a single
discourse segment giving rise to a set of salient entities, including the 32-beat task itself,
which populate a single focus space. During subsequent activity, the instructor uses the
expressions again, once more and once more round to refer to the whole 32-beat
structure.
In addition to referring to conceptual entities that reside at a relatively high level,
however, instructors refer locally to sub-segments within the 32-beat task structure. For
example, again is sometimes used to refer to single repetitions of the basic `up tap down
tap' floor pattern contained within the 32-beat structure. In order to account for this local
reference, we can model the local tasks from which the 32 beat activity is made up, and
their associated language, as subordinate discourse segments with focus spaces of their
own. These spaces would be `pushed' on the stack as the tasks are undertaken. The global
discourse purpose of the dominating segment can be described as `intend that the
participants intend to perform the 32-beat activity', the purposes of the local segments are
to get the participants to perform the tasks that comprise the 32 beats. Within each focus
space lies the description of the relevant task, consisting of a small number of salient
variables which participants keep in mind: a sort of `working description' of the current
task. For each task, a participant will have in mind a current description consisting of four
components: the floor pattern, such as a basic step or a V-step, the number of repetitions
or duration of the exercise, the relevant arm movement (a bicep curl, for example), and
the relevant leg movement (e.g hamstring curls, knee lifts).
Modifications to each element of the activity are made without stopping, usually
by replacing one element with another (for example, arms overhead may replace elbows
in; two repetitions may replace four). The elliptical directive side out no arms, to take a
particular case, updates the working description of leg and arm movements: side out
meaning that the current sequence of three knee lifts should be replaced on the next
repetition with a sequence of three side-swings of the leg, while the current arm
movement should be dropped completely. Note that there is no revision intended to the
number of repetitions, the basic floor pattern, or the global structure of the task.
The constant updating of salient variables provides a set of values against which
ellipsis should be resolved. Given the basic framework set out above, elliptical utterances
(both descriptive and directive) are matched against the `fields' in the open focal space. In
practice, floor patterns are rarely revised in this way: movements are more usually added
on to existing floor patterns. Repetition number is frequently reduced, however, and the
most common alterations during major task segments are changes to arm and leg
movements. All the values would be expected to change at major task segment
boundaries, however, when a new floor pattern is defined, with new duration and
different numbers of repetitions making up the global task. Ellipsis and anaphora does not
therefore cross major task segment boundaries.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
30
There is a residual problem, however, in the resolution of ambiguous anaphoric
directives such as again. How are participants to differentiate an again meaning a single
floor pattern repetition from one meaning the whole major task segment? On Grosz and
Sidner's model, the first referent would be chosen from the topmost focal space, which,
on the application of their theory that I have been outlining, would represent the local
task. However, in some situations, this would result in an incorrect interpretation: a
participant would choose the local interpretation when the global one was intended. The
solution to this issue lies once more in the placement of utterances, the key to
distinguishing between segment-internal reference and whole-segment reference being in
the placement of the relevant expressions. Quite simply, a directive including again
which appears during the final activity of the 32-beat section will be interpreted as
referring to the entire section, while one appearing, say, just after the second repetition of
one of the component floor patterns (in this case, between beats 4 and 8 of the segment)
will be interpreted as referring to that more local activity. The local or global `granularity'
of the reference is therefore disambiguated by utterance positioning.
Although this section has done no more than sketch a framework for the
resolution of elliptical, anaphoric, and ambiguous utterances, I hope to have shown the
utility of the notion of discourse modelling as a set of entities that are currently in focus:
in particular, a conceptualisation of the major task segment, and a set of properties
describing the current task. Together, these resources account for the correct resolution of
both local and global anaphora and ellipsis. It is interesting to note that beat placement in
relation to position in the task helps to disambiguate anaphors that are ambiguous in
terms of the level of the hierarchy they refer to. The resolution of semanticallyunderspecified expressions is therefore also performed successfully as a result of task
knowledge.
Conclusions and Further Research
Although the genre of the Step Aerobics monologue may seem an obscure corner
of linguistic usage, it presents a relatively clear-cut case of the way in which language
and action interrelate. As I hope to have shown, neither language nor action can be taken
in isolation in the analysis of how meaning is constructed among participants in the
activity type. The way in which participants interpret language in the workout is based on
specific knowledge about the task at hand, and expectations based on this about what
kinds of contributions can be made towards it. As Levinson (1992) suggests, the kinds of
inferences that are necessary for the interpretation of discourse draw on both the
structural properties of the talk itself and the structural organization of the activities in
which it arises (p. 75). The structural properties of the activity give rise to special
expectations about what the talk can be, and special constraints on what count as
`allowable contributions' from the class (in fact, there are almost no allowable verbal
contributions, and none at all in the data analysed here) and from the instructor (the five
kinds of allowable contributions I have outlined). Particularly interesting in this data is
the way in which the timing of utterances, rather than their sequential organisation,
determines their meaning. This is a particularly clear case of the application of pragmatic
knowledge that is very specific to the domain of activity in interpreting the discourse, and
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
31
shows how crucial it is to adopt an approach that respects the multimodal nature of the
construction of meaning.
In the discussion of ellipsis and anaphora, we have seen a specific way in which
constraints on interpretation arise out of the structure of the activity. The workout has a
clear hierarchical task structure, and the language that arises in the monologue
accompanying it is undiverted by any unpredictable contributions of other discourse
participants. The discourse, then, embodies one of the clearest available relationships
between task and language that make up an activity type. This makes the workout
discourse a very useful testbed for theories that attempt to predict the linguistic and
processing consequences of task and discourse segmentation. Resolution of anaphora and
ellipsis is one phenomenon where discourse segmentation approaches such as that of
Grosz and Sidner (1986) do predict the observed behaviour. Although there has been
space here only to sketch an application of their theory to this data, it seems clear that a
notion of focusing is a useful model of the way in which participants track discourse
referents. In addition, however, the paper has looked in more detail at how discourse
referents are constructed out of ephemeral sequences of actions, and what information
might reside in a focus space such that ellipsis can be resolved correctly. There is clear
scope for a more detailed study, tracing the mechanics of the focusing model more fully.
While the paper has not attempted an exhaustive description of the workout as a
genre, it does raise some interesting questions in relation to what features are ascribed to
the language of workouts. In particular, we have observed that participants with differing
levels of expertise may experience the workout differently: the interpretation of
descriptions and directives are a case in point. We may wish, therefore, to open the
question of how genres, registers, or discourse types are described. Features may well
vary depending on from whose point of view the describing is done. While there has not
been the space to enter into a full discussion of this point, the paper has shown something
of the nature of expert knowledge in this domain: that is, in what expertise consists, and
how the `ideal' (expert) participant is differentiated from the novice.
Of course, as in any short study, much has been left undone. One element that
plays a clear role in utterance function signalling, for example, is intonation: the `tune' of
utterances that play particular roles. Any study of this using computer pitch tracking is
challenged by the presence of sung music accompanying the workout. However, the clear
segmentation in the workout discourse could provide further useful cues as to the role of
intonation in signalling discourse segmentation and topic structure.
While participation in the workout is a social activity, and creating meaning
jointly within it is a social act, there is much more to be said about the workout as a social
phenomenon. In particular, more remains to be said about how participants are
constructed as participants, in terms of the kinds of expectations of behaviour that are
accepted by them, and how social relations with the instructor and with the rest of the
class are maintained. A particularly useful source of data for this lies in the sometimes
lengthy preambles to the workout, particularly those provided on the Reebok video: these
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
32
preambles describe exactly what is expected of participants, and set the tone for the
authority relations that will persist throughout.
This research represents a small step towards an understanding of an interesting
and as yet unstudied area of discourse research, and contributes to the body of knowledge
on instructional discourse in general. In particular, though, it points to the relevance of an
understanding of discourse as arising in action, and discourse meaning as dependent on a
range of kinds of knowledge, some of which are very specific indeed. While not every
discourse is like the workout discourse in structure, content, or action, a useful
generalisation to derive from this research would be the value of an approach that looks
very broadly at how meanings are arrived at, and includes information from a wide range
of available sources in the account.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
33
References
Delin, J. (1998). Facework and instructor goals in the Step Aerobics workout. In
S. Hunston (Ed.) Language at Work, 52-71. Clevedon: British Association for Applied
Linguistics in association with Multilingual Matters.
Grosz, B. (1977). The Representation and Use of Focus in Dialogue
Understanding. PhD Thesis, University of California, Berkeley.
Grosz, B. & Sidner, C. (1986). Attention, Intentions, and the Structure of
Discourse. Computational Linguistics, 12 (3), 175-204.
Levinson, S. (1992). Activity Types and Language. In: P. Drew, & J. Heritage
(Eds.), Talk at Work (pp. 66-100). Cambridge: Cambridge University Press.
Reebok (1990). Step Reebok: The Manual. Reebok International Limited.
Reebok (1992). Step Reebok: The Video. Reebok International Limited.
Reichman, R. (1981). Plain Speaking: A Theory and Grammar of Spontaneous
Discourse. (Report No. 4681). Cambridge, MA: Bolt, Beranek and Newman, Inc.
Searle, J. (1976). A classification of illocutionary acts. Language in Society 5, 123.
Sidner, C. L. (1979). Toward a Computational Theory of
Definite Anaphora Comprehension in English. (Technical Report AI-TR-537). Boston,
MA: Massachusetts Institute of Technology.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
34
Author Note
Judy L. Delin, Department of English Studies, University of Stirling, FK9 4LA,
U.K. The author can be contacted via email at j.l.delin@stir.ac.uk.
I would like to thank many people who have helped this research on its way.
Jenny Worsthorne patiently allowed me to record her classes and gave insights into
workout planning. Patrick Allen provided useful suggestions for the analysis of beat
patterns. John Bateman, Bethan Benwell, Jean Carletta, Robert Dale, Michael Gregory,
Karen Sparck-Jones, Geoff Thompson, and members of seminar and conference
audiences in Cambridge, Edinburgh, Halle, Stirling, and Sussex provided useful
suggestions. Adam Bull helped me think about workout structure and gave me access to
training materials, and Susana Murcia-Bielsa shared her work on directives. I am grateful
to the anonymous reviewers for their helpful comments.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Step Aerobics
35
Footnotes
1
WAG is available by ftp from its originator, Mick O’Donnell, at the Department
of Artificial Intelligence, University of Edinburgh, 80 South Bridge, Edinburgh EH1
1HN, or via the Web at
http:///www.dai.ed.ac.uk/daidb/staff/personal_pages/micko/wag.html
2
3
Descriptions are termed ‘Narrative’ moves in Delin 1998.
The activity is a 16-beat one, consisting of three repetitions of a four-beat
activity plus a single different four-beat activity.
To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.
Download