Step Aerobics Keeping in Step: Task Structure, Discourse Structure, and Utterance Interpretation in the Step Aerobics Workout Judy Delin Department of English Studies University of Stirling Stirling FK9 4LA Scotland j.l.delin@stir.ac.uk January 10, 2000 To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. 1 Step Aerobics 2 Abstract This study examines the monologues given by instructors in Step Aerobics classes, focusing on the way in which language both arises out of ongoing action and is constitutive of it. Following Levinson (1992), I show how the structure of the activity constrains the interpretations that are made of the utterances that arise throughout the workout. Aerobics participants need specific pragmatic knowledge, a key part of which is the ability to detect and interpret five distinct functions of utterance, defined according to their timing and placement in the hierarchical structure of tasks that the class is performing. I demonstrate that it is beat placement, rather than grammatical form or sequential organisation, that is the most important cue for this interpretative task. Having presented the utterance functions and the cues to their interpretation in detail, the paper goes on to outline how participants achieve the correct assignment of pronoun reference and ellipsis in the instructor’s monologue. This is explained by means of an approach to discourse modelling first suggested by Grosz and Sidner (1986), showing how instructors set up ephemeral actions as complex conceptually-salient discourse entities, making them accessible for subsequent reference. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 3 Keeping in Step: Task Structure, Discourse Structure, and Utterance Interpretation in the Step Aerobics Workout During the course of a Step Aerobics workout, participants have to perform a highly complex set of tasks. To do this, they are ‘cued’, visually and verbally, but the instructor, who produces a monologue to accompany the activity. This paper focuses on how participants use their knowledge of both the task structure and language to interpret this monologue, and how instructors position and construct utterances so that they serve the goals of the workout. The analysis reveals that less than half the instructor’s utterances actually have a directive function. In fact, instructor utterances serve five main functions, and participants must be able to differentiate between them in order to act correctly. This task is not achieved through relying substantially on either utterance form or content, since there is a great deal of overlap between the different utterance functions in both these respects. Instead, participants rely on the placement of utterances in relation to the beats of the music and the synchronised activity that is taking place in order to distinguish reliably between utterance types. It is utterance timing, rather than sequential organisation, that enables participants to act appropriately. This research draws on the framework for the description of language and action outlined by Levinson (1992). Levinson introduces the notion of activity type (p. 69), which he defines as ‘any culturally recognized activity...whose focal members are goaldefined, socially constituted, bounded, events with constraints on participants, setting, and so on, but above all on the kinds of allowable contributions.’ Levinson’s formulation places action at the centre of descriptions of such events. The interpretation of any language that occurs within them is closely linked to that action. He draws attention to one ‘particularly important question (p. 75): ‘In what ways do the structural properties of an activity constrain (especially the functions of) the verbal contributions that can be made towards it?’ This paper seeks to answer this question for the genre of Aerobics workouts through examining the constraints on the verbal contributions that can be made by instructors, and looking at how these contributions are ‘parsed’ by participants into relevant functions, based on the structure of the action at hand. As Levinson (1992) has suggested, specific types of inference about what utterances can ‘count as’ in a given context rely on a knowledge of activity type. The task examined here even dictates what can count as the functionally-defined unit ‘utterance’. Because task knowledge and utterance interpretation are so interdependent, different levels of expertise in the task can also give rise to predictable differences in utterance interpretation. In other words, novices predictably ‘misparse’ certain utterance types, leading to certain types of mistakes in action. The monologue therefore displays an element of ‘double coding’, in that novices and experts often interpret it differently. It is part of the growing expertise of each participant to use action knowledge to interpret utterances more reliably, and in turn to use utterance interpretation more surely as a guide to correct action. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 4 A particular example of how task knowledge and timing are used in interpretation can be seen in the successful resolution of ellipsis and anaphora. In the final part of the paper, a model is suggested that outlines how instructors set up complex entities as conceptually salient for the purposes of the activity, and make them accessible for subsequent reference. The clear hierarchical structure of both the task and its discourse make them particularly amenable to the discourse model suggested by Grosz & Sidner (1986). This model uses hierarchical structure to capture conversational participants’ focus of attention. The model is very appropriate for an activity in which the hierarchical structure of the action readily dictates how the discourse is to be interpreted, and can successfully be used to capture how correct antecedents are identified in cases of ellipsis and anaphora. This paper has five major sections. The remainder of the current section provides a background to the study and overview of what the monologue is like. It examines the issues involved in understanding in this context, an overview of the workout itself and its structure, and a discussion of the relative roles of verbal and visual information in the interpretative task. In the second section, the data and methods of analysis for the study are presented. In the third section, the paper argues that the workout monologue is analysed by participants into five functionally distinct types of utterance. In that section, I show how understanding is based on placement of the different utterance types in terms of musical beats and the ongoing exercise. This is explained using a framework for discourse analysis that predicts when and where in the workout task the different utterance types are allowable, capturing accurately the knowledge that workout participants are expected to acquire and employ in order to make the correct links between cues and action. In the fourth section, I show how a theory of focus and discourse structure such as that suggested by Grosz & Sidner (1986) can capture the creation and maintenance of the shared focus of attention that enables anaphora and ellipsis to be resolved correctly. In the final section, I discuss implications, conclusions, and directions for further research. Language and Action in the Step Aerobics Class Step Aerobics, as defined by the originating company, Reebok, involves stepping up and down onto a low platform, to music, while performing simultaneous choreographed arm movements (Reebok 1990). Aerobics is in general a group activity, performed in a gym or hall, with a single instructor visible on a platform or at the front of the class. Instructors, and participants, are usually female. The instructor wears a headset radio microphone for large classes, or uses her voice unamplified for smaller ones. Music is chosen to accompany the different stages of the activity, and the instructor’s voice is made audible above the music. Although usually at the front of the class, either facing or with her back to them, the instructor may also move around giving brief comments on a one-to-one basis for some exercises, although this did not happen in the data examined here. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 5 During the activity, the instructor is highly verbal. For example, approximately 1370 utterances, or nearly 4.5 thousand words, were produced in one of the hour-long workouts analysed in the study. Approximately 1500 utterances, or over 7000 words, were produced in the other. Language is clearly an important component of the class. The constraints on verbal contributions are different for participants and instructors in Aerobics classes. Aerobics workouts of the sort studied here are monologic, because participants are not generally expected to respond verbally. They will only do so in response to the most formulaic of questions (‘let me hear you say yeah!’ ‘yeah!’). Participants who make mistakes are expected to get back in step as best they can: they do not ask for verbal clarification. If a large number of people go wrong at once, the instructor will see this and interpret any verbalisation from the class as underlining the difficulty. In these cases, she will rectify this, usually by restarting the class from the next appropriate point. We can therefore expect any breakdown in a face-to-face class to be observable from the instructor’s monologue. It is therefore part of the rules of this particular language game that participants are non-verbal, and that they accept this position at the outset. The analysis presented here is therefore deliberately centred on the instructor’s monologue and how it is interpreted. Although language and authority appear to be on the instructor’s side, however, we should not assume that the class consists of the instructor barking at the class like a drill sergeant. She must attend to the ‘face wants’ of the participants (cf. Brown and Levinson 1987) as well as she can, within the confines of the activity, to ensure their enjoyment of and continued attendance at classes, as shown in detail in Delin (1998). Ignoring elements of social and technical preamble before the music starts, this paper concentrates on the activity-related language that arises during the structured workout. An expert participant has some specific and complex skills in conceptualising the workout task and in interpreting the discourse that goes with it. As novices will testify, despite the best efforts of instructors, the workout monologue is not easy to interpret until the necessary skills have been developed. Participants must learn to interpret and use verbal cues often before visual support is available to them. Although the instructor will eventually perform most of the workout actions herself, a strategy of simply mimicking action will leave participants lagging behind. An ability to use the verbal information that is supplied in advance, therefore, will enhance performance significantly. This paper seeks to present what it is that experts know, and novices learn, in the successful negotiation of language and action in the workout. The Task of Utterance Interpretation One might expect unusual or novel vocabulary to be the biggest problem for a novice Aerobics participant. Field-specific vocabulary such as grapevine, side out, single side step, and bow and arrow are used to refer to different actions. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 6 More interesting, however, is how participants interpret the language of the monologue so that it provides them with information they can use at critical points in the activity, such as just before changes in action. As might be expected, a significant portion of the workout monologue consists of utterances that tell participants what to do. These utterances fall without the broad definition of directive offered by Searle (1976). In this context, however, we will need to refine the notion of directive to mean actions to be done at a specific point, since directives in the workout are issued in order to warn the class that they must soon change to different movement. As we shall see, the complete interpretation of what movement the new one should be, and when it should occur, is arrived at on the basis of a complex set of processes that includes anaphor and ellipsis resolution. Participants must use task knowledge and interpretation of the utterance’s position in relation to activity and to musical beats to achieve comprehension. Some directives are imperative in form, such as squeeze up, tap and change, shake those arms, and hold it there. As in many other registers of language, however, there is no necessary match between linguistic form and utterance function. Directives in the workout may be nominals, declaratives, prepositional phrases, and adverbials, as illustrated respectively in (1) below: (1) a b c d right leg leg that’s behind, goes to the side to the centre overhead The monologue is not solely composed of utterances with directive function. Important though they are, directives make up only 43% of the instructors’ total output in the corpus analysed here. The remaining utterances serve other functions crucial to correct participation, and participants must learn to distinguish these. It is common, indeed normal, for instructors to give information to ‘foreshadow’ a change in action, for example. This function may occur across several utterances. It is intended only that the class should absorb the information while completing a previous task, rather than acting on it immediately. For example, the four consecutive utterances in (2) jointly specify that a single set of four repeated arm movements should be performed only after the final utterance: (2) start with the arms in front pull back for four here we go The preamble to an action change may refer to the same action several times, making the monologue appear repetitious. For example, in (3), the action of changing the leading foot is referred to no less than four times: changing the legs is signalled by the first utterance, then re-signalled by the utterance tap and change, then again by tap. Finally, left leg up refers a fourth time to the same action, specifying which leg should now lead: this is the result of the ‘tap’, a false step on the floor that allows the other foot to lead the To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 7 step. Despite four references to the same movement, participants understand that they need to do the action once and once only within the segment. Furthermore, they know which of the several utterances is the final one before the change: in this case, tap and change. (3) we’re gonna change the legs two more one more tap and change tap left leg up Other utterances do not refer at all to a change in action, but are intended instead to allow participants to check that they are performing the current action correctly. For example, the utterances in (4) do not indicate new movements, but constraints on performing the current one. The range of syntactic forms significantly overlaps that used for directives indicating distinct actions, including the use of imperatives: (4) a b c d nice straight backs hands going to your chin step together in the middle of the step roll back and down These examples show that utterances that seem like directives, syntactically and in content, may indicate constraints on current actions. Other utterances may simply be part of a preamble to the relevant directive. The identity of an action may be constructed over four or five consecutive utterances, each of which adds information to the description. The class is expected to know which kind of utterance is which. In what follows, we will see how the nature of the workout task itself forms a central part of the pragmatic knowledge that allows appropriate interpretative inferences to be made. The Workout What, then, is the activity like, and how is it constructed? Instructors plan the exercise activity in detail before the class, generally conforming to the four section structure suggested by Reebok (1990, p. 9) of warm up and stretch, aerobic stepping, isolation work using specific chosen muscle groups, and finally post-exercise stretch. In each section, the music is crucial: it is selected according to its beat speed, expressed as beats per minute. In musical terms, it must be 2/4, 4/4, or 8/8 to fit the rhythmic structure of the actions that make up the exercises. Breaking the workout structure into four sections already indicates something of its hierarchical nature. In fact, the activity is further divided into sub-tasks down to a very fine level of granularity. For example, the aerobic stepping section may build up gradually, and also incorporate a cooling-down period before the stretching section. Each of these sub-sections is itself made up of concatenated units of activity. A unit is made up of sets of repetitions such as 8 of one action, 16 of another, and 8 of a third, and so on, To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 8 and that unit would be repeated twice, once in each direction. As we will see, the hierarchical nature of the task is important both to the conceptualisation of the activity, and for the production and interpretation of the accompanying language. The basic building block of Step, from which all higher levels of organisation are constructed, is the step pattern or floor pattern, a small number of actions performed with the feet. Step patterns can be performed on the spot, or incorporate left or right, forward or backward travel. It is up to the instructor to concatenate patterns and co-ordinate travel so that each member of the class ends up back where she should be in relation to her step platform and to the four walls of the room at the end of the sequence. The simplest kind of step pattern takes four beats from the resting position to perform. One four-beat pattern is the basic step: up onto the step with the right foot, up with the left foot, down with the right foot, down with the left foot. This can be varied by leading with the left foot, or by starting on top of the step and stepping down first. Another group takes eight beats to perform: these are known as alternating steps. The following example of one kind of alternating step, the ‘Tap Up’, is taken from the Reebok trainers’ manual (1990): Tap Up Execution: Cycle: Cue: Approaching from the front of the step: R foot up, L foot tap up, L foot down, R foot down, L foot up, R foot tap up, R foot down, L foot tap down. 8 counts up, tap, down, down, up, tap, down, down (p.24) We can see here in ‘cue’ that the instructor is told what to say to accompany the exercise. The hierarchical structure of the workout can therefore be described by beginning with the fine detail of individual acts (such as the placing of one foot on the step), composing these individual acts into floor patterns (a basic step, an alternating step), adding floor patterns together into repetitions (such as 8 basic steps), and then composing ‘phrases’ of sets of repetitions (such as 8 basic steps, 16 tap ups, another 8 basic steps). These phrases are themselves combined into ever more complex sequences of activity. The penultimate level is the four-section structure of the workout as a whole described above. At the top level, for our purposes, is the category of the workout itself, although we could see this as again embedded in, say, an eight-week sequence of exercise classes. This hierarchical structure is exemplified in Figure 1. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 9 workout warm-up section 1 32-beat phrase 1 16-beat phrase 1 32-beat phrase 8 basic steps 16 side steps 8 basic steps 8 basic steps basic step right foot up left foot up right foot down left foot down Figure 1. Hierarchical structure of the workout. The data for this study consisted of two one-hour Step Aerobics classes. Class 1 was collected in a University sports centre, at a class in which the author was a participant (and was a regular participant in the series). Class 2 was taken from a commercial instructional video recording distributed by Reebok (1992). The Reebok class was taken to represent an institutionally-approved version of the Step workout; the instructor on the video is in fact the inventor of the exercise). Obviously, workouts recorded for video and those performed in a class situation will differ substantially in many respects, and these factors are periodically discussed. However, the instructional monologue produced is assumed to be similar whether talking to a class or to an individual watching the video, particularly since the video is intended to replicate a `complete, heart-pumping session of Step Reebok' (Reebok, 1992). A third University class was recorded, transcribed, and consulted informally as a cross-check on conclusions. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 10 In addition to qualitative comment on the data, this study included quantitative analysis. This was based on a close analysis of a total of 27 minutes and 57 seconds of data taken from class 1 (henceforth `University') and class 2 (henceforth `Reebok'). This corpus consisted of excerpts were taken from two separate sections of the University class, and one section of the Reebok video. Data from the University class consisted of a segment of the ‘Warmup’ section (15 minutes and 8 seconds, 386 utterances, 1175 words), and the ‘Aerobic’ section (5 minutes and 44 seconds, 191 utterances, 508 words). The total duration of data selected for close analysis from the University class was therefore 20 minutes and 52 seconds, consisting of 577 utterances and 1683 words. From the Reebok video, data was taken from ‘Warmup’ section alone: 300 utterances or 936 words, a duration of 7 minutes and 5 seconds. The corpus as a whole contained 877 utterances, and 2619 words. This data was coded utterance-by-utterance using the Workbench for Analysis and Generation (WAG), a computer coding tool that facilitates the analysis of data within the framework of Systemic Functional Grammar1. The current study was based on a systemic network built expressly for this analysis, allowing the relevant features of the workout data to be captured. The software allows for units of language (in this case, utterances) to be coded according to the features of the network, and performs statistics on the codings when completed. All syntactic and utterance-type coding described here was performed using the coder. Prior to coding, the data described in Figure 2 was also transcribed by hand onto a grid representing the 4/4 rhythm of the music, locating each syllable as closely as possible to the musical beat on which it occurred (to an accuracy of half-beats). This method of analysis was used as the basis for the study of utterance function assignment in relation to beat placement. Utterance Functions in the Workout Monologue We saw in the introductory section an informal description of the kinds of function that utterances in the workout monologue can serve. It is crucial to appreciate here the way in which the language relates to the activity type under way if we are to understand how the instructor's utterances can be assigned their correct functional interpretation. As an experienced participant observer in workout classes, and following the approach outlined by Levinson (1992), I have constructed the set of utterance function categories to be presented here on the basis of an understanding of how the activity itself divides down into functional sub-parts, and what kinds of information are needed by participants at specific points. The structure of the activity, and the language that arises within it, jointly serve to support the hierarchical goal structure of the activity. The goal structure of the workout places constraints on form, placement, and content of utterances, and interpretation is constructed on the basis of both the linguistic and the situational facts. The categories presented below, therefore, are not defined within the domain of speech alone: they are constructed and interpreted through the activity type in which they are embedded and of which they constitute a part. What are termed To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 11 `utterances' in this analysis are not independently verified in any systematic way on the basis of linguistic evidence such as pause or breath groups, although some linguistic differences are noted between them. Instead, utterance types are defined functionally, in terms of their role in the workout activity and experienced participants' actions in response to them. I have tried as far as possible, therefore, to develop a framework that arises, like the language itself, directly from the activity type in question. During the course of the workout, utterances with five main functions are produced and interpreted. In this section, we will examine these utterance types in more detail, and show what tasks interpreters are performing in differentiating them from one another. The five utterance types can be summarised as follows: Directives warn participants about what they will need to do at the next change. They may be straightforward cues such as now upward row, overhead, or again, or Directive Countdowns asking for a repeat of a previous action, such as four more. Descriptions2 specify what participants should be doing right now, often functioning as a check or follow-up to a previous directive. These may be utterances such as up up down down, describing at the time the ongoing action, or Descriptive Countdowns which count number of actions or duration down to a change (for four, three, two...). Teaching Points are of two kinds. They may either give further detail on an action currently under way, constraining the activity as in try not to push on that leg, resulting in participants checking that their behaviour conforms, or they may instruct participants in a new sequence of actions before they do it themselves, as in now we're gonna do an A step. Comments are utterances that are usually used to check and create interpersonal relations between participants and instructor, such as have I warmed you up yet? and this is very good. Markers are used to warn participants that they must act on the next beat. They may act as follow-up to a directive, and/or as a preface to a description. Examples are Ready? and Here we go. The number of utterances in the University and Reebok corpora were 577 and 300, respectively. The distribution of utterances in the University corpus was Directives (35.4%), Descriptions (47.3%), Teaching Points (10.7%), Comments (3.6%), Markers (1.4%) and Unintelligible (1.6%). The distribution in the Reebok corpus was Directives (57.7%), Descriptions (19%), Teaching Points (9.7%), Comments (8%), Markers (5.3%) and Unintelligible (0.3%). A chi-square test revealed that Directives were significantly more prevalent in the Reebok corpus than the University corpus, X2 (1) = 23.54, p < 0.05, whereas Descriptions were more prevalent in the University corpus than the Reebok corpus, X2 (1) = 87.40, p < 0.05. These differences may be accounted for by the structure of the workout: the Reebok workout is very inclusive in terms of the muscle groups it addresses in its duration. There are more frequent action changes with each action being performed for a shorter time. For changes in action, Directives are mandatory. The University workout, by contrast, is part of a series with different exercise goals for each To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 12 class, and segments of it may already be quite familiar to regular participants. There are fewer changes of action, and actions go on for longer. Descriptions are therefore used to support continuing actions, with Directives appearing at the less frequent changes. It is likely, too, that individual instructor styles play a role in how utterances are selected, and further research is needed to ascertain what particular factors affect frequency of utterance choice. It is sufficient for the purposes of this paper, however, to note that utterances of all the major types were used by both instructors. The subsequent sections examine each utterance type separately, describing the features that distinguish them both in form and function. Directives Directive utterances are those that tell participants about an imminent change of action, and what is to be done in the next task segment. These are composed of a range of syntactic forms. Further examples appear in (5): (5) a b c d e f g tap and change stepping up and front side for two now upward row to the centre single side step An important sub-category of directive is the count directive, which cues participants how many times a movement must be repeated, or for how long a duration in terms of beat counts. 85 of the directives in the total corpus are count directives: examples appear in (6): (6) a b c d one more time last one another four four more Together, the non-count and count directive define the next activity, and therefore often appear one after the other. Example (7) shows a typical sequence. Directives are in bold, count directives in italics: (7) arms overhead four more four, three arms down for four four three two arms up four three two To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 13 arms down for two up for two down for two up for two Note that, in the case of arms up, a count directive does not appear, since it has been established that the operational unit for the duration of the exercise is four repetitions. This `carrying over' of information from one task segment to another will be examined further in the discussion of focusing. The time-critical nature of directives is crucial to interpretation. Since it is part of directive function to warn or cue participants, a defining characteristic of directives is their temporal placement, as indicated by the Reebok advice on cueing quoted earlier. In the workout context, temporal placement can be described in terms of utterance position in relation to beats of the music and the ongoing activity. For example, if the current activity consists of four repetitions of a four-beat movement (i.e., 16 beats), a directive for what is to be done in the next task segment must be issued on the last four-beat repetition: that is, between beats 13 to 16. Likewise, if the activity is an eight-beat movement with four repetitions (32 beats), a new directive must appear on the last eight beat repetition: between beats 25 and 32. Directives, then, always appear on the last action repetition. Two-beat actions, however, are an exception: here, the final repetition is rather short to issue a coherent directive: in these cases, instructors will take the final two repetitions (four beats) to issue the directive. Two examples will serve to make these facts about directive placement somewhat clearer, represented diagrammatically to show the relationship of the language to the music and the activity. Figure 2 shows utterance placement in a four-beat activity (a basic step). In this and the following figures, activity beats are represented by numbers below a line, while each line is given a number above it to denote the number of repetitions that have taken place of that activity. The accompanying language is placed below the beatmarking numbers, and utterance functions are differentiated as noted in the key to each figure. Boxes indicate multi-word utterances that are too long to place on a single line: in these cases, the extent of the box, or the extent of a bracket above the box, indicate the number of beats that the utterance consumes. In Figure 2, it should first of all be noted how the count directive, another four, is placed on the final repetition of the previous activity segment. The next four four-beat bars, then, will be taken up with continuing repetitions of the same activity. As predicted by the first count directive (another four), participants know they must expect a new directive on the fourth and last repetition. The predicted directive, now upward row, duly arrives just before the second beat of the final four-beat bar. It could, in fact, have been later: as long as it is complete by the half-beat after beat four, it will suffice as a cue. However, this instructor wishes to use a marker (which we return to below) to mark the final beat before the change: squeeze is such a marker. This means that the directive has been placed sufficiently early to allow room for it. Finally, the beginning of the next set of repetitions is marked by the description up (see below for a description of descriptive utterances), said on the beat at which the class is stepping up onto the step. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 14 4 1 2 3 4 another four 1 1 2 2 3 4 1 2 3 3 for three 4 1 2 4 3 for two 4 1 last one 1 1 2 3 4 2 3 squeeze now upward row up directive marker description Figure 2. Utterance placement in a four-beat activity. A two-beat example appears in Figure 3. As it is rarely possible to generate a useful directive utterance in the space of two beats, instructors often use the space of the last two repetitions of the previous segment, instead of the final repetition, to issue the directive. Accordingly, the onset of the directive push forward with the arm is placed halfway through repetition three of the previous segment. It is followed by three descriptive utterances (four, three, two) before the onset of the next directive, now overhead, which again is positioned halfway through repetition three of the activity. Note that the fourth and final beat of the current task segment is the possible site for a marker, (indicated here by going). Finally, the new task segment is marked by descriptives that further elaborate on the content of the directive now overhead, telling participants that their arms should now be going up. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. 4 Step Aerobics 1 1 2 2 1 1 2 four 3 3 4 4 1 2 3 4 push forward with the arm 2 3 4 three 15 3 4 1 2 two 1 3 4 1 2 going up 2 3 4 up now overhead directive marker description Figure 3. Utterance placement in a two-beat activity. We are now in a position to formulate a metric for the interpretation of directives, based on timing and placement, in the workout monologue: Metric for Directive Function: directive onset will occur as late as possible during the last four beats of the task segment prior to the required change, or during the final repetition of the previous task set, whichever duration is greater. Maximum lateness of placement is dependent on the presence or absence of an optional final-beat marker. This, then, is a specific statement of the kind of pragmatic knowledge that participants must develop and use in order to interpret the workout discourse appropriately. Markers Markers are as utterances issued on the final beat or half-beat before a new task segment begins. They signal clearly when a change in activity will take place. Markers are optional. The archetypal marker may be familiar to the readers who attended ballet class in their youth: the pianist would play an introduction, turn to the class, and say ‘and’ to tell them where to start. In the workout discourse, markers usually appear after directives, but may appear on their own while a task segment is underway, reminding participants of To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 16 a change in move without describing the content of that change. In the data analysed for this study, markers constituted 1.4% of the University monologue, and 5.3% of the Reebok monologue. Given the large number of action changes that occur in the workout, it is clear that final-beat marking using a marker is optional: late placement of the directives, often with fall-rise intonation, is often sufficient warning of the specific beat upon which a change is to take place. It is clear that intonation is an important signalling device in the workout, not only of utterance function but of position in the hierarchical task structure. A full analysis is beyond the scope of the current study, however. A wide range of syntactic forms are used as markers. They may be as brief as and, or as long as you’re gonna go. The University instructor favours -ing forms, such as going, curling, pushing. Whatever their syntax, markers must, however, fit into at most a beat and a half, since their function is to indicate that a new action must take place on the next beat. An example of going as a marker appeared in Figure 3; further examples appear in Figure 7 at the end of the exposition of the individual utterance types. Unusually, markers can appear in pairs, if the previous directive has left enough space: an instructor might use two in a row such as here you go you're gonna go, ending on the half-beat before the relevant action change. In addition, markers can appear on their own with no directive, if the action is ongoing and is therefore sufficiently salient. For example, a new set of repetitions of the same action might simply be marked by an utterance of and. Descriptions In the University workout in particular, a large number of utterances serve the function of describing the current move. These descriptive utterances supply information to back up the framework provided by directives: they narrate the workout activities as they happen, providing a more fine-grained level of description than that given in directives. Frequently, a sequence consisting of directive and a marker is followed by description, as in (8): (8) Directive: Marker: Description: now add the hop you're gonna go up hop In this case, up hop appears on the two beats where stepping up and hopping should occur. Other examples appear in (9), below: (9) a b c d e up tap and down tap backwards up lift side, up, side, and down squeeze curl To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 17 As it has to appear while the class is performing the relevant components of specific actions, description is time-critical in its delivery. It is, as a result, terse. Syntactically (apart from the descriptive countdowns described later in the section, which are as might be expected mainly numbers), description consists of either adverbials (just under 43%) and clauses (just over 16%); noun phrases are rare. As we noted above, the University instructor uses description relatively prolifically, while the Reebok instructor, by contrast, describes less frequently, relying more on directives issued prior to the action. While syntactic information does not serve reliably to distinguish description and directives, the two categories of utterance are markedly distinct in terms of the related issues of beat placement and information status. Firstly, it is important to stress that a description is simply a late-placed directive: an utterance that must function to describe a current action rather than to direct a subsequent one. Description is placed too late to use as a cue: participants attempting to use them as cues will be at least half a beat behind the class. Informationally, to an adept participant, description will be composed of information that is already shared with the instructor, and is entailed by the content of the preceding discourse. For example, having issued the directive and knee lifts, the University instructor describes up lift on each iteration of the task; likewise, having directed an action to the class as then we're going to come back for three (meaning three steps backwards), she then says backwards on each of the three steps as they are performed. This involves an understanding that the described actions are part of the action described by the directive: that is, that there is a rhetorical relation between the two kinds of information that might be captured variously as a part-whole or elaboration relation. For expert participants, the descriptive information is inferrable on the basis of the directive. Less expert class members, however, may either not appreciate this relationship and therefore see the description as requiring different actions (the interpretation of the workout as a flat structure consisting of sequences of unrelated directives), resulting in confusion. A more successful, but still non-expert, strategy lies in understanding the relationship between directive and subsequent description, but still treating description as the component that cues participants on when to act. If a participant has to use description to tell her what she should be doing, she will be late in performing the action. This failure to share what is clearly intended to be shared knowledge indicates to a participant that she is not expert, and is excluded from the construction of `ideal participant' that the description implies. It is more than likely that participants move between these levels of descriptive interpretation during the workout, since even the most experienced participants can lapse in concentration from time to time, and will take the opportunity to get back in step that description offers them. The ability to interpret description as description, rather than needing to use it as directive, is a sign of developing skill. Description, then, is a category which is `double-coded' depending on level of expertise. A second kind of descriptive utterance appears in the monologue in the form of count descriptions. Count descriptions serve to enumerate repetitions or beat duration on the relevant beats, and are exemplified in Figure 4. Count descriptions serve a different function from count directives: the fact that the two kinds of utterance are concatenated in To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 18 the monologue reveals this. In Figure 4, repetitions of a four-beat action are cued by the count directive four more, followed by the count descriptions for four, for three, for two, for one, each placed on the first beat of each repetition. Informationally, count descriptions serve to elaborate at a low level of detail: while the directive four more of these, for example, defines the task segment, the descriptions for four, for three, for two, for one serve to count off the component actions of that segment. Note that, while the slot is not exploited, there is still, in principle, time for a marker at the end of the final repetition. The role of description and directives, and counts of both kinds, in defining task segments as salient discourse entities will be addressed in the discussion of focusing later in the paper. 1 1 2 2 3 4 1 2 3 3 4 1 2 4 3 4 1 2 3 4 four more 1 1 2 2 3 4 1 for three directive 2 3 3 4 1 for two 2 4 3 4 1 2 3 for one description Figure 4. Count directives and count descriptions. Teaching Points Teaching points are a heterogeneous category, but are grouped together here because they are not time-critical and are interpreted by participants as advisory rather than directive. Teaching points serve two particular major functions. Instructing teaching points function to teach a new move that participants are shortly to undertake, and constraining teaching To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. 4 Step Aerobics 19 points describe a current move in more detail, enabling participants to check that they are performing it correctly. Teaching points as a general category occupy just over 10% of the monologue in the total corpus examined here, with the proportion being very similar for both instructors. Example (10) is a long instructing teaching point from a University workout: (10) and we're gonna do three of these here gonna go across to the front diagonally then we're going to come back for three watch while I do it keep up tapping and down tapping here and I'll show you watch here I go I'm going to go four three two then I'm going over diagonally one two three and tap backwards backwards backwards then back over one two four ok In this long embedded segment, the instructor performs a range of speech act functions: in particular, she describes herself performing the action. These segments are therefore capable of further analysis in a recursive manner, re-applying the framework within the teaching segment. Many teaching points are shorter, as in the two examples in (11) below from the University data: (11) a b we're gonna do a basic up tap down tap to the side Now you're gonna watch for these arms, watch These teaching points differ from directives in that they are not to be acted upon immediately: instead, they foreshadow activities, and therefore other utterances, to come. Because they are delivered at a time when participants are already established in a different activity, they are not time-critical, and as a result tend to approximate more closely to the rhythms of casual speech than do directive and descriptive utterances. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 20 An example of the placement of an instructing teaching point, and its `foreshadowing' of directives and description to come, appears in Figure 5. In this example, the instructing teaching point then we're gonna add the hop, yes? comes six four-beat bars before the directive it foreshadows3, now add the hop. The teaching point is not placed on a final repetition of an action: it takes place when a sequence of actions (indicated by sequences of up, squeeze, over) is already underway. Most importantly, its placement violates a beat segment boundary, differentiating it further from a potential directive: as we have seen, directives typically inhabit a region as near as possible to the end of a four-beat or single-task-segment structure. This teaching point, on the other hand, straddles two bars. Since the we're gonna construction is also a common one for directives, it is clear from examples like these that placement is central to disambiguating utterance function. 3 1 2 up squeeze 3 4 1 2 then we’re gonna add the hop, yes? 3 4 1 2 over 3 4 1 2 3 4 4 1 2 up squeeze 3 4 1 2 3 squeeze 4 1 2 3 4 squeeze over 1 2 now add the hop 1 1 2 3 4 1 2 3 4 up hop directive marker comment teaching point description To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. 3 4 you’re gonna go Step Aerobics 21 Figure 5: Teaching point foreshadowing later directive The second category of teaching point is the constraining teaching point, which serves as a check on an activity already under way. The function of constraints is to ensure that participants are doing activities correctly and safely, and in a way that will enable them to get the maximum benefit from the exercise. (12) gives examples from the Reebok workout: (12) a b c each time your hands come down and rest on your thigh shoulders back abdominals tight push those crosses straight out in front Like instructional teaching points, constraints are not placed in the segment-ending positions that might cause them to be confused with directives: stretching those hamstrings and pushing those crosses out in front are not exercises in themselves, but refinements of current exercises. Placement, therefore, is `parenthetical' to the main structure created by directives, description, and markers. An example of a constraining teaching point in context appears in Figure 6, in which really work that back is inserted between two instances of and down which mark, and then describe, the current activity. Note, again, the cross-segment-boundary positioning of really work that back. It is this positioning that enables participants to understand constraining teaching points as elaborating at a lower level of detail on current actions. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 1 1 2 2 3 4 1 2 3 3 4 1 2 4 3 4 1 one 1 1 2 2 3 up 4 1 and down 2 1 that back 2 4 1 2 3 4 and directive marker teaching point 1 down 3 2 4 1 and down 2 last one 4 1 curling up 2 4 to the curling centre 2 3 4 3 4 two Really work more 3 3 3 4 curling up 1 2 3 3 22 4 3 4 and 1 down 2 and you’re marching description Figure 6: Constraining teaching point in context Comments Comments are utterances that serve to check and manipulate social relations in the workout. They are the only class of utterance whose content is not directly determined by the structure of the activity itself. They may, for example, serve a meta-linguistic function, checking that everyone can hear and is in a good position to begin the workout. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 23 More commonly, however, they serve to joke, check progress, and praise participants at various point throughout the class. Examples from the Reebok workout appear in (13): (13) a b c d e ready? good that's it right good that's good In fact, the examples in (13) above exhaust the variety of comments in the 300 utterances analysed closely in the Reebok data, perhaps because more contentful observations on class behaviour are not possible in the context of a video recording for commercial distribution. The University instructor, who has a live class in front of her, uses a wider variety of comments. Examples appear in (14): (14) a b c d e f nice and easy how're we feeling? this is very good keep it up you can almost hear the steam coming out of your brains why d'you all look so worried? Comments are not time-critical in their placement: as long as they avoid the positions defined as prime sites for directives, even imperatives such as keep it up should not be confused with directives. It is likely that a combination of content (for example, evaluative content, mental process verbs, non-field-specific vocabulary) and syntax (questioning and clausal constructions) serves to differentiate comments adequately from other utterance types. Summary: Utterance Placement and Interpretation Having described the five utterance types found in the workout monologue, we can examine how the utterance types work in context. In particular, it is helpful to examine some sequences of utterances from the workouts that further exemplify the `grammar' of utterance placement at work. As we have seen, three of the five functions of utterance are time-critical: directives, descriptions, and markers. Interpretation of directives as directives relies very closely on the structure of the task at hand. When workout participants reach the end of a task segment, their expectation that the next utterance will be a directive is greatly increased. In fact, because of the particular time-constraints of the task and the class expectation that a change is approaching, they are desperate for such information. Understanding of at least the current segment of the task structure is a crucial cue to topdown interpretation of the incoming utterances: anything appearing at these crucial points in the activity is likely to be interpreted as a directive. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 24 The directive metric given earlier in this section also caters for the positioning of markers, which, although optional, must be placed when used on the final beat or halfbeat before an action change. Finally, it is part of the definition of description that it is placed on the beat of the relevant action, so that it performs its primary task of allowing verification of the current action. Teaching points and comments may be freely placed, as long as they avoid the key beat placements reserved for directives in particular. Freedom from beat timing means that utterances of both these categories are more likely to approximate the normal rhythm of speech, and they are more likely to be syntactically complex than description and directives. As we have seen, too, they may straddle beat structure boundaries freely. Rhythm, syntax, and placement therefore combine to prevent confusion between the different classes of utterance. Figure 7 is an example that features all five types of utterance and illustrates the points that I have been making about their placement. It includes and follows on from the excerpt in Figure 5 above. The workout task is a sixteen-beat one: we join the extract at the beginning of the third of four sixteen-beat repetitions. First, note the placement of the time-critical utterances: directives, description, and markers. The major task directives now add the hop and once more round (non-count and count respectively) are placed as near as possible to the end of the sixteen-beat segment, as the metric for directive interpretation given above predicts. The reason that these directives are not right at the end is that a marker is present in each case (you're gonna go and going), and these occupy the last possible position in the segment. Descriptions are placed on the beats to which they refer: this accounts for the utterances up squeeze, up squeeze, squeeze, over, up hop, hop over, one two three and tap, up hop, up hop, up hop, over, and one two three and tap. Two further utterances, then we're gonna add the hop, yes? and don't forget to go backwards are teaching points, instructing and constraining respectively. As we saw above, the teaching points avoid the critical directive position late in their respective task segments (in this 16-beat activity, beats 13-16 inclusive). Finally, the encouraging comment, keep it up, similarly avoids the critical placement, and is in fact delivered, although the transcript does not show it, with a higher pitch than any of the descriptive utterances that are its only potential sources of confusion. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 3 1 2 up squeeze 1 2 up squeeze 3 3 4 1 2 then we’re gonna add the hop, yes? 4 1 3 4 1 2 over 3 4 1 2 25 3 4 4 2 3 squeeze 4 1 2 3 4 squeeze over 1 2 now add the hop 3 4 you’re gonna go 1 1 2 3 4 1 2 3 4 1 2 3 4 hop over up hop 2 1 2 up hop 1 3 2 3 4 1 2 3 don’t backwards forget to go 4 up hop directive marker 1 2 up hop comment 3 4 1 2 3 4 one two three and tap 1 2 3 4 1 keep it up 2 1 2 3 2 3 4 up hop over teaching point 4 1 3 4 once going more round 3 4 one two three and tap description Figure 7: 5 types of utterance in context To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 26 Different types of utterance address different levels of conceptual hierarchy in the workout task. Pragmatic knowledge of utterance placement and task structure leads participants to infer correctly the ways in which this happens. While directives, count directives, and instructional teaching points provide descriptions of entire task segments, constraining teaching points and descriptive utterances operate at a finer level of detail, reaching inside task segments to describe or constrain sub-activities. Markers act as `punctuation', to warn exactly when a change is expected, while comments serve a primarily social function, maintaining friendly proximity during the workout. Do it Again: Focusing and Reference Resolution In addition to inferring the distinct functions of utterances, workout participants must also resolve a large number of anaphoric references, and successfully interpret a high degree of ellipsis. In this section, we will look at the ways in which anaphoric referring expressions and ellipsis come to be interpreted correctly in a situation where the discourse referents – actions and sequences of actions – are both complex and ephemeral. For example, the class is frequently given directives such as again, another four and once more round. In a sequence of actions that stretches back as much as forty or fifty minutes, it is clear that mechanisms are in place that enable participants to segment their previous activity into a conceptual structure that provides correct antecedents for such expressions. Similarly, during a sequence of activities, elliptical directives such as now to the front and and singles are often encountered. Here, the class is clearly intended to preserve particular elements of the current activity, and change only those which the new directive is intended to update. This ability suggests that a working description of the current activity, represented as an accessible aggregate of different variables, is available to them. Furthermore, since all the expressions indicated are capable of different interpretations at different points in the workout, it is clear that the knowledge structures involved are dynamic, and capable of complete updating whenever a global change in activity takes place. The highly structured nature of the workout task, and the close mapping between this and the accompanying monologue, suggests the appropriateness of computational theories of focusing. These employ hierarchical and dynamically-updated discourse models as a means of representing interpretative processes (for example, Grosz 1977; Sidner 1979; Reichman 1981). In particular, the following brief account draws on the theory of Grosz and Sidner (1986). They suggest that interpretative processes required for keeping track of discourse can be captured in terms of a model consisting of three related structures: linguistic structure (the actual discourse), intentional structure (a structure composed of the various purposes of the discourse segments) and attentional structure (a representation of the state of attention of participants in the discourse). Recognising how discourse coheres into segments, and modelling the relationships between those segments, is known to provide information vital for the understanding of semantically underspecified linguistic expressions such as ellipsis and anaphora. It also allows discourse participants to keep track of discourse topic, digressions, and interruptions. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 27 While there is not the space here to go into either a complete description of Grosz and Sidner's theory, nor to exhaust its possibilities for describing the current data, an examination of some particular cases of reference will exemplify some of the finergrained interpretative tasks that workout participants successfully carry out. Central to the theory are so-called focus spaces which represent attentional state. In the model, each segment of the discourse gives rise to a focus space which contains the entities that are salient in that segment. In addition, each focus space contains information about the purpose of its discourse segment and its relationship with other segments. The relationship between focus spaces is represented as a stack, with new spaces being pushed onto the top of the stack when the discourse segment purpose for a new segment contributes to that of the segment immediately preceding it. Discourse segment purposes, then, are seen as hierarchically organised, contributing to the global purpose of the discourse as a whole. When a purpose is achieved, and the discourse associated with it therefore comes to an end, the focus spaces dominated by that purpose are popped from the stack. In terms of the accessibility of discourse referents, the stack as a whole is seen as representing the salience of entities, with the topmost focal space containing the most salient. However, information in lower spaces is accessible from higher ones, providing that all are dominated by the same focal space. We can conceptualise the workout monologue in Grosz and Sidner's terms as being dominated by a single, top-level, discourse purpose: we can gloss this as the instructor's intention that the class intend to perform the workout. However, it is the subordinate structures into which the workout and its monologue are arranged that provide the interesting answers for reference resolution. As we have seen in the discussion of the workout plan, there is a range of levels at which the workout activity can be conceptualised and described: from the individual actions that make up floor patterns, through sets of repetitions of different floor patterns, right up to the four major sections of the workout. For economy, instructors need to be able to refer quickly and easily to parts of the task, so that participants will be able to repeat movements. In this section, we will look briefly at how the University instructor sets up task segments of particular `grainsize' in order to refer to them subsequently. The first step is to set up a floor pattern and establish it as salient to participants. This is achieved either by a straightforward directive, or, if it is new or complex and therefore needs to be taught, by a teaching point, as in (15): (15) we're going to do a basic up tap and down tap to the side This introduces the floor pattern on the basis of which the next sequence of activity will be built up. To establish the floor pattern, this instructor typically describes a few repetitions of it. Not every step is described. Even with some steps omitted, however, the spacing of descriptive utterances still serves to establish in participants' minds the duration of the salient activity in beats (in this case, four beats; in example (16), brackets To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 28 indicate elapsed beats that are not marked linguistically). After we're gonna go, a marker, the description begins: (16) we're gonna go up tap and down tap up tap and down tap (1.5) and down tap (2) down tap Then, to build up a more complex sequence of activity out of the now-established floor pattern, the instructor will add several repetitions together to make a longer phrase. This will be done using a teaching point. Note that, in (17), the anaphoric these is able to refer to the `up tap and down tap' that has been established as the basic floor pattern, while the final elliptical three refers to three such movements: (17) gonna do three of these here gonna go across the front diagonally then we're gonna come back for three … then back over This teaching point serves as the first introduction of a 32-beat pattern of activity, composed of three 4-beat `up tap and down taps', a 4-beat movement across the step, three further 4-beat up tap and down taps moving gradually backwards to the foot of the step, and a final 4-beat step over back to the starting position. The instructor then describes this as before, as the class join in. The pattern of her utterances soon falls into the normal one of directives and description as they accompany her: she moves from a strategy of exclusively on-the-beat description to one of issuing some information early enough to use as directive. For example, over and and then back over are directives. In example (18), numbers in brackets again indicate elapsed 'silent' beats: (18) up tap down tap up tap (2) (3) over (4) one two three and tap up (3) up (3) and then back over one two three and tap The higher-order task of 32 beats has therefore been made conceptually salient by a combination of teaching point, directive, description, and counting. The class should now have a clear representation of the nature and duration of the task segment, based not just on the original four-beat floor pattern but on the larger structure in which it participates. Interestingly, keeping track of the beats that elapse between utterances has been as important as the utterances themselves in establishing this larger task. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 29 We can view the whole process of establishing the 32-beat task as a single discourse segment giving rise to a set of salient entities, including the 32-beat task itself, which populate a single focus space. During subsequent activity, the instructor uses the expressions again, once more and once more round to refer to the whole 32-beat structure. In addition to referring to conceptual entities that reside at a relatively high level, however, instructors refer locally to sub-segments within the 32-beat task structure. For example, again is sometimes used to refer to single repetitions of the basic `up tap down tap' floor pattern contained within the 32-beat structure. In order to account for this local reference, we can model the local tasks from which the 32 beat activity is made up, and their associated language, as subordinate discourse segments with focus spaces of their own. These spaces would be `pushed' on the stack as the tasks are undertaken. The global discourse purpose of the dominating segment can be described as `intend that the participants intend to perform the 32-beat activity', the purposes of the local segments are to get the participants to perform the tasks that comprise the 32 beats. Within each focus space lies the description of the relevant task, consisting of a small number of salient variables which participants keep in mind: a sort of `working description' of the current task. For each task, a participant will have in mind a current description consisting of four components: the floor pattern, such as a basic step or a V-step, the number of repetitions or duration of the exercise, the relevant arm movement (a bicep curl, for example), and the relevant leg movement (e.g hamstring curls, knee lifts). Modifications to each element of the activity are made without stopping, usually by replacing one element with another (for example, arms overhead may replace elbows in; two repetitions may replace four). The elliptical directive side out no arms, to take a particular case, updates the working description of leg and arm movements: side out meaning that the current sequence of three knee lifts should be replaced on the next repetition with a sequence of three side-swings of the leg, while the current arm movement should be dropped completely. Note that there is no revision intended to the number of repetitions, the basic floor pattern, or the global structure of the task. The constant updating of salient variables provides a set of values against which ellipsis should be resolved. Given the basic framework set out above, elliptical utterances (both descriptive and directive) are matched against the `fields' in the open focal space. In practice, floor patterns are rarely revised in this way: movements are more usually added on to existing floor patterns. Repetition number is frequently reduced, however, and the most common alterations during major task segments are changes to arm and leg movements. All the values would be expected to change at major task segment boundaries, however, when a new floor pattern is defined, with new duration and different numbers of repetitions making up the global task. Ellipsis and anaphora does not therefore cross major task segment boundaries. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 30 There is a residual problem, however, in the resolution of ambiguous anaphoric directives such as again. How are participants to differentiate an again meaning a single floor pattern repetition from one meaning the whole major task segment? On Grosz and Sidner's model, the first referent would be chosen from the topmost focal space, which, on the application of their theory that I have been outlining, would represent the local task. However, in some situations, this would result in an incorrect interpretation: a participant would choose the local interpretation when the global one was intended. The solution to this issue lies once more in the placement of utterances, the key to distinguishing between segment-internal reference and whole-segment reference being in the placement of the relevant expressions. Quite simply, a directive including again which appears during the final activity of the 32-beat section will be interpreted as referring to the entire section, while one appearing, say, just after the second repetition of one of the component floor patterns (in this case, between beats 4 and 8 of the segment) will be interpreted as referring to that more local activity. The local or global `granularity' of the reference is therefore disambiguated by utterance positioning. Although this section has done no more than sketch a framework for the resolution of elliptical, anaphoric, and ambiguous utterances, I hope to have shown the utility of the notion of discourse modelling as a set of entities that are currently in focus: in particular, a conceptualisation of the major task segment, and a set of properties describing the current task. Together, these resources account for the correct resolution of both local and global anaphora and ellipsis. It is interesting to note that beat placement in relation to position in the task helps to disambiguate anaphors that are ambiguous in terms of the level of the hierarchy they refer to. The resolution of semanticallyunderspecified expressions is therefore also performed successfully as a result of task knowledge. Conclusions and Further Research Although the genre of the Step Aerobics monologue may seem an obscure corner of linguistic usage, it presents a relatively clear-cut case of the way in which language and action interrelate. As I hope to have shown, neither language nor action can be taken in isolation in the analysis of how meaning is constructed among participants in the activity type. The way in which participants interpret language in the workout is based on specific knowledge about the task at hand, and expectations based on this about what kinds of contributions can be made towards it. As Levinson (1992) suggests, the kinds of inferences that are necessary for the interpretation of discourse draw on both the structural properties of the talk itself and the structural organization of the activities in which it arises (p. 75). The structural properties of the activity give rise to special expectations about what the talk can be, and special constraints on what count as `allowable contributions' from the class (in fact, there are almost no allowable verbal contributions, and none at all in the data analysed here) and from the instructor (the five kinds of allowable contributions I have outlined). Particularly interesting in this data is the way in which the timing of utterances, rather than their sequential organisation, determines their meaning. This is a particularly clear case of the application of pragmatic knowledge that is very specific to the domain of activity in interpreting the discourse, and To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 31 shows how crucial it is to adopt an approach that respects the multimodal nature of the construction of meaning. In the discussion of ellipsis and anaphora, we have seen a specific way in which constraints on interpretation arise out of the structure of the activity. The workout has a clear hierarchical task structure, and the language that arises in the monologue accompanying it is undiverted by any unpredictable contributions of other discourse participants. The discourse, then, embodies one of the clearest available relationships between task and language that make up an activity type. This makes the workout discourse a very useful testbed for theories that attempt to predict the linguistic and processing consequences of task and discourse segmentation. Resolution of anaphora and ellipsis is one phenomenon where discourse segmentation approaches such as that of Grosz and Sidner (1986) do predict the observed behaviour. Although there has been space here only to sketch an application of their theory to this data, it seems clear that a notion of focusing is a useful model of the way in which participants track discourse referents. In addition, however, the paper has looked in more detail at how discourse referents are constructed out of ephemeral sequences of actions, and what information might reside in a focus space such that ellipsis can be resolved correctly. There is clear scope for a more detailed study, tracing the mechanics of the focusing model more fully. While the paper has not attempted an exhaustive description of the workout as a genre, it does raise some interesting questions in relation to what features are ascribed to the language of workouts. In particular, we have observed that participants with differing levels of expertise may experience the workout differently: the interpretation of descriptions and directives are a case in point. We may wish, therefore, to open the question of how genres, registers, or discourse types are described. Features may well vary depending on from whose point of view the describing is done. While there has not been the space to enter into a full discussion of this point, the paper has shown something of the nature of expert knowledge in this domain: that is, in what expertise consists, and how the `ideal' (expert) participant is differentiated from the novice. Of course, as in any short study, much has been left undone. One element that plays a clear role in utterance function signalling, for example, is intonation: the `tune' of utterances that play particular roles. Any study of this using computer pitch tracking is challenged by the presence of sung music accompanying the workout. However, the clear segmentation in the workout discourse could provide further useful cues as to the role of intonation in signalling discourse segmentation and topic structure. While participation in the workout is a social activity, and creating meaning jointly within it is a social act, there is much more to be said about the workout as a social phenomenon. In particular, more remains to be said about how participants are constructed as participants, in terms of the kinds of expectations of behaviour that are accepted by them, and how social relations with the instructor and with the rest of the class are maintained. A particularly useful source of data for this lies in the sometimes lengthy preambles to the workout, particularly those provided on the Reebok video: these To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 32 preambles describe exactly what is expected of participants, and set the tone for the authority relations that will persist throughout. This research represents a small step towards an understanding of an interesting and as yet unstudied area of discourse research, and contributes to the body of knowledge on instructional discourse in general. In particular, though, it points to the relevance of an understanding of discourse as arising in action, and discourse meaning as dependent on a range of kinds of knowledge, some of which are very specific indeed. While not every discourse is like the workout discourse in structure, content, or action, a useful generalisation to derive from this research would be the value of an approach that looks very broadly at how meanings are arrived at, and includes information from a wide range of available sources in the account. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 33 References Delin, J. (1998). Facework and instructor goals in the Step Aerobics workout. In S. Hunston (Ed.) Language at Work, 52-71. Clevedon: British Association for Applied Linguistics in association with Multilingual Matters. Grosz, B. (1977). The Representation and Use of Focus in Dialogue Understanding. PhD Thesis, University of California, Berkeley. Grosz, B. & Sidner, C. (1986). Attention, Intentions, and the Structure of Discourse. Computational Linguistics, 12 (3), 175-204. Levinson, S. (1992). Activity Types and Language. In: P. Drew, & J. Heritage (Eds.), Talk at Work (pp. 66-100). Cambridge: Cambridge University Press. Reebok (1990). Step Reebok: The Manual. Reebok International Limited. Reebok (1992). Step Reebok: The Video. Reebok International Limited. Reichman, R. (1981). Plain Speaking: A Theory and Grammar of Spontaneous Discourse. (Report No. 4681). Cambridge, MA: Bolt, Beranek and Newman, Inc. Searle, J. (1976). A classification of illocutionary acts. Language in Society 5, 123. Sidner, C. L. (1979). Toward a Computational Theory of Definite Anaphora Comprehension in English. (Technical Report AI-TR-537). Boston, MA: Massachusetts Institute of Technology. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 34 Author Note Judy L. Delin, Department of English Studies, University of Stirling, FK9 4LA, U.K. The author can be contacted via email at j.l.delin@stir.ac.uk. I would like to thank many people who have helped this research on its way. Jenny Worsthorne patiently allowed me to record her classes and gave insights into workout planning. Patrick Allen provided useful suggestions for the analysis of beat patterns. John Bateman, Bethan Benwell, Jean Carletta, Robert Dale, Michael Gregory, Karen Sparck-Jones, Geoff Thompson, and members of seminar and conference audiences in Cambridge, Edinburgh, Halle, Stirling, and Sussex provided useful suggestions. Adam Bull helped me think about workout structure and gave me access to training materials, and Susana Murcia-Bielsa shared her work on directives. I am grateful to the anonymous reviewers for their helpful comments. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version. Step Aerobics 35 Footnotes 1 WAG is available by ftp from its originator, Mick O’Donnell, at the Department of Artificial Intelligence, University of Edinburgh, 80 South Bridge, Edinburgh EH1 1HN, or via the Web at http:///www.dai.ed.ac.uk/daidb/staff/personal_pages/micko/wag.html 2 3 Descriptions are termed ‘Narrative’ moves in Delin 1998. The activity is a 16-beat one, consisting of three repetitions of a four-beat activity plus a single different four-beat activity. To appear in Discourse Processes, Vol 31, Issue 1, Jan-Feb 2000. Please quote that version.