MARKING SALIENCE:

advertisement
MARKING SALIENCE:
THE SIMILARITY OF TOPIC AND FOCUS1
JENNIFER E. ARNOLD
University of Pennsylvania
Past research on choice of reference form and other aspects of language processing has suggested
that referents differ in terms of their salience. But what makes referents salient? In this paper I
investigate how salience is influenced by topic and focus constructions. In Experiment 1, I show
that the referents of both the sentential topic and the focus of cleft-like constructions are more
salient than other referents. This result is consistent with past psycholinguistic research, but is
surprising in light of the traditional opposition between topic and focus in linguistics. A partial
explanation of this result is given with a corpus analysis, which shows that these constructions
share an additional characteristic: referents of subjects and foci are both more likely to be
referred to again in the following discourse than the other referents in their respective utterances.
In Experiments 2 and 3, I further investigate the role of the focus by comparing it with the
discourse topic. These experiments show that when a clearly defined discourse topic is present,
the focus is not as salient as when there is no clear discourse topic. This suggests that salience is
a competitive property, such that the salience of one referent is sensitive to the salience of other,
competing referents in the discourse. Implications for the nature of salience are discussed.
How do speakers decide when to use a pronoun to refer to something, and when to use a
fuller form, like a name or description? This question has been the subject of much research in
linguistics, psychology, philosophy, and computer science, in part because of its centrality to
understanding how human beings think, speak, and represent the world around them. Most
researchers explain the appropriateness of particular forms of reference by appealing to the
salience of the referent in the discourse model that is shared among discourse participants.2 For
the most part, pronouns are used when the speaker believes that the referent is already available
1
The results of Experiment 1 and the Corpus Analysis were presented in a paper at the 1997 meeting for the Linguistic
Society of America, under the title "What is Salience?: The role of topic and focus in processing reference".
I would like to thank Thomas Wasow for his guidance at every stage of this project. Thank you also to Amit Almor,
Mira Ariel, David Beaver, Eve Clark, Herbert Clark, Miriam Eckert, Adele Goldberg, Maryellen MacDonald, Ellen
Prince, Peter Sells, and John Trueswell for their comments on earlier drafts of this paper and/or discussions of these data.
Thank you to Carie Lemack for her help with designing and running Experiment 2. None of these people are responsible
for any shortcomings of this paper.
2
My use of the term "discourse participants" includes both the speaker and all listeners. I will often use the terms
"speaker" and "listener", assuming a prototypical situation of a dialogue. However, "speaker" is also meant to extend to
writers and other producers of language, and "listener" also represents readers and other comprehenders. In a dyadic
conversation, the listener is also the addressee. However, my data do not specifically distinguish between language
comprehension from the perspective of an addressee or an overhearer.
page 1
in the listener's consciousness (i.e., when the entity is "given") (Ariel, 1990; Chafe, 1976; Gundel
et al., 1993; among others). But there may be many things in the listener's consciousness, and the
speaker needs to know that the listener will choose the correct one. To account for this process,
past work on reference has suggested that the entities in the joint discourse model will differ in
terms of salience, prominence, activation, or accessibility. Although these terms are not
identical, for the purposes of this paper I will categorize them together, and use "salience" as a
cover term.3
But what makes things salient? In this paper, I will focus on how the salience of
discourse entities is influenced by two kinds of linguistic properties: "topic" and "focus". An
inspection of past research reveals an apparent discrepancy between linguistic analyses of
information structure and psycholinguistic studies on reference: while linguists have traditionally
treated "topic" and "focus" as opposing categories (see e.g. Gundel, 1988; Lambrecht, 1994;
Tomlin et al., 1997), different psycholinguistic experiments have revealed that both topic and
focus constructions can make their referents salient. The experiments in this study will
investigate this discrepancy, and propose an explanation for why certain topic and focus
constructions have similar effects on language processing.
In the first experiment, I will directly compare the grammatical subject of nonclefted
utterances (a topical position) with the focus of clefts (a focus construction). I will demonstrate
that both categories make their referents more salient, such that subsequent reference is more
natural as a pronoun. I will then use a corpus analysis to investigate the similarity between these
categories, and show that the referents of subjects and the focus of clefts share the property that
each is more likely than other entities to be mentioned in the following discourse.
If the foci of clefts are salient, what happens when they co-occur with a well-established
topic? Experiments 2 and 3 investigate this situation, and find that the salience of the focus
position is highly susceptible to competition from other discourse entities: if a competing
3
Later in the paper I will discuss salience more specifically in terms of the activation of referents in the discourse
participants' mental representations. This approach is detailed more fully in Arnold (1998).
page 2
discourse entity is highly salient, it appears to draw salience away from the focus. This finding
further suggests that salience is a dynamic, competitive phenomenon.
1. SALIENCE AND LANGUAGE PROCESSING. In this paper I am attempting to contribute to a
processing account of salience and reference form, which carries several important assumptions.
The major underlying issue at hand is how linguistic information leads some people, things,
or ideas to be more saliently represented in the mind of the listener. Language comprehension is
known to result in a non-linguistic mental model of the discourse (e.g., Johnson-Laird, 1983, 1989;
Miller, 1979a, 1979b; Morrow et al., 1987, van Dijk and Kintsch, 1983). It is also known that
language use depends on speakers and hearers knowing what is in “common ground” (e.g., Clark
1992, Part I), or the “joint discourse model”. However, it is not entirely known how the joint
discourse model is built.
I critically assume that speakers and listeners do not know a priori what is salient for others,
so they have to infer it. I am further assuming that people estimate the salience of an entity
probabilistically, in a graded way. For example, if Ann has good evidence that an object is salient
for Bob, then her representation of that object in the joint discourse model will be high on the
salience continuum. If Ann has only weak evidence of the object's salience, she will represent that
object with only middle-level salience. If there is no evidence for the object's salience, she will
represent it as not salient (or not represent it at all).4
So, what evidence do people use to infer the salience of an entity in the joint discourse model? I
will report on three studies that investigate how topic and focus constructions affect salience, and
reveal the competitive nature of salience.
4
I am assuming here that speakers refer only to the joint discourse model during production (and furthermore that they
rely on comprehenders to use the linguistic information successfully to build the same joint discourse model). However,
this simple assumption is likely to be only an approximation of actual discourse processes (e.g., Hanna et al., 1998;
Horton and Keysar, 1996; Arnold et al. 1999) Thus, it is likely that the model I am developing here will have to be
replaced by a more complex model, which accounts for the extent to which speakers use information from the joint
discourse model, and the extent to which other constraints influence their choices in reference form. However, the
studies in this paper investigate information that is available to all discourse participants, so these results would be
consistent with a more fully developed model.
page 3
2. SALIENCE AND REFERENCE FORM.
The tool I will use to measure salience is variation in
reference form. This method rests on an important assumption about salience and reference form.
Past research has demonstrated that the speaker's choices of forms of reference are greatly influenced
by the salience / accessibility of the referent (e.g., Ariel, 1990; Arnold, (to appear); Du Bois, 1987;
Givón, 1983; Gundel et al., 1993). In addition, the listener's comprehension of these referring forms
is affected by the appropriateness of a linguistic form in a given situation (e.g., Almor, 1995; Gordon
et al., 1993; Hudson-D'Zmura and Tanenhaus, 1997). The tight association between salience and
reference form means that evidence about reference form can be taken as evidence about the
cognitive status of the referent. For example, if an experiment shows that pronouns tend to be used
more for the referents of grammatical subjects than for other referents, this can be taken as evidence
that the grammatical subject is associated with a more salient representation of that referent. This is
the logic employed by many studies on reference (e.g., Givón, 1983; Gordon et al., 1993), and which
I will also adopt for this paper.5
3. TOPIC AND FOCUS: SIMILAR OR DIFFERENT?
Topic and focus are informational categories
that have been used to explain various linguistic phenomena. Some studies about these phenomena
have looked at the syntactic constructions that can be used to encode topics or foci. Others have
asked what cognitive properties are associated with each category. In both of these cases, most work
has concentrated on how topical or focused referents relate to their status in the PRIOR discourse,
studying the conditions under which something can be put in a topic construction or focus position
(e.g., Prince, 1978, 1981, 1992; Ward and Birner, 1998), or the kinds of syntactic constructions used
for topics or foci (e.g., Aissen, 1992). In both cases, topics and foci are contrasted with each other
5
This assumption is reasonable, given that the factors affecting reference form can be categorized into two general
factors (Arnold, 1998): 1) those factors that affect the speaker’s assumption about the salience of referents in the joint
discourse model, and 2) the ambiguity of the situation. For example, in a context with only one inanimate referent, the
pronoun "it" will be sufficient to specify that referent, but in a context with several inanimate referents, a fuller form may
be needed. In this study I will hold the ambiguity of a situation constant, or cross-balance it against other factors, which
means that reference form preferences are a plausible indicator of the salience of a referent.
page 4
syntactically and informationally (for reviews see Gundel 1988; Lambrecht, 1994; Tomlin et al.,
1997).
By contrast, I am interested in a complementary question: how do topic and focus
constructions influence the discourse status of entities in the FOLLOWING discourse? This
question is germane to determining the nature of salience.
Consider topics first. Theories of referential form often describe salience in terms of
topical attributes. There are several theories that describe the relationship between cognitive
accessibility and choices in referring forms, e.g. Givon's (1983) Topicality continuum, Gundel et
al's (1993) Givenness hierarchy, Ariel's (1990) Accessibility hierarchy, Prince's (1981)
Familiarity Scale, and Chafe's (1994) Activation states. All five theories encode similar
continua, but the same attributes that are termed “topical” by Givon are described by others in
terms of cognitive accessibility. The link between salience and topicality is also apparent where
topics are described in terms of high accessibility. For example, Gundel (1988) argues that an
entity has to have a certain level of "givenness" (or salience) to be felicitously used as a topic
(i.e., it has to be assumed to be familiar among the discourse participants.)
Because "salience" and "topicality" are often described in terms of similar properties, one
might think that salience could be understood in terms of topicality. This conclusion would also
be supported by experimental research on the role of grammatical subjects. Sentential topics are
not always the same as the grammatical subject, but they have often been associated with this
role, especially when they are in sentence-initial position (Chafe, 1994; Lambrecht, 1994:131;
Levy, 1982, as cited in Ariel, 1990: 24; Reinhart, 1982). Research has demonstrated that
speakers and writers consistently use pronouns (or zero anaphora) more often than full names or
descriptions to refer to an entity that last appeared in subject position (Arnold, in press; Brennan,
1995; Stevenson et al. 1994; Sanford and Garrod 1981:137-8).
In terms of comprehension, Gordon et al., (1993) and Hudson-D'Zmura and Tanenhaus
(1998) both found that the referents of grammatical subjects ("subject-referents") were more
salient than non-subject-referents: comprehension was facilitated if the next sentence used a
page 5
pronoun to refer to subject-referents, but it was inhibited if a repeated name was used. In
contrast, the "repeated-name penalty" was not observed for non-subject-referents. Both studies
concluded that their results reflected the higher salience of subjects over non-subjects. Since
grammatical subjects are considered topical in English, the association of subjects with salience
supports the connection between salience and topicality.
However, salience can not be fully defined in terms of topicality. While the above studies
found repeated-name penalties for subject entities, Almor (in press) found a similar effect for the
entities denoted by the foci of clefts. In a self-paced reading experiment, he compared reading
times for reference to the focused entity with reading times for reference to the nonfocused entity.
For instance, in the following example, "the bird" is compared with "the fruit" under conditions
where each referent is focused.
(1) Sample stimuli form Almor (in press), Experiment 1
1a.
It was the robin that ate the apple.
The bird seemed very satisfied.
 FASTER READING TIMES
1b.
It was the robin that ate the apple.
The fruit was already half rotten.
2a.
What the robin ate was the apple.
The bird seemed very satisfied.
2b.
What the robin ate was the apple.
The fruit was already half rotten.
FASTER READING TIMES
Almor's results showed that subjects read the underlined anaphor in the second sentence
faster if it referred to the focused referent than if it referred to the nonfocused referent. By
contrast, when repeated NPs were used (e.g., "the bird" referring to an entity last mentioned as
"the bird"), he found the opposite effect: a repeated-NP penalty for focused entities but not
unfocused ones. Under the assumption that reading is facilitated for accessible information,
these results suggest that the focus of a cleft is more saliently represented than the nonfocus, a
conclusion which is also supported by Carpenter and Just's (1977) study.
In sum, past research suggests that both subjects and the focus of clefts make their
referents more salient than other elements in the utterance. Since the subject is a topical position
page 6
but the focus of clefts is a focus position, this observation contrasts with the traditional
opposition between topics and foci in terms of their pragmatic status in the prior discourse
context.
This paper will investigate the apparent similarity between topics and the foci of clefts.6
The evidence for the subject/focus similarity, presented above, comes from different studies, with
slightly different methodologies. Because of this, and because topics and foci have so often been
contrasted with each other, it is necessary to confirm the similar roles that subjects and the foci of
clefts play with respect to salience. Experiment 1 addresses this question.
4. EXPERIMENT 1: SUBJECT AND FOCUS.
The goal of the first experiment was to test the
hypothesis that the topic and the focus of clefts are more saliently represented than other referents, by
explicitly comparing these two categories. More specifically, I wanted to test the hypothesis that the
supposed salient representation of topics and foci results in an increased preference for less-specified
forms of reference.
However, any study of categories like "topic" and "focus" is only beneficial to the extent
that these categories are well defined. For this reason, I will use the grammatical subject in this
study as an operationalization of sentential topic, following the studies discussed above. In the
discussion of this experiment, I will avoid any confusion about the category I am discussing by
using the term "grammatical subject". Similarly, the category "focus" is operationalized as the
focus of cleft-like constructions such as "the one she saw was Ann".7
6
An additional question concerns the extent to which the salience of the foci in Almor's study is generalizable to other
types of foci. Theories of information structure posit that all sentences contain a focus (e.g., Lambrecht, 1994: 206), even
if they aren't syntactically marked with a cleft. In this paper, I discuss salience with respect to only one type of focus: that
marked by clefts. It may be that other types of focus-marking phenomena, like prosody, might make referents salient in
the way clefts do. This question will be left to future research.
7
Although this is not strictly a wh-cleft, it has been referred to as a "cleft with a lexical head" (Prince 1978).
page 7
4.1 METHODS.
PROCEDURE. The method for this experiment was a rating questionnaire that elicited off-line
judgments about naturalness. Participants read three-sentence paragraphs and were asked to rate
them for naturalness on a 7-point scale. A sample stimuli set is shown in Figure 1.
1.
2a.
2b.
3a.
3b.
The guests were nervously standing around in the living room, trying to decide which
person to talk to.
Ann decided to say hi to Emily first.
The one Ann decided to say hi to first was Emily.
Emily/She looked like the friendliest person in the group.
Ann/She hated to be in a room full of people where no-one was talking.
FIGURE 1: Sample stimuli for Experiment 1
The experiment followed a (2x2x2) design, investigating three factors: 1) focus
construction (did the second sentence include a focus construction ("the one..." or not?); 2) 3rdsentence continuation (was the subject of the second sentence the 1st or 2nd mentioned
character?), and 3) type of reference for 3rd-sentence subject (noun vs. pronoun). All items
appeared an equal number of times in each condition, but only once for a given participant. Half
the participants saw stimuli with "the one..."8 constructions, and half saw stimuli with no focus
construction.
PARTICIPANTS.
144 subjects rated 12 stimuli and 10 fillers in exchange for one candy bar.
Time of participation was approximately 10 minutes. All subjects were native speakers of
American English. "Native speaker" was defined as having begun to learn English by age 5.
PREDICTIONS. Based on the findings of Gordon et al. (1993) and others, I expected that for the
non-focused stimuli, the pronominal versions would be rated higher when the referent was the
subject of the second sentence, but not when the referent was the non-subject. Based on the
findings of Almor (in press), I expected the opposite pattern for the focused stimuli, where I
expected the pronominal versions to be rated higher for the focus.
8
For variation, two of the stimuli used the construction "the person..." instead of "the one...".
page 8
RESULTS.
The ratings for each subject9 were first normalized to eliminate individual biases
for some subjects to rate the sentences higher overall than other subjects. The noise resulting
from this variation was reduced by taking advantage of the fact that all subjects saw the exact
same set of filler stimuli. Each individual's average score for all the filler stimuli was subtracted
from the average filler score for all the subjects, and this normalization score was then added to
each stimulus score for that individual. The normalized scores were then submitted to a threeway ANOVA analysis. When a subject skipped an item for some reason, the average score for
that subject's stimuli items was substituted. The average ratings for each category are displayed
in Table 1.
The results showed that when the second sentence did not contain a focus construction,
higher ratings were given to stimuli with pronouns in the third sentence that referred to the 1st
mentioned entity (the topic), and to stimuli with names that referred to the 2nd mentioned entity.
In contrast, when the second sentence did contain an explicit focus construction the preferences
were reversed: item ratings showed that people preferred pronouns to refer to the 2nd mentioned
character, the focused referent, but they preferred names to refer to the nonfocused entity.
TABLE 1
Normalized average scores for each category in Experiment 1. The higher
rating is underlined.
Focus Construction
NO
pronoun
name
YES
pronoun
name
5.14
>
5.00
4.40
<
4.68
2nd mention 5.19
<
5.35
4.80
>
4.71
1st mention
9
One subject was excluded because he was not a native speaker of English, one subject was excluded because he was a
native speaker of South African English, not American English, 2 subjects were excluded because they rated all stimuli
items equally, and 4 random subjects were excluded in order to make the cells even. The total number of subjects
included in the analysis is 136.
page 9
The results of the ANOVA showed a main effect for focus construction (F1 (1,134)
=5.469, p< .03; F2(1,11)=38.638, p<.001) and a main effect for 3rd-sentence continuation
(F1(1,134)=8.996, p< 0.005; F2(1,11)=4.736; p<.06). The crucial result, however, was the threeway interaction between focus construction, pronominalization, and 3rd-sentence continuation,
which was significant by subjects and marginally significant by items (F1(1,134)=6.690, p<.02;
F2(1,11)=4.047, p<.07). No other factors were significant (p's >.1)
The results for this experiment showed that readers preferred stimuli that used pronouns
over stimuli that used names for referring to both the grammatical subject entity (a topic
construction) and the focus of "the one" constructions (a focus construction). By contrast, when
the third sentence contained a reference to the nonsubject or nonfocus, they preferred stimuli that
used names over stimuli that used pronouns. These results corroborated the findings of Almor
(in press) and Gordon et al. (1993). If these preferences for reference form are assumed to reflect
the mental representations of the referents, the results suggest that the referents of both subjects
and the focus of clefts are more salient than other arguments. This conclusion suggests that
despite the traditional opposition between topic and focus, the two categories may be more
similar to each other than previously thought.
At the same time, the results of this study raise the question of why it should be that these
two constructions share the role of designating their referents as more salient. It is this question
that the following corpus analysis aimed to answer.
5. CORPUS ANALYSIS: SUBJECT VS. FOCUS.
The results of the Subject/Focus questionnaire
suggested that the referents of both subjects and the focus of clefts are more saliently represented
than other referents in an utterance. What is it about these two constructions that gives rise to this
property? It cannot derive from topicality alone, and neither can it be explained in terms of focus
properties alone.
The hypothesis that I investigated with this corpus analysis was that both subjects and the
focus of clefts are good indicators that their referents will be mentioned again in the following
page 10
utterance. That is, both constructions may be pointers to the topic of the following utterance. In
a "normal", nonclefted utterance, the best bet for the topic of the following utterance is the topic
of the current utterance, since speakers usually talk about the same thing for extended periods of
time. On the other hand, a clefted utterance is a marked construction that the speaker may
employ to indicate that the topic will shift to the referent of the focus. This follows from Weil's
(1887/ 1844) proposal that the focus of one utterance is related to the topic of the next (as cited
Tomlin et al., 1997). If this is the case, I expected that following nonclefted utterances, speakers
would refer to the referent of the subject more often than other elements in the utterance, but that
following clefted utterances, speakers would refer more often to the focus of the cleft.
To investigate this hypothesis I conducted a corpus analysis, using the Aligned-Hansard
Corpus from 1986. The Aligned-Hansard corpus is a collection of transcripts from the Canadian
parliament. It has two advantages. First, the discourse it represents is natural and
communicative. Second, it is a relatively formal genre of discourse, which is likely to raise the
occurrence of clefts, a relatively formal construction.10
Using the Aligned-Hansard corpus, I compared nonclefted utterances with object-clefted whclefts, following the method described below. My hypothesis, as mentioned above, was that
speakers would refer to the referents of subjects and focus of clefts more frequently than to other
referents in the preceding utterance.
5.1 METHOD. I analyzed the 1986 Aligned-Hansard corpus by extracting two types of utterances:
wh-clefts (also sometimes called pseudo-clefts) and nonclefted sentences. To find wh-clefts I
searched for "What" (with a capital W) that was not followed by a question-mark. I only considered
those utterances that clefted the object, because I was interested in the comparison between the
embedded subject and the focused NP. I also discarded cases where there was no following
utterance, or where the following utterance belonged to another speaker, yielding 146 total examples
10
The drawbacks of this corpus are that some speakers may have had prepared notes at their disposal, the transcripts are
doubtlessly edited to remove disfluencies, and some segments may be translated from French.
page 11
of wh-clefts. The comparison set of data was a random sample of non-clefted sentences, which I
assembled by opening each file from 1986, scrolling down three pages, and taking the main clause of
the utterance where my cursor landed. In this set I excluded questions, sentences with dummy
subjects or nonreferential subjects (like the generic "one"), and cases where there was no following
sentence by the same speaker. I also excluded utterances that did not have either a direct object or an
object-of-preposition, since I was interested in comparing objects with the foci of clefts.11 This
method resulted in 167 examples.
For each sentence in my sample, S1, I identified the next "independent" utterance, S2,
which I defined as a clause that is finite, not a sentential complement of the matrix clause, and
not a relative clause. I then coded S2 for the first reference to an entity that had been referred to
in S1 (if any). The categories I found in this analysis are presented in Table 2.
11
Note that it is a conservative choice to exclude cases with no object or object of preposition, because many sentences
have more than one object or object of preposition. Therefore, if reference continuations were distributed randomly,
there would be a higher chance of the speaker referring to an object-referent than to the subject-referent. Therefore my
sample is slightly biased against my hypothesis that subsequent references will refer more to the subject-referent than
object-referents.
page 12
TABLE 2
Identity (in S1) of the Referent of the First Referring NP in S2
Nonclefted Sentences (S1)
1. Subject 2. Part of Subject 3. Related to Subject 4. Object/Obj of PP 5. Part of Object 6. Whole 9. None -
matrix clause subject
part of matrix clause subject (e.g., "I" when previous subject was
"my Party")
something related to the matrix clause subject, such as another
member of a set
matrix clause object or object of PP
part of matrix clause object
the whole sentence, or an entire constituent within the sentence
(such as an entire PP, VP, or subordinate S).
there is no referent in common with S2
Clefted Sentences (S1)
1. Embedded Subject
subject of embedded clause (e.g., What we need is...)
2. Related to Embedded Subj. Related to subject of embedded clause
3. Focused NP
NP in the focus of the Wh-cleft
4. Part of Focused NP
part of the NP in the focus of the Wh-cleft
5. Related to Focused NP
related to the NP in the focus of the Wh-cleft
6. Embedded Object
object of embedded clause
7. Unclear
8. Whole
the whole sentence, or an entire constituent within the sentence
(such as an entire PP, VP, or subordinate S).
9. None
there is no referent in common with S2
As mentioned above, I looked at S2 to find the first NP that had the same referent as any of the
NPs in S1, and if so, which one. The hypothesis behind this methodology was that certain
categories (namely, subjects and focused NPs) may be indicators that their referents are likely to
be mentioned again in the following discourse. If more than one referent from S1 was mentioned
in S2, only the first one was analyzed. Once a referent was identified, I noted how it had been
referred to in the first utterance (S1) -- that is, what was the grammatical function of the
reference in S1. The following examples show how particular cases were coded.
page 13
TABLE 3
Examples from Corpus Analysis
first reference
example
subject
S1: The charter is giving individuals a new opportunity to seek redress when their
rights are infringed upon.
S2: It also imposes on all an obligation to be tolerant of the rights of others.
subject
S1: The hon . gentleman states that the Government has turned down options.
S2: He knows that is not correct . .
object of PP S1: I have talked to other whites who have worked in the South African police
force
S2: and through them I have learned something about the enormously tragic
environment of the prison system in South Africa and the harshness of the
treatment to which many . . .
focus
S1: What the Government finally got was a deck hand whose name is Mr. Lander.
S2: He has been busying himself rearranging the deck chairs on the deck of the
Titanic.
focus
S1: What we have now is a breath of fresh air ,
S2: and the people of Canada welcome it .
The codings for each utterance were tabulated in terms of four categories: 1) Subject (or
embedded Subject, 2) Object / Cleft, 3) Other (references to other NPs or the whole utterance),
and 4) None. The first category (Subject / embedded Subject) included cases where the referent
was part of the subject or related to the subject, and the second category included cases where the
referent was part of or related to the object or cleft. Examples of this type of utterance are
provided in Table (4).
page 14
TABLE 4
Examples of "Part of" and "Related to" codings.
(1) Part of Subject
a. The unfortunate aspect of the dilemma facing our producers is that the problem is not
of their own doing. Our farmers are the innocent victims of the costly madness of
subsidies which has gripped the European Economic Community . . .
b. From time to time my Party speaks about the need to democratize these centres of
power and put them in the hands of ordinary Canadians. I suggest that those are some
constructive and positive means by which we could seek to do that.
(2) Part of Object
I would like to quote the words of the Hon. Member , the Liberal health critic. He
said: "Higher prices might be justified if it led to more drug research in Canada.
(3) Part of Cleft
What the Hon. Member has not answered with respect to Challenge '86 '86, has not
refuted, is the allegation, the observation, that there has been a significant shifting of
funds from the non-profit service and municipal sector to private enterprise. I have no
problem with funding training opportunities in private enterprise.
(4) Related to Embedded Subject
What the Government is entering into now is a bilateral negotiation by appointing
trade envoys. The U.S. administration said to Canada that . . .
(5) Related to Cleft
What we are really talking about is a farm crisis. Just the other day the Committee on
Agriculture met with Dr. Lillian Walker and Dr. James Walker.
One might wonder about the rationale for categorizing cases where the speaker makes
reference to a part of an argument or something related to an argument. The idea behind this
choice is that this corpus analysis is measuring discourse patterns -- that is, how often do
speakers continue talking about the referents of subjects, and how often do speakers continue
talking about the referents of the focus of clefts? Whatever the speaker's choice for continuing a
discourse, the listener will interpret the following utterances in terms of what came before,
engaging in what Clark and Haviland (1977) call bridging inferences. If the speaker mentions an
entity that represents only a subpart of the referent of an NP in the previous sentence, or
something that relates to a referent in the first sentence, the listener will need to access the
page 15
representation of that referent in order to establish a connection. This idea is also reflected in
claims by Almor (in press) and Prince (1981:251), who suggest that reference to one member of
a set makes the other members of the set more accessible as well. Therefore, to understand how
discourses are frequently continued, it is necessary to include cases where an argument is the
topic of the discourse, even if it is not explicitly or fully mentioned.
5.2 RESULTS. The results showed that the most frequent type of continuation for nonclefted
utterances is with the referent of the subject, but for clefted utterances it is with the referent of the
focus.
The total number of continuations in each category is listed in Table (3.) The results were
analyzed with a chi-squared analysis of distribution, which showed that these four categories
behaved differently with respect to nonclefts and clefts (2(3)=45, p<.001).
TABLE 5
The number in each category refers to the total number of cases where the first referent in
S2 co-referred with the NP in that position in S1. Data for the Subject and Object are for
nonclefted utterances, and data for the Embedded Subject and Focus are for the clefted
utterances.
Noncleft categories / Cleft categories
subject / embedded subject
object / focus of cleft
other
none
nonclefts
74
46
25
22
clefts
25
81
7
33
The percentages of total references for Clefts and Nonclefts are presented graphically in Figure
(4). In this figure, as in Table (5), references to the subject of nonclefted utterances are grouped
with references to the embedded subject in clefted utterances, and the object of nonclefted
utterances is grouped with the focus of clefted utterances.
page 16
Nonclefts (n=167)
60%
50%
55%
Clefts (n=146)
44%
40%
28%
30%
23%
20%
17%
15%
10%
13%
5%
0%
sub ject/em bedd ed
sub ject
obj ect/focus
oth er/ who le
non e
Gramma tical Function of Referent
Figure 3. Percentage of total references in each category for both clefted and nonclefted
utterances.
5.3 DISCUSSION OF EXPERIMENT 1 AND CORPUS ANALYSIS.
The results of the experiment and
corpus analysis showed a remarkable parallel between subjects and the foci. In the Subject/Focus
study, readers gave higher ratings to stimuli that used pronouns to refer to the referents of both the
subject of nonclefted sentences and the focus of clefts. This pattern contrasted with higher ratings
for stimuli that used names to refer to the referents of objects and embedded subjects. This finding,
which is consistent with past results of on-line studies (Gordon et al., 1993; Almor, in press),
indicates that both subjects and the focus of clefts cause their referents to be represented more
saliently in the mind of the comprehender. The most striking result is that Experiment 1 found that
the focus-referent was similar in terms of salience to the referent of a prototypical topic position like
the subject. This shows that salience cannot be understood purely in terms of topicality.
The corpus analysis provides one potential reason for the similarity between subjects and
foci: both positions are associated with an increased tendency to appear in the following
page 17
discourse. That is, the probability of re-mention is higher for subject-referents than other
referents in non-clefted sentences, and the probability of re-mention is higher for focus-referents
than other referents in clefted sentences. Thus, there is a three-way connection between
preference for pronouns, probability of subsequent mention, and the positions of subject and
focus, as depicted in Figure 4.
Increased preference for
pronouns over names
Subject-referents
Focus-referents
Increased probability of
subsequent reference
Figure 4. The relation between pronoun use, probability of subsequent mention, and
grammatical positions, as shown by Experiment 1 and the Corpus Analysis.
The parallel findings of experiment 1 and the corpus analysis support the following
hypothesis about why these two positions are salient. Recall that comprehenders need to identify
the topic of a sentence. Subjects are often topics, so they are probabilistically taken as topics.
The current topic is likely to be the topic of the next utterance, because speakers are usually
coherent, and topic continuation is more frequent overall than topic shift. By contrast, cleft
constructions signal a probable change in topic, to the focus of the cleft. The degree to which
something is expected to be the topic of the following utterance determines its degree of salience.
The more salient a referent is to the comprehender, the more likely it is that the speaker will use a
pronoun to refer to it.
This story about referent salience can also be described on a different level. The results
presented here show that subjects and the focus of clefts provide a similar resource for speakers,
in that speakers can use these positions to signal their interlocutors about the discourse status of
certain referents. Those referents that are placed in subject or focus position are relatively more
salient, and therefore are good candidates for pronouns that may occur in the subsequent
utterance. In this way, subjects and the focus of clefts are similar in that they provide similar
discourse functions.
This hypothesis makes sense if we think about what topichood might mean for language
comprehension. Topics are what the sentence is "about", in a way that is relevant to the current
page 18
context (Gundel, 1988; Lambrecht, 1994; Reinhart, 1982). Since speakers tend to stick with the
same topic for awhile, the thing which a sentence is "about" is the thing which is likely to be
mentioned again in the following utterance, and is also the thing most salient to the speaker.
This means that if subject-referents tend to be topical , they should also receive multiple
reference. This prediction is borne out in the corpus analysis of nonclefted utterances. Focusreferents are also likely to be mentioned again, as shown by the corpus analysis of clefted
utterances, supporting the hypothesis that clefts signal a topic shift.
This account has an attractive property of being consistent with other work on language
processing, which has also shown that people are sensitive to the degree to which certain
information is likely to occur (e.g., Lieberman, 1963, Marslen-Wilson and Tyler, 1980), to the
extent of causing comprhension errors in certain ritualized contexts (Cushing, 1994).
5.4 FURTHER QUESTIONS.
The account I just presented explains salience in terms of the degree to
which a referent is a likely topic of the following utterance. One question this raises concerns the
relative weight of topics and the foci of clefts in marking a referent as a likely topic of the following
utterance. If topics and foci are similar in marking salience, as the preceding results suggest, then
what happens when an utterance contains a cleft-focus AND a well-established topic? Are both
referents equally salient, or do they compete with each other? If it is the latter, then which one "wins
out"? The following two experiments address this question.
6. EXPERIMENT 2: DISCOURSE-COMPLETION.
The stimuli in Experiment 1 were designed to
contain either a topic (the subject) or a focus (the focus of a cleft), but not both. In the condition
with a focus construction, the focus co-occurred with an embedded subject, which is not a highly
topical position. In the condition without a focus construction, the subject co-occurred with an
object or object of preposition, which are not specially marked focus positions. Therefore,
Experiment 1 did not bear on the question of which referents are more salient when the topic and
focus are competing against each other.
page 19
Experiment 2 was designed to pit topics and foci against each other. In order to compare
these categories, however, it is important to establish how they are operationalized. In the first
experiment, "topic" was defined as the grammatical subject, which is most naturally thought of as
a sentence-level topic. However, foci of clefts can not co-occur with simple subjects of the type
investigated in Experiment 1. The entity expressed as the subject in a non-cleft, as in (2a), is
expressed as an embedded subject, as in (2b).
(2a)
(b)
The burglar heard the neighbor's dog.
What the burglar heard was the neighbor's dog.
Thus, in order to directly compare the focus of a wh-cleft with a topic, the embedded subject
must be defined as the discourse topic. The discourse topic in the stimuli for this experiment was
established by introducing one character as the subject of the first sentence, and referring to this
character with a pronoun in the following two sentences. The referent of the discourse topic was
compared with a second character, which was introduced either as the focus of a cleft or as an
object / object of PP. The experiment aimed to discover which of these two referents would be
treated as more salient by the participants.
6.1. METHODS.
PROCEDURE. The experiment was conducted in the language laboratory at Stanford University.
Each participant sat in a cubicle outfitted with a tape recorder and headset with a microphone and
earphones. The stimuli were provided in written form on several sheets of paper, with several
spaces between each stimulus item in order to prevent subjects from reading the next stimulus
item while completing the current one.
Each stimulus item consisted of a three-sentence story, an example of which is given in
(7). The first sentence introduced one character and provided the context for the story. In the
second sentence, the same character was referred to again with a pronoun. The third sentence
consisted of either a clefted or nonclefted utterance. This sentence also introduced a new
character, as either the object of a nonclefted utterance or the focus of a cleft. The clefted
page 20
utterance was always an object cleft, such that both utterances contained two participants: a
subject/embedded subject and an object/ focus. All characters were introduced with proper
names.12
(3a)
Ron was looking through his address book, trying to make up his mind. He had
an extra ticket to the opera, but he didn't know which friend to invite. The one he
decided on at last was Kysha / Fred.
(b)
Ron was looking through his address book, trying to make up his mind. He had
an extra ticket to the opera, but he didn't know which friend to invite. At last he
decided on Kysha / Fred.
Participants were asked to read each paragraph into the tape recorder. At the end of each
paragraph, their task was to add a natural continuation sentence to the paragraph. They were
asked to make the continuation as seamless and natural as possible. It was specified that the
continuation should be another sentence, and not an elaboration of the last sentence. However,
no strict definition of "a sentence" was provided, and some continuations had to be discarded
because they consisted of conjoined sentences (with "and", "but", or "because"), relative clauses,
or other elaborations of the stimulus sentence.
The experiment used a total of 24 items, which were rotated through the four conditions
and presented to subjects in four lists. These items were combined with 32 items belonging to
another experiment, such that each experiment served as a distracter for the other. The items
from the other experiment also had three sentences and used proper names, but followed a
different structure than the items from this experiment.
PARTICIPANTS.
16 native speakers of English from the Stanford community participated in
this and another experiment in exchange for $7. Native speakers were defined as having begun
12
The experiment also included an additional, orthogonal manipulation of ambiguity. Each stimulus appeared half the
time with two names that represented the same gender, and half the time with two names that represented different
genders. In all cases, names were chosen according to the author's estimation that they are commonly associated with
only one gender. In a few cases where the participants did not take the intended interpretation, their responses were
excluded. This variable was not the purpose of this study, and therefore will not be discussed in this paper.
page 21
to learn English by age 5. The approximate time to complete both experiments was forty-five
minutes.
6.2 RESULTS. The responses of each participant were recorded, and later transcribed.13 Each
response was coded according to the same system as the corpus analysis: the first time the
participant referred to one of the characters in the preceding utterance, that reference was coded
in terms of 1) who the referent was, and 2) how that reference was made, with a name or
pronoun. Thus, each sentence was given one of the following codings for who the referent was:
a) NP1 (subject or embedded subject), b) NP2 (object or focus), or c) other/ none. For those
cases that referred to either NP1 or NP2, a coding of either "pronoun" or "name" was also
recorded. Examples of participant completions are shown in (8).
(8)
STIMULUS: Leslie arrived at the party where she didn't know anyone. She
surveyed the guests in the living room, planning her first move.
The one she decided to say hi to was Bob/ Pam. / She decided to say hi to Bob/
Pam.
SAMPLE COMPLETIONS:
• Pam always stood next to the to the munchies so she thought it would be a good
place to go. (NP2)
• Leslie was very nervous about meeting people. (NP1)
• She walked over and sat down next to Bob and asked him how his day went.
(NP1)
• Bob was probably the only person that she recognized. (NP2)
• He was really good looking and kinda had looked over her looked over at her
with a smile. (NP2)
• Hi Pam, she said. (NP1)
• Bob and Leslie had grown up together in the back hills of the Appalachians.
(other)
The goal of this study was to find out when people would tend to use pronouns, and when
they would tend to use names. Based on the results of Experiment 1, I predicted that in a
13
14 responses were excluded. Reasons for exclusion included continuing the last sentence rather than beginning a new
one, skipping the item, adding a nonsensical continuation, or interpreting the name of one of the characters with the
unintended gender. For example, the context of the sentence made it clear that once a participant interpreted the name
"Lynn" as a male name (although it was intended as a female name), and one participant interpreted "Ned" as a female
name (although it was intended as a male name).
page 22
nonclefted utterance, they would use pronouns more when referring to the subject, but in a
clefted utterance, they would use pronouns more when referring to the focus. The results, which
are presented in Table (5), did not support this prediction.
TABLE 5
Percentage of pronouns used, according to the position of the referent (NP1 vs. NP2)
and the type of utterance (cleft vs. noncleft). Only the initial references for each
completion are included.14
NP1 referent
NP2 referent
Nonclefts
98% (n=40)
12% (n=142)
Clefts
97% (n=34)
12% (n=137)
The results demonstrated that the referent identity greatly influenced the form of reference
that participants used in their continuations. Pronouns were almost always used to refer to the
first-mentioned character, both when it was the subject and when it was the embedded subject.
In contrast, names were almost always used to refer to the second-mentioned character, both
when it was the object and when it was the focus. However, the type of utterance that preceded
the continuation did not affect the form of reference at all. These results were submitted to a
logistic regression analysis, using SPSS, which indicated that the referent identity (NP1 vs. NP2)
contributed significantly to the prediction of the dependent variable (p<.001), but the type of
utterance (cleft vs. noncleft) and the interaction (referent identity x utterance type) did not
(p's>.1).
6.3 DISCUSSION.
The results of Experiment 2 suggest that when the focus of a cleft co-
occurs with a strongly established discourse topic, the topic "wins out" as the more salient
14
An alternate method of coding includes all references to NP1 or NP2, where only the first reference was included if
the same character was mentioned more than once. This method sometimes resulted in two data points for a given
completion. The pattern of results with this method is similar, except that there is a lower percentage of parallel Subjectto-Subject references, which results in a lower percentage of pronouns referring to Subjects.
NP1 referent
NP2 referent
Nonclefts
89% (n=83)
12% (n=163)
Clefts
84% (n=77)
11% (n=170)
Like the results reported in Table 5, only referent identity was selected as significant by the logistic regression analysis.
page 23
referent of the two. This conclusion is supported by the fact that participants usually used names
(88%) to refer to the focus of the cleft, but they almost always used pronouns (97%) to refer to
the discourse topic.
At the same time, it is somewhat surprising that these results showed absolutely no effect
of the focus construction, in that names were used equally as often to refer to the focus in the
focus condition as to the object in the non-focus condition. The results of Experiment 1
indicated that cleft constructions make the focus more salient than other referents, but
Experiment 2 did not confirm these findings. So why did this difference occur?
One diference between the two experiments occurred in the stimulus design. The content
of the stimuli was identical, but the operationalization of "topic" was different in each case.
Experiment 1 compared the focus of a cleft with the grammatical subject, which represented the
sentence-level topic, while Experiment 2 compared it with a discourse-level topic,
operationalized by repeated mention of one character. It is possible that when a topic is
established through repeated reference to one character, other weaker constraints (such as the use
of a cleft construction) cannot influence the representation of new entities as much.
7. EXPERIMENT 3: TOPIC VS. FOCUS.
Experiment 3 was designed to investigate why
Experiment 1 found an effect for cleft constructions, but Experiment 2 did not. I hypothesized that
the difference in how the topic was operationalized in each experiment contributed to the difference
in results. In particular, I hypothesized that when one character was established as a discourse topic,
it became very highly salient, and far more predictable than other characters. This salience drew the
attention away from the other character, regardless of whether it was the focus or not. Following
research on syntactic processing (e.g., MacDonald et al., 1994), I hypothesized that discourse entities
would compete with each other for salience in the listener's representation. At a broader level, this
hypothesis is consistent with proposals that people have limited attentional resources (see Clark,
1996: 274-6), and cannot focus their attention on multiple referents at the same time. Thus,
increasing the salience of one character necessarily means decreasing it for another.
page 24
Experiment 3 investigated the salience of the focus when it co-occurred with a discourse
topic versus when it occurred with a sentence topic. The crucial question to be answered here is
whether the existence of a discourse topic can move the referent of the focus to a lower point on
the scale of salience. For that reason, this experiment was restricted to the comparison of
pronouns and names that referred to the focus of a cleft, under two conditions: with a discourse
topic and without one.
7.1 METHODS. This experiment used the rating questionnaire methodology of Experiment 1.
As in Experiment 1, participants read three-sentence paragraphs and were asked to rate
them for naturalness on a 7-point scale. A sample stimuli set is shown in Figure 5.
DISCOURSE TOPIC
Eleanor arrived at a party where no one knew anyone else. She sat in the living room, trying to
decide how to break the ice. The one she spoke to first was Emily. {Emily/She} looked like the
friendliest person in the room.
NO DISCOURSE TOPIC
Everybody arrived at a party where no one knew anyone else. They all sat in the living room,
trying to decide how to break the ice. The one Eleanor spoke to first was Emily. {Emily/She}
looked like the friendliest person in the room.
FIGURE 5: Sample stimuli for Experiment 3
Thus, the experiment investigated two factors: 1) Did the story contain a clear discourse topic
(yes vs. no); and 2) Is reference to the focus made using a noun or a pronoun?
The experiment used a total of 24 items. The stimuli were put into four lists, following
the same procedure as for Experiment 2. This experiment was also combined with 12 items
belonging to another experiment, plus 24 distracters. All distracters and additional stimuli had 3
sentences, but followed a different structure.
The hypothesis was that in the presence of a clear discourse topic, the non-topic character
(introduced as the focus of a cleft) would become relatively less salient, by virtue of the fact that
more attention would be allocated to the discourse topic, and less to the focus. Since I was only
measuring reference form preferences for the focus, I expected that the results would show
page 25
relatively higher scores for the pronoun condition when there was no discourse topic than when
there was a discourse topic.
PARTICIPANTS.
129 native speakers of English from the Stanford community participated
in this study.15 60 people were paid $3 plus one candy bar, and 2 people were paid $5 for their
participation in this rating questionnaire, which lasted approximately 15 minutes. 67 people
received course credit for their participation in this and another experiment, which totaled
approximately one hour.16
7.2 RESULTS. Following the same procedure as for Experiment 1, the ratings were normalized to
adjust for individual biases in rating scales. An inspection of the means in Table 6 shows that
when the stimuli contained a discourse topic, participants gave higher ratings for the versions
where the focus was referred to with a name. By contrast, when there was no discourse topic, the
preference for the name versions disappeared.
These scores were submitted to two two-way analyses of variance, one with the subjects as a
random variable, and one with items as a random variable. The critical result from this experiment
was the two-way interaction between topicality and referring form (F1 (1,11) = 5.101, p<.05; F2 (1,
23) = 7.451, p<.05). The ANOVA also revealed a main effect of topicality (F1 (1,111) = 12.779,
p<.001; F2 (1, 23), p<.01), and a marginal main effect of referring form (F1 (1, 11) = 3.733, p<.06;
F2 (1, 23), p<.09).
TABLE 6
Normalized average scores for each category in Experiment 3.
pronoun
name
15
The scores for 17 participants were excluded: five people had not begun learning English by age 5, one participant
skipped an entire page, and one provided answers that appeared too pattern-like (suggesting that the individual had not
actually paid attention to the stimuli). It was also necessary to exclude ten randomly selected participants to produce
even numbers of participants per cell. This left the scores of 112 participants in the analysis.
16
For these 67 participants, three of the filler items were slightly modified, so that they wouldn't create a conflict with
the other experiment. The modifications consisted of changing "ice cream truck" to "ice cream van", "When the sun went
over the hill" to "When it got dark", and "until the sun rose" to "until morning".
page 26
Reference to FOCUS in the Discourse Topic condition
4.52
Reference to FOCUS in the No Discourse Topic condition 4.43
7.3 DISCUSSION.
<
=
4.73
4.42
The results of Experiment 3 demonstrated that changing the topicality of
the non-focused NP has a significant effect on the salience of the focus of a wh-cleft. When the
stimuli contained a discourse topic, names were preferred to refer to the new NP, which was
introduced as the focus of a cleft. When the stimuli did not contain a clear discourse topic, the
name preference disappeared, indicating that the focus was relatively more salient in the no-topic
condition than the topic condition.
The results of this experiment have two important implications. First, they explain the
different results of Experiments 1 and 2: In Experiment 1, where the focus-referent co-occurred
with the relatively non-salient referent of the embedded subject, the focus-referent appeared to be
highly salient. In Experiment 2, by contrast, the focus-referent co-occured with a thrice-repeated
discourse topic, and appeared no more salient than the object of a non-cleft. Experiment 3
demonstrated that when a discourse topic has been established by repeating one referent, the
other referent becomes less salient, and we observe a shift towards a preference for names over
pronouns. This result suggests that the difference in the results of Experiments 1 and 2 reflected
a difference in the level of salience of the focused referent.
The second implication of these results is more generally that the degree of salience of
one referent can affect the degree of salience of the other. Recall that in Experiment 3, the
degree of topicality of the NON-FOCUSED character was manipulated, but this had an effect on the
degree of salience of the FOCUSED character. This result supports the idea that the language
comprehender has limited attentional resources, and cannot pay attention to all aspects of the
discourse equally. Thus, when one referent becomes more salient, other referents necessarily
lose salience.
This competitive view of salience is compatible with the view presented here, that
salience is a scalar property that can rise or fall. The results of Experiment 3 further support a
model in which salience is identified with the activation of mental representations, and where the
page 27
representations of different referents in a particular discourse compete for activation. This
approach is discussed further in section 8.1.
At the same time, this experiment produced a somewhat unexpected result, which is that
there was no pronoun preference for the focus of clefts, in either condition. In the topical subject
condition, there was a clear name preference for reference to the focus, but in the nontopical
subject condition, participants on average rated both pronoun and name conditions equally (4.43
and 4.42, respectively). There are several possible reasons for this. On one hand, there may be a
tendency for the first-mentioned NP to be fairly salient, regardless of whether it is the subject or
not (Gernsbacher and Hargreaves, 1988). If this is the case, it is still competing with the focus to
a certain degree, even when it is not clearly the discourse topic. A second possibility is that
participants may view other entities as the discourse topic, such as "Everyone" in Figure 5 above,
which would provide some competition for the focus. If this is the case, it may be that the focus
construction is a weaker indicator of salience than identity as discourse or sentence topics, and
more susceptible to competition from other entities. In contrast, the referent of the grammatical
subject may not be as susceptible to competition from the discourse topic. However, the data
from Experiment 3 do not bear on these speculations, since this experiment did not manipulate
whether the discourse contained a focus construction or not. For the same reason this result does
not contradict the finding from Experiment 1 that the focus construction makes the focus
relatively more salient than the grammatical object in comparable utterances.
8. GENERAL DISCUSSION.
The major finding from this Experiment 1 were that pronouns were
preferred over names to refer to the referents of the subject (a topic position), the focus of cleft-like
constructions, and the discourse topic. These results join a host of studies that show that
grammatical subjects make their referents more salient (e.g., Hudson-D'Zmura and Tanenhaus, 1998;
Gordon et al., 1993) and a handful of other studies that show that the focus is the most accessible
entity in constructions like clefts (e.g., Almor, in press; Carpenter and Just, 1977). The corpus
analysis revealed a second similarity between subjects and the foci of clefts: the entities denoted by
page 28
both are the things that are most likely to be mentioned in the following utterance. I argued that the
salience of subject-referents and focus-referents can be thought of in terms of the degree of
likelihood that the entity will be mentioned again.
Experiment 2 investigated the relative salience of topics and the foci of clefts, and found
that the salience associated with the focus is not a static feature. When there was a clearly
established discourse topic, it made no difference whether or not the other character was
introduced as the focus of a cleft, in that names were used predominately for both objects and the
focus of clefts. Thus, when the focus co-exists with a highly salient discourse topic, it is less
likely to be perceived as salient. The results of Experiment 3 suggested that the focus of a cleft is
more salient when it is not competing with a discourse topic (or a sentential topic, for that
matter): readers preferred names to refer to the focus of the cleft to a greater degree when the
discourse contained a well-established, highly salient topic than when it did not.
8.1 TOPICALITY, SALIENCE, AND LANGUAGE PROCESSING. In section 5.3 I presented an account in
which the degree of salience was linked to the degree to which comprehenders perceived an entity to
be likely to be the topic of the following utterance. Under this view, the current topic-referent is
salient because topics tend to persist over several utterances, and the focus-referent is salient because
clefts signal a change in topic. This view also fits well with the data from Experiments 2 and 3,
which suggest that discourse topics influence the comprehender's estimate of an entity's likelihood of
being the topic in the following utterance.
Experiments 2 and 3 also showed that the salience of the discourse topic is stronger than
that of the subject. This makes sense because, as noted by many authors, the correlation between
grammatical subjects and topics is not perfect. Therefore comprehenders can only make
probabilistic estimations of the topicality of a subject-referent. By contrast, if a referent is
page 29
mentioned repeatedly throughout a discourse, comprehenders have good evidence that it is highly
topical.17
Experiments 2 and 3 also revealed an important property of discourse salience: it is
competitive. If one referent is highly salient, as is the case when there is a discourse topic, other
referents become relatively less salient.
The account I have given so far (the "topicality account") is one where salience is viewed
as a graded property which is dependent on the probability that a referent will be the topic of the
following utterance, where this probability is sensitive to the competition between the existing
referents at any point in the discourse.
There is another way of saying the same thing, characterizing salience not in terms of
topicality, but rather in terms of probabilistic activation (the "activation account"):
Comprehenders compute the probability that a given entity will be mentioned in the following
utterance, on the basis of information like grammatical function, special constructions like clefts,
or the repetition of reference to one entity. (For a discussion of other relevant sources of
information, see Arnold,1998). This probability can be represented in terms of the activation of
the comprehender's mental representations for each referent. It is this probability of re-mention
that determines an entity's salience. When two sources of relevant information make two
different things likely to be mentioned again, the two referents compete.
The topicality and activation accounts amount to the same thing, with subtle differences.
One difference is that the activation account does not assume that comprehenders identify topics
or foci in the current utterance, but only that they compute the likelihood of an entity being
mentioned again. However, these two things are highly correlated. Topical things are defined as
things that the sentence/discourse is "about", which means that coherent speakers should refer to
them repeatedly.
17
This idea is formalized in Centering Theory, where the backward-looking Center is similar to what I am calling the
"topic". Grammatical subjects are taken as the highest-ranked forward-looking center (Brennan, Friedman, and Pollard,
1987), but referents are not established as backward-looking centers until they have been mentioned in two consecutive
utterances.
page 30
Another difference is that the activation account does not assume, for example, that
comprehenders identify one referent as the expected topic for the next utterance. Rather, it only
says that some referents are RELATIVELY more likely to be mentioned again, and therefore are
relatively more salient than others.
An attractive property of activation is that it makes predictions about language
processing. If a referent is highly activated (i.e., salient), it is easier for the comprehender to reactivate their representation of that referent when it is mentioned again. If re-activation is easier,
then comprehenders should not need as much information to achieve re-activation, and thus lessspecified forms of reference like pronouns should be sufficient.
In sum, the topicality account is useful for thinking of how this account relates to the
well-established categories of topic and focus, and to the more general perception that speakers
tend to be coherent. However, the activation account has attractive features for thinking about
language processing in a way which is consistent with much of the processing literature. An
advantage of the approach presented here, whether it is described in terms of topicality or
activation, is that it provides a way of thinking of "salience" not as a primitive, but as the result
of other discourse factors.
8.2. THE NATURE OF SALIENCE.
The above interpretation of these results has additional
implications for the concept of salience. The view of salience presented here is more of a
"forward-looking" property than a "backward-looking" property, borrowing the terms from
Centering Theory. That is, it suggests that the speaker focuses on some referents as salient
because these referents are more central to what the speaker is saying. The listener takes cues
from the speaker as to which elements are salient for the speaker and therefore more likely to be
referred to in the upcoming discourse.
This view contrasts with past accounts of reference, many of which have focused on the
history of the discourse to determine the salience of any particular referent. For example, many
researchers have demonstrated an effect of recency, such that more recently mentioned items are
page 31
more salient than those that haven't been mentioned for awhile (e.g., Birner, 1998; Clark and
Sengul, 1979; Givón, 1983; Ariel, 1988).18 Similarly, the heightened salience of subjects and the
focus of clefts could be seen as information about the history of the discourse.
However, two approaches to the study of reference have considered both "backwardlooking" and "forward-looking" information to determine the salience of an entity for the
purposes of choosing a referring form. For example, Givón (1983) and colleagues include three
measures of topicality: Recency, Ambiguity, and Persistence. Recency, as already mentioned, is
a purely backward-looking property. Persistence, on the other hand, measures the number of
clauses that contain subsequent reference to an entity, a clearly forward-looking characteristic. In
addition, Centering Theory explicitly posits both forward-looking and backward-looking
properties of anaphoric expressions (e.g., Grosz et al., 1995).
Under the view presented here, information about the past discourse is important, but the
reason that it is important is to provide information about the upcoming discourse. Information
about the upcoming discourse is relevant because it makes some parts of the discourse
predictable, and thus easier to process for the listener. As the corpus analysis demonstrated,
subjects and the foci share the property that their referents are more predictable in the following
utterance. This similarity is especially striking, because in other respects these two categories
have been treated quite differently.
8.3. CONCLUSIONS. In sum, this study has tackled the slippery constructs of salience, topic, and
focus. The results point to a link between these concepts, where topic and focus constructions make
their referents more probable as topics for the following utterance, thus making them salient. This
can also be described in terms of activation, where the probability of a referent being mentioned
again determines its degree of activation. When a referent is highly activated or salient, less-specified
forms of reference are more natural for subsequent reference. This study uses the notion of
18
See Arnold, 1998, for a discussion of how recency can also be an indication of the importance of an entity to the
upcoming discourse.
page 32
probabilistic activation to ground the concept of salience in the mechanisms of language production
and comprehension.
page 33
REFERENCES
Aissen, J. 1992. Topic and focus in Mayan. Language 68.43-80.
Almor, A. In press. Definite Noun-Phrase Anaphora and Focus. Psychological Review.
Ariel, M. 1988. Referring and accessibility. Journal of Linguistics 24.65-87.
Ariel, M. 1990. Accessing Noun-Phrase Antecedents. London: Routledge.
Arnold, J. E. (1998). Reference Form and Discourse Patterns. Ph.D. Dissertation, Stanford
Univeristy.
Arnold, J. E., Trueswell, J. C. & Lawentmann., S. M. (1999). Using Common Ground to Resolve
Referential Ambiguity. Poster presented at the 40th annual meeting of the Psychonomic
Society, Los Angeles, CA.
Arnold, J. In press. Multiple Constraints on Reference Form: Null, Pronominal, and Full Reference
in Mapudungun. In J. W. Du Bois, L. E. Kumpf, and W. J. Ashby (Eds), Preferred Argument
Structure: Grammar as Architecture for Function.. John Benjamins.
Birner, B. J. 1998. Recency Effects in English Inversion. Centering Theory in Discourse, ed. by M.
A. Walker, A. K. Joshi, & E. F. Prince, 309-323. Oxford: Clarendon Press.
Bower, G. H. & Morrow, D. G. 1990. Mental models in narrative comprehension. Science 247.4448.
Brennan, S. E., Friedman, M. W. & Pollard, C. J. 1987. A Centering Approach to Pronouns.
Proceedings from the 25th Annual meeting of the Association for Computational Linguistics,
July 6-9, 1987.
Carpenter, P. A. & Just, M. A. 1977. Reading Comprehension as Eyes See It. Cognitive Processes in
Comprehension, ed. by M. A. Just & P. A. Carptenter, Hillsdale, NJ: Lawrence Erlbaum
Associates.
Chafe, W. 1994. Discourse, Consciousness, and Time. Chicago: Chicago University Press.
Clark, H. H. & Marshall, C. R. 1981. Definite reference and mutual knowledge. Elements of
Discourse Understanding, ed. by A. K. Joshi, B. L. Webber, & I. A. Sag, 10-63. Cambridge:
Cambridge University Press.
Clark, H. H. & Sengul, C. J. 1979. In search of referents for nouns and pronouns. Memory and
Cognition 7.35-41.
Clark, H. H. 1992. Arenas of Language Use. Chicago: University of Chicago Press.
Clark, H. H. 1996. Using Language. Cambridge: Cambridge University Press.
page 34
Cushing, Steven. 1994. Fatal words: communication clashes and aircraft crashes. Chicago:
University of Chicago Press.
DuBois, J. W. 1987. The Discourse Basis of Ergativity. Language, 634.805-855.
Francik, E. P. 1985. Referential choice and focus of attention. Ph.D., Stanford University.
Gernsbacher, M. A. & Hargreaves, D. J. 1988. Accessing sentence participants: The advantage of
first mention. Journal of Memory and Language 27.699-717.
Givón, T. 1983. Topic Continuity in Discourse: An Introduction. Topic Continuity in Discourse: A
Quantitative Cross-Language Study, ed. by T. Givón, 1-42. Amsterdam: John Benjamins
Publishing.
Gordon, P. C., Grosz, B. J. & Gilliom, L. A. 1993. Pronouns, Names, and the Centering of Attention
in Discourse. Cognitive Science, 17.311-347.
Grice, H. P. 1975. Logic and conversation. Syntax and Semantics 3: Speech acts, ed. by P. Cole &
J. L. Morgan, New York: Academic Press.
Grosz, B. J., Joshi, A. K. & Weinstein, S. 1995. Centering: A Framework for Modelling the Local
Discourse. Computational Linguistics, 212.
Gundel, Jeanette K. 1988, Universals of topic-comment structure. Studies in syntactic typology, ed.
by Michael Hammond, Edith A. Moravcsik, and Jessica R. Wirth. Amsterdam: John
Benjamins.
Gundel, J. K., Hedberg, N. & Zacharaski, R. 1993. Cognitive Status and the form of referring
expressions. Language, 692.274-307.
Hanna. J. E., Trueswell, J. C., Tanenhaus, M. K., and Novick, J. M. 1998. Consulting Common
Ground During Referential Interpretation. Paper presented at the CUNY conference on
human sentence processing.
Horton, W. S. & Keysar, B. 1996. When do speakers take into account common ground?. Cognition
59.91-117.
Hudson-D’Zmura, S. & Tanenhaus, M. K. 1997. Assigning Antecedents to Ambiguous Pronouns:
The role of the center of attention as the default assignment. Centering in Discourse, ed. by
M. Walker, A. Joshi, & E. Prince, Oxford: Oxford University Press.
Johnson-Laird, P. 1983. Mental Models. Cambridge, MA: Harvard University Press.
Johnson-Laird, P. N. 1989. Mental Models. Foundations of Cognitive Science, ed. by M. Posner,
469-499. Cambridge, MA: MIT press.
Kameyama, M. 1996. Indefeasible Semantics and Defeasible Pragmatics. Quantifiers, Deduction,
and Context, ed. by M. Kanazawa, C. Piñón, & H. de Swart, 111-138. Stanford, CA: CSLI.
page 35
Lambrecht, Knud. 1994. Information structure and sentence form. Cambridge: Cambridge University
Press.
Levelt, W. J. M. 1989. Speaking. Cambridge: MIT Press.
Levy, E. 1982. Toward an objective definition of "Discourse Topic". Papers from the Eighteenth
Regional Meeting of the Chicago Linguistic Society, ed. by K. Tuite, R. Schneider, & R.
Chametzky, 259-304. Chicago: CLS.
Lieberman, P. 1963. Some effects of the semantic and grammatical context on the production and
perception of speech. Language and Speech 6.172-175.
MacDonald, M. C., Pearlmutter, N. J. & Seidenberg, M. S. 1994. The Lexical Nature of Syntactic
Ambiguity Resolution. Psychological Review 1014.676-703.
Marslen-Wilson, W. D. & Tyler, L. K. 1980. The temporal structure of spoken language
understanding. Cognition, 81.1-71.
Marslen-Wilson, W. D. & Tyler, L. K. 1980. The temporal structure of spoken language
understanding. Cognition 81.1-71.
Miller, G. 1979a. Construction and Selection in the Mental Representation of Text. Cahier De
L'Institut de Linguistique de Louvain.5.1-2. 185-197.
Miller, G. 1979b. Images and Models, Similes and Metaphors. In A. Ortony (ed.), Metaphor and
Thought. Cambridge: Cambridge University Press.
Morrow, D. G., Greenspan, S. L. & Bower, G. H. 1987. Accessibility and situation models in
narrative comprehension. Journal of Memory & Language 262.165-187.
Prince, E. 1978. A Comparison of Wh-Clefts and It-Clefts in Discourse. Language, 544.883-906.
Prince, E. F. 1981. Topicalization, Focus-Movement, and Yiddish-Movement: A Pragmatic
Differentiation. Proceedings from the seventh annual meeting of the Berkeley Linguistics
Society.
Prince, E. F. 1992. The ZPG Letter: Subjects, Definiteness, and Information-Status. Discourse
Description: Diverse Linguistic Analyses of a Fund-Raising Text., ed. by W. C. Mann & S.
A. Thompson, 295-325. Amsterdam: John Benjamins.
Reinhart, T. 1982. Pragmatics and Linguistics: An analysis of Sentence Topics. Ms, distributed by
the Indiana University Linguistics Club.
Sanford, A. J. & Garrod, S. C. 1981. Understanding Written Language. New York: Wiley.
Stevenson, R. J., Crawley, R. A. & Kleinman, D. 1994. Thematic Roles, Focus and the
Representation of Events. Language and Cognitive Processes 94.473-592.
page 36
Tanenhaus, M. K. & Trueswell, J. C. 1995. Sentence Comprehension. The Handbook of Perception
and Cognition: Volume 11., ed. by J. Miller & P. Eimas, 217-262. Academic Press.
Tomlin, R. S., Forrest, L., Pu, M. M. & Kim, M. H. 1997. Knowledge Integration and Information
Management in Discourse. Discourse: A multidisciplinary introduction, ed. by T. van Dijk,
London: Sage.
Trueswell, J. C. & Tanenhaus, M. K. 1994. Towards a lexicalist framework of constraint-based
syntactic ambiguity resolution. Perspectives on Sentence Processing, ed. by J. C. Clifton, L.
Frazier, & K. Rayner, Hillsdale, NJ: Erlbaum.
van Dijk, T. A. & Kintch, W. 1983. Strategies of discourse comprehension. New York: Academic
Press.
Ward, G. & Birner, B. J. 1995. Definiteness and the English existential. Language, 71.722-42.
Ward, G., & Birner, B. J. 1996. On the Discourse Function of Rightward Movement in English.
Conceptual Structure, Discourse and Language, ed. by A. Goldberg, 463-479. Stanford:
Center for the Study of Language and Information Publications.
Ward, G., & Birner, B. J. 1998. Information Status and Noncanonical Word Order in English.
Philadelphia: John Benjamins.
Weil, H. 1887 (1844). The order of words in the ancient languages compared with that of the modern
languages (C. Super, Trans.). (3rd ed.). Boston: Ginn & Co.
page 37
Appendix A: Stimuli for Experiment 1: Subject vs. Focus
1. The camp counselors took all the children and paired them up randmoly for a game.
{Fortunately, Judy ended up with Sarah. / Fortunately, the one Judy ended up with was Sarah.}
{{Sarah/ She}was known as the best kid in camp at sll sports and games. /
{Judy/ She}was glad to be with someone she knew.}
2. The faculty met the other day to discuss the candidates for the new job.
{Carol supported Eleanor. / The one Carol supported was Eleanor. }
{{Eleanor/ She} was the youngest candidate but the most impressive. /
{Carol/ She} tried earnestly to convince the other members of the faculty as well.}
3. By the time the project was turned in, everybody in the group hated everybody else.
{Stacy hated Diane the most / The one Stacy hated the most was Diane.}
{{Diane/She} had done nothing but complain and boss other people around. /
{Stacy/ She} couldn't stand people who didn't pull their weight in a group project.}
4. The fencing coach toald everyone in the class to find a partner.
{Tim asked Sean. / The one Tim asked was Sean.}
{{Sean/ He} was not only a good fencer but also a good teacher. /
{Tim/ He} always tried to fence with people who were better than he was.}
5. The coach told all the girls on the varisty team to help those on the JV team.
{Erica had to help Elaine. / The one Erica had to help was Elaine.}
{{Elaine/She} was hard to help because she wasn't very motivated. /
{Erica/She} hated having to spend time helping the younger players.}
6. The guests were nervously standing around the living room, trying to decide which person to talk to.
{Ann decided to say hi to Emily first. / The one Ann decided to say hi to first was Emily.}
{{Emily/She looked like the friendliest person in the group. /
{Ann / She} hated to be in a room full of people where no-one was talking.}
7. The runners stood at the starting line, sizing each other up.
{Amy had to watch out for Cynthia. / The one Amy had to watch out for was Cynthia.}
{{Cynthia/ She} was known for tripping her opponents at the far end of the track. /
{Amy/She} was cautious of anyone who looked that confident.}
8. The Smith brothers were trying to decide which friend to invite skiing.
{Ron wanted to invite Fred. / The one Ron wanted to invite was Fred.}
{{Fred/He} had never been skiing before and would love the chance to try it. /
{Ron/He} felt guilty not being able to bring all his friends.}
9. The students found out that somebody had vandalized the classroom, but nobody knew who.
{John suspected Tom. / The one John suspected was Tom.}
{{Tom/He} was the most violent student in the class. /
{John/He} couldn't imagine anyone else doing such an awful thing.}
10. The students filed into the class on the first day of school and looked for seats.
{Alex sat next to Tony. / The person Alex sat next to was Tony.}
{{Tony/ He was the new kid and looked like he needed a friend. /
{Alex/ He} was careful to sit next to someone he liked on the first day.}
11. The TAs for the class were trying to think of someone to invite as a guest lecturer.
{Immediately Jason suggested Andy. /The person Jason suggested immediately was Andy.}
{Andy/ He} was a salient researcher as well as a good speaker. /
{Jason/ He} had been a TA before and knew who would be good.}
12. At the end of each year the high school basketball team votes for "Player of the Year".
{This year, Eric voted for Nathan. / This year, the one Eric voted for was Nathan.}
{Eric/He} always voted for the best defense player rather than the best all-around player.
{Nathan/He} was the best all-around player on the team.}
page 38
Download