CHAPTER 3 STRUCTURAL PROPERTIES OF SENTENCES One

advertisement
CHAPTER 3
STRUCTURAL PROPERTIES OF SENTENCES
One reason why we can process speech so rapidly is our ability to systematically make
use of structure in natural language. What do we mean by structure in language? We can
define the structure of language in terms of sets of rules that tell us how words strung
together can form a sentence and convey a meaning. When we speak of rules that give
structure to language, we do not mean that a speaker consciously follows these rules when
uttering a sentence. As Levelt (1989) has said, “A speaker doesn’t have to ponder the issue of
whether to make the recipient of give an indirect object (as in John gave Mary the book) or an
oblique object (as in John gave the book to Mary).” Nor, Levelt goes on to. suggest, does the
retrieval of common words require much time or conscious effort (p. 22). These are
“automatic” processes over which we exert little conscious control. Yet, for communication
to occur, the speaker and the listener must share a common knowledge base, and each must
have access to the same knowledge sets and rules.
Think for a moment of a simple “sentence” in the abstract, a sentence following the
noun-verb-noun form. Think now of the same “sentence” but in the form of an action, The
first noun verbed the second noun. Finally, let us instantiate this sentence with specific
words:
The student read the book.
The teacher graded the test.
The teacher heard the student
Although all three sentences take the form of the noun verbed the noun, the first two
sentences are not reversible. That is, while you can say, “The student read the book,” or “The
teacher graded the test,” you cannot say, “The book read the student,” or “The test graded the
teacher.” Only the third sentence is reversible. You can just as easily say, “The student heard
the teacher,” as “The teacher heard the student.” Some actions are possible, and some are not.
Real-world knowledge can supply constraints that operate as part of the structure of our
language.
These properties of language give rise to regularities in the language that make possible
a degree of statistical prediction whenever we listen to natural speech. To illustrate, let us
begin with the fact that the average college-educated adult may have a speaking vocabulary
of 75,000 to 100,000 worth (Oldfield, 1963). Suppose someone was about to say a word to
you, and you had to guess what the word might be. If all words in the language were equally
probable, the odds of it being any particular word would be between .00001 and .000013.
Now, clearly, each word is not equally probable. Some words tend to be used much more
frequently than others. In writing, the most frequently used word is the, and in spoken
telephone conversations, it is I. In fact, the 50 most commonly used words in English make
up about 60% of all the words we speak, and about 45% of those we write. We can put this
another way: On average, we speak only about 10 to 15 words before we repeat a word
(Miller, 1951).
Thus, some words are more predictable than others, even out of context. When words
are heard within a context, the effect is even further increased. Imagine someone started to
speak to you, but then stopped suddenly in midsentence. If you were asked at that point what
you thought the next word might be, you might have a good idea. You could at least say what
part of speech the next word might be, whether it would probably be a noun, a verb, an
adjective, and so forth. Indeed, you would stand a good chance of correctly guessing the word
itself. If someone said, “The train pulled into the... ,“ you might say “station,” or you might
say “tunnel.” From your knowledge of language, you would, at the very least, have a high
expectation for either a noun or an adjective.
Statistical Approximations to English
We can capture this predictive quality of natural language by giving people a few words
of a sentence and asking them to guess what they think the next word might be. We then
show this set of words to another person and ask him to guess the next word, and so on. In
this way one can see what people’s linguistic intuitions look like with varying amounts of
preceding context.
For example, Moray and Thylor (1960) showed subjects the five words, “I have a few
little,” and asked the subjects to guess what they thought the next word of this sentence might
be. One subject said, “facts.” Moray and Taylor added the word facts, covered the first word,
I, and showed”. .. have a few little facts _______“to another subject. This subject said,
“here.” A third subject saw the Last set of five worth:”. . . a few little facts here ______“ and
was asked to guess the sixth word. This process was continued until an entire 150-word
passage was constructed. This example is called a sixth-order approximation to English,
because each word was generated based on a context of five preceding words. Here is an
extract from Moray and Thylor’s sample. As you read it, it seems as if our artificial speaker is
continually on the verge of saying something meaningful, but never quite does:
I have a few little facts here to test lots of time for studying and praying for guidance
in living according to common ideas as illustrated by the painting.
A second-order approximation, where subjects guess the most likely word of a sentence
based on seeing only one word of context, would be somewhat lçss English-like:
The camera shop and boyhood friend from fish and screamed loudly men only when
seen again and then it was jumping in the tree.
You might ask what would happen if one created approximations to English after giving
subjects a specific context, such as telling them that the words are from a political campaign
speech, a romantic novel, or a legal document. The following is a fourth-order approximation
to English (each word is based on three words of prior context only) when respondents were
told the words were taken from a mystery novel:
When I killed her I stabbed Paul between his powerful jaws clamped tightly together.
Screaming loudly despite fatal consequences in the struggle for life began ebbing as
he coughed hollowly spitting blood from his ears (Attneave, 1959,
p. 19).
We have long known that increasing the likelihood of words by increasing contextual
constraints, either with sentences or with statistical approximations to English, will make the
words easier to remember (Miller & Selfridge, 1950), more audible under poor listening
conditions (Rubenstein & Pollack, 1963), and more recognizable if they are presented
visually for brief durations (Ttilving & Cold, 1963; Morton, 1964).
Where Do People Pause When They Speak?
Clearly, listeners know a great deal about the structure of their native language. The speech
we hear also has an intonation pattern and rhythm to it that can give the listener hints about
what is about to be heard. One of these hints can come from the periodic appearance of
pauses in spontaneous speech, whether they are “filled” with uhms and ohs, or by silence.
They occur as the speaker thinks of what to say and how to phrase it.
Some estimates suggest that as much as 40 to 50 percent of speaking time is occupied by
pauses that occur as we select the worth we wish to utter. What happens to these natural
pauses when we reduce the planning demands on the speaker? Reading aloud from a script
does reduce the proportion of pausing, but it may be impossible for a speaker to speak
sensibly without pausing at least 20% of the time (Butterworth, 1989, p. 128).
Systematic studies verify that the pauses in connected speech tend to occur just before
words of low probability in the context, the “thoughtful” words that do not represent a run of
association. They suggest that in fluent speech we do not pause to take a breath. Rather, we
take the opportunity to breathe during natural pauses determined by the linguistic content of
what we are saying (Goldman-Eisler, 1968). in short, although speech that departs from an
expected pattern will be harder to predict, the nature of the speech act itself can signal the
listener that such an event is upcoming.
The lesson to be drawn from this discussion is that sentence perception is a surprisingly
active process, even though it is ordinarily accomplished rapidly and without conscious
effort. Sentence processing represents a continual analysis of the incoming speech stream to
detect the structure and meaning of speakers’ utterances as they are being heard. In order to
discover how sentence processing takes place, we must understand how the listener
accomplishes syntactic and semantic processing. As we shall see, some theorists have
claimed that we conduct syntactic structure and semantic analysis independently, and others
have claimed that we ordinarily process them at the same time in an interactive fashion.
SYNTACTIC PROCESSING
Resolution Is Necessary for Comprehension
Although the statistical properties of language say something about the consequences of the
speaker’s and listener’s knowledge of language structure, they do not themselves explain this
structure. During the 1960s some researchers attempted to use transformational grammar to
fulfill this goal. These attempts made two important points relevant to our discussion: the
difference between surface structure and deep structure, and the difference between
competence and performance.
Structure versus Deep Structure
The first point was a distinction between the surface structure and the deep structure of a
sentence. The surface structure of a sentence is represented by the words you actually hear
spoken or read: the specific words we have chosen to convey the meaning of what it is we
wish to say. The listener must “decode” this surface structure to discover the meaning that
underlies the utterance—the “deep structure” of the sentence.
Some sets of sentences have different surface structures, but the same deep
structure. An example would be the pair of sentences, The boy threw the ball, and The ball
was thrown by the boy. The specific words used—the surface structures—are obviously
different The first sentence is a simple active declarative, and the second is a passive. In spite
of this difference, both sentences focus on the fact that a boy threw a ball. The two sentences
have different surface structures, but they convey the same meaning. They have the same
deep structure.
By contrast, some sentences can have the same surface structure, but different deep
structures. A well-known example is the sentence, flying planes can be dangerous. This
sentence could mean that it is dangerous to be a pilot, or it could mean that living near an
airport can be dangerous.
The distinction between deep structure and surface structure makes an important point for our
understanding of sentence processing. It tells us that sentence processing is conducted in two
steps in which the listener analyzes the surface structure and uses this information to detect
the deep structure. The latter step conveys the meaning of the sentence that is the primary
goal of the communicator (Fodor, Bever, & Garrett, 1974).
Competence Versus Performance
The second point is that the way people produce language is not equivalent to their
knowledge of language. Much of what we say consists of incomplete fragments that do not
even approach a grammatical sentence (Coldman-Eisler, 196S). This does not mean that we
lack the knowledge to produce a complete sentence, or that we do not know the difference
between an ungrammatical fragment and a grammatical sentence. The specification of these
rules is critical to an understanding of language competence—what the speaker knows about
the structure of the language (Chomsky, 1957, 1965). A theory of performance requires an
explanation of how we can understand speech, however incomplete and fragmentary it may
be. A complete theory of sentence processing must thus take into account both competence
and performance.
SENTENCE PARSING AND SYNTACTIC AMBIGUITY
In the previous section we saw how comprehenders extract syntactic structure from
sentences in the form of clausal units. Comprehenders also extract syntactic structure while
they are processing clauses word by word. Models of sentence parsing address how the
syntactic functions of individual words determine the overall syntactic structure of clauses
and sentences. Researchers have found that the way listeners and readers handle ambiguities
can offer valuable insights into general processing principles in language comprehension.
Local Ambiguity Versus Standing Ambiguity
Syntactic ambiguity refers to cases where a clause or sentence may have more than one
interpretation given the potential grammatical functions of the individual words. The
occurrence of such ambiguities and the fact that language comprehension runs along
smoothly in spite of these ambiguities have long been of interest in psycholinguistics. There
are two types of syntactic ambiguity of interest. The first is referred to as local ambiguity, and
the second is referred to as standing ambiguity.
Local ambiguity refers to cases where the syntactic function of a word, or how to parse
a sentence, remains temporarily ambiguous until it is later clarified as we hear more of the
sentence (Frazier & Rayner, 1989). For example, consider the sentence, When Fred passes
the ball, it always gets to its target This sentence is temporarily ambiguous when we hear the
noun phrase. the ball, because it could be completed in two different ways, corresponding to
two possible syntactic structures. For instance, another completion might be, When Fred
passes, the ball always gets to its target. The ambiguity is referred to as local ambiguity
because our uncertainty about the structure of the sentence is only temporary. When the
reader or listener has encountered the phrase it always gets or always gets the ambiguity is
resolved. If we are forced to remain uncertain for too long (if the disambiguating information
doesn’t arrive right away), we will find a sentence increasingly hard to understand. The sentence, The rat the cat the dog chased bit ate the cheese, is difficult because we must hold too
many incomplete substructures before the sentence is finally complete and the full structure
can be seen.
Abney and Johnson (1991) clearly summarize the complexities of memory requirements and parsing strategies in the resolution of local ambiguities. A parser could adopt
a wait-and-see attitude, holding off making a decision until more information is available.
This, however, would ax memory. On the other hand, a parsing strategy that keeps memory
load to a minimum would increase the risk of making many preliminary parsing errors at
points of local syntactic ambiguity. Some theorists, such as Frazier (1979), have emphasized
the need to minimize memory requirements; others, such as Marcus (1980). have emphasized
the need to avoid local ambiguities, hence putting a greater burden on memory.
Standing ambiguity refers to cases where sentences remain syntactically ambiguous
even when all of the lexical information has been received. For example, the sentence, The
old books and magazines were on the bench, remains ambiguous even when the sentence is
finished. That is, it is not clear whether there should be a boundary after books (the books
were old, but the magazines may not have been), or whether a boundary should follow
magazines (making it clear that both the books and the magazines were old). Similarly, the
sentence, lsaw the man with the binoculars, does not make clear who has the binoculars.
Sentences such as these can only be disambiguated by the broader context in which they are
encountered.
MODELS OF SENTENCE PARSING
To understand how theorists have used ambiguous sentences to understand syntactic
parsing, consider the sentence, The old man the boats. If you had trouble understanding this
sentence, it is probably because you read The old man and assumed it was a noun phrase.
When you reached the end of the sentence (the boats) and found no verb, you knew that
either the sentence made no sense at all or your initial understanding of the sentence was
wrong. If you went back and realized that The old is the noun phrase and man is the verb
(meaning “to operate”) the sentence made sense. Sentences that, like this one, are especially
misleading when you first encounter them are called “garden path” sentences.
Let us think (or a moment about how you might process a sentence such as The old
man the boats. Most people’s intuition is that we initially hear only one meaning of the
sentence as we are listening to it. Because of this, when we reach the end of the sentence and
discover we have done something wrong, we must go back and attempt to reparse the
sentence in a different way. Alternatively, your intuition might tell you that as we listen to
sentences that contain syntactic ambiguities we process both possible meanings. Even though
we are only consciously aware of one of them. In this case when we get to the end of the
sentence and discover our interpretation was wrong, we could solve the problem by switching
our attention to the alternative interpretation that has already been generated, albeit at the
unconscious level.
Interestingly, versions of each of these two possibilities have been proposed in the
psycholinguistics literature. A theory similar to the first possibility is sometimes referred to as
the garden path model of sentence processing. A theory similar to the second possibility is
sometimes referred to as the constraint satisfaction model.
THE ROLE OF PROSODY IN SENTENCE PROCESSING
Prosody is a general term for the variety of acoustic features—what we hear—that ordinarily accompany a spoken sentence. One prosodic feature is the intonation pattern of a
sentence. Intonation refers to pitch changes over time, as when a speaker’s voice rises in
pitch at the end of a question or drops at the end of a sentence. A second prosodic feature is
word stress, which is, in fact, a complex subjective variable based on loudness, pitch, and
timing. 1\vo final prosodic features are the pauses that sometimes occur at the ends of
sentences or major clauses and the lengthening of final vowels in words immediately prior to
a clause boundary (Cooper & Sorensen, 1981; Ferreira, 1993; Streeter, 1978).
Prosody plays numerous important roles in language processing. Prosody can indicate
the mood of a speaker (happy, angry, sad, sarcastic), it can mark the semantic focus of a
sentence (Jackendoff, 1972), and it can be used to disambiguate the meaning of an otherwise
ambiguous senence, such as I saw a man in the park with a telescope (Beach, 1991; Ferreira,
Henderson, Anes, Weeks, & McFarlane, 1996; Wales & Toner, 1979).
A more subtle effect of prosody is the way it can be used to mark major clauses of a
sentence. Consider the sentence, In order to do well, he studied very hard. If you say this
sentence aloud, you will notice how clearly the clause boundary (indicated here by the
comma) is marked by intonation stress, and timing Note especially how speakers
automatically lengthen the final vowel in the word just prior to the clause boundary (in this
case, the word well). Although Garrett and his colleagues used an ingenious splicing
technique to eliminate prosodic cues, this had the effect of underestimating their importance
when such cues were present. When studies analogous to the click studies are conducted, but
with the formal clause boundary and the prosodic marking for a clause boundary placed in
direct conflict, clicks just as often migrate to the point marked by prosody as to the formal
syntactic boundary (Wingfield & Klein1 1971).
Probably the experiment that cast the most dramatic doubt on whether or not the click
studies were tapping on-line perceptual segmentation rather than reflecting a post-perceptual
response bias was a study conducted by Reber and Anderson (1970). They found results
parallel to the original click studies even when subjects were falsely told that the sentences
they would hear contained “subliminal” clicks and asked to say where they thought these
clicks had occurred. Although no clicks were actually presented, subjects more often reported
having heard them at clause boundaries than within clauses.
It is certainly the case that clauses are important to the way people remember speech. In
one series of experiments, subjects heard a tape-recorded passage that was stopped without
warning at various spots in the passage. The moment the tape stopped, subjects were asked to
recall as large a segment as possible of what had just been heard. Generally, subjects’ recall
was bounded by full clauses, just as one would expect if major linguistic clauses do have
structural integrity (Jarvefla, 1970, 1971). The importance of clause boundaries and other
syntactic constituents can also be demonstrated by giving subjects tape-recorded passages
and telling them to interrupt the tape whenever they want to immediately recall what they
have just heard. In such cases, subjects reliably press the tape recorder pause button to give
their recall at periodic intervals corresponding exactly with the ends of major clauses and
other important syntactic boundaries (Winglield & Butterworth, 1984).
We should not dismiss all elements of an autonomy principle out of hand. Indeed, we
will later review evidence for some degree of autonomous processing in the form of
activation of word meaning independent of the sentence context in which the word is
embedded. Few writers today, however, espouse the early version of syntactic autonomy that
implies that analysis at the semantic level must await completion of a full clause or sentence
boundary in the speech stream.
We do not want to suggest that clauses are unimportant units in sentence processing1
Rather, our question is whether both syntactic and semantic analyses occur together and
continuously interact as we hear a sentence. Let us examine the principles of an interactive
view of sentence processing before returning to the arguments for processing autonomy still
current in the literature.
Download