Language

advertisement
Language
Language
• Definition of language
• Ambiguities of language (what makes it
hard)
Language diversity
• There are thought to be 6,000-7,000
languages worldwide, many with several
dialects
– Languages: not mutually intelligible
– Dialects: are mutually intelligible, differ in
grammar & vocabulary (usually associated with
race, region, or social class)
– Accents: differences in pronunciation
Language diversity
• Languages are disappearing
• More than half are spoken by fewer than
10,000 people.
• Perhaps 90% will be gone within 100 years
• People drop language for assimilation, and
to use languages of commerce.
Language universals
• Communicative (permits communication)
• Semanticity (stand for something other than
themselves)
• Arbitrary (relation between sound and reference is
unimportant)
• Structured (the pattern of symbols is not arbitrary)
• Generative (the basic units can be used to build a
limitless number of utterances)
• Dynamic (language is always evolving)
The problems
• How do we perceive speech sounds
(phonemes)?
• How do we perceive words?
• How do we perceive sentences?
• How do we perceive texts?
Phonemes (English)
Why is phoneme perception
hard?
• Phonemes produced fast (50/sec)
Different speakers produce differently
http://classweb.gmu.edu/accent
/
Please call Stella. Ask her to bring these things
with her from the store: Six spoons of fresh snow
peas, five thick slabs of blue cheese, and maybe a
snack for her brother Bob. We also need a small
plastic snake and a big toy frog for the kids. She
can scoop these things into three red bags, and
we will go meet her Wednesday at the train
station.
http://www.rhetorical.com/cgibin/demo.cgi
A single speaker produces them
differently, depending on the context of the
phoneme--this is coarticulation.
Coarticulation
“Vowel” vs “Vole”
You start to form the vowel (an o sound in voles and an
aa sound in vowels) before you start the buzzing noise
with your lips that produces the v sound.
Why is it had to understand words?
Speech stream: no space between words:
Speech segmentation
Does sometimes go wrong--famously when trying to
understand song lyrics.
Misheard lyric
Actual lyric
Frighten her kazoo Pride can hurt you
too
Heated, heated
Beat it, beat it
Should all the
Quintons beef, or
what?
Should old
acquaintance be
forgot
Song and artist
Beatles “She loves
you”
Michael Jackson,
“Beat it”
Traditional, “Auld
Lang Syne”
Why are sentences hard?
Obviously word order is crucial:
“Jayne kissed Jon”
“Jon kissed Jayne”
Even if the word order doesn’t change more
than one meaning is possible.
“Time flies like an arrow”
What does this mean?
There are at least 5 meanings to this sentence.
“Time flies like an arrow”
1. Time moves quickly, as an arrow does.
2. Assess the pace of flies as you would assess the pace
of an arrow
3. Assess the pace of flies in the same way that an arrow
would assess the pace of flies.
4. A particular variety of flies (time flies) adore arrows.
5. Assess the pace of flies, but only those flies that
resemble an arrow.
What makes understanding texts
hard?
A text is a collection of sentences forming a
paragraph or a collection of related
paragraphs.
Happy families are all alike; every unhappy family is unhappy in its own way.
Everything was in confusion in the Oblonskys' house. The wife had discovered
that the husband was carrying on an intrigue with a French girl, who had been a
governess in their family, and she had announced to her husband that she could
not go on living in the same house with him. This position of affairs had now
lasted three days, and not only the husband and wife themselves, but all the
members of their family and household, were painfully conscious of it. Every
person in the house felt that there was no sense in their living together, and that
the stray people brought together by chance in any inn had more in common
with one another than they, the members of the family and household of the
Oblonskys. The wife did not leave her own room, the husband had not been at
home for three days. The children ran wild all over the house; the English
governess quarreled with the housekeeper, and wrote to a friend asking her to
look out for a new situation for her; the man-cook had walked off the day
before just at dinner-time; the kitchen-maid, and the coachman had given
warning.
Anna Karenina, Ch. 1
Is Mrs. Oblonsky sad?
Is Mr. Oblonsky upset?
What time of year is it?
Is Mrs. Oblonsky sad?
Is Mr. Oblonsky upset?
What time of year is it?
The fact is that you don’t know the answer to any
of these questions; you are ready to make
inferences confidently about the first two, and in
fact probably make inferences without realizing
it. You don’t make an inference about the third.
So these are the problems. . .
•
•
•
•
Perception of phonemes
Perception of words
Perception of sentences
Perception of texts
Perception of phonemes
Warren (1970)
The state governors met with their respective
leg*slatures convening in the capital city
It was found that the *eel was on the axle
It was found that the *eel was on the shoe
It was found that the *eel was on the orange
It was found that the *eel was on the table
This is called the phoneme restoration effect
Perception of Phonemes
McGurk effect
Vision indicates “ga”, soundtrack says “ba”
Most people hear “da” or “la”
http://www.media.uio.no/personer/arntm/McGurk_english.html
Perception of Phonemes
People don’t perceive slight differences in phonemes
Sounds
like “ba”
1.0
P(hearing “b”)
Sounds
like “pa”
0.0
0
40
80
Voice onset time
Words--how perceived?
Most researchers think it’s a matching process between
input and the lexicon
Pronunciation: blæk
Spelling: black
Part of speech: adjective
Meaning pointer:  {this directs the system to
another location where the meaning is stored}
To test lexical access, you can do cross-modal priming.
Cross-modal priming
“At the turn of the century,it was
typical for gentlemen to wear
hats in the evening. . ..”
“At the turn of the century,it was
typical for gentlemen to wear
hacks in the evening. . ..”
Initial research indicated that the lexicon was
pretty picky about input--”hack” would no get
access to the lexicon; the lexicon was pretty
picky about access.
Gaskell et al (1998) showed that mispronounced words do
get lexical access if they are mispronounced the way people
tend to mispronounce them.
Sentence
type
Changed
Unchanged
Natural
change
Pime
bench
Pine bench
Unnatural Pime
change
cupboard
Pine
cupboard
750
Reaction Time
Changed
700
Unchanged
650
600
550
FAST RTs
indicate
lexical
access
500
Natural
Unnatural
Type of change
The point: you get lexical access with
mispronunciations IF the mispronunciations are
the type that people make naturally.
Word perception--reading
Visual input
?
Lexicon
Sound pattern
Spelling
Syntactic cat
Pointer to meaning
Word perception--reading
Lexicon
Visual input
Visual input
Sound pattern
Spelling
Syntactic cat
Pointer to meaning
Lexicon
Sound pattern
Spelling
Syntactic cat
Pointer to meaning
letter-phoneme rules
Dyslexia evidence
Lexicon
Visual input
Sound pattern
Spelling
Syntactic cat
Pointer to meaning
letter-phoneme rules
Slint:okay
Yacht:impaired
Cake:okay
Sale: might think it’s sail
Lexicon
Visual input
Sound pattern
Spelling
Syntactic cat
Pointer to meaning
letter-phoneme rules
Slint:impaired
Yacht:okay
Cake:okay
Sentence processing
To understand sentence processing, we need
to understand a little bit about grammar.
How are sentences parsed?
Grammar refers to a set of rules that describes the legal
sentences that can be constructed in a language.
Grammar is NOT what you find in a grammar book;
grammar refers to the set of rules people carry around in
their heads to produce sentences.
Word chain grammars-INCORRECT THEORY
Grammatical sentences are constructed word by word, by
selecting the next word in a sentence based on the associations
of the rest of the words in the sentence.
“The boy took his baseball bat and hit the _________”.
Probably “ball” but could be “window” or “umpire” or “squid”
Chomsky developed the famed sentence “Colorless green ideas
sleep furiously” to demonstrate that a sentence composed of
words that are very unlikely to follow one another can still be
grammatical.
Word chain grammars
Perhaps just specify next part of speech, not specific
word.
“The boy took his baseball bat and hit the
”
could be completed by a noun (ball) but the next word
could also be an adjective (smelly ball) or an adverb
(swiftly escaping boy).
Word chain grammars
The reason that word chain grammars don’t work
are instructive.
1: language has dependencies, which can span many
words
2. dependencies can be embedded
Dependencies
Dependencies: e.g., verbs must agree, “either” implies “or”;
“at” implies a noun
The little dogs, whose master was the nastiest, most foulmouthed monster who had ever simultaneously threatened me
with litigation and tried to romance me, were quite loving to
me.”
Embeddedness
Dependencies can be embedded:
“Either Dan or Brian will go” and then embed that clause
in another clause, forming “Either Dan or Brian will go,
or Karen and Jon will go.”
Because embedding opportunities are infinite, you’d need
an infinite word chain generator.
The solution--phrase structure
grammars
Phrase structure grammars use a hierarchical organization, not linear (as
word chains did).
Phrase structures specifies a limited number of sentence parts and a
limited number of ways the parts can be combined.
Sentence = noun phrase + verb phrase
Verb phrase = verb + noun phrase
Noun phrase = noun
Noun phrase = adjective + noun
Noun phrase = article + noun
Verb = auxiliary + verb
Note that “noun phrase” appears as part of a sentence and as part of the
verb phrase. Word chain would have needed to duplicate that definition.
Phrase structure
How do we get embeddedness?
Embedding is accounted for because definitions can be
recursive, meaning a definition has that definition embedded in
it.
Sentence = noun phrase + verb phrase
Sentence = “Either” sentence “or” sentence
Sentence = sentence “and” sentence
Sentence = “if” sentence “then “sentence
Key question:
What cues does the parser use to decide which
phrase structures are which?
• key words
• word order
• principle of minimal attachment
Key words
“a” indicates that a noun phrase follows
“who,” “which” and “that” indicate a relative clause
Fodor and Garrett (1967)
The car that the man whom the dog bit drove crashed
The car the man the dog bit drove crashed
Word order
Parser assumes that sentences will be active
(noun, then verb, then direct object)
Principle of minimal attachment
If new word can be attached to an existing
node in a phrase structure, go with that
interpretation.
Minimal attachment
The spy saw the cop with binoculars but the cop didn’t see him
The spy saw the cop with a revolver but the cop didn’t see him.
Sentence
Sentence
noun
phrase
article noun
verb
noun
phrase
verb
phrase
noun
phrase
prepositional
phrase
article noun
verb
verb
phrase
noun
phrase
noun
phrase
The
spy
saw the cop
with
bin oculars
s
The
spy
saw the cop
prepositional
phrase
with
a re volver
Note that in the sentence on the left, “binoculars” is part of the verb
phrase started by “saw” whereas in the sentence on the right,
“revolver” requires that a new node be generated to represent the
noun phrase. Takes longer to read the sentence on the right.
Phrase structures--ambiguity
Phrase structures can account for (some) ambiguities of language
Some sentences are ambiguous: “They are frying chickens”
Phrase structure ambiguity
• Two cars were reported stolen by Groveton
police yesterday
• The license fee for altered dogs with a
certificate will be $3 and for pets owned by
senior citizens who have not been altered
the fee will be $1.50.
• For sale: Mixing bowl set designed to
please a cook with round bottom for
efficient beating
When do we assign roles?
• On-line, NOT by waiting until the end of
the sentence
• Another heuristic that normally--but not
always--works well
Garden path sentences
• The horse raced past the barn fell.
• The man who hunts ducks out on weekends.
• The cotton clothing is usually made of
grows in Mississippi
• The raft floated down the river sank.
• The first words lead listener down the
garden path to an incorrect analysis
Called garden path sentences because
the parser is assigning each word to a
phrase structure, but it later becomes
clear that one of the assignments must
have been wrong.
Pragmatics
• Language as it is really used
• Not the crisp, clean sentences we’ve been
discussing!
• Common ground is essential.
Haldeman: That the way to handle this now is for us to have Walters call Pat Gray
and just say, "Stay the hell out of this...this is ah, business here we don't want you to
go any further on it." That's not an unusual development,...
Nixon: Um huh.
Haldeman: ...and, uh, that would take care of it.
Nixon: What about Pat Gray, ah, you mean he doesn't want to?
Haldeman: Pat does want to. He doesn't know how to, and he doesn't have, he
doesn't have any basis for doing it. Given this, he will then have the basis. He'll call
Mark Felt in, and the two of them ...and Mark Felt wants to cooperate because...
Nixon: Yeah.
Haldeman: he's ambitious...
Nixon: Yeah.
Haldeman: Ah, he'll call him in and say, "We've got the signal from across the
river to, to put the hold on this." And that will fit rather well because the FBI agents
who are working the case, at this point, feel that's what it is. This is CIA.
-”Smoking gun” tape, 6-23-72
Common ground
• Woman: I’m leaving you.
• Man: Who is he?
Pragmatics
• Speakers should be informative, truthful,
relevant, clear, unambiguous, brief, and
orderly
• But they can violate for a particular
purpose:
– Is Professor Willingham a good dancer?
– Well, he wears nice shoes.
Download