Language Language • Definition of language • Ambiguities of language (what makes it hard) Language diversity • There are thought to be 6,000-7,000 languages worldwide, many with several dialects – Languages: not mutually intelligible – Dialects: are mutually intelligible, differ in grammar & vocabulary (usually associated with race, region, or social class) – Accents: differences in pronunciation Language diversity • Languages are disappearing • More than half are spoken by fewer than 10,000 people. • Perhaps 90% will be gone within 100 years • People drop language for assimilation, and to use languages of commerce. Language universals • Communicative (permits communication) • Semanticity (stand for something other than themselves) • Arbitrary (relation between sound and reference is unimportant) • Structured (the pattern of symbols is not arbitrary) • Generative (the basic units can be used to build a limitless number of utterances) • Dynamic (language is always evolving) The problems • How do we perceive speech sounds (phonemes)? • How do we perceive words? • How do we perceive sentences? • How do we perceive texts? Phonemes (English) Why is phoneme perception hard? • Phonemes produced fast (50/sec) Different speakers produce differently http://classweb.gmu.edu/accent / Please call Stella. Ask her to bring these things with her from the store: Six spoons of fresh snow peas, five thick slabs of blue cheese, and maybe a snack for her brother Bob. We also need a small plastic snake and a big toy frog for the kids. She can scoop these things into three red bags, and we will go meet her Wednesday at the train station. http://www.rhetorical.com/cgibin/demo.cgi A single speaker produces them differently, depending on the context of the phoneme--this is coarticulation. Coarticulation “Vowel” vs “Vole” You start to form the vowel (an o sound in voles and an aa sound in vowels) before you start the buzzing noise with your lips that produces the v sound. Why is it had to understand words? Speech stream: no space between words: Speech segmentation Does sometimes go wrong--famously when trying to understand song lyrics. Misheard lyric Actual lyric Frighten her kazoo Pride can hurt you too Heated, heated Beat it, beat it Should all the Quintons beef, or what? Should old acquaintance be forgot Song and artist Beatles “She loves you” Michael Jackson, “Beat it” Traditional, “Auld Lang Syne” Why are sentences hard? Obviously word order is crucial: “Jayne kissed Jon” “Jon kissed Jayne” Even if the word order doesn’t change more than one meaning is possible. “Time flies like an arrow” What does this mean? There are at least 5 meanings to this sentence. “Time flies like an arrow” 1. Time moves quickly, as an arrow does. 2. Assess the pace of flies as you would assess the pace of an arrow 3. Assess the pace of flies in the same way that an arrow would assess the pace of flies. 4. A particular variety of flies (time flies) adore arrows. 5. Assess the pace of flies, but only those flies that resemble an arrow. What makes understanding texts hard? A text is a collection of sentences forming a paragraph or a collection of related paragraphs. Happy families are all alike; every unhappy family is unhappy in its own way. Everything was in confusion in the Oblonskys' house. The wife had discovered that the husband was carrying on an intrigue with a French girl, who had been a governess in their family, and she had announced to her husband that she could not go on living in the same house with him. This position of affairs had now lasted three days, and not only the husband and wife themselves, but all the members of their family and household, were painfully conscious of it. Every person in the house felt that there was no sense in their living together, and that the stray people brought together by chance in any inn had more in common with one another than they, the members of the family and household of the Oblonskys. The wife did not leave her own room, the husband had not been at home for three days. The children ran wild all over the house; the English governess quarreled with the housekeeper, and wrote to a friend asking her to look out for a new situation for her; the man-cook had walked off the day before just at dinner-time; the kitchen-maid, and the coachman had given warning. Anna Karenina, Ch. 1 Is Mrs. Oblonsky sad? Is Mr. Oblonsky upset? What time of year is it? Is Mrs. Oblonsky sad? Is Mr. Oblonsky upset? What time of year is it? The fact is that you don’t know the answer to any of these questions; you are ready to make inferences confidently about the first two, and in fact probably make inferences without realizing it. You don’t make an inference about the third. So these are the problems. . . • • • • Perception of phonemes Perception of words Perception of sentences Perception of texts Perception of phonemes Warren (1970) The state governors met with their respective leg*slatures convening in the capital city It was found that the *eel was on the axle It was found that the *eel was on the shoe It was found that the *eel was on the orange It was found that the *eel was on the table This is called the phoneme restoration effect Perception of Phonemes McGurk effect Vision indicates “ga”, soundtrack says “ba” Most people hear “da” or “la” http://www.media.uio.no/personer/arntm/McGurk_english.html Perception of Phonemes People don’t perceive slight differences in phonemes Sounds like “ba” 1.0 P(hearing “b”) Sounds like “pa” 0.0 0 40 80 Voice onset time Words--how perceived? Most researchers think it’s a matching process between input and the lexicon Pronunciation: blæk Spelling: black Part of speech: adjective Meaning pointer: {this directs the system to another location where the meaning is stored} To test lexical access, you can do cross-modal priming. Cross-modal priming “At the turn of the century,it was typical for gentlemen to wear hats in the evening. . ..” “At the turn of the century,it was typical for gentlemen to wear hacks in the evening. . ..” Initial research indicated that the lexicon was pretty picky about input--”hack” would no get access to the lexicon; the lexicon was pretty picky about access. Gaskell et al (1998) showed that mispronounced words do get lexical access if they are mispronounced the way people tend to mispronounce them. Sentence type Changed Unchanged Natural change Pime bench Pine bench Unnatural Pime change cupboard Pine cupboard 750 Reaction Time Changed 700 Unchanged 650 600 550 FAST RTs indicate lexical access 500 Natural Unnatural Type of change The point: you get lexical access with mispronunciations IF the mispronunciations are the type that people make naturally. Word perception--reading Visual input ? Lexicon Sound pattern Spelling Syntactic cat Pointer to meaning Word perception--reading Lexicon Visual input Visual input Sound pattern Spelling Syntactic cat Pointer to meaning Lexicon Sound pattern Spelling Syntactic cat Pointer to meaning letter-phoneme rules Dyslexia evidence Lexicon Visual input Sound pattern Spelling Syntactic cat Pointer to meaning letter-phoneme rules Slint:okay Yacht:impaired Cake:okay Sale: might think it’s sail Lexicon Visual input Sound pattern Spelling Syntactic cat Pointer to meaning letter-phoneme rules Slint:impaired Yacht:okay Cake:okay Sentence processing To understand sentence processing, we need to understand a little bit about grammar. How are sentences parsed? Grammar refers to a set of rules that describes the legal sentences that can be constructed in a language. Grammar is NOT what you find in a grammar book; grammar refers to the set of rules people carry around in their heads to produce sentences. Word chain grammars-INCORRECT THEORY Grammatical sentences are constructed word by word, by selecting the next word in a sentence based on the associations of the rest of the words in the sentence. “The boy took his baseball bat and hit the _________”. Probably “ball” but could be “window” or “umpire” or “squid” Chomsky developed the famed sentence “Colorless green ideas sleep furiously” to demonstrate that a sentence composed of words that are very unlikely to follow one another can still be grammatical. Word chain grammars Perhaps just specify next part of speech, not specific word. “The boy took his baseball bat and hit the ” could be completed by a noun (ball) but the next word could also be an adjective (smelly ball) or an adverb (swiftly escaping boy). Word chain grammars The reason that word chain grammars don’t work are instructive. 1: language has dependencies, which can span many words 2. dependencies can be embedded Dependencies Dependencies: e.g., verbs must agree, “either” implies “or”; “at” implies a noun The little dogs, whose master was the nastiest, most foulmouthed monster who had ever simultaneously threatened me with litigation and tried to romance me, were quite loving to me.” Embeddedness Dependencies can be embedded: “Either Dan or Brian will go” and then embed that clause in another clause, forming “Either Dan or Brian will go, or Karen and Jon will go.” Because embedding opportunities are infinite, you’d need an infinite word chain generator. The solution--phrase structure grammars Phrase structure grammars use a hierarchical organization, not linear (as word chains did). Phrase structures specifies a limited number of sentence parts and a limited number of ways the parts can be combined. Sentence = noun phrase + verb phrase Verb phrase = verb + noun phrase Noun phrase = noun Noun phrase = adjective + noun Noun phrase = article + noun Verb = auxiliary + verb Note that “noun phrase” appears as part of a sentence and as part of the verb phrase. Word chain would have needed to duplicate that definition. Phrase structure How do we get embeddedness? Embedding is accounted for because definitions can be recursive, meaning a definition has that definition embedded in it. Sentence = noun phrase + verb phrase Sentence = “Either” sentence “or” sentence Sentence = sentence “and” sentence Sentence = “if” sentence “then “sentence Key question: What cues does the parser use to decide which phrase structures are which? • key words • word order • principle of minimal attachment Key words “a” indicates that a noun phrase follows “who,” “which” and “that” indicate a relative clause Fodor and Garrett (1967) The car that the man whom the dog bit drove crashed The car the man the dog bit drove crashed Word order Parser assumes that sentences will be active (noun, then verb, then direct object) Principle of minimal attachment If new word can be attached to an existing node in a phrase structure, go with that interpretation. Minimal attachment The spy saw the cop with binoculars but the cop didn’t see him The spy saw the cop with a revolver but the cop didn’t see him. Sentence Sentence noun phrase article noun verb noun phrase verb phrase noun phrase prepositional phrase article noun verb verb phrase noun phrase noun phrase The spy saw the cop with bin oculars s The spy saw the cop prepositional phrase with a re volver Note that in the sentence on the left, “binoculars” is part of the verb phrase started by “saw” whereas in the sentence on the right, “revolver” requires that a new node be generated to represent the noun phrase. Takes longer to read the sentence on the right. Phrase structures--ambiguity Phrase structures can account for (some) ambiguities of language Some sentences are ambiguous: “They are frying chickens” Phrase structure ambiguity • Two cars were reported stolen by Groveton police yesterday • The license fee for altered dogs with a certificate will be $3 and for pets owned by senior citizens who have not been altered the fee will be $1.50. • For sale: Mixing bowl set designed to please a cook with round bottom for efficient beating When do we assign roles? • On-line, NOT by waiting until the end of the sentence • Another heuristic that normally--but not always--works well Garden path sentences • The horse raced past the barn fell. • The man who hunts ducks out on weekends. • The cotton clothing is usually made of grows in Mississippi • The raft floated down the river sank. • The first words lead listener down the garden path to an incorrect analysis Called garden path sentences because the parser is assigning each word to a phrase structure, but it later becomes clear that one of the assignments must have been wrong. Pragmatics • Language as it is really used • Not the crisp, clean sentences we’ve been discussing! • Common ground is essential. Haldeman: That the way to handle this now is for us to have Walters call Pat Gray and just say, "Stay the hell out of this...this is ah, business here we don't want you to go any further on it." That's not an unusual development,... Nixon: Um huh. Haldeman: ...and, uh, that would take care of it. Nixon: What about Pat Gray, ah, you mean he doesn't want to? Haldeman: Pat does want to. He doesn't know how to, and he doesn't have, he doesn't have any basis for doing it. Given this, he will then have the basis. He'll call Mark Felt in, and the two of them ...and Mark Felt wants to cooperate because... Nixon: Yeah. Haldeman: he's ambitious... Nixon: Yeah. Haldeman: Ah, he'll call him in and say, "We've got the signal from across the river to, to put the hold on this." And that will fit rather well because the FBI agents who are working the case, at this point, feel that's what it is. This is CIA. -”Smoking gun” tape, 6-23-72 Common ground • Woman: I’m leaving you. • Man: Who is he? Pragmatics • Speakers should be informative, truthful, relevant, clear, unambiguous, brief, and orderly • But they can violate for a particular purpose: – Is Professor Willingham a good dancer? – Well, he wears nice shoes.