Natural Language Processing Lecture 1: Syntax Outline for today’s lecture Motivation Paradigms for studying language Levels of NL analysis Syntax – Parsing Top-down Bottom-up Chart parsing Motivation ‘Natural’ interface for humans Programming language interfaces are difficult to learn WIMP (windows, icons, menus, pointers) can be inefficient, impractical – Flatten out search space Ubiquitous computing Motivation Economics – Cost of maintaining a phone bank – Cost of voice transactions Turing Test Language makes us human (?) Example – problem with expert system interfaces Motivation Large text databases Question answering Text summarization Why can’t we do it yet? Speech recognition – Technology is getting better, but we may be pushing up against what is possible with signal processing only The real problem – AMBIGUITY! Paradigms for studying language Linguistic – How do words form sentences and phrases? – What constrains possible meanings for a sentence? Psycholinguistic – How do people identify the structure of sentences? – How are word meanings identified? Paradigms for studying language Philosophic – What is meaning, and how do words and sentences acquire it? – How do words identify objects in the world? Computational linguistic – How is the structure of sentences identified? – How can knowledge and reasoning be modeled? – How can language be used to accomplish specific tasks? Levels of understanding Phonetic How are words related to the sounds that make them? /puh-lee-z/ = please Important for speech recognitions systems Levels of understanding Phonetic Morphological How are words constructed from more basic components Un-friend-ly Gives information about function of words Levels of understanding Phonetic Morphological Syntactic How are words combined to form correct sentences? What role do words play? Best understood – Well studied for formal languages Levels of understanding Phonetic Morphological Syntactic Semantic What do words mean? How do these meanings combine in sentences? Levels of understanding Phonetic Morphological Syntactic Semantic Pragmatic How are sentences used in different situations? How does this affect interpretation of a sentence? Levels of understanding Phonetic Morphological Syntactic Semantic Pragmatic Discourse level How does the surrounding language content affect the interpretation of a sentence? Pronoun resolution, temporal references Levels of understanding Phonetic Morphological Syntactic Semantic Pragmatic Discourse level World knowledge General knowledge about the world necessary to communicate. Includes knowledge about goals and intentions of other users. Ambiguity in language Language can be ambiguous on many levels – Too, two, to – Cook, set, bug – The man saw the boy with the telescope. – Every boy loves a dog. – Green ideas have large noses. – Can you pass the salt? Syntax The syntactic structure of a sentence indicates the way that the words in the sentence are related to each other. The structure can indicate relationships between words and phrases, and can store information that can be used later in processing Example The boy saw the cat The cat saw the boy The girl saw the man in the store – Was the girl in the store? Syntactic processing Main goals – Determining whether a sequence of symbols constitute a legal sentence – Assigning a phrase/constituent structure to legal sentences for later processing Grammars and parsing techniques We need a grammar in order to parse – Grammar = formal specification of structures allowed in a language Given a grammar, we also need a parsing technique, or a method of analyzing a sentence to determine its structure according to the grammar Statistical vs. Deterministic Deterministic – Provably correct – Brittle Statistical – Always gives an answer – No guarantees We probably want to split the difference NL and CFGs Context-free grammars (CFG) are a good choice – Powerful enough to describe most NL structure – Restricted enough to allow for efficient parsing A CFG has rules with a single symbol on the left-hand side A simple top-down parser (example in handouts) A simple, silly grammar S -> NP VP VP -> V NP NP -> NAME NP -> ART N NAME -> John V -> ate ART -> the N -> cat A parse tree for “John ate the cat” S NP NAME VP V NP ART John ate the N cat Simple top-down parse S -> NP VP VP -> V NP NP -> NAME NP -> ART N NAME -> John V -> ate ART -> the N -> cat S NP VP NAME VP John VP John V NP John ate NP John ate ART N John ate the N John ate the cat Simple bottom-up parse S -> NP VP VP -> V NP NP -> NAME NP -> ART N NAME -> John V -> ate ART -> the N -> cat NAME ate the cat NAME V the cat NAME V ART cat NAME V ART N NP V ART N NP V NP NP VP S Parsing as search Parsing can be viewed as a special case of the search problem What are the similarities? Chart parsing Maintains information about partial parses, so consitutents do not have to be recomputed more than once Top-down chart parsing Algorithm: – Do until no input left and agenda is empty: If agenda is empty, look up interpretations of next word and add them to the agenda Select a constituent C from the agenda Combine C with every active arc on the chart. Add newly formed constituents to the agenda For newly created active arcs, add to chart using arc introduction algorithm Top-down chart parsing Arcs keep track of completed consitutents, or potential constituents Arc introduction algorithm: – To add an arc S -> C1 . . . ° Ci . . . Cn ending at position j, do the following: For each rule in the grammar of form Ci -> X1 … Xk, recursively add the new arc Ci -> ° X1 … Xk To add an arc S -> C1 . . . ° Ci . . . Cn ending at position j, do the following: For each rule in the grammar of form Ci -> X1 … Xk, recursively add the new arc Ci -> ° X1 … Xk 0 John 1 ate 2 the S -> ° NP VP NP -> ° ART N NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat 3 cat 4 Agenda: John 0 NAME John 1 ate 2 the 3 cat 4 NP -> NAME ° S -> ° NP VP NP -> ° ART N NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: NP NP1 NAME1 0 John 1 ate 2 the 3 cat 4 NP -> NAME ° S -> NP ° VP S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: ate NP1 NAME1 0 John 1 V1 ate 2 the 3 cat 4 NP -> NAME ° S -> NP ° VP S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: NP1 NAME1 0 John 1 NP -> NAME ° V1 ate 2 the 3 cat 4 VP -> V ° NP S -> NP ° VP NP -> ° ART N S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: the NP1 NAME1 0 John 1 NP -> NAME ° V1 ate 2 VP -> V ° NP ART1 the 3 cat 4 NP -> ART ° N S -> NP ° VP NP -> ° ART N S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: cat NP1 NAME1 0 John 1 NP -> NAME ° V1 ate 2 VP -> V ° NP ART1 the 3 N1 cat 4 NP -> ART ° N NP -> ART N ° S -> NP ° VP NP -> ° ART N S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: NP NP1 NAME1 0 John 1 NP -> NAME ° NP2 V1 ate 2 VP -> V ° NP ART1 the 3 N1 cat 4 NP -> ART ° N NP -> ART N ° S -> NP ° VP NP -> ° ART N S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat Agenda: VP VP1 NP1 NAME1 0 John 1 NP -> NAME ° NP2 V1 ate 2 VP -> V ° NP ART1 the 3 N1 cat 4 NP -> ART ° N NP -> ART N ° S -> NP ° VP NP -> ° ART N S -> ° NP VP NP -> ° ART N NP -> ° NAME VP -> ° V NP NP -> ° NAME S -> NP VP VP -> V NP NP -> ART N NP -> NAME NAME -> John V -> ate ART -> the N -> cat A bigger example S1 S2 VP3 NP2 VP2 NP1 VP1 N1 V1 ART1 ADJ1 N2 N2 AUX1 AUX2 NP3 V3 N3 V4 ART2 N4 1 the 2 large 3 can 4 can 5 hold 6 the 7 water 8 Complexity For a sentence of length n: Pure search: Cn, where C depends on algorithm Chart-based: Kn3, where K depends on the algorithm Next time Semantics Maybe some Prolog Other ideas Augmented transition networks Features