G22

advertisement
G22.2590 - Natural Language Processing - Spring 2001
Lecture 2 Outline
Prof. Grishman
January 24, 2001
Role of Syntax Analysis
determining and regularizing structure -- relations between words
captures generalizations about language
Basic Syntactic Structures of English (Allen, chapter 2)
parts of speech
phrases: classifying them by part of speech of main word or by syntactic role
subject and predicate; noun phrase and verb phrase
verb complements and modifiers
types of complements ... noun phrases, adjective phrases, prepositional phrases,
particles
clauses; clausal complements
tenses: progressive, perfect, passive
noun phrase structure
relative clauses; reduced relative clauses
coordinating and subordinating conjunctions
Comparison with other Languages
word segmentation
inflectional and derivational morphology
fixed vs. free word order
Phrase-structure languages (Allen 3.1)
productions; rewrite operation; derivation
Chomsky hierarchy (regular grammars, context-free grammars, context-sensitive
grammars)
A small context-free English grammar
sentence := np vp;
np := n | art n | art adj n;
vp := v | v np;
Including auxiliaries
vp
:=
v | v np | v vp;
Including PPs
sentence := np vp;
np := ngroup | ngroup pp;
ngroup
:= n | art n | art adj n;
vp := v | v np | v vp | v np pp;
pp := p np;
Parsers
Top-down recognizer / parser (Allen 3.3)
Bottom-up (immediate-constituent) parser (Grishman 2.4.2)
Uses tree nodes with components
root (a non-terminal grammar symbol),
start and end (token numbers), and
constituents (a vector of parse tree nodes)
For i = 1 , … number of words in sentence
Create a node with root = part of speech of word i, start = i, end = i+1
(if the word has several parts of speech, create one node for each P.O.S.)
Put this node on list todo
While todo is not empty,
Remove node n from todo
If there exists a production A  a1 a2 … aj such that
root(n) = aj
and there exist nodes n1 … nj-1 such that
root(nk)=ak and end(nk)=start(nk+1) (k=1,…,j-1),
then create a new node with root = A, start = start(n1), end = end(n) and
add it to todo.
Assignment #2
Due February 7th.
Allen Chapter 2 exercises 2, 3, and 8.
Using Jet, add a verb and a noun to the dictionary given and parse two sentences,
using the top-down parser;
submit the parses produced (copy and paste from the console log).
Download