nlp1

advertisement
EECS 595 / LING 541 / SI 661
Natural Language Processing
Fall 2004
Lecture Notes #1
Introduction
Course logistics
• Instructor: Prof. Dragomir Radev (radev@umich.edu)
Ph.D., Computer Science, Columbia University
Formerly at IBM TJ Watson Research Center
• Times: Tuesdays 1:10-3:55 PM, in 412, West Hall
• Office hours: TBA, 3080 West Hall Connector
Course home page:
http://www.si.umich.edu/~radev/NLP-fall2004
Example (from a famous movie)
Dave Bowman: Open the pod bay doors, HAL.
HAL: I’m sorry Dave. I’m afraid I can’t do that.
Example
I saw her fall
• How many different interpretations does the
above sentence have?
What is Natural Language
Processing
• Natural Language Processing (NLP) is the
study of the computational treatment of
natural language.
• NLP draws on research in Linguistics,
Theoretical Computer Science,
Mathematics and Statistics, Artificial
Intelligence, Psychology, etc.
Linguistics
• Knowledge about language:
–
–
–
–
–
–
–
Phonetics and phonology - the study of sounds
Morphology - the study of word components
Syntax - the study of sentence and phrase structure
Lexical semantics - the study of the meanings of words
Compositional semantics - how to combine words
Pragmatics - how to accomplish goals
Discourse conventions - how to deal with units larger
than utterances
Theoretical Computer Science
• Automata
– Deterministic and non-deterministic finite-state automata
– Push-down automata
• Grammars
– Regular grammars
– Context-free grammars
– Context-sensitive grammars
• Complexity
• Algorithms
– Dynamic programming
Mathematics and Statistics
•
•
•
•
•
•
Probabilities
Statistical models
Hypothesis testing
Linear algebra
Optimization
Numerical methods
Artificial Intelligence
• Logic
– First-order logic
– Predicate calculus
• Agents
– Speech acts
• Planning
• Constraint satisfaction
• Machine learning
Ambiguity
I saw her fall.
• The categories of knowledge of language can be
thought of as ambiguity-resolving components
• How many different interpretations does the above
sentence have?
• How can each ambiguous piece be resolved?
• Does speech input make the sentence even more
ambiguous?
Time flies like an arrow.
http://edition.cnn.com/2004/WEATHER/09/03/hurricane.frances/index.html
Frances churns toward Florida
Hurricane center: Storm 'relentlessly lashing Bahamas'
Friday, September 3, 2004 Posted: 2024 GMT (0424 HKT)
MIAMI, Florida (CNN) -- Hurricane Frances moved slowly toward Florida on Friday, and the National Hurricane Center said it could gain
intensity before making landfall, possibly late Saturday.
At 2 p.m. ET, the Category 3 storm was centered near the southern tip of Great Abaco in the Bahamas, 200 miles (321 kilometers) eastsoutheast of Florida's lower east coast, according to the National Hurricane Center.
The storm was moving toward the west-northwest at about 9 mph (15 kph).
Its maximum sustained winds had dropped to 115 mph (185 kph), but forecasters said it still is "a dangerous hurricane."
Hurricanes are classified as categories 1 to 5 on the Saffir-Simpson hurricane scale. A Category 3 storm has sustained winds between 111
and 130 mph (178 and 209 kph).
The advisory said Frances was likely to make landfall in Florida in about 36 hours.
Hurricane-force winds extend 85 miles (140 kilometers) from the center of the storm, and winds of tropical storm strength (39-73 mph)
extend outward up to 185 miles (295 kilometers).
Because Frances is the size of Texas -- more than twice as large as Hurricane Charley three weeks ago -- its major winds and heavy rain
are expected to batter a large part of Florida well before landfall.
By Friday afternoon, parts of Florida were experiencing wind gusts as high as 39 mph -- the lower end of tropical-storm intensity.
Hurricane warnings are in effect for much of Florida's eastern coastline. A hurricane warning means hurricane conditions are expected in
the warning area within 24 hours.
Storm surge flooding of six to 14 feet above normal has been reported in the storm's path, and the hurricane center warned "rainfall
amounts of seven to 12 inches -- locally as high as 20 inches -- are possible in association with Frances."
The hurricane center bulletin said Frances was "relentlessly lashing the central and western Bahamas."
A hurricane center official told CNN the storm could spend two days moving across the Florida Peninsula.
Frances has weakened slightly in the past few days, but the hurricane center advisory warned that as it moves across the warm waters of
the Gulf Stream, "this could easily lead to re-intensification."
However, current forecasts predict "a 100-knot hurricane at landfall" -- meaning wind speeds of about 115 mph.
Because steering currents are expected to weaken further, Frances "will likely slow down on its way to Florida. This could delay the
landfall a few more hours," the advisory said. "Numerical guidance continues to bring the hurricane over Florida during the next two to
three days."
Florida Gov. Jeb Bush said Friday that the state was taking all necessary steps to prepare for the storm.
Florida Gov. Jeb Bush said Friday that the state was taking all necessary steps to prepare for the storm.
"We are staging across -- some outside the state and some inside the state -- a massive response for this storm, and we're going to need it,"
Bush said in a news conference. "There's going to be a lot of work necessary to make sure that the response is massive and immediate to
help people once this storm comes."
He said he has asked the governors of 17 states to waive size and weight restrictions on trucks carrying relief supplies.
His brother, President Bush, also offered support at a campaign rally Friday morning in Pennsylvania.
"Before I begin, I do know you'll join me in offering our prayers and best wishes to those in the path of Hurricane Frances," the president
said.
A hurricane the size of Texas
Florida ordered mandatory evacuations in parts of 16 counties and voluntary evacuations in five other counties.
"If you are on a barrier island or a low-lying area, and you haven't left, now is the time to do so," Governor Bush said.
Florida officials said the evacuation order covers 2.5 million people.
Most of them "are staying in their own community, which is exactly what they should be doing," said Bush, noting that low-lying areas
were most at risk. "They've made plans to be with a loved one or a friend and they're not on the roads."
People looking to flee the region clogged highways Thursday, but officials said Friday that traffic had died down. "Overall we're very, very
pleased with evacuation procedures yesterday and continuing through today," said Col. Chris Knight, director of the Florida Highway
Patrol. "We have no problems this morning."
The Red Cross opened 82 shelters in Florida on Thursday and about 21,000 people were in them by nightfall, spokeswoman Carol Miller
told CNN. The group also set up eight reception centers along the highway to help people who needed information, directions, water and
maps, she said.
Miller said the Red Cross was launching its largest-ever response effort to a domestic natural disaster.
Airlines have canceled flights in and out of some of the major airports in Florida and the Caribbean, and are expected to adjust schedules
as weather patterns change throughout the weekend.
Military preparations
Military officials preparing to evacuate three commands as Frances approaches.
At MacDill Air Force Base in Tampa, on Florida's Gulf Coast, a military team is preparing to set up alternative headquarters facilities for
the U.S. Central Command and Special Operations Command at the stadium used by the Tampa Bay Buccaneers football team.
Central Command is responsible for running the wars in Afghanistan and Iraq, while Special Operations Command oversees 50,000
special operations forces.
Patrick Air Force Base, on the eastern coast of Florida near Melbourne, was evacuated Thursday, and the commander of a fighter wing
near Miami ordered aircraft moved out of the hurricane's path.
The naval air station at Jacksonville also moved aircraft out of the area.
In Miami, the headquarters of the Southern Command has closed. Command-and-control operations are being performed, but they could
be moved to Davis-Monthan Air Force Base in Arizona.
The alphabet soup
(NLP vs. CL vs. SP vs. HLT vs. NLE)
•
•
•
•
•
•
NLP (Natural Language Processing)
CL (Computational Linguistics)
SP (Speech Processing)
HLT (Human Language Technology)
NLE (Natural Language Engineering)
Other areas of research: Speech and Text Generation,
Speech and Text Understanding, Information Extraction,
Information Retrieval, Dialogue Processing, Inference
• Related areas: Spelling Correction, Grammar Correction,
Text Summarization
Sample applications
•
•
•
•
•
•
Speech Understanding
Question Answering
Machine Translation
Text-to-speech Generation
Text Summarization
Dialogue Systems
Some demos
• AT&T Labs Text-To-Speech
(http://www.research.att.com/proje
cts/tts/demo.html)
• Babelfish
(babelfish.altavista.com)
OneAcross (www.oneacross.com)
AskJeeves (www.ask.com)
•
•
• IONaut
(http://www.ionaut.com:8400)
• NSIR
(http://tangra.si.umich.edu/clair/NS
IR/html/nsir.cgi)
• AnswerBus
(www.answerbus.com)
• NewsInEssence
(www.newsinessence.com)
The Turing Test
• Alan Turing: the Turing test (language as test for intelligence)
• Three participants: a computer and two humans (one is an
interrogator)
• Interrogator’s goal: to tell the machine and human apart
• Machine’s goal: to fool the interrogator into believing that a
person is responding
• Other human’s goal: to help the interrogator reach his goal
Q: Please write me a sonnet on the topic of the Forth Bridge.
A: Count me out on this one. I never could write poetry.
Q: Add 34957 to 70764.
A: 105621 (after a pause)
Some brief history
• Foundational insights (40’s and 50’s): automaton (Turing),
probabilities, information theory (Shannon), formal
languages (Backus and Naur), noisy channel and decoding
(Shannon), first systems (Davis et al., Bell Labs)
• Two camps (57-70): symbolic and stochastic.
Transformation grammar (Harris, Chomsky), artificial
intelligence (Minsky, McCarthy, Shannon, Rochester),
automated theorem proving and problem solving (Newell
and Simon)
Bayesian reasoning (Mosteller and Wallace)
Corpus work (Kučera and Francis)
Some brief history
• Four paradigms (70-83): stochastic (IBM), logicbased (Colmerauer, Pereira and Warren, Kay,
Bresnan), nlu (Winograd, Schank, Fillmore),
discourse modelling (Grosz and Sidner)
• Empiricism and finite-state models redux (83-93):
Kaplan and Kay (phonology and morphology),
Church (syntax)
• Late years (94-03): strong integration of different
techniques, different areas (including speech and
IR), probabilistic models, machine learning
The state of the art and the nearterm future
• World-Wide Web (WWW)
• Sample scenarios:
–
–
–
–
–
–
–
–
–
generate weather reports in two languages
teaching deaf people to speak
translate Web pages into different languages
speak to your appliances
find restaurants
answer questions
grade essays (?)
closed-captioning in many languages
automatic description of a soccer game
Structure of the course
• Three major parts:
– Linguistic, mathematical, and computational background
– Computational models of morphology, syntax, semantics, discourse,
pragmatics
– Applications: text generation, machine translation, information extraction,
etc.
• Three major goals:
– Learn the basic principles and theoretical issues underlying natural
language processing
– Learn techniques and tools used to develop practical, robust systems that
can communicate with users in one or more languages
– Gain insight into many open research problems in natural language
Readings
• Speech and Language
Processing
(Daniel Jurafsky and James
Martin)
Prentice-Hall, 2000
ISBN: 0-13-095069-6
• Handouts given in class
• 1-2 chapters per week
Optional readings:
Natural Language Understanding by Allen
Foundations of Statistical Natural Language Processing by Manning and Schütze.
Grading
•
•
•
•
•
Four homework assignments (40%)
Midterm (15%)
Final project (20%)
Final exam (25%)
Additional requirements for SI761
Assignments
• (subject to change)
– Finite-state modeling, part of speech tagging, and
information extraction
• Fsmtools/lextools/JMX (Bell Labs, Penn)
– Tagging and parsing
• Brill tagger/Charniak parser (JHU, Brown)
– Machine translation
• GIZA++/Rewrite decoder (Aachen, JHU, ISI)
– Text generation
• FUF/Surge (Columbia)
Syllabus
Wk
Date
Topic
1
9/7
Introduction (JM1)
Linguistic Fundamentals
2
9/14
Regular Expressions and Automata (JM2)
3
9/21
Morphology and Finite-State Transducers (JM3)
Word Classes and Part of Speech Tagging (JM8)
4
9/28
Context-Free Grammars for English (JM9)
Parsing with Context-Free Grammars (JM10)
5
10/5
Features and Unification (JM11)
Lexicalized and Probabilistic Parsing (JM12)
6
10/12
Natural Language Generation (JM20)
Machine Translation (JM 21 + handout)
10/19
NO CLASS
HW HW due
#1
#2
#1
#3
#2
Syllabus
Wk
Date
Topic
7
10/26
Midterm
8
11/2
Natural Language Generation (JM20) (Cont’d)
The Functional Unification Formalism (Handout)
9
11/9
Language and Complexity (JM13)
10
11/16
Representing Meaning (JM14)
11
11/23
Semantic Analysis (JM15)
Discourse (JM18)
12
11/30
Rhetorical Analysis (Handout)
Dialogue and Conversational Agents (JM19)
13
12/7&14
Project Presentations
HW HW due
#4
#3
#4
Project
due
Other meetings
• CLAIR meeting
(TBA)
• Artificial Intelligence Seminar
(Tuesdays 4-5:30)
• STIET
(Thursdays 4-5:30)
Projects
Each student will be responsible for designing and completing a research project that
demonstrates the ability to use concepts from the class in addressing a practical
problem. A significant part of the final grade will depend on the project assignment.
Students can elect to do a project on an assigned topic, or to select a topic of their own.
The final version of the project will be put on the World Wide Web, and will be
defended in front of the class at the end of the semester (procedure TBA).
In some cases (and only with instructor’s approval), students may be allowed to work
in pairs when the project’s scope is significant.
Sample projects
•
•
•
•
•
•
•
•
•
•
•
•
•
Noun phrase parser
Paraphrase identification
Question answering
NL access to databases
Named entity tagging
Rhetorical parsing
Anaphora resolution, entity
crossreference
Document and sentence
alignment
Using bioinformatics methods
Encyclopedia
Information extraction
Speech processing
Sentence normalization
•
•
•
•
•
•
•
•
•
•
•
•
•
Text summarization
Sentence compression
Definition extraction
Crossword puzzle generation
Prepositional phrase attachment
Machine translation
Generation
Semi-structured document
parsing
Semantic analysis of short
queries
User-friendly summarization
Number classification
Domain-specific PP attachment
Time-dependent fact extraction
Main research forums and other
pointers
• Conferences: ACL/NAACL, SIGIR, AAAI/IJCAI, ANLP, Coling,
HLT, EACL/NAACL, AMTA/MT Summit, ICSLP/Eurospeech
• Journals: Computational Linguistics, Natural Language Engineering,
Information Retrieval, Information Processing and Management, ACM
Transactions on Information Systems, ACM TALIP, ACM TSLP
• University centers: Columbia, CMU, JHU, Brown, UMass, MIT,
UPenn, USC/ISI, NMSU, Michigan, Maryland, Edinburgh,
Cambridge, Saarland, Sheffield, and many others
• Industrial research sites: IBM, SRI, BBN, MITRE, MSR, (AT&T, Bell
Labs, PARC)
• Startups: Language Weaver, Ask.com, LCC
• The Anthology: http://www.aclweb.org/anthology
What this course is NOT
•
EECS 597 / LING 792 / SI 661 “Language and Information”, last taught in
Fall of 2002, essentially an introduction to corpus-based and statistical NLP.
– Topics covered: introduction to computational linguistics, information theory, data
compression and coding, N-gram models, clustering, lexicography, collocations,
text summarization, information extraction, question answering, word sense
disambiguation, analysis of style, and other topics .
•
SI 760 “Information Retrieval”, last taught Winter 2003.
– Topics covered: information need, IR models, documents, queries, query languages,
relevance, retrieval evaluation, reference collections, query expansion and
relevance feedback, indexing and searching, XML retrieval, language modeling
approaches, crawling the Web, hyperlink analysis, measuring the Web, similarity
and clustering, social network analysis for IR, hubs and authorities, PageRank and
HITS, focused crawling, relevance transfer, question answering
•
An undergraduate Linguistics course such as Ling 212 “Intro to the Symbolic
Analysis of Language” or Ling 320 “Programming for Linguistics and
Language Studies”
Linguistic Fundamentals
Syntactic categories
• Substitution test:
Nathalie likes
{
black
Persian
tabby
small
}
cats.
• Open (lexical) and closed (functional) categories:
No-fly-zone
yadda yadda yadda
the
in
Morphology
The dog chased the yellow bird.
•
•
•
•
•
•
Parts of speech: eight (or so) general types
Inflection (number, person, tense…)
Derivation (adjective-adverb, noun-verb)
Compounding (separate words or single word)
Part-of-speech tagging
Morphological analysis (prefix, root, suffix,
ending)
Part of speech tags
From Church (1991) - 79 tags
NN
IN
AT
NP
JJ
,
NNS
CC
RB
VB
VBN
VBD
CS
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
singular noun */
preposition */
article */
proper noun */
adjective */
comma */
plural noun */
conjunction */
adverb */
un-inflected verb */
verb +en (taken, looked (passive,perfect)) */
verb +ed (took, looked (past tense)) */
subordinating conjunction */
Jabberwocky (Lewis Carroll)
`Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"
Nouns
• Nouns: dog, tree, computer, idea
• Nouns vary in number (singular, plural),
gender (masculine, feminine, neuter), case
(nominative, genitive, accusative, dative)
• Latin: filius (m), filia (f), filium (object)
German: Mädchen
• Clitics (‘s)
Pronouns
• Pronouns: she, ourselves, mine
• Pronouns vary in person, gender, number, case (in
English: nominative, accusative, possessive, 2nd
possessive, reflexive)
Mary saw her in the mirror.
Mary saw herself in the mirror.
• Anaphors: herself, each other
Determiners and adjectives
•
•
•
•
•
•
Articles: the, a
Demonstratives: this, that
Adjectives: describe properties
Attributive and predicative adjectives
Agreement: in gender, number
Comparative and superlative (derivative and
periphrastic)
• Positive form
Verbs
•
•
•
•
•
•
•
•
•
•
Actions, activities, and states (throw, walk, have)
English: four verb forms
tenses: present, past, future
other inflection: number, person
gerunds and infinitive
aspect: progressive, perfective
voice: active, passive
participles, auxiliaries
irregular verbs
French and Finnish: many more inflections than English
Other parts of speech
• Adverbs, prepositions, particles
• phrasal verbs (the plane took off, take it off)
• particles vs. prepositions (she ran up a
bill/hill)
• Coordinating conjunctions: and, or, but
• Subordinating conjunctions: if, because,
that, although
• Interjections: Ouch!
Phrase structure
• Constraints on word order
• Constituents: NP, PP, VP, AP
• Phrase structure grammars
S
NP
PN
VP
V
N
Spot chased Det
a
N
bird
Phrase structure
• Paradigmatic relationships (e.g., constituency)
• Syntagmatic relationships (e.g., collocations)
S
NP
That
VP
man
VBD
PP
NP
caught the butterfly
NP
IN
with
a
net
Phrase-structure grammars
Peter gave Mary a book.
Mary gave Peter a book.
•
•
•
•
•
•
•
Constituent order (SVO, SOV)
imperative forms
sentences with auxiliary verbs
interrogative sentences
declarative sentences
start symbol and rewrite rules
context-free view of language
Sample phrase-structure grammar
S
NP
NP
NP
VP
VP
VP
P








NP
AT
AT
NP
VP
VBD
VBD
IN
VP
NNS
NN
PP
PP
NP
NP
AT
NNS
NNS
NNS
VBD
VBD
VBD
IN
IN
NN










the
children
students
mountains
slept
ate
saw
in
of
cake
Phrase structure grammars
• Local dependencies
• Non-local dependencies
• Subject-verb agreement
The women who found the wallet were given a reward.
• wh-extraction
Should Peter buy a book?
Which book should Peter buy?
• Empty nodes
Dependency: arguments and adjuncts
Sue watched the man at the next table.
• Event + dependents (verb arguments are usually
NPs)
• agent, patient, instrument, goal - semantic roles
• subject, direct object, indirect object
• transitive, intransitive, and ditransitive verbs
• active and passive voice
Subcategorization
• Arguments: subject + complements
• adjuncts vs. complements
• adjuncts are optional and describe time,
place, manner…
• subordinate clauses
• subcategorization frames
Subcategorization
Subject: The children eat candy.
Object: The children eat candy.
Prepositional phrase: She put the book on the table.
Predicative adjective: We made the man angry.
Bare infinitive: She helped me walk.
To-infinitive: She likes to walk.
Participial phrase: She stopped singing that tune at the
end.
That-clause: She thinks that it will rain tomorrow.
Question-form clauses: She asked me what book I was
reading.
Subcategorization frames
•
•
•
•
•
•
•
Intransitive verbs: The woman walked
Transitive verbs: John loves Mary
Ditransitive verbs: Mary gave Peter flowers
Intransitive with PP: I rent in Paddington
Transitive with PP: She put the book on the table
Sentential complement: I know that she likes you
Transitive with sentential complement: She told
me that Gary is coming on Tuesday
Selectional restrictions and
preferences
• Subcategorization frames capture syntactic
regularities about complements
• Selectional restrictions and preferences
capture semantic regularities: bark, eat
Phrase structure ambiguity
• Grammars are used for generating and parsing
sentences
• Parses
• Syntactic ambiguity
• Attachment ambiguity: Our company is training
workers.
• The children ate the cake with a spoon.
• High vs. low attachment
• Garden path sentences: The horse raced past the
barn fell. Is the book on the table red?
Ungrammaticality vs. semantic
abnormality
* Slept children the.
# Colorless green ideas sleep furiously.
# The cat barked.
Semantics and pragmatics
• Lexical semantics and compositional semantics
• Hypernyms, hyponyms, antonyms, meronyms and
holonyms (part-whole relationship, tire is a
meronym of car), synonyms, homonyms
• Senses of words, polysemous words
• Homophony (bass).
• Collocations: white hair, white wine
• Idioms: to kick the bucket
Discourse analysis
• Anaphoric relations:
1. Mary helped Peter get out of the car. He thanked her.
2. Mary helped the other passenger out of the car.
The man had asked her for help because of his foot
injury.
• Information extraction problems (entity crossreferencing)
Hurricane Hugo destroyed 20,000 Florida homes.
At an estimated cost of one billion dollars, the disaster
has been the most costly in the state’s history.
Pragmatics
• The study of how knowledge about the
world and language conventions interact
with literal meaning.
• Speech acts
• Research issues: resolution of anaphoric
relations, modeling of speech acts in
dialogues
Other areas of NLP
• Linguistics is traditionally divided into phonetics,
phonology, morphology, syntax, semantics, and
pragmatics.
• Sociolinguistics: interactions of social
organization and language.
• Historical linguistics: change over time.
• Linguistic typology
• Language acquisition
• Psycholinguistics: real-time production and
perception of language
Other sites
• Johns Hopkins University (Jason Eisner)
http://www.cs.jhu.edu/~jason/465/
• Cornell University (Lillian Lee)
http://courses.cs.cornell.edu/cs674/2002SP/
• Simon Fraser University (Anoop Sarkar)
http://www.sfu.ca/~anoop/courses/CMPT-825-Fall-2003/index.html
• Stanford University (Chris Manning)
http://www.stanford.edu/class/cs224n/
• JHU Summer workshop
http://www.clsp.jhu.edu/ws2003/calendar/preliminary.shtml
Word classes and
part-of-speech tagging
Part of speech tagging
•
•
•
•
Problems: transport, object, discount, address
More problems: content
French: est, président, fils
“Book that flight” – what is the part of speech
associated with “book”?
• POS tagging: assigning parts of speech to words in
a text.
• Three main techniques: rule-based tagging,
stochastic tagging, transformation-based tagging
Rule-based POS tagging
• Use dictionary or FST to find all possible
parts of speech
• Use disambiguation rules (e.g., ART+V)
• Typically hundreds of constraints can be
designed manually
Example in French
<S>
^
beginning of sentence
La
rf b nms u
article
teneur
nfs nms
noun feminine singular
Moyenne
jfs nfs v1s v2s v3s
adjective feminine singular
en
p a b
preposition
uranium
nms
noun masculine singular
des
p r
preposition
rivi`eres
nfp
noun feminine plural
,
x
punctuation
bien_que
cs
subordinating conjunction
délicate
jfs
adjective feminine singular
À
p
preposition
calculer
v
verb
Sample rules
BS3 BI1: A BS3 (3rd person subject personal pronoun) cannot be followed by a
BI1 (1st person indirect personal pronoun). In the example: ``il nous faut'' ({\it
we need}) - ``il'' has the tag BS3MS and ``nous'' has the tags [BD1P BI1P
BJ1P BR1P BS1P]. The negative constraint ``BS3 BI1'' rules out ``BI1P'', and
thus leaves only 4 alternatives for the word ``nous''.
N K: The tag N (noun) cannot be followed by a tag K (interrogative pronoun); an
example in the test corpus would be: ``... fleuve qui ...'' (...river, that...). Since
``qui'' can be tagged both as an ``E'' (relative pronoun) and a ``K''
(interrogative pronoun), the ``E'' will be chosen by the tagger since an
interrogative pronoun cannot follow a noun (``N'').
R V:A word tagged with R (article) cannot be followed by a word tagged with V
(verb): for example ``l' appelle'' (calls him/her). The word ``appelle'' can only
be a verb, but ``l''' can be either an article or a personal pronoun. Thus, the
rule will eliminate the article tag, giving preference to the pronoun.
Stochastic POS tagging
• HMM tagger
• Pick the most likely tag for this word
• P(word|tag) * P(tag|previous n tags) – find tag
sequence that maximizes the probability formula
• A bigram-based HMM tagger chooses the tag ti for
word wi that is most probable given the previous
tag ti-1 and the current word wi:
• ti = argmaxj P(tj|ti-1,wi)
• ti = argmaxj P(tj|ti-1)P(wi|tj) : HMM equation for a
single tag
Example
• Secretariat/NNP is/VBZ expected/VBN
to/TO race/VB tomorrow/ADV
• People/NNS continue/VBP to/TO
inquire/VB the/DT reason/NN for/IN
the/DT race/NN for/IN outer/JJ space/NN
• P(VB|TO)P(race|VB)
• P(NN|TO)P(race|NN)
• TO: to+VB (to sleep), to+NN (to school)
Example (cont’d)
•
•
•
•
•
•
P(NN|TO) = .021
P(VB|TO) = .34
P(race|NN) = .00041
P(race|VB) = .00003
P(VB|TO)P(race|VB) = .00001
P(NN|TO)P(race|NN) = .000007
HMM Tagging
• T = argmax P(T|W), where T=t1,t2,…,tn
• By Bayes’ rule: P(T|W) = P(T)P(W|T)/P(W)
• Thus we are attempting to choose the sequence of
tags that maximizes the rhs of the equation
• P(W) can be ignored

• P(T)P(W|T) =
P(wi|w1t1…wi-1ti1ti)P(ti|w1t1…wi-1ti-1)
Transformation-based learning
•
•
•
•
P(NN|race) = .98
P(VB|race) = .02
Change NN to VB when the previous tag is TO
Types of rules:
–
–
–
–
–
The preceding (following) word is tagged z
The word two before (after) is tagged z
One of the two preceding (following) words is tagged z
One of the three preceding (following) words is tagged z
The preceding word is tagged z and the following word is
tagged w
Confusion matrix
IN
JJ
IN
-
.2
JJ
.2
-
3.3
NN
8.7
-
NNP .2
3.3
4.1
RB
2.0
.5
VBD
.3
.5
VBN
2.8
2.2
NN
NNP RB
VBD VBN
.7
2.1
1.7
.2
2.7
.2
-
.2
-
4.4
2.6
-
Most confusing: NN vs. NNP vs. JJ, VBD vs. VBN vs. JJ
Readings
• J&M Chapters 1, 2, 3, 8
• “What is Computational Linguistics” by
Hans Uszkoreit
http://www.coli.uni-sb.de/~hansu/what_is_cl.html
• Lecture notes #1
Download