Parsing & Parsing Speech

advertisement
Parsing & Parsing Speech
Benjamin Lambert
10/23/09
“MLSP” Group
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
What is parsing?
“Parsing is the process of structuring a linear
representation in accordance with a given
grammar….
…“The linear representation may be a sentence, a
computer program, a knitting pattern, a sequence
of geological strata, a piece of music, actions in
ritual behaviors, in short any linear sequence in
which the preceding elements in some way
restrict the next element.”
(From “Parsing Techniques” Grune and Jacob, 1990)
“Structure a linear representation”
• “The dog, that I saw, was fast”
• The dog, that I saw, was fast
Why parse speech?
To add constraints to the model
• Specifically, parsing can help us identify:
– Grammatical errors:
• The dog, that I saw, was fast
• *The dogs, that I saw, was fast
– Semantic errors:
• The pulp will be made into newsprint
• *The Pope will be made into newsprint
• (These are all long-distances dependencies,
too distant for n-gram language model)
Why parse speech?
Extract information from the speech
(e.g. dialog system)
Linear representation:
“I’d like to fly from Boston to, um, Pittsburgh on Saturday on
US Airways”
Structured representation:
FLIGHT FRAME
Origin
Boston
Destination
Pittsburgh
Date
Saturday
Airline
US Airways
…
…
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
What is a (formal) language?
“A language is a ‘set’ of sentences,
and each sentence is a ‘sequence’ of ‘symbols’…
that is all there is: no meaning, no structure,
either a sentence belongs to a language or it does
not.” (“Parsing Techniques” Grune)
• A linguistic would disagree
• We’ll use this definition for formal language
theory only
Example formal languages:
• Binary numbers:
– 0, 1, 10, 11, 100…
• Binary numbers with an odd number of ones:
– 1, 111, 1000, 1011, …
– * 11, 101, 1111,…
• N zeros followed by n ones (0n1n)
– 01, 0011, 000111, 00001111…
– *0, 1, 100, …
• Grammatically correct English:
– “The pope will be made into newsprint”, …
– *“The pope will are made into newsprint”, …
• Semantically correct English (Semantic validity determined by some world
model)
– “The pulp will be made into newsprint”, ….
– *“The Pope will be made into newsprint”, …
DFAs
All diagrams of DFAs, NFAs, and PDAs from the Sipser book.
NFAs (and equivalent DFAs)
PDA for: {On1n | n ≥ 0}
CFG for: {On1n | n ≥ 0}
• SA
• A  0A1
• Aε
• Can convert any CFG to Chomsky normal form:
A  BC
Aa
A Non-deterministic PDA for {aibjck |i,j,k ≥ 0 and
i=j or i=k}
Review: Formal Language Theory
Language
class
Written formalism
“Machine”
Chomsky
class
Example
Regular
Regular expressions
DFA/NFA
3
01*
Context-free
Context-free
grammars
(nondeterministic)
Push-down automaton
2
0n1n,
programming
languages,
(natural
languages?)
Contextsensitive
Context-sensitive
grammars
Linear-bounded
automaton
1
0n1n,2n, (Natural
languages—
mildly)
Unrestricted?
(Perl?)
Turing machine?
“0”
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
History (See chapter summaries in Hopcroft et. al)
1931
Kurt Goedel
Incompleteness Theorem (See Nagel 2001)
1936
Alan Turing
Turing machines & undecidability of unres.
lan
1936
Church, Kleene, Post
Computability
55/56
Huffman, Mealy, Moore
DFA
1956
Shannon and McCarthy
NFA
1956
S.C. Kleene
Regular expressions
1956
Chomsky
Context-free grammars & context-sensitive
grammars (Chomsky, 1957)
‘59/’60
Backus, and Naur
CFGs for Fortran & Algol (respectively)
61/63
Oettinger/Schutzenberger
Push-down automata
1963
P.C. Fischer
Deterministic PDA
1965
D.E. Knuth
LR(k) grammars
1967
D.H. Younger (& C & K?)
CYK parsing algorithm
1967
Charles Fillmore
“Case” theory of linguistics (Fillmore, 1967)
71/72
S.C. Cook & R.M. Karp
NP-completeness problems
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
Parsing Context-Free Grammars
• Want to retrieve the structure, not just
accept/reject
• Parsing
– CYK parsing algorithm (not PDA-based) (1967)
• Example
– LR parsing (Knuth ’68)—Most like a PDA
– GLR parsing, LR parsing but allows ambiguity,
simulates non-deterministic PDA (Tomita, 1984)
Parsing CFGs, Goal:
Given a grammar:
S  NP VP
NP  (Det) N
VP  V (NP)
N  pope
V  ran
Det  the
And, an input string:
“The pope ran.”
Find the structure that the grammar describes:
•
How to parse?
• CYK (Cocke-Younger-Kasami ~1965):
– Convert to CNF (Chomsky Normal form)
– Really simple algorithm: O(n3)
– Bottom-up, so we use “substitution” rules in reverse
• In practice:
– We don’t want to convert to CNF
– Can do much faster on the average case
– Other algorithms (Earley, chart-parsing, LR parsing)
are similar, but tedious to go through
CYK algorithm
• Bottom-up
• Chart-based
CYK algorithm
Taken from?
From Alon Lavie’s lecture slides on CYK parsing from CMU’s 11-711 course
CYK Algorithm Example
5
4
3
Length 2
1
b
a
a
b
a
1
2
3
4
5
Start position
Running time
• CYK: O(n3)
• Most reasonable algorithms, worst case: O(n3)
– (CYK isn’t “reasonable”)
• Current fastest: O(n2.376), reduced to matrix
multiplication
• (Some grammars much faster with other
algorithms, e.g. LR-parsing in linear time).
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
LR Parsing
• Shift-reduce parsing
• Simulates a push-down automata
• Developed by Knuth in the late 60’s for
programming language compilers
• Only works for (mostly) unambiguous
grammars (i.e. LR(k) grammars)
• Very fast—linear time parsing
A Simple Arithmetic Grammar
From Grune and Jacobs
Non-deterministic LR Parsing Table
From Grune and Jacobs
Deterministic LR Parsing Table
From Grune and Jacobs
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
GLR Parsing
• Just like LR parsing, but allows ambiguity, and
simulates non-determinism when there is
ambiguity
• Masaru Tomita (1984)
– Co-founder of LTI
GLR Parsing
From Tomita’s 1985 IJCAI paper
GLR Parse forest
From Tomita’s 1985 IJCAI paper
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
Challenges Parsing speech
•
Disfluencies:
– Uh, um
– Interruptions/restarts (“I want to drive…. I want to fly to Pittsburgh” “Boston, no I mean,
Maui”
•
Incomplete sentences
“from Boston to Pittsburgh”
•
Non-sentences
“Hello.”
•
Ellipsis
“I want to fly from Boston <silence> Pittsburgh”
•
Segments with poor acoustics/LM
“I want to fly from Boston purple monkey dishwasher Pittsburgh”
•
No: <s> </s>
“I want to fly to Pittsburgh she wants to fly to Maui”
•
Two examples:
– GLR*
– PHOENIX
GLR*
• Alon Lavie (1993-1996)
– His PhD thesis at CMU, under Tomita
• Conceptually the same as GLR, plus:
– Can parse multiple simultaneous trees, i.e. more
than one complete “S” tree
– Can “skip” words in the input
• These create lots of additional ambiguity, so
adds heuristics
How to skip filler words?
From Grune and Jacobs
GLR* parsing of speech
• Allows skips, allows multiple S-nodes
GLR* (1993)
• Multiple S’s and skip edges blow up the search
space, so do a beam search with heuristics:
– Number of words skipped
– Fragmentation of the parse (number of S-nodes)
How can we use GLR* in ASR?
• Add parsability constraints
• Search through the n-best list looking for hypotheses that parse:
1. THOUGH THE GAME HAD AROUND THE SCIENCE AND THE NINETEEN
FIFTIES IT NEVER REGAINED THE POPULARITY OF ITS GOLD MANAGED
[Sphinx: -29494370 ] [WER: .4211]
2. THOUGH THE GAME HAD AROUND THE SCIENCE AND THE NINETEEN
FIFTIES IT NEVER REGAINED THE POPULARITY OF ITS GOAL NATURE
[Sphinx: -29557043 ] [WER: .4211]
3. THOUGH THE GAME HAD EVER RENAISSANCE AND THE NINETEEN
FIFTIES IT NEVER REGAINED THE POPULARITY OF ITS GOLD MANAGED
[Sphinx: -29571010] [WER: .2105]
…
N?. THOUGH THE GAME HAD A RENAISSANCE IN THE NINETEEN FIFTIES IT
NEVER REGAINED THE POPULARITY OF ITS GOLDEN AGE
Examples from WSJ dataset
GLR*: pros and cons
• Pros:
– *General* open domain
• Cons:
– Still imposes pretty strict grammatical constraints
Outline
•
•
•
•
What is parsing? Why parse?
Review: Formal Language Theory
History
Parsing
– CYK parsing
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars or, PHOENIX
• Alternatives/extensions
• How can we use these?
Parsing in restricted domain ASR
• Two options:
– Make the grammar extremely specific (a
“semantic grammar”)
– Or, forget about trying to do a “complete” parse,
and just look for the information that we know we
need (information extraction) (PHOENIX)
• Most (all?) dialog systems use one or the
other or both (See McTear, 2004)
Semantic grammars
• In a restricted domain, we can (potentially) greatly simplify the
parsing
• Instead of abstract grammatical categories
S  NP VP
NP  Adj* N
• Specific, meaningful, actionable categories:
AddToSpreadsheetCommand  SelectCommand
SpreadsheetLocation Text
SelectComment  “select” | “highlight”
SpreadSheetlocation  …
Semantic Grammar
• Just another grammar, use GLR* or something else to parse
• SOUP parser (Marsal Gavalda, CMU, ~2000)
–
–
–
–
–
Same GLR* but with “semantic grammar”?
On ATIS style data (Semantic classes: location, time, etc.)
Uses equivalent Probabilistic Recursive Transition Networks
“Inspired by PHOENIX”
Used in JANUS Speech-to-Speech MT system
• Semantic grammars are not portable/reusable
– A lot of work to create
PHOENIX- Case frame
“I’d like to fly from Boston to, um, Pittsburgh on Saturday on US
Airways”
FLIGHT FRAME
Origin
Boston
Destination
Pittsburgh
Date
Saturday
Airline
US Airways
…
…
PHOENIX (Ward ~1990)
• Before GLR*, etc. but perhaps the most influential in ASR,
ASU, Dialogue systems
– “Set the bar for the ATIS task” (Alex Acero, paraphrased)
• Still used in “Ravenclaw-Olympus” (a CMU dialog system
framework (Bohus and Rudnicky)
– Thus, used in: “Let’s Go”, etc. etc. CMU Communicator, etc.
• Doesn’t attempt to perform a full parse, just to extract the
important bits of information
– Each field is recognized by a “mini” CFG parser (strict, no skips)
– Allows any amount of skipping/noise in between informative
fields
Numerous frames; one grammar per slot
• ATIS 1994 system has 70 grammars. For example:
• Grammar #1:
ORIGIN_CITY  [from | beginning in ] [Atlanta |
Pittsburgh | Boston | …]
• Grammar #2:
DEPARTURE_TIME  [leaving at | on ] TIME_EXPRESSION
TIME_EXPRESSION  [DAY_OF_WEEK]
TIME_EXPRESSION  [DAY_OF_WEEK] [TIME_OF_DAY]
PHOENIX
• “… As slot fillers (semantic phrases) are recognized,
they are added to frames to which they apply. The
algorithm is basically a dynamic programming beam
search on frames. Many different frames, and several
different versions of a frame, are pursued
simultaneously. The score for each frame hypothesis is
the number of words that it accounts for. A file of
words not to be counted in the score is included. At
the end of an utterance the parser picks the best
scoring frame as a result… The output of the parser is
the frame name and the parse trees for the filled slots.”
(Ward and Issar, 1994)
Applications
• Dialogue systems
– Frames are used to represent the actions the system can take
– E.g. robot voice commands (TeamTalk)
MOVE FRAME
DIRECTION
<N, E, S, W>
DISTANCE
– Most (?) dialogue systems require very specific/constrained semantic
grammars to achieve acceptable performance (?)
– Semantic grammars are not portable
• Speech-to-speech Machine Translation
– Frame (?) as target in Interlingua (a shallow semantic representation)
(Levin et al. 2000)
Outline
• What is parsing? Why parse?
• Review: Formal Language Theory
– History
• Parsing
– CYK parsing—example
– LR parsing (concept)
– GLR parsing (concept)
• Parsing speech
– Open domain: GLR*
– Restricted domain: Semantic grammars, and PHOENIX
• Open research areas & additional topics
Open research areas
• Using non-domain-specific frame-like
constraints for open-domain ASR
– E.g. “turn X into Y” frame
• X=pulp, Y=newsprint
• X=Pope, Y=newsprint
• “Reusable” semantic grammars?
– “Re-use” the grammar
– But substitute new semantics for a new
application
Additional topics in parsing
• Alternative algorithms:
– Top-down parsing (faster, but probably not good for
speech)
•
•
•
•
Probabilistic CFGs
Mildly-context sensitive?
Dependency parsing, “chunking”
Feature unification grammars (e.g. LFG)
– Attaches additional constraints to each grammar rule
• Other constraints (Scone)
References
Dick Grune and Ceriel J.H. Jacobs, Parsing Techniqes – A
Practical Guide, Ellis Horwood, Chichester, England, 1990.
Michael Sipser, Introduction to the Theory of Computation,
Course Technology, 2005.
John Hopcroft, Rajeev Motwani, and Jefferey Ullman,
Introduction to Automata Theory, Languages, and
Computation, 2nd ed. Addison Wesley, 2000.
Ernest Nagel, James Newman, Godel’s Proof, NYU Press, 2001.
Charles Fillmore, “The Case for Case,” in April 1967 Texas
Symposium On Linguistic Universals.
Noam Chomsky, Syntactic Structures, 1957.
References (2)
Tomita, Masaru (1985). "An efficient context-free parsing
algorithm for natural languages". International Joint
Conference on Artificial Intelligence. IJCAI. pp. 756-764.
Alon Lavie, Masaru Tomita, “GLR* - An Efficient Noiseskipping Parsing Algorithm For Context Free Grammars,” In
Proceedings of the Third International Workshop on Parsing
Technologies, 1993.
Marsal Gavaldà, “Soup: a parser for real-world spontaneous
speech” In book, New developments in parsing technology
book contents, Kluwer Academic Publishers Norwell, MA,
2004.
Michael McTear, Spoken Dialogue Technology: Toward the
Conversational User Interface, Springer, 2004.
References (3)
Wayne Ward, “Understanding spontaneous speech: the Phoenix
system.” ICASSP, 1991.
Wayne Ward, Sunil Issar, “Recent Improvements in the CMU
Spoken Language Understanding System” In HLT, 1994.
Lori Levin, Alon Lavie, Monika Woszczyna, Donna Gates, Marsal
Gavaldá, Detlef Koll, Alex Waibel, “The Janus-III Translation
System: Speech-to-Speech Translation in Multiple Domains,”
Machine Translation, Volume 15 , Issue 1/2 (June 2000)
End of slides.
Extra slides…
Tree-Adjoining Grammar
Non-robust parsers
From Lavie and Rose’s LCFLEX paper (?)
“Robust” parsers
From Lavie and Rose’s LCFLEX paper (?)
More speech parsing:
• LCFLEX (Rose and Lavie ~2000)– GLR* but
faster
– Uses “left-corner” algorithm, combination topdown and bottom-up
• Other case-frame parsers? (80’s - 90’s)
– MINDS
– DYPAR
• For additional references, see my thesis
proposal (forthcoming)
Turing machine for {0n^2 }
Download