Introduction

advertisement
Introduction
Verónica Dahl
Professor, Comp. Sci. Dept
Simon Fraser University, Canada
Marie Curie Chair of Excellence,Universidad
de Tarragona, Spain
My scientific dream
Bridge the gap between formal and
empirical approaches
pluridisciplinarity:
linguistics, logic,
computing sciences, AI, cognitive
sciences, internet, molecular biology
--> Cognitive theories and methodologies.
2
March 4- Intro to
Natural Language Processing
3
“Understanding” language
Levels of Language Processing
Recognition, Analysis, Synthesis
Main problems: Ambiguity, Long Distance
Dependencies, elided constituents,
contextual and world knowledge,
intentions, presuppositions.
Processing Language through Prolog and
DCGs. Examples.
Levels of Language Processing
4
Phonetic
Lexical
Syntactic
Semantic
Pragmatic
Main Types of Language
Processing
5
Recognition
Analysis (parsing)- by far the most studied
Synthesis (generation)
Single sentence vs. discourse
Main problems
Ambiguity:
Lexical (part of speech):
The overflown bank
Teacher strikes idle kids
Syntactic (structural):
Time flies like an arrow
6
Main problems: long distance
dependencies
How to relate constituents that can be arbitrarily
far apart. E.g. Topicalization
Logic, we love .
Logic, he thinks we love .
Logic, they pretended they did not know we
love .
7
Main problems: long distance
dependencies
Anaphora (e.g. relating a pronoun with its
antecendent- or postcedent)
Mona Lisa smiled. Leonardo painted her smile.
Ann smiled. Lissa frowned. Tom photographed
her.
Near her, Alice saw a Bread-and-Butterfly.
8
Main problems:elided constituents
- How to guess material left implicit
The workshop was interesting and the talks
inspiring.
were
Flexibility needed. Bottom-up approaches with
some kind of constraint reasoning have helped
(e.g. datalog with constraints
9 upon word boundaries)
Main problems: idioms, contextual
and world knowledge, intentions
He kicked the bucket.
It’s too warm in here.
Je m’appele Monsieur Leblanc. Et vous?
Nous aussi.
10
Main problems: presuppositions
The king of France is bald
How many students learnt Mirandes last
year?
11
Basic Prolog tools for NLP: DCGs
An example: two word sentences
We’d like CF-like grammar rules, e.g.:
sentence --> [Word1],[Word2].
We can have a similar plain Prolog rule:
sentence(Input,Output):- find_in(Input, W1,Rest),
find_in(Rest,W2,Output).
find_in([Word|Words],Word,Words).
N.B. find_in/3 is a primitive, but is called ‘C’/3
Sample query for recognizing a sentence:
12 ?- sentence([it,rains],[]).
DCGs compile into Prolog:
Eg.
s --> np, vp.
compiles into:
s (Input,Output):- np(Input,Rest), vp(Rest,Output).
Where words are involved, we get calls to ‘C’:
E.g.
noun --> [moon].
compiles into:
noun(Input,Output):- 'C'(Input,moon,Output).
So DCG calls still require two added arguments:
?- sentence([the,moon,shines],[]).
13
A first CF grammar in DCG
% Syntax
s --> np, vp.
np --> d, n.
np --> name.
vp --> iv.
vp --> tv, np.
vp --> bv, np, pp.
14
pp --> p, np.
CF grammar in DCG (cont.)
% Lexicon
d --> [the].
n --> [sun].
n --> [moon].
n --> [world].
name --> [gaia].
name --> [helios].
iv --> [shines].
tv --> [reflects].
tv --> [illuminates].
bv --> [reflects].
p --> [upon].
15
Download