Natural Language Processing • • • • what it does what is involved why is it difficult brief history sentence structured repns of meaning "how old is my help3.doc file?" Lisp: (query (file-detail 'date "C:/help3.doc")) "the large cat chased the rat" Logic: (1s1 large(s1) feline(s1)) (1s2 rodent(s2)) chased(s1, s2 ) "the young boy ate a bad apple" CD Graph ...see next page... CD graph "the young boy ate a bad apple" what is involved symbolic computation ie: symbols manipulated by symbol processors search & inference knowledge representation techniques why is it difficult prejudice, politics, etc ambiguity... • syntactic • semantic • pragmatic example sentences • • • • the old man the boats my car drinks petrol I saw the Eiffel Tower flying to Paris he opened the door with the key he opened the door with the squeaking hinge • the boy kicked the ball under the tree the boy kicked the wall under the tree • put the bottles in the box on the shelf by the door (brief) history of language processing 1950s Russian English translation 1956 1960s 1970s 1980s 1990s 2000+ Chomsky Pattern matching Parsing & some KnRep Kn & inference big dreams small results quietly promising matching: Sir matching: Student matching: Elisa a modern approach input sentence morphological processing lexicon syntax analysis (parsing) grammar semantic rules contextual information semantic analysis pragmatic analysis target representation step 1- morphological processing objective: strip words into roots & modifiers issues • inflection • derivation (cat pl cat-s) (happy adj happiness noun) • compounding (toothpaste) morphological processing - notes • all(?) spoken lngs exhibit morphology • easier to handle in written lngs if not iconic • some morphology describes infm beyond syntax eg: proximity (Tamil, Setswana, etc) case speaker / listener peer relationship morphology examples Noun + Suffix Syntactic case Meaning Chennai-ukku dative: destination To Madras Chennai-ukku-irundu dative: source From Madras Chennai-le containment In Madras Chennai-ai object (formal) Madras Fig. 2.2. Suffix Attachment for Noun Cases (Tamil author's spelling) Proximity Time Things (inanimate) Near i-ppa (this time: now) i-ndtha (this thing: this) Far a-ppa (that time: then) a-ndtha (that thing: that) Question e-ppa (what time: when) e-ndtha (what thing: which) Fig. 2.3. Proximity Information as Prefix Tags (Tamil) Proximity Cow Student Near Speaker kgomo e (this cow) mo-ithuti yo (this student) Near Listener kgomo e-o (that cow) mo-ithuti yo-o (that student) Far kgomo e-le (that cow) mo-ithuti yo-le (that student) Fig. 2.4. Proximity by Demostrative Pronoun Inflection (Setswana) step 2- syntax analysis objectives: 1 check for correctness 2 produce phrase structure uses • parser • grammar • lexicon a rule-based search engine context-free production rules dictionary of words & their categories syntax rules parts of speech rules of combination consider • the cat chases the mouse • all large black dogs chase cats example 1 - using Lkit (build-lexicon '((a determiner) (cat noun) (dog noun) (the determiner) (chased verb))) (build-grammar '((s1 (sentence -> noun-phrase verb-phrase)) (np (noun-phrase -> determiner noun)) (vp (verb-phrase -> verb noun-phrase)) )) example 1 - output (parse 'sentence '(the dog chased a cat)) complete-edge 0 5 s1 sentence (the dog ...) nil s1 sentence -> (noun-phrase verb-phrase) Syntax (sentence (noun-phrase (determiner the) (noun dog)) (verb-phrase (verb chased) (noun-phrase (determiner a) (noun cat)))) Semantics (sentence) so what ? we want meaning Remember: "the young boy ate a bad apple" how can semantics be encoded as symbols? the boy / an apple? young/old, happy/sad, good/bad? how can semantics be generated? what can be inferred from semantics? Reminder: "the young boy ate a bad apple" symbolic representation of semantics (actor (root boy) (id boy#732) (tags animate human male) (qual (age (val 5) (approx 3))) (quant specific)) (action (primitve INGEST)) (object (root apple) (id nil) (tags physob veg fruit food) (qual (phy-state -4)) (quant non-specific)) semantics in lexicon a simple example (build-lexicon '((a det (cat noun (chased verb (dog noun (the det )) any ) feline ) hunts ) canine ) specific) semantics in grammar rules (s1 (sentence -> noun-phrase verb-phrase) (actor . noun-phrase) (action . verb-phrase.action) (object . verb-phrase.object) ) (np (noun-phrase -> det noun) (det . noun) ) (vp (verb-phrase -> verb noun-phrase) (action . verb) (object . noun-phrase) ) semantics - results (parse 'sentence '(the dog chased a cat)) complete-edge 0 5 s1 sentence (the dog...) nil s1 sentence -> (noun-phrase verb-phrase) Syntax (sentence (noun-phrase (det the) (noun dog)) (verb-phrase (verb chased) (noun-phrase (det a) (noun cat)))) Semantics (sentence (actor (specific canine)) (action hunts) (object (any feline))) semantics in lexicon - checks 1 (a (all (cat (cats (chase (chases (dog (dogs (the det det noun noun verb verb noun noun det (sems (sems (sems (sems (sems (sems (sems (sems (sems . . . . . . . . . any)) every)) feline) (num feline) (num hunts) (num hunts) (num canine) (num canine) (num specific)) . . . . . . sing)) plur)) plur)) sing)) sing)) plur)) semantics in grammar - checks 1 (s1 (sentence -> noun-phrase verb-phrase) (actor . noun-phrase.sems) (action . verb-phrase.action) (object . verb-phrase.object) ; check number of noun-phrase & verb-phrase (if (noun-phrase.number = verb-phrase.number) numeric-agreement-ok numeric-agreement-bad ) ) semantics - results (parse 'sentence '(the dog chases a cat)) complete-edge 0 5 s1 sentence (the dog...) nil s1 sentence -> (noun-phrase verb-phrase) Syntax (sentence (noun-phrase (det the) (noun dog)) (verb-phrase (verb chases) (noun-phrase (det a) (noun cat)))) Semantics (sentence (actor specific canine) (action . hunts) (object any feline) numeric-agreement-ok) semantics - results (parse 'sentence '(the dogs chases a cat)) complete-edge 0 5 s1 sentence (the dog...) nil s1 sentence -> (noun-phrase verb-phrase) Syntax (sentence (noun-phrase (det the) (noun dog)) (verb-phrase (verb chases) (noun-phrase (det a) (noun cat)))) Semantics (sentence (actor specific canine) (action . hunts) (object any feline) numeric-agreement-bad) semantics in grammar - checks 2 (s1 (sentence -> noun-phrase verb-phrase) (fail if noun-phrase.number /= verb-phrase.number) (actor . noun-phrase.sems) (action . verb-phrase.action) (object . verb-phrase.object) ) semantics - results (parse 'sentence '(the dog chases a cat)) Semantics (sentence (actor specific canine) (action . hunts) (object any feline)) (parse 'sentence '(the dogs chases a cat)) .... failed .... semantics in grammar - checks 3 (s1 (sentence -> noun-phrase verb-phrase) (glitch numeric-agreement if not noun-phrase.number = verb-phrase.number) (actor . noun-phrase.sems) (action . verb-phrase.action) (object . verb-phrase.object) ) semantics - results (parse 'sentence '(the dogs chases a cat)) complete-edge 0 5 s1 sentence (the dogs...) nil Glitches: (numeric-agreement) s1 sentence -> (noun-phrase verb-phrase) Syntax (sentence (noun-phrase (det the) (noun dogs)) (verb-phrase (verb chases) (noun-phrase (det a) (noun cat)))) Semantics (sentence (actor specific canine) (action . hunts) (object any feline)) example 2 - lexicon (a (cat (chase (dog (the (black (large (small det noun verb noun det adj adj adj any ) feline ) hunts ) canine ) specific) (color black)) (size 7/10)) (size 3/10)) example 2 - grammar (build-grammar '((np (noun-phrase -> ?det *adj noun) (if det (quantification . det) (quantification undefined)) (qualifiers . *.adj) (object . noun) )) )) example 2 - results (parse 'noun-phrase '(small black dog)) complete-edge 0 3 np noun-phrase (small...) nil np noun-phrase -> (?det *adj noun) Syntax (noun-phrase (adj small) (adj black) (noun dog)) Semantics (noun-phrase (quantification undefined) (qualifiers ((size . 3/10)) ((color . black))) (object canine)) example 2 - results small dogs chase the small cats and large dogs chase the large cats (sentence conjunction ((actor (quant undefined) (qual (size . 3/10)) (object . canine)) (action . hunts) (object (quant . specific) (qual (size . 3/10)) (object . feline))) ((actor (quant undefined) (qual (size . 7/10)) (object . canine)) (action . hunts) (object (quant . specific) (qual (size . 7/10)) (object . feline)))) semantic processing (one approach) • semantic rules in grammar 1st stage case frame • verb form primitive action case frame • • • • • disambiguate & fill additional case frame slots check references with world and/or dialog do statement level inference integrate with dialog do event sequence dialog step-1: produce raw case frame • verb cases the cat chased the rat in the kitchen the cat chased the rat into the kitchen • common cases source destination location start-time end-time duration instrument beneficiary the ambiguity problem eg: the boy kicked the ball under the tree grammar rules S S PP S NP VP NP ?det *adj noun NP NP PP example frame #1 actor (quant specific) (tags animate male human) (qual (age (range 3 13))) (root boy) action (root kick) object (root ball) (tags manip) (posn-relative (locator beneath) (object (root tree) ...etc... ) example frame #2 actor (quant specific) (tags animate male human) (qual (age (range 3 13))) (root boy) action (root kick) object (root ball) (tags manip) dest (posn-relative (locator beneath) (object (root tree) ...etc... ) example verb form #1 primitive strike prohibited object (tags manip) slots instrument (part-of $actor foot) legal start-time, end-time, duration instrument, beneficiary, location illegal source, dest example verb form #2 primitive push required object (tags manip) slots instrument (part-of $actor foot) legal source, dest, start-time, end-time, instr, beneficiary, locatn, duration semantic processing (one approach) × semantic rules in grammar 1st stage case frame × verb form primitive action case frame × disambiguate & fill additional case frame slots check references with world and/or dialog do statement level inference • integrate with dialog • do event sequence dialog integration with dialog dialogs have... • players (actors) • props (objects) • locations (from case frames) • themes (derived) • event sequences (from themes) • plans (from themes and/or derived) event sequence set of... • players (actors) • props (objects) series of... • semantically encoded activities (matched) • escapes, exceptions & alternatives reading – grammars, etc A good source of links & references... “Computational Analysis of Prepositions” http://knol.google.com/k/abdul-baqisharaf/computational-analysis-of-prepositions/3hc3uny2z7r41/4# if you only plan to read one article... Baldwin, T. Kordoni, V and Villavicencio, A. 2009. Prepositions in Applications: A Survey and Introduction to the Special Issue ". Computational Linguistics 35 (2): 119–149. also... Litkowski, Kenneth C. and Orin Hargraves. 2007. SemEval-2007 task 06: Wordsense disambiguation of prepositions. In Proceedings of the 4th International Workshop on Semantic Evaluations, pages 24–29, Prague. Disambiguation of Preposition Sense Using Linguistically Motivated Features, Stephen Tratz and Dirk Hovy. Proceedings of the NAACL HLT Student Research Workshop and Doctoral Consortium, pages 96–100, Boulder, Colorado, June 2009. c 2009 Association for Computational Linguistics reading – grammars, etc the NLP dictionary: www.cse.unsw.edu.au/~billw/nlpdict.html for practical help with building grammars check the following (it is about 10 years old but then so is the English language :o) A Grammar Writer’s Cookbook. Miriam Butt, Tracy Holloway King, Marma-Eugenia Niño and Fridirique Segond also (for writing larger grammars) it is useful to find a book on grammar for tutors and/or students of English as a second language. for a broad (if a little formal) take on semantics try dipping into... Semantics-Oriented Natural Language Processing Mathematical Models and Algorithms. Vladimir Fomichov A. 2010 reading – kn rep for NLP logic and knowledge representation – a guide http://dspace.dsto.defence.gov.au/dspace/bitstream/1947/9996/1/DSTO-TR2324%20PR.pdf representing events for NLP http://www.google.co.uk/url?sa=t&rct=j&q=knowledge%20representation%20%22 representing%20events%22&source=web&cd=6&sqi=2&ved=0CEgQFjAF&url=h ttp%3A%2F%2Fwww.aaai.org%2Focs%2Findex.php%2FFSS%2FFSS10%2Fpap er%2Fdownload%2F2183%2F2819&ei=f6oWT_e7DeKC4gTMpaijBA&usg=AFQjC NFYmurwJR9oqfCRBimVprWRK45kew&cad=rja semantic networks & frames (2005) http://www.cs.bham.ac.uk/~jxb/IAI/w6.pdf VERL: An Ontology Framework for Video Events (2005) http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1524892