Contextual Vocabulary Acquisition as Computational Philosophy and as Philosophical Computation William J. Rapaport Department of Computer Science & Engineering, Department of Philosophy, and Center for Cognitive Science rapaport@cse.buffalo.edu http://www.cse.buffalo.edu/~rapaport Computation & Philosophy • Computational philosophy = – Application of computational (i.e., algorithmic) solutions to philosophical problems • Use of SNePS KRR & belief-revision system to solve problems in representation of fictional entities – Rapaport 1991; Rapaport & Shapiro 1995, 1999 • CVA • Philosophical computation = – Application of philosophy to CS problems • Use of Castañeda’s theory of quasi-indexicals to solve problems in knowledge representation – Maida & Shapiro 1982; Rapaport 1986; Rapaport, Shapiro, & Wiebe 1997 • CVA • (more later) Contextual Vocabulary Acquisition • CVA = active, deliberate acquisition of a meaning for a word in a text by reasoning from “context” • CVA = what you do when: – – – – – You’re reading You come to an unfamiliar word It’s important for understanding the passage No one’s around to ask Dictionary doesn’t help • • • • No dictionary Too lazy to look it up :-) Word not in dictionary Definition of no use – Too hard – Inappropriate • So, you “figure out” a meaning for the word “from context” – “figure out” = compute (infer) a hypothesis about what the word might mean in that text – “context” = ?? What Does ‘Brachet’ Mean? (From Malory’s Morte D’Arthur [page # in brackets]) 1. 2. 3. 4. 10. 18. There came a white hart running into the hall with a white brachet next to him, and thirty couples of black hounds came running after them. [66] As the hart went by the sideboard, the white brachet bit him. [66] The knight arose, took up the brachet and rode away with the brachet. [66] A lady came in and cried aloud to King Arthur, “Sire, the brachet is mine”. [66] There was the white brachet which bayed at him fast. The hart lay dead; a brachet was biting on his throat, and other hounds came behind. [86] [72] CVA as Computational Philosophy • Origin of project: – Rapaport, “How to Make the World Fit Our Language” (1981) • Neo-Meinongian theory of a word’s meaning for a person as the set of contexts in which person has heard or seen word. • Could that notion be made precise? • Semantic-network theory offered a computational tool – Later, learned that computational linguists, reading educators, L2 educators, psychologists,… were all interested in this • A really interdisciplinary cognitive-science problem – Developed into Karen Ehrlich’s CS Ph.D. dissertation (1995) What Is the “Context” for CVA? • “context” ≠ textual context – surrounding words; “co-text” of word • “context” = wide context = – “internalized” co-text … • ≈ reader’s interpretive mental model of textual “co-text” – … “integrated” via belief revision … • infer new beliefs from internalized co-text + prior knowledge • remove inconsistent beliefs – … with reader’s prior knowledge: • “world” knowledge • language knowledge • previous hypotheses about word’s meaning – but not including external sources (dictionary, humans) “Context” for CVA is in reader’s mind, not in the text Meaning of “Meaning” • “a meaning for a word” vs. “the meaning of a word” – “the” – “of ” – “a” single, correct meaning meaning belongs to word many possible meanings • depending on textual context, reader’s prior knowledge, etc. – “for” reader hypothesizes meaning from “context”, & gives it to word Prior Knowledge PK1 PK2 PK3 PK4 Text Prior Knowledge PK1 PK2 PK3 PK4 Text T1 Integrated KB internalization PK1 I(T1) PK2 PK3 PK4 Text T1 B-R Integrated KB internalization PK1 I(T1) PK2 inference PK3 P5 PK4 Text T1 B-R Integrated KB Text internalization PK1 I(T1) PK2 inference PK3 P5 PK4 P6 I(T2) T1 T2 B-R Integrated KB Text internalization PK1 I(T1) PK2 T1 T2 inference PK3 P5 PK4 I(T2) P6 I(T3) T3 B-R Integrated KB Text internalization PK1 I(T1) PK2 T1 T2 inference PK3 P5 PK4 I(T2) P6 I(T3) T3 Note: All “contextual” reasoning is done in this “context”: B-R Integrated KB (the reader’s mind) internalization PK1 P7 Text I(T1) PK2 T1 T2 inference PK3 P5 PK4 I(T2) P6 I(T3) T3 Overview of CVA Project • Background: – People do “incidental” CVA • Possibly best explanation of how we learn vocabulary – Given # of words high-school grad knows (~45K), & # of years to learn them (~18) = ~2.5K words/year – But only taught ~10% in 12 school years – • Students are taught “deliberate” CVA in order to improve their vocabulary CVA project: From Algorithm to Curriculum 1. Implemented computational theory of how to figure out (compute) a meaning for an unfamiliar word from “wide context” 2. Convert algorithms to an improved, teachable curriculum Computational CVA • Implemented in SNePS (Shapiro 1979; Shapiro & Rapaport 1992) – Intensional, propositional semantic-network knowledge-representation, reasoning, & acting system • Indexed by node: From any node, can describe rest of network – Serves as model of the reader (“Cassie”) • KB: SNePS representation of reader’s prior knowledge • I/P: SNePS representation of word in its co-text • Processing (“simulates”/“models”/is?! reading): – Uses logical inference, generalized inheritance, belief revision to reason about text integrated with reader’s prior knowledge – N & V definition algorithms deductively search this “belief-revised, integrated” KB (the context) for slot fillers for definition frame… • O/P: Definition frame – slots (features): classes, structure, actions, properties, etc. – fillers (values): info gleaned from context (= integrated KB) Cassie learns what “brachet” means: Background info about: harts, animals, King Arthur, etc. No info about: brachets Input: formal-language (SNePS) version of simplified English A hart runs into King Arthur’s hall. • In the story, B12 is a hart. • In the story, B13 is a hall. • In the story, B13 is King Arthur’s. • In the story, B12 runs into B13. A white brachet is next to the hart. • In the story, B14 is a brachet. • In the story, B14 has the property “white”. • Therefore, brachets are physical objects. (deduced while reading; PK: Cassie believes that only physical objects have color) --> (defineNoun "brachet") Definition of brachet: Class Inclusions: phys obj, Possible Properties: white, Possibly Similar Items: animal, mammal, deer, horse, pony, dog, I.e., a brachet is a physical object that can be white and that might be like an animal, mammal, deer, horse, pony, or dog A hart runs into King Arthur’s hall. A white brachet is next to the hart. The brachet bites the hart’s buttock. [PK: Only animals bite] --> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: white, Possibly Similar Items: mammal, pony, A hart runs into King Arthur’s hall. A white brachet is next to the hart. The brachet bites the hart’s buttock. The knight picks up the brachet. The knight carries the brachet. [PK: Only small things can be picked up/carried] --> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: small, white, Possibly Similar Items: mammal, pony, A hart runs into King Arthur’s hall. A white brachet is next to the hart. The brachet bites the hart’s buttock. The knight picks up the brachet. The knight carries the brachet. The lady says that she wants the brachet. [PK: Only valuable things are wanted] --> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: valuable, small, white, Possibly Similar Items: mammal, pony, A hart runs into King Arthur’s hall. A white brachet is next to the hart. The brachet bites the hart’s buttock. The knight picks up the brachet. The knight carries the brachet. The lady says that she wants the brachet. The brachet bays at Sir Tor. [PK: Only hunting dogs bay] --> (defineNoun "brachet") Definition of brachet: Class Inclusions: hound, dog, Possible Actions: bite buttock, bay, hunt, Possible Properties: valuable, small, white, I.e. A brachet is a hound (a kind of dog) that can bite, bay, and hunt, and that may be valuable, small, and white. General Comments • Cassie’s behavior human protocols • Cassie’s definition OED’s definition: = A brachet is “a kind of hound which hunts by scent” Noun Algorithm • Generate initial hypothesis by “syntactic manipulation” – Algebra: Solve an equation for unknown value X – Syntax: “Solve” a sentence for unknown word X – “A white brachet (X) is next to the hart” → X (a brachet) is something that is next to the hart and that can be white I.e., “define” node X in terms of immediately connected nodes • Then find or infer from wide context: – Basic-level class memberships (e.g., “dog”, rather than “animal”) • else most-specific-level class memberships • else names of individuals – – – – – – – Properties of Xs (else, of individual Xs) (e.g., size, color, …) Structure of Xs (else …) (part-whole, physical structure…) Acts that Xs perform (else …) or that can be done to/with Xs Agents that do things to/with Xs … or to whom things can be done with Xs … or that own Xs Possible synonyms, antonyms I.e., “define” word X in terms of some (but not all) other connected nodes Verb Algorithm • Generate initial hypothesis by syntactic/algebraic manipulation • Then find or infer from wide context: – Class membership (e.g., Conceptual Dependency) • What kind of act is X-ing • What kinds of acts are X-ings (e.g., walking is a kind of moving) (e.g., sauntering is a kind of walking) – Properties/manners of X-ing (e.g., moving by foot, slow walking) – Transitivity/subcategorization information • Return class membership of agent, object, indirect object, instrument – Possible synonyms, antonyms – Causes & effects • [Also: preliminary work on adjective algorithm] A Computational Theory of CVA 1. 2. A word does not have a unique meaning. A word does not have a “correct” meaning. Author’s intended meaning for word doesn’t need to be known by reader in order for reader to understand word in context Even familiar/well-known words can acquire new meanings in new contexts. Neologisms are usually learned only from context a) b) c) 3. Every co-text can give some clue to a meaning for a word. • 4. Generate initial hypothesis via syntactic/algebraic manipulation But co-text must be integrated with reader’s prior knowledge Large co-text + large PK more clues Lots of occurrences of word allow asymptotic approach to stable meaning hypothesis a) b) 5. CVA is computable CVA is “open-ended”, hypothesis generation. a) • b) 6. 7. CVA ≠ guess missing word (“cloze”); CVA ≠ word-sense disambiguation Some words are easier to compute meanings for than others (N < V < Adj/Adv) CVA can improve general reading comprehension (through active reasoning) CVA can & should be taught in schools From Algorithm to Curriculum • State of the art in vocabulary learning from context: – – Mauser 1984: “context” = definition! Clarke & Nation 1980: a “strategy” (algorithm?): 1. Determine part of speech of word 2. Look at grammatical context • Who does what to whom? 3. Look at surrounding textual context • Search for clues (as we do) 4. Guess the word; check your guess CVA: From Algorithm to Curriculum • “guess the word” = “then a miracle occurs” • Surely, computer scientists can “be more explicit”! • And so should teachers! From Algorithm to Curriculum (cont’d) • We have explicit, GOF (symbolic) AI theory of how to do CVA Teachable! • Goal: – Not: teach people to “think like computers” – But: explicate computable & teachable methods to hypothesize word meanings from context • AI as computational psychology: – Devise computer programs that faithfully simulate (human) cognition – Can tell us something about (human) mind • Joint work with Michael Kibby (UB Reading Clinic) – We are teaching a machine, to see if what we learn in teaching it can help us teach students better CVA as Computational Philosophy & Philosophical Computation 1. CVA & holistic semantic theories: – Semantic networks: • “Meaning” of a node is its location in the entire network – Holism: • – Meaning of a word is its relationships to all other words in the language Problems (Fodor & Lepore): • • • • • • No 2 people ever share a belief No 2 people ever mean the same thing No 1 person ever means the same thing at different times No one can ever change his/her mind Nothing can be contradicted Nothing can be translated – CVA offers principled way to restrict “entire network” to a useful subnetwork • • That subnetwork can be shared across people, individuals, languages,… Can also account for language/concept change – Via “dynamic”/“incremental” semantics CVA as Computational Philosophy & Philosophical Computation (cont’d) 2. CVA and the Chinese Room – How would Searle-in-the-Room figure out the meaning of an unknown squiggle? • – Searle’s CR argument from semantics: 1. 2. 3. – By CVA techniques! Computer programs are purely syntactic Cognition is semantic Syntax alone does not suffice for semantics No purely syntactic computer program can exhibit semantic cognition “Syntactic Semantics” • (Rapaport 1985ff) Syntax does suffice for the kind of semantics needed for NLU in the CR – – All input—linguistic, perceptual, etc.—is encoded in a single network (or: in a single, real neural network: the brain!) Relations—including semantic ones—among nodes of such a network are manipulated syntactically » Hence computationally (CVA helps make this precise) Summary • Contextual Vocabulary Acquisition project is: – Computational philosophy • And computational psychology! – Philosophical computation – With applications to: • Computational linguistics • Reading education