Why BICA is Necessary for AGI

Biologically Inspired Cognitive Architecture Why BICA is Necessary for AGI Alexei Samsonovich (George Mason University) 1 Questions Answers  Why BICA is necessary for achieving AGI?  Because we need a humanlike universal learner  What kind of a BICA?  One that describes human cognition and learning at a higher symbolic level  What are the minimal starting requirements, i.e., the “critical mass”?  “Critical mass” includes human-like mental states that can act on each other 2 A few words about GMU-BICA 3 Mental states in GMU-BICA  A mental state in GMU-BICA includes:  Contents of awareness represented by schemas  A token representing an instance of the Self who is aware (labeled I-Now, I-Next, etc.) Episodic memory: Frozen mental states of the Self I-Past-1 I-Past-2 I-Past-3 I-Past-4 Working memory: Active mental states of the Self I-Goal: I-Meta: I-Past: •Scenario I-Imagine: •Past experience •Analysis •Intermediate goal situation •Prospective memories •Stimulus satisfaction I-Previous: I-Now: •Ideas •Ideas I-Next: •Visual input •Intent •Scheduled action •Expectation 4 Mental state dynamics in working memory of GMU-BICA: an example Working memory me' I-Now S me he He-Now me S Inputoutput he' S me' He-Next I-Next me me S S he Semantic memory S R P Q 5 Examples of types of mental states in GMUBICA (a possible snapshot of working memory) I-Goal I-Meta-1 I-Alt-Goal I-Imagined-2 I-Meta-2 I-Imagined-1 I-Imagined-3 I-Subgoal I-Next I-Previous I-Now I-Next-Next I-False-Belief I-Past I-Past-Revised She-Past-Prev I-Detail-1 She-Past I-Detail-2 He-Now I-Feel He-Now-I-Now 6 Models that we need to integrate 7 Self-regulated learning (SRL) model of problem solving Ground Object Meta- Level Level Level Perception Doing Monitoring Reasoning Metareasoning Model of meta-cognition (Cox & Raja, 2007) Performance SRL in problem solving Control Task analysis Identifying data, goals and barriers Selecting strategies and planning Assessing expected self-efficacy Goalorientation Predicting outcomes, confidence Understanding values of outcomes Setting attitudes toward (sub)goals Self-reflection Action Selection Forethought “…there is a need to build a unified model of metacognition and selfregulated learning that incorporates key aspects of existing models, assumptions, processes, mechanisms, and phases” (Azevedo and Witherspoon, AAAI BICA-2008) Selfobservation Introspective self-monitoring Metacognitive self-analysis Selfcontrol Self-instruction and attention control Self-imagery, self-experimentation Selfjudgment Standard-based self-evaluation Selfreaction Self-adaptation, conflict resolution Causal self-attribution of outcomes Self-rewarding, updating self-image (based on Zimmerman & Kitsantas, 2006) 8 Result: A Mental-state model of SRL I-Meta HW Problem: Solve for x: ax+b=c I-Meta-Next Forethought Performance Reflection Task analysis Self-beliefs Self-control Self-observation Self-judgment Self-reactions I-Now I-Next I-Next-Next Task analysis Self-beliefs Self-control Self-observation Self-evaluation Self-reaction Identify goal Select strategic steps (a plan) Self-efficacy Goal-orientation Intrinsic interest Enact selected steps to solve the problem Self-recording using a worksheet Compare result to the standard (a template) Met standard Skill mastered Self-reward (Exit) -- OR – Did not meet standard Attribute failure to ineffective strategy selection (Loop reentry) I-Detail-1 I-Goal I-Detail-2 Homework task Select strategic steps Enact strategic steps Result validation Problem: ax+b = c Goal: Solve for x, i.e., have a formula x=… Isolate x - use subtraction property - use division property ax+b = c ax = c-b x = (c-b)/a x=(c-b)/a compare to x = …(no x in r.h.s.) There is a match. | -b | /a (Samsonovich, De Jong & Kitsantas, to appear in International Journal of Machine Consciousness, 1, June 2009) 9 Final take-home message 10 - How to build a universal learner? - Need to bootstrap from “critical mass” ( ) - How to build a “critical mass” (suppose we know what)? There are at least three approaches to building a “critical mass”: 1. Incremental bottom-up engineering Without a good stimulus will take forever 2. Brittle rapid prototypedemo Useless toy (BICA Phase I) Watch for AAAI 2009 Fall Symposia (BICA, SRL-metacog) 3. SRL assistant (finessing lower levels by students!) Feasible and practically useful stepping stone Thank you. 11 End of Talk 1 / Beginning of Talk 2 12 A Cognitive Map of Natural Language Alexei Samsonovich (George Mason University) 13 Theory 14 Introducing two notions of a semantic cognitive map (SCM):  “Strong” SCM with a dissimilarity metric B A  C A is closer to B than C  A is more similar to B than C  “Weak” SCM that captures both synonym and antonym relations A B C A and B are synonyms, A and C are antonyms. Don’t care about unrelated. 15 Background: Method of building an SCM 1. 2. 3. Represent symbols (words, documents, etc.) as vectors in Rn Optimize vector coordinates to minimize H Do truncated SVD of the resultant distribution dot product (a ) H   xy   xy   x 4 A S Q (b) H   x  y   xy   x 4 2 S A Q (c ) H   x  y   x  y   x 4   x 2 2 S 2 A  Q (d ) H   x  y   exp  x  y 2 S S 2  Q c x, y  Q – vectors in Rn A – antonym pairs S – synonym pairs (Samsonovich & Ascoli, Proceedings of AGI-2007) 16 Example: color map  Sample N = 10,000 points on a sphere (A)     declare some pairs of points ‘synonyms’ (some of those that are close to each other) declare some other pairs of points ‘antonyms’ (some of those that are separated far apart) assign random coordinates to points in 10-dimensional space (B) apply an optimization procedure to the set of 10,000 random vectors in order to minimize the following energy function: H  xy   xy   x xy A  A xyS B C 4 x The result is the reconstructed spatial distribution of colors (C) 17 Geometric properties of the reconstructed color map are robust with respect to variation of model parameters 18 Results for Synonym-Antonym Dictionaries 19 Optimization results 20 positive PC #1(valence) negative Individual words clear well accept praise support good right respect increase improve decline poor stop uncertain fail reject sad deny vague bad stiff tough hard heavy serious extreme deep loud tense intense calm soft relaxed mild easy gentle modest quiet calmly easygoing close final detain restraint confine swallow restrain local wait compact release go fire free freedom independent new expose far brief Antonym pairs Sorted words and antonym pairs (MS Word English) PC #2(arousal) PC #3(dominance) exciting, tough calming, easy close, dominate open, free accept clear good support praise well respect continue happy right decline lose poor neglect criticize badly deny stop sad wrong stiff hard fierce tough serious tense severe strict loud heavy relaxed soft calm easy mild relax gentle easygoing quiet insignificant restrain close stay restraint restricted cushion take somebody on experienced hold back final release open go freedom free expose fire new let go first 21 Geometric characteristics of the SCM (MS Word English) 22 Semantic characteristics of the SCM Synonym pairs and antonym pairs, if mixed together, can be separated with 99% accuracy based on the angle between vectors: synonyms acute  synonyms, obtuse  antonyms antonyms * Semantics of the first 3 dimensions are more general than any words, yet clearly identifiable: PC#1: success, positive, clear, makes good sense PC#2: exciting, does not go easy PC#3: beginning, source, origin, release, liberation, exposure 23 PC-by-PC correlation across languages and datasets WN PC1 English PC2 PC3 PC4 MS PC1 French PC2 PC3 PC4 MS PC1 German PC2 PC3 PC4 MS PC1 Spanish PC2 PC3 PC4 ANEW D1 D2 D3 PC1 0.73 -0.23 0.12 0.029 0.74 -0.01 -0.034 0.056 0.73 -0.081 0.049 -0.089 0.67 -0.20 0.17 0.0014 0.80 0.052 0.0085 MS English PC2 PC3 0.20 -0.06 0.64 0.18 -0.13 0.57 -0.022 0.001 0.0057 0.0004 0.41 0.24 -0.33 0.37 0.066 -0.0058 0.037 0.025 0.21 0.16 -0.16 0.26 0.007 0.014 0.037 -0.046 0.45 -0.13 -0.056 0.46 0.33 0.19 -0.19 0.20 0.39 0.26 0.22 0.094 PC4 -0.031 0.22 0.13 0.30 0.034 0.14 0.0097 0.021 0.056 0.097 0.029 0.026 -0.014 0.14 0.066 0.18 0.21 0.22 -0.22 Canonical Corr. Coef. 0.78 0.72 0.63 0.52 0.75 0.54 0.49 0.27 0.78 0.57 0.46 0.24 0.71 0.62 0.60 0.45 0.83 0.55 0.37 24 Clustering of words in the first SCM dimension: WordNet and ANEW vs. MS Word 25 Applications 26 Examples of “semantic twisting” 27 Sentiment analysis: 7 utterances automatically allocated on SCM 1. Please, chill out and be quiet. I am bored and want you to relax. Sit back and listen to me. 2. Excuse me, sorry, but I cannot follow you and am falling asleep. Can we pause? I've got tired and need a break. 3. I hate you, stupid idiot! You irritate me! Get disappeared, or I will hit you! 4. What you are telling me is terrible. I am very upset and curious: what's next? 5. Wow, this is really exciting! You are very smart and brilliant, aren't you? 6. I like very much every word that you say. Please, please, continue. I feel like I am falling in love with you. 7. We have finally found the solution. It looks easy after we found it. I feel completely satisfied and free to go home. (Samsonovich & Ascoli, in Proc. of AAAI 2008 Workshop on Preference Handling) 28 Sentiment analysis: Mapping movie reviews as ‘bags of words’  Acquired 40+ reviews for each of three movies: Iron Man, Superhero and Prom Night, from the site www.mrqe.com  For each review, computed the average map coordinate of all identified indexed words and phrases.  RESULT: Statistics for PC#1 are consistent with grades given to the movies in the reviews. Iron Man: (1.95, 0.52), Superhero: (1.49, 0.36), Prom Night: (1.17, 0.42) All differences are significant except PC#2 of Superhero vs. Prom Night 29 CONCLUSIONS Weak SCM is low-dimensional, yet distinguishes almost all synonym-antonym pairs Therefore, SCM can be used as a metric system for semantics (at least for the most general part of semantics) SCM dimensions have clearly identifiable semantics that make sense virtually in all domains of knowledge SCM can be used to guide the process of thinking in symbolic cognitive architectures The map semantics and geometrical characteristics are consistent across corpora and across languages Other potential applications include sentiment analysis, semantic twisting, document search, validation of translation Credits to Giorgio A. Ascoli, Rebecca F. Goldin, Thomas T. Sheehan Thank you. 30

Why BICA is Necessary for AGI

Related documents

Products

Support

Why BICA is Necessary for AGI

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib