Dynamical Cognition 2010: New Approach to Some Tough Old Problems Simon D. Levy Washington & Lee University Lexington, Virginia, USA Inspiration, 1985-1995 Inspiration, 1995-present [I]t turns out that we don’t think the way we think we think! ... The scientific evidence coming in all around us is clear: Symbolic conscious reasoning, which is extracted through protocol analysis from serial verbal introspection, is a myth. − J. Pollack (2005) [W]hat kinds of things suggested by the architecture of the brain, if we modeled them mathematically, could give some properties that we associate with mind? “ ... a fresh coat of paint on old rotting theories.” − P. Kanerva (2009) − B. MacLennan (1991) What is Mind? The Need for New Representational Principles • Ecological affordances (Gibson 1979); exploiting the environment (Clark 1998) • Distributed/Connectionist Representations (PDP 1986) • Holographic Representations (Gabor 1971; Plate 2003) • Fractals / Attractors / Dynamical Systems (Tabor 2000; Levy & Pollack 2001) The Need for New Representational Principles • Ecological affordances (Gibson 1979); exploiting the environment (Clark 1998) • Distributed/Connectionist Representations (PDP 1986) • Holographic Representations (Gabor 1971; Plate 2003) • Fractals / Attractors / Dynamical Systems (Tabor 2000; Levy & Pollack 2001) Pitfalls to Avoid 1. The “Short Circuit” (Localist Connectionist) Approach i) Traditional models of phenomenon X (language) use entities A, B, C, ... (Noun Phrase, Phoneme, ...) • We wish to model X in a more biologically realistic way. • Therefore our model of X will have a neuron (pool) for A, one for B, one for C, etc. a.k.a. The Reese’s Peanut Butter Cup Model QuickTime™ and a decompressor are needed to see this picture. E.g. Neural Blackboard Model (van der Velde & de Kamps 2006) Benefits of Localism (Page 2000) • Transparent (one node, one concept) • Supports lateral inhibition / winner-takes all Lateral Inhibition (WTA) L2 L1 A B C Problems with Localism • Philosophical problem: “fresh coat of paint on old rotting theories” (MacLennan 1991): what new insights does “neuro-X” provide? • Engineering problem: need to recruit new hardware for each new concept/combination leads to combinatorial explosion (Stewart & Eliasmith 2008) The Appeal of Distributed Representations (Rumelhart & McClelland 1986) WALKED WALK ROARED ROAR SPOKE SPEAK WENT GO Mary won’t give John the time of day. ignores(mary, john) Challenges (Jackendoff 2002) I. The Binding Problem + ? ? ? ? II. The Problem of Two + ? ? ? III. The Problem of Variables ignores(X, Y) X won’t give Y the time of day. Vector Symbolic Architectures (Plate 1991; Kanerva 1994; Gayler 1998) Tensor Product Binding (Smolensky 1990) Binding Bundling + = Unbinding (query) Lossy Lossy Cleanup Hebbian / Hopfield / Attractor Net Reduction (Holographic; Plate 2003) Reduction (Binary; Kanerva 1994, Gayler 1998) Composition / Recursion Variables john X Scaling Up •With many (> 10K) dimensions, get • Astronomically large # of mutually orthogonal vectors (symbols) • Surprising robustness to noise Pitfalls to Avoid 2. The Homunculus problem, a.k.a. Ghost in the Machine (Ryle 1949) In cognitive modeling, the homunculus is the researcher: supervises learning, hand-builds representations, etc. Banishing the Homunculus Step I: Automatic Variable Substitution •If A is a vector over {+1,-1}, then A*A = vector of 1’s (multiplicative identity) •Supports substitution of anything for anything: everything (names, individuals, structures, propositions) can be a variable! • “What is the Dollar of Mexico?” (Kanerva 2009) Let X = <country>, Y = <currency>, A = <USA>, B = <Mexico> • Then A = X*U + Y*D, Y*P B = X*M + D*A*B = D*(X*U + Y*D) * (X*M + Y*P) = (D*X*U + D*Y*D) * (X*M + Y*P) = (D*X*U + Y) * (X*M + Y*P) = D*X*U*X*M + D*X*U*Y*P + Y*X*M + Y*Y*P = P + noise Learning Grammatical Constructions from a Single Example (Levy 2010) • Given • Meaning: kiss(mary, john) • Form: Mary kissed John • Lexicon: kiss/kiss, mary/Mary, ... • What is the form for hit(bill, fred) ? Learning Grammatical Constructions from a Single Example (Levy 2010) (ACTION*KISS + AGENT*MARY + PATIENT*JOHN) * (P1*Mary + P2*kissed + P3*John) * (KISS*kissed + MAY*Mary + JOHN*John + BILL*Bill + FRED*Fred + HIT*hit) * (ACTION*HIT + AGENT*BILL + PATIENT*FRED) = .... = (P1*Bill + P2*hit + P3*Fred) + noise Step II: Distributed “Lateral Inhibition” • Analogical mapping as holistic graph isomorphsm (Gayler & Levy 2009) C A P B Q D R S cf. Pelillo (1999) A P B C D Possibilities x: Evidence w: Q R S A*P + A*Q + A*R + A*S + ... + D*S A*B*P*Q + A*B*P*R +...+ B*C*Q*R + .. + C*D*R*S x*w = A*Q + B*R + ... + A*P + ... + D*S What kind of “program” could work with these “data structures” to yield a single consistent mapping? Replicator Equations Starting at some initial state (typically just xi = 1/N corresponding to all xi being equally supported as part of the solution), x can be obtained through iterative application of the following equation: where and w is a linear function of the adjacency matrix of the association graph (“evidence matrix”). Replicator Equations • Origins in Evolutionary Game Theory (Maynard Smith 1982) • • • • xi is a strategy (belief in a strategy) πi is the overall payoff from that strategy wij is the utility of playing strategy i against strategy j Can be interpreted as a continuous inference equation whose discrete-time version has a formal similarity to Bayesian inference (Harper 2009) Localist Implementation Results (Pelillo 1999) VSA “Lateral Inhibition” Circuit (Levy & Gayler 2009) xt w c∧ c * πt ∧ cleanup / xt+1 VSA Implementation Results tinyurl.com/gidemo Conclusions • Vector Symbolic Architectures: A new kind of distributed representation for cognitive computing • • • • • robust to noise rapid (one-shot) learning “everything is a variable” solves complicated problems in parallel Replicator equations: Dynamical system from evolutionary game theory, adapted to solve graph problems (analogies); can be made more plausible by using VSA instead of localist representation Current / Future Work • Subgraph mapping D B P Q R S A E • C Using Map-Seeking Circuits (Arathorn 2002) to isolate sub-parts