Vector Symbolic Architectures: A New Building Material for AGI

advertisement
Dynamical Cognition
2010: New Approach to
Some Tough Old Problems
Simon D. Levy
Washington & Lee University
Lexington, Virginia, USA
Inspiration, 1985-1995
Inspiration, 1995-present
[I]t turns out that we don’t think the way we think we
think! ... The scientific evidence coming in all around
us is clear: Symbolic conscious reasoning, which is
extracted through protocol analysis from serial verbal
introspection, is a myth.
− J. Pollack (2005)
[W]hat kinds of things suggested
by the architecture of the brain, if
we modeled them mathematically,
could give some properties that
we associate with mind?
“ ... a fresh coat of paint on
old rotting theories.”
− P. Kanerva (2009)
− B. MacLennan (1991)
What is Mind?
The Need for New
Representational
Principles
• Ecological affordances (Gibson
1979); exploiting the environment
(Clark 1998)
• Distributed/Connectionist
Representations (PDP 1986)
• Holographic Representations
(Gabor 1971; Plate 2003)
• Fractals / Attractors / Dynamical
Systems (Tabor 2000; Levy &
Pollack 2001)
The Need for New
Representational
Principles
• Ecological affordances (Gibson
1979); exploiting the environment
(Clark 1998)
• Distributed/Connectionist
Representations (PDP 1986)
• Holographic Representations
(Gabor 1971; Plate 2003)
• Fractals / Attractors / Dynamical
Systems (Tabor 2000; Levy &
Pollack 2001)
Pitfalls to Avoid
1. The “Short Circuit” (Localist Connectionist)
Approach
i) Traditional models of phenomenon X (language) use entities A,
B, C, ... (Noun Phrase, Phoneme, ...)
•
We wish to model X in a more biologically realistic way.
•
Therefore our model of X will have a neuron (pool) for A, one for
B, one for C, etc.
a.k.a. The Reese’s Peanut
Butter Cup Model
QuickTime™ and a
decompressor
are needed to see this picture.
E.g. Neural Blackboard Model
(van der Velde & de Kamps
2006)
Benefits of Localism
(Page 2000)
•
Transparent (one node, one concept)
•
Supports lateral inhibition / winner-takes all
Lateral Inhibition (WTA)
L2
L1
A
B
C
Problems with Localism
•
Philosophical problem: “fresh coat of paint
on old rotting theories” (MacLennan 1991):
what new insights does “neuro-X” provide?
•
Engineering problem: need to recruit new
hardware for each new
concept/combination leads to
combinatorial explosion (Stewart &
Eliasmith 2008)
The Appeal of
Distributed
Representations
(Rumelhart &
McClelland 1986)
WALKED
WALK
ROARED
ROAR
SPOKE
SPEAK
WENT
GO
Mary won’t give John the time of day.
ignores(mary, john)
Challenges
(Jackendoff 2002)
I. The Binding Problem
+
?
?
?
?
II. The Problem of Two
+
?
?
?
III. The Problem of Variables
ignores(X, Y)
X won’t give Y the time of day.
Vector Symbolic
Architectures
(Plate 1991;
Kanerva 1994;
Gayler 1998)
Tensor Product Binding
(Smolensky 1990)
Binding
Bundling
+
=
Unbinding (query)
Lossy
Lossy
Cleanup
Hebbian /
Hopfield /
Attractor Net
Reduction
(Holographic;
Plate 2003)
Reduction
(Binary;
Kanerva 1994,
Gayler 1998)
Composition /
Recursion
Variables
john
X
Scaling Up
•With many (> 10K) dimensions, get
• Astronomically large # of mutually
orthogonal vectors (symbols)
• Surprising robustness to noise
Pitfalls to Avoid
2. The Homunculus problem, a.k.a. Ghost in
the Machine (Ryle 1949)
In cognitive modeling,
the homunculus is the
researcher: supervises
learning, hand-builds
representations, etc.
Banishing the
Homunculus
Step I: Automatic
Variable Substitution
•If A is a vector over {+1,-1}, then A*A =
vector of 1’s (multiplicative identity)
•Supports substitution of anything for
anything: everything (names, individuals,
structures, propositions) can be a variable!
•
“What is the Dollar of
Mexico?” (Kanerva
2009)
Let X = <country>,
Y = <currency>,
A = <USA>, B = <Mexico>
• Then A = X*U + Y*D,
Y*P
B = X*M +
D*A*B =
D*(X*U + Y*D) * (X*M + Y*P) =
(D*X*U + D*Y*D) * (X*M + Y*P) =
(D*X*U + Y) * (X*M + Y*P) =
D*X*U*X*M + D*X*U*Y*P + Y*X*M + Y*Y*P =
P + noise
Learning Grammatical
Constructions from a Single
Example (Levy 2010)
• Given
• Meaning: kiss(mary, john)
• Form: Mary kissed John
• Lexicon: kiss/kiss, mary/Mary, ...
• What is the form for hit(bill, fred) ?
Learning Grammatical
Constructions from a Single
Example (Levy 2010)
(ACTION*KISS + AGENT*MARY + PATIENT*JOHN) *
(P1*Mary + P2*kissed + P3*John) *
(KISS*kissed + MAY*Mary + JOHN*John + BILL*Bill + FRED*Fred + HIT*hit) *
(ACTION*HIT + AGENT*BILL + PATIENT*FRED) =
....
= (P1*Bill + P2*hit + P3*Fred) + noise
Step II: Distributed
“Lateral Inhibition”
• Analogical mapping as holistic graph
isomorphsm (Gayler & Levy 2009)
C
A
P
B
Q
D
R
S
cf. Pelillo (1999)
A
P
B
C
D
Possibilities x:
Evidence w:
Q
R
S
A*P + A*Q + A*R + A*S + ... + D*S
A*B*P*Q + A*B*P*R +...+ B*C*Q*R + .. + C*D*R*S
x*w = A*Q + B*R + ... + A*P + ... + D*S
What kind of “program” could work with these
“data structures” to yield a single consistent
mapping?
Replicator Equations
Starting at some initial state (typically just xi = 1/N
corresponding to all xi being equally supported as part of
the solution), x can be obtained through iterative
application of the following equation:
where
and w is a linear function of the adjacency matrix of the
association graph (“evidence matrix”).
Replicator Equations
•
Origins in Evolutionary Game Theory (Maynard
Smith 1982)
•
•
•
•
xi is a strategy (belief in a strategy)
πi is the overall payoff from that strategy
wij is the utility of playing strategy i against strategy j
Can be interpreted as a continuous inference
equation whose discrete-time version has a formal
similarity to Bayesian inference (Harper 2009)
Localist Implementation
Results (Pelillo 1999)
VSA “Lateral Inhibition” Circuit
(Levy & Gayler 2009)
xt
w
c∧
c
*
πt
∧
cleanup
/
xt+1
VSA Implementation
Results
tinyurl.com/gidemo
Conclusions
•
Vector Symbolic Architectures: A new kind of
distributed representation for cognitive computing
•
•
•
•
•
robust to noise
rapid (one-shot) learning
“everything is a variable”
solves complicated problems in parallel
Replicator equations: Dynamical system from
evolutionary game theory, adapted to solve graph
problems (analogies); can be made more plausible
by using VSA instead of localist representation
Current / Future Work
•
Subgraph mapping
D
B
P
Q
R
S
A
E
•
C
Using Map-Seeking
Circuits (Arathorn
2002) to isolate
sub-parts
Download