here - Blerim Emruli

advertisement
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
BLERIM EMRULI
EISLAB, Luleå University of Technology
Outline





Context and motivation
Aims
Background (concepts and methods)
Summary of appended papers
Conclusions and future work
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Conventional computing



1+2/3 = 1.666…7
1010 XOR 1000 = 0010
1-64 bit variables
Cognitive computing


Concepts, relations, sequences, actions,
perceptions, learning …
Some concepts


man ≅ woman
man ≅ lake
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Cognitive computing

Bridging of dissimilar concepts



man - fisherman - fish – lake
man - plumber - water – lake
Relations between concepts and sequences


5 : 10 : 15 : 20
5 : 10 : 15 : 30
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
“..invisible, everywhere computing that does not live on a personal device
of any sort, but is in the woodwork everywhere (Weiser, 1994).”
– Mark Weiser, widely considered to be the father of ubiquitous computing
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Is cognitive computing for ubiquitous systems, i.e., systems that in
principle can appear “everywhere and anywhere” as part of the physical
infrastructure that surrounds us.
Ubiquitous Cognitive Computing:
A Vector Symbolic Approach
Intuition
high-level “symbol-like”
representations
high-level processing
low-level processing
(sensory integration)
Aims

Investigate mathematical concepts and develop computational
principles with cognitive qualities, which can enable digital
systems to function more like brains in terms of:





learning/adaption
generalization
association
prediction
…
Other desirable properties
computationally lightweight
 suitable for distributed and
parallel computation
robust and degrade gracefully


Related approaches

service-oriented architecture (SOA)
traditional artificial intelligence techniques

cognitive approach (Giaffreda, 2013; Wu et al., 2014)

Geometric approach to cognition

What can we do with words of 1-kilobyte or more?
1
0
1
0
1
2
3
4
… ….
1
0
1
0
1
9996
9997
9998
9999
10000

Pentti Kanerva started to explore this idea in the 80’s

Engineering perspective with inspiration from biological
neural circuits and human long-term memory

Since the 90’s similar ideas developed also from
Peter Gärdenfors, Professor at Lund University
Sparse Distributed Memory (SDM)



inspired by circuits in the brain
model of human long-term memory
associative memory
KEY IDEA: Similar or related concepts in memory
correspond to nearby points in a high-dimensional
space (Kanerva, 1988, 1993)
SDM interpreted as computer memory
SDM interpreted as feedforward neural network
Vector symbolic architectures (VSAs)


Concepts and their interrelationships correspond to points in a
high-dimensional space
Able to represent concepts, relations, sequences… learn,
generalize, associate… perform analogy-making using vector
representations based on sound mathematical concepts and
principles (Plate, 1994)
Vector symbolic architectures (VSAs)


VSAs were developed to address some early criticisms of neural
networks (Fodor and Pylyshyn, 1988) while retaining useful
properties such as learning, generalization, pattern recognition,
robustness and noise immunity (30% corruption tolerable)
There are mathematical operators for how to construct
operate, query etc. compositional structures, which are part
of the VSA framework
Analogy-making



Analogy-making is a central element of cognition that enables
animals to identify and manage new information by generalizing
past experiences, possibly from a few learned examples
Present theories of analogy-making usually divide this process
into three or four stages (Eliasmith and Thagard, 2001)
My work is focused mainly on the challenging mapping stage
Analogical mapping

Analogical mapping is the process of mapping
relations and concepts from one situation (a source),
x, to another (a target), y; M : x → y
Analogical mapping

The process of mapping relations and concepts that
describe one situation (a source) to another (a target)
Analogical mapping (cont’d)

The process of mapping relations and concepts that
describe one situation (a source) to another (a target)
Circle is above the square
Analogical mapping (cont’d)

The process of mapping relations and concepts that
describe one situation (a source) to another (a target)
Square is below the circle
Analogical mapping (cont’d)

The process of mapping relations and concepts that
describe one situation (a source) to another (a target)
Novel ‘‘above–below’’ relations
Generalization via analogical
mapping
(Neumann, 2001)
Generalization via analogical
mapping
(Neumann, 2001)
Generalization via analogical
mapping
(Neumann, 2001)
Generalization via analogical
mapping
(Neumann, 2001)
A difficult computational problem



If analogical mapping is considered as a graph comparison
problem it is a challenging computational problem
VSAs use compressive representations, not graphs
The ability to encode symbol-like approximate representations
makes VSAs computationally feasible and psychologically
plausible
Gentner and Forbus (2011) and Eliasmith (2013)
Sum-up


I have adopted a vector-based geometric approach to cognitive
computation because it appears to be sufficiently potent and suitable
for implementation in resource-constrained devices
A central part of the work deals with analogy making and learning
as a key mechanism enabling interoperability between heterogonous
systems, much like ontologies play a central role in service-oriented
architecture and the semantic web

Raad and Evermann (2014): Is Ontology Alignment like Analogy?
Thesis – Appended papers
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with
Sparse Distributed Memory: A Simple Model that Learns to
Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical
Mapping and Inference with Binary Spatter Codes and
Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space
Architecture for Emergent Interoperability of Systems by
Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random
Indexing of Multi-dimensional Data
Thesis – Cognitive computation papers
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with
Sparse Distributed Memory: A Simple Model that Learns to
Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical
Mapping and Inference with Binary Spatter Codes and
Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space
Architecture for Emergent Interoperability of Systems by
Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random
Indexing of Multi-dimensional Data
Thesis – Cognitive architecture for
ubiquitous systems paper
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with
Sparse Distributed Memory: A Simple Model that Learns to
Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical
Mapping and Inference with Binary Spatter Codes and
Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space
Architecture for Emergent Interoperability of Systems by
Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random
Indexing of Multi-dimensional Data
Thesis – Encoding vector representations paper
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with
Sparse Distributed Memory: A Simple Model that Learns to
Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical
Mapping and Inference with Binary Spatter Codes and
Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space
Architecture for Emergent Interoperability of Systems by
Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random
Indexing of Multi-dimensional Data
Paper A
Cognitive Computation
6(1):74–88, 2014
Emruli B. and Sandin F.
Q1: Is it possible to extend the sparse
distributed memory model so that it can
store multiple mapping examples of
compositional structures and make
correct analogies from novel inputs?
Analogical mapping unit (AMU)
SDM
Results: size of the memory and generalization
Results: size of the memory and generalization
minimum
probability of error
Paper B
IJCNN 2013, Dallas, TX
Aug. 4 – 9, 2013
Emruli B., Gayler W. R. and Sandin F.
Q2: If such an extended sparse
distributed memory model is
developed, can it learn and infer
novel patterns in sequences such
as those encountered in widely
used intelligence tests like Raven’s
Progressive Matrices?
Bidirectionality of mapping vectors
Bidirectionality problem
Raven's Progressive Matrices
Rasmussen R. and Eliasmith C., Topics in Cognitive Science, Vol. 3, No. 1, 2011
Learning mapping vectors
SDM
Learning mapping vectors (cont’d)
SDM
Learning mapping vectors (cont’d)
SDM
Prediction
SDM
Results
Paper C
Biologically Inspired Cognitive Architectures
9:33–45, 2014
Emruli B., Sandin F. and Delsing J.
Q3: Could extended sparse distributed
memory and vector-symbolic methodologies
such as those considered in Q1 and Q2 be
used to address the problem of designing an
architecture that enables heterogeneous IoT
devices and systems to interoperate
autonomously and adapt to instructions in
dynamic environments?
Communication architecture
No shared operational semantics
(Sheth, 1999; Obrst, 2003; Baresi et al., 2013)
Automation system
Learning by demonstration
Alice
Bob

Interact with the four systems to achieve a particular goal

Instructions of Alice and Bob are the same
Results
One instruction per day
by Alice and Bob
Paper D
Knowledge and Information Systems
Submitted
Sandin F., Emruli B. and Sahlgren M.
Q4: Is it possible to extend the
traditional method of random indexing
to handle matrices and higher-order
arrays in the form of N-way random
indexing, so that more complex data
streams and semantic relationships can
be analyzed? What are the other
implications of this extension?
Random indexing (RI)


Random indexing is (was) an approximative method for dimension
reduction and semantic analysis of pairwise relationships
Main properties




concepts and their interrelationships correspond to random points in
a high-dimensional space
incremental coding/learning
lightweight, suitable for processing of streaming data
accuracy comparable to standard methods for dimension reduction
Applications





natural language processing
search engines
pattern recognition (e.g., event detection in blogs)
graph searching (e.g., social network analysis)
other machine learning applications
Results: one-way versus two-way Random Indexing (RI)
Anecdote
“ As an engineer, this can feel like a deal with the devil,
as you have to accept error and uncertainty in your results.
But the alternative is no results at all! ”
Pete Warden, data scientist and a former Apple engineer
Results: two-way RI versus PCA
Gavagai AB: Opinion mining
Viewer votes
Gavagai forecast
30 %
Danny Saucedo
33 %
22 %
Thorsten Flinck
8%
12 %
Loreen
2012
22 %
Summary

The proposed AMU integrates the idea of mapping vectors
with sparse distributed memory


Demonstration of transparent learning and application of
multiple analogical mappings
The AMU solves a particular type of Raven’s matrix

The SDM breaks the commutative (bidirectionality) property of
the binary mapping vectors
Summary (cont’d)

Outline of communication architecture that enables system
interoperability by learning, without reference to a shared
operational semantics


Presenting a novel approach to a challenging problem
Extension of Random Indexing (RI) to multiple dimensions in an
approximately fixed size representation

Comparison of two-RI with the traditional (one-way) RI and PCA
Limitations


Hand-coding of the representations
The examples addressed in Paper C are relatively simple, more
complex examples and symbolic representation schemes are
needed to further test the architecture



Attention mechanism needs to be developed
Extension to higher-order Markov chains
In Paper D only one- and two-way RI are investigated and problems
considered are relative small in scale and not demonstrated in
streaming data
Future work



To apply the architecture outlined in Paper C in a “Living Lab”
equipped with technology similar to that described in the
hypothetical automation scenario
To improve and further investigate, both empirically and
theoretically the implications of the NRI extension
Is the mathematical framework sufficiently general?
“A beloved child has many names.”







Holographic Reduced Representation (HRR) - 1994
Context-Dependent Thinning (CDT) - 2001
Vector Symbolic Architecture (VSA) - 2003
Hyperdimensional Computing (HC) - 2009
Analogical Mapping Unit (AMU) - 2013
Semantic Pointer Architecture (SPAUN) - 2013
Matrix Binding of Additive Terms (MBAT) - 2014
Key readings






Sparse Distributed Memory (Kanerva, 1988)
Conceptual Spaces (Gärdenfors, 2000)
Holographic Reduced Representation (Plate, 2003)
Geometry and Meaning (Widdows, 2004)
How to Build a Brain (Eliasmith, 2013)
The Geometry of Meaning (Gärdenfors, 2014)
Credits
Supervisors
JERKER DELSING FREDRIK SANDIN
GUSTAFSSON
LENNART
Coauthors
ROSS GAYLER
MAGNUS SAHLGREN
Discussions and inspiration
ASAD KHAN
PENTTI KANERVA
BRUNO OLSHAUSEN
CHRIS ELIASMITH
Financial support
STINT, ARROWHEAD PROJECT, NORDEAS NORRLANDSSTIFTELSE, AND THE WALLENBERG FOUNDATION
COLLEAGUES, FAMILY AND FRIENDS
THE END
… or perhaps the beginning
Download