Blue and Red Gradient - Computer Science and Engineering

advertisement
Hierarchical Temporal Memory as a
Means for Image Recognition
by Wesley Bruning
CHEM/CSE 597D
Final Project Presentation
December 10, 2008
The Grand Scheme


A free, on-line resource that allows anyone to find
information about symbols; a compendium of
symbols, their names, meanings, and histories.
Symbols? Yes, symbols!



The Star of David (hexagram), the Greek symbol
Sigma, the Masonic compass, the Wheel of
Dharma, the bass clef, company logos, et cetera
Would fill a niche, but a relatively easy enough
niche to fill, and one that should be filled sometime.
Not for profit.
The Neat Feature

Users can search for symbols by drawing or
uploading pictures.



“What does this mean?”
The server(s) will house a program that receives the
image and determines which symbol in the database
the user desires.
Image recognition.
Computer Vision



1
Visual pattern recognition, like understanding
language and physically manipulating objects, is
difficult for computers.
There are no viable algorithms for performing these
functions on a computer.1
For humans, these are easy.
J. Hawkins and D. George, “Hierarchical Temporal Memory – Concepts, Theory, and Terminology,”
Whitepaper, Numenta Inc.
A Couple of Existing Models

“Classic” artificial neural networks


At least 3 layers of nodes
Bayesian networks

Directed acyclic graph
Hierarchical Temporal Memory

Abbreviated HTM.

A novel machine learning paradigm.


Can be considered a type of artificial neural network,
but the founding principles differ.
(I will only discuss the higher-level concepts of
HTM—not its learning algorithms and the like)
Why HTM?



It's rather new/untested.
Has already shown promising results in the area of
visual pattern recognition.
Biologically inspired.
The Biological Inspiration




1
HTM is based on a hierarchical theory of the human
brain's neocortex and thalamus; it seeks to replicate
their biological functions.
A top-down solution that models the brain as a
“device that computes by performing sophisticated
pattern matching and sequence prediction.”1
A hierarchy of uniform processing elements.
HTM implements invariant pattern recognition, as
seen in the visual cortex.
K. L. Rice, et. al. “A Preliminary Investigation of a Neocortex Model Implementation on the Cray XD1.”
Assumptions in the Basic Theory



1 J.
The neocortex is an efficient pattern matching
device, not a computing engine.1
The brain learns by storing patterns. It recognizes by
matching sensory data to learned patterns.2
The structure of the world is hierarchical: temporal
as well as spatial. e.g. “A speaker expresses an idea
over time by combining consonants and vowels to
make syllables, syllables to make words, etc.”3
Hawkins. "Learn Like a Human."
2
K. L. Rice, et. al. “A Preliminary Investigation of a Neocortex Model Implementation on the Cray XD1.”
3
http://www.numenta.com/for-developers/education/htm-summary.php
How Does HTM Work?


It's (not) a black box!
It's a hierarchy of connected
nodes.
An HTM Network




Multiple levels of nodes.
Sensory data is input to
the lower level, and a
belief is generated at the
top level.
Information is exchanged
from parent to child and
vice versa.
Each node performs the
same learning algorithm.
This Looks Similar to Some Types of
Artificial Neural Networks


HTM can be considered a type of ANN, as well as a
type of Bayesian network.
Big Difference: The majority of these networks try
to emulate individual neurons, not the overall
structure of the neocortex.

Temporal data is typically not handled (well).

Different learning algorithms are used.
So How Does it Work?



Each node looks at its input and learns the “cause”
of its input. A “cause” is whatever causes the input
pattern to occur.
The outputs of the nodes in one level become the
inputs of the nodes in the next level.
So! The nodes at the lower levels discover simple
causes, such as edges and corners, while the nodes at
the higher levels discover complex causes, such as
faces. Intermediate nodes find causes of intermediate
complexity.
Beliefs
How Do Nodes Generate Beliefs?


1
1. Node looks at input and assigns a probability that
the input matches a spatial pattern.
2. The node takes this probability distribution and
combines it with previous state information to assign
a probability that the current input matches a
temporal sequence.
3. The distribution over the set of sequences is the
output of the node and is passed up the hierarchy.
Finally, if the node is still learning, it might modify
the set of stored spatial and temporal patterns to
reflect the new input.1
J. Hawkins and D. George, “Hierarchical Temporal Memory – Concepts, Theory, and Terminology.”
In Pictures

Discovering spatial patterns

Discovering temporal patterns (sequences)
Past Trial


“Using Numenta’s hierarchical temporal memory to
recognize CAPTCHAs”1
1 Y.

HTM performed well, but performance could have
been improved with more time

Concluded HTMs are designed well to recognize
CAPTCHAs
J. Hall and R. E. Poplin, “Using Numenta’s hierarchical temporal memory to recognize
CAPTCHAs”
Past Trial

1
“Content-Based Image Retrieval Using Hierarchical
Temporal Memory”1

HTM was robust to spatial noise, blurring, and
other distortions despite having been trained on
only clean, undistorted images

Concluded HTMs are flexible enough to provide
efficient and accurate indexing of line drawings
B. A. Bobier and M. Wirth, “Content-Based Image Retrieval Using Hierarchical Temporal Memory”
My Own Firsthand First Impression

Testing HTM's image recognition capabilities.
Testing HTM's Image Recognition
Capabilities
Testing HTM's Image Recognition
Capabilities
Testing HTM's Image Recognition
Capabilities
Testing HTM's Image Recognition
Capabilities
Testing HTM's Image Recognition
Capabilities
Results

With simple black and white tests, it was very
successful.

Handled noisy data well.

Not so good with rotated images.

Reason? (predominantly) Training.

Training is essential.
Is HTM a Viable Option?



Yes. It has already proven it is a good candidate for
simple image processing.
I still need to conduct more experiments to find its
boundaries. E.g. color images, more complex
images, larger database of trained images.
Once these boundaries are found, I must decide if it
is worth it to find solutions within HTM technology.

I may need to implement additional processing.

Numenta is a business, this is their product.
Download