- Telluride Workshop

advertisement
Implementing HMAX with an
Integrate-&-Fire Array Tranceiver
Ralph Etienne-Cummings, Fope Folowesele, R. Jacob
Vogelstein, Gert Cauwenberghs*
The Johns Hopkins University
*UC – San Diego
Outline
 Introduction
 Neural Arrays
Our Integrate-&-Fire Array Transceivers
 Visual Object Recognition Pathways
 Models – HMAX
 HMAX with IFAT
 Conclusion

Introduction
 Object detection, recognition and tracking are
computationally difficult tasks
 Primates excel at these tasks
 Engineered systems are unable to match their
level of proficiency, flexibility and speed
 Robots and other artificial systems are limited
in their ability to interact with the
environment
Big Picture

Our overall goal is to work towards developing a realtime autonomous intelligent system that can detect,
recognize and track objects under various viewing
conditions
• Sense
presence of
object
Detect
Cross-Correlation
Recognize
• Identify and
categorize
object
Spiking HMAX
• Monitor
object
movement
Track
Neural Kalman
The Approach
 Emulate cortical functions of primates to
design more intelligent artificial systems
› Mimic the visual information processing of the
primate’s visual system
› Model computationally-intensive algorithms in
neural hardware
Potential Applications
Population Surveillance
and Visual Search Engines
Visual Prosthesis and
Ocular Implants
Research Tool for Neuroscientists
Techarena 2009; Future Predictions 2008; R. Friendman, Biomedical Computation Review 2009
Project Plan
 Develop a spike-based processing platform on
which we can demonstrate object detection,
recognition and tracking
› Design the next generation neural array transceiver
› Realize silicon facsimiles of cortical simple cells,
complex cells, composite feature cells and MAX
› Implement Spike-based Classification
› Implement neural algorithms analogous to crosscorrelation and Kalman filtering for object detection
and tracking respectively
Outline
 Introduction
 Neural Arrays
Our Integrate-&-Fire Array Transceivers
 Visual Object Recognition Pathways
 Models – HMAX
 HMAX with IFAT
 Conclusion

Software vs. Hardware Models
Software models run
slower than real time
and are unable to
interact with the
environment
Silicon designs take a few
months to be fabricated,
after which they are
constrained by limited
flexibility
IBM 2004; Tenore 2008
Solution Reconfigurable Models
Neural array transceivers are
reconfigurable systems
consisting of large arrays of
silicon neurons
 Useful for studying real-time
operations of cortical, largescale neural networks

› Able to leverage the known
fundamental blocks such as the
operation of neurons and
synapses
› Flexible enough for testing out
unknowns
Digital
Application
Specific
General
Purpose
Application-Specific Neural Array
Transceivers
 Specific to particular neural processes such as
› Spatial frequency and orientation (Choi et al.
2005)
› Acoustic localization (Horiuchi & Hynna 2001)
› Retinotopic self-organization (Taba & Boahen
2006)
› Learning and Memory (Arthur & Boahen 2004,
2006)
Digital Neural Array Transceivers
 Utilize digital logic as an alternative approach to
analog VLSI designs
› FPGA conductance-based neuron model (Graas et al.
›
›
›
›
2004)
FPGA leaky integrate-and-fire neuron model (Pearson
et al. 2005)
DSP and FPGA populations of cortical cells for
retinotopic maps (Shi et al. 2006)
FPGA spike response neuron model (Ros et al. 2006)
FPGA Izhikevich neural models (Cassidy & Andreou
2008)
General Purpose Neural Array Transceivers
 More easily amenable to multiple tasks
› Integrate-and-fire cooperative-competitive ring of
neurons (Chicca & Indiveri 2006)
› Integrate-and-fire with stop learning neural array
(Mitra & Indiveri 2008)
› Hodgkin-Huxley type neural array (Zou et al. 2006)
› Integrate-and-fire array transceiver (Goldberg et
al. 2001; Vogelstein et al. 2004, Folowosele et al.
2008)
Why Integrate-and-Fire Array
Transceiver?
 Flexible
› No local or hardwired connectivity
 Reprogrammable
› Virtual synaptic connections with programmable
weight and equilibrium potential allowing for any
arbitrary connection topology
 Expandable
› Multiple chips can be connected together
Outline
 Introduction
 Neural Arrays
Our Integrate-&-Fire Array Transceivers
 Visual Object Recognition Pathways
 Models – HMAX
 HMAX with IFAT
 Conclusion

Integrate-and-Fire Array Transceiver
(IFAT)
 One of the earliest designs was by D.H. Goldberg
et al in 2001
 The chip was designed in a 0.5-micron process on
a 1.5mm x 1.5mm die
› 1024 integrate-and-fire neurons
› 128 probabilistic synapses with two
sets of fixed parameters
D.H. Goldberg, Neural Networks, 2001
2nd Generation Integrate-and-Fire
Array Transceiver (IFAT)
Each neuron implements discrete-time model of a single
compartment neuron using switched-capacitor architecture
 Synapses have two internal parameters

› Synaptic weight
› Equilibrium potential


2400 Neurons/Chip
4,194,304 synapses
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
IFAT Operation



Incoming and outgoing address events are communicated
through the digital I/O port (DIO)
The MCU looks up the synaptic parameters (conductance and
driving potential) and neuron address in RAM
It then provides the parameters (driving potential via the DAC)
to the appropriate neuron on the I&F chip
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
IFAT Operation
R.J. Vogelstein et al., IEEE Trans. Neural Networks 2007a
Spike-Based CMOS Cameras:
Octopus
Vdd_r
reset
event
Ic
Imaging Concept
Sample Image
Other Approaches:
- W. Yang, “Oscillator in a Pixel,” 1994
-J. Harris, “Time to First Spike,” 2002
- A. Bermak, “Arbitrated Time to First Spike,” 2007
Culurciello, Etienne-Cummings & Boahen, 2001, 2003
IFAT Results
R.J. Vogelstein et al., NIPS, 2005
IFAT 3G: 3D Design in 150n CMOS
Tier A
Tier B
Tier C
• Address Event Representation
(AER) Communication Circuits
• Receiver
• Transmitter
• Synapse
• Bursting Circuit
• Control Circuit
• Neuron
• Spike Generating Circuit
In collaboration with the Sensory Communication and Microsystems Lab
Outline
 Introduction
 Neural Arrays
Our Integrate-&-Fire Array Transceivers
 Visual Object Recognition Pathways

› Models – HMAX
 HMAX with IFAT
 Conclusion
Visual Pathways

Primary Visual Cortex V1
transmits information to
two primary pathways
› Dorsal stream
› Ventral stream
Dorsal pathway is
associated with motion
 Ventral pathway mediates
the visual identification of
objects

T. Poggio, NIPS, 2007
Wikipedia, The Free Encyclopedia
Object Recognition for Computer
Vision
T. Poggio, NIPS 2007
Neurobiological Software Models
 VisNet (Wallis & Rolls 1997)
› Homogenous architecture for invariance
and specificity
 HMAX (Riesenhuber & Poggio 1999)
› Feature complexity and invariance alternatingly
increased in different layers of a processing
hierarchy
› Utilizes different computational mechanisms to
attain invariance and specificity
VisNet




VisNet is a four layer feedforward network
A series of hierarchical competitive networks with local
graded inhibition
Convergent connections to each neuron from a topologically
corresponding region of the preceding layer
Synaptic plasticity based on a modified Hebbian learning rule
with a temporal trace of each cell’s previous activity
E. Rolls & T. Milward, Neural Computation 2000
HMAX
Summarizes and
integrates large amount of
data from different levels
of understanding (from
biophysics to physiology
to behavior)
 Two main operations
occur in the model

› Gaussian-like tuning
operation in the S layers
› Nonlinear MAX-like
operation in the C layers
M. Riesenhuber & T. Poggio, Nature Neuroscience 1999
An Implementation
Serre et al. 2007
System Layers

S1
› Corresponds to classical simple cells of Hubel and Wiesel found in V1
› Gaussian-like tuning to one of four possible orientations with
different filter sizes

C1
› Corresponds to complex cells of Hubel and Wiesel
› MAX pooling operation of S1 cells with the same orientation and
scale band

S2
› Pools over C1 units from a local spatial neighborhood
› Behaves as radial basis function units – Gaussian-like dependence on
the Euclidean distance between a new input and a stored prototype

C2
› Global maximum over all scales and positions for each S2 type over
the entire S2 lattice
Serre et al. 2007
Learning and Classification Stages
 Learning
› During training, extract prototypes at the C1 level
from target image across all orientations
 Classification
› At runtime, extract C1 and C2 standard model
features (SMFs) and pass them to a simple linear
classifier
Serre et al. 2007
Scene Understanding System
Serre et al. 2007
Object Recognition in Clutter



C2 responses computed
over a new input image
and passed to a linear
classifier
Superior to previous
approaches on MIT-CBCL
data sets
Comparable to previous
on CalTech5 data sets
Data Sets
Bench
mark
C2 Features
Boost SVM
Leaves
84.0
97.0
95.9
Cars
84.8
99.7
99.8
Faces
96.4
98.2
98.1
Airplanes
94.0
96.7
94.9
Motorcycles
95.0
98.0
97.4
Faces
90.4
95.9
95.3
Cars
75.4
95.1
93.3
Serre et al. 2007
Summary

Benefits to using the fine information from low-level
SMFs
› C1 SMFs superior for shape based object recognition

Benefits to using the more invariant high-level SMFs
› C2 SMFs suitable for semisupervised recognition of objects
in clutter
› C2 SMFs excel at recognition of texture-based objects
which lack a geometric structure

Too slow for real-time applications
Outline
 Introduction
 Neural Arrays
Our Integrate-&-Fire Array Transceivers
 Visual Object Recognition Pathways

› Models – HMAX
 HMAX with IFAT
 Conclusion
HMAX on IFAT


System receives its inputs from silicon retinas
Each simple cell receives inputs from four consecutive
retinal cells
› Two with excitatory connections
› Two with inhibitory connections

Excitatory and inhibitory synaptic weights are balanced
so that the simple cells do not respond to uniform light
R.J. Vogelstein et al., NIPS 2007
C1, S2 and beyond
 Implement C1, S2 and
possibly C2 stages of
the HMAX model
 HMAX model provides
a generic high-level
computational function
in a quantitative way
T. Serre, Dissertation 2006
Preliminary Results: S1 and C1 Stages
S1 neurons are oriented spatial filters that detect local
changes in contrast
 C1 neurons take the MAX of similarly-oriented simple cells
over a region of space
 S1 cell integrates inputs from a 4x1 retinal receptive field
 C1 cell integrates inputs from an array of 5x5 similarlyoriented S1 cells

F. Folowosele et al., BioCAS 2008
Canonical Models

Biologically plausible neural circuits for implementing
both Gaussian-like and MAX-like operations
Kouh 2007
MAX Operation


Nonlinear saturating pooling function on a set of inputs, such that
the output codes the amplitude of the largest input regardless of
the strength and number of the other inputs
Set of input neurons {X} causes the output Z to generate spikes at a
rate proportional to the input with the fastest firing rate
R.J. Vogelstein et. al, NIPS 2007
Test1: Test Images and Resulting Simple Cells



(A1-4) Generated
test images
(B1-4) Horizontallyoriented simple
cells that respond
to light-to-dark
transitions
(C1-4) Verticallyoriented simple
cells that respond
to dark-to-light
transitions
F. Folowosele et al., ISCAS 2007
Test 1: MAX Network Computation Results

The ratio k obtained is approximately constant
among all the simple cells, with a mean of 0.068 and
a standard deviation of 0.0006
F. Folowosele et al., ISCAS 2007
Test2: Test Images and Resulting Simple and
Complex Cells






Checkerboard Test Image
The cells within each
square of the overlaid
checkerboard pattern
represent the 5x5 array
of simple cells which are
pooled to form a complex
cell
2400 Simple Cells
80 complex cells
MAX ratio: 0.1085 ± 0.02
After outliers are
removed, MAX ratio:
0.1179 ± 0.01
F. Folowosele et al., BioCAS 2008
Future: Attention Modulated HMAX
Riesenhuber , 2004
Conclusion
 General Purpose IF Array Transceivers
› Allows implementation spike-based algorithms
› Digital implementations may end up being more
effective than the mixed signal version
 Object Recognition
› HMAX provides a biologically plausible hierarchical
model of V1 – PFC
› Can be shown to outperform some benchmarks
 Implementation with IFAT
› Preliminary results on the early layers
› Future must also include attention
Acknowledgments
 Telluride Neuromorphic Engineering Workshop
 UNCF-Merck Fellowship
 National Science Foundation
References





















R.R. Murphy and E. Rogers, “Cooperative assistance for remote robot supervision,” Presence: Teleoperators and Virtual Environments Journal, vol. 5,
no. 2, pp. 224-240, 1996.
T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, and T. Poggio, “A theory of object recognition: computations and circuits in the feedforward
path of the ventral stream in primate visual cortex,” AI Memo, MIT, Cambridge 2005.
M. Riesenhuber, and T. Poggio, “Computational models of object recognition in cortex: a review,” Technical Report Artificial Intelligence Laboratory
and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 2000b.
R.J. Vogelstein, U. Mallik, E. Culurciello, G. Cauwenberghs, R. Etienne-Cummings, “A multichip neuromorphic system for spike-based visual
information processing,” Neural Computation, vol. 19, pp. 2281-2300, 2007a.
D.H. Goldberg, G. Cauwenberghs, and A.G. Andreou, “Probabilistic synaptic weighting in a reconfigurable network of VLSI integrate-and-fire
neurons,” Neural Networks, vol. 14, pp. 781-793, 2001.
T.Y.W. Choi, P.A. Merolla, J.V. Arthur, K.A. Boahen, and B.E. Shi, “Neuromorphic implementation of orientation hypercolumns,” IEEE ISCAS 2005.
R.J. Vogelstein, U. Mallik, J.T. Vogelstein, G. Cauwenberghs, “Dynamically reconfigurable silicon array of spiking neurons with conductance-based
synapses,” IEEE Transactions on Neural Networks, 2007b.
A. Cassidy, S. Denham, P. Kanold, and A.G. Andreou, “FPGA-based silicon spiking neural array,” IEEE BioCAS 2007.
B. E. Shi, E. K. C. Tsang, S. Y. M. Lam and Y. Meng, "Expandable hardware for computing cortical maps," IEEE ISCAS 2006.
D.H. Hubel and T.N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,” Journal of Physiology, vol.
160, no. 1, 1962.
L.G. Ungerleider, and J.V. Haxby, “What and where in the human brain,” Curr. Opin. Neurobiol., pp. 157-165, 1994.
E. Rolls and T. Milward, “A model of invariant object recognition in the visual system: Learning rules, activation functions, lateral inhibition, and
information-based performance measures, Neural Computation, vol. 12, pp. 2547-2572, 2000.
P Merolla and K Boahen, “A recurrent model of orientation maps with simple and complex cells,” Advances in Neural Information Processing Systems
(NIPS) 16, S Thrun and L Saul, Eds, MIT Press, pp 995-1002, 2004.
R.P.N. Rao, “Robut Kalman filters for prediction, recognition, and learning,” Technical Report 645, Computer Science Department, University of
Rochester, 1996.
J. Licklider, “A duplex theory of pitch perception,” Cellular and Molecular Life Sciences (CMLS), vol. 7, no. 4, pp. 128-134, 1951
J. Tapson, “Autocorrelation properties of single neurons,” Proceedings of the 1998 South African Symposium on Communication and Signal
Processing, 1998.
J. Tapson, C. Jin, A. van Schaik and R. Etienne-Cummings, “A First-Order Nonhomogeneous Markov Model for the Response of Spiking Neurons
Stimulated by Small Phase-Continuous Signals,” Neural Computation, vol. 21, no. 6, pp. 1554-1588, June 2009.
T. Lacey, “Tutorial: The Kalman filter,” Lecure Notes, Department of Computer Science, Georgia Institute of Technology, 1998.
R. Linsker, “Neural network learning of optimal Kalman prediction and control,” Neural Networks, vol. 21, no. 9, pp. 1328-1343, 2008.
R.E. Kalman, “A new approach to linear filtering and prediction problems,” Transactions of the ASME–Journal of Basic Engineering (Series D), pp. 3545, 1960. S. Mihalas and E. Niebur, “A generalized linear integrate-and-fire neural model produces diverse spiking behaviors,” Neural Computation,
2008 in Press.
C. Cadieu, M. Kouh, A. Pasupathy, C.E. Connor, M. Riesenhuber, T. Poggio, “A model of V4 shape selectivity and invariance,” J. Neurophysiol., 2007.
Download