A rank encoder: Adaptive analog to digital conversion Philipp H¨

advertisement
A rank encoder: Adaptive analog to digital conversion
exploiting time domain spike signal processing ∗
Philipp Häfliger and Elin Jørgensen Aasebø
Department of Informatics
University of Oslo
Abstract. An electronic circuit is presented that encodes an array of analog input
signals into a digital number. The digital output is a ‘rank order code’ that reflects
the relative strength of the inputs, but is independent of the absolute input intensities. In that sense, the circuit performs an adaptive analog to digital conversion,
adapting to the average intensity of the inputs (i.e. effectively normalizing) and
adapting the quantization levels to the spread of the inputs. Thus, it can convey
essential information with a minimal amount of output bits over a huge range of
input signals.
As a first processing step the analog inputs are projected into the time domain,
i.e. into voltage spikes. The latency of those spikes encodes the strength of the input.
This conversion enables the circuit to conduct further analog processing steps by
asynchronous logic.
The circuit was implemented as a prototype on a VLSI chip, fabricated in the
AMS 0.6µm process. The implementation takes 31 analog inputs that are delivered
as frequency encoded spike trains by a 5bit AER (address event representation) bus.
The output is 128 bits wide and is read serially in 16-bit packages. Based on tests
with one particular set of inputs with an entropy of 112.62 bits, it is estimated that
the chip is able to convey 40.81 bits of information about that input set.
Possible applications can be found in speech recognition and image processing.
Keywords: analog to digital conversion (ADC), normalization, rank order code,
time domain signal representation, temporal code
1. Introduction
1.1. Rank order encoding
The principle of rank order encoding (Thorpe et al., 1996) is inspired
by observations in neurophysiology. Neurons (brain cells) transmit signals in the form of short voltage pulses, so-called action potentials or
spikes. How the nervous system codes information with these spike
signals is an open research area. Pioneers in neurophysiology found
correlations between stimuli and the firing frequency of neurons, e.g.
between optical stimuli and neurons in the visual cortex (Hubel and
Wiesel, 1962). Thus, early neural models considered the information
∗
This work was supported by the ASICs for MEMS project of the Norsk
Forskningsrådet
c 2003 Kluwer Academic Publishers. Printed in the Netherlands.
paper.tex; 13/06/2003; 11:26; p.1
2
transmitted between neurons to be encoded in the average spike rate.
More recent observations, however, indicate that these rate models are
not sufficient. A strong argument for this view comes from experiments
described in (Thorpe et al., 1996). In these experiments, monkeys and
human subjects were presented with pictures. They had to decide if
they saw an animal in the picture or not and they had to press one of
two buttons accordingly. The subjects needed about 200-500ms to solve
this task. The experimenters argued that along the neural processing
path from the eye to the motor reaction each neuron involved only had
time to fire once or twice within this time limit. Therefore the necessary
information could not have been encoded in firing frequency but was
somehow conveyed with only one or two spikes per processing stage.
There are several models of how this can be achieved (Thorpe et al.,
1996; Maass and Natschläger, 1997; Maass, 1999). Some of them consider the relative timing of individual spikes among a group of neurons
to be important. The principle of rank order encoding (Thorpe et al.,
1996) is one of them. In this model the relevant information within a
group of neurons is the order in which they fire their first spike after a
stimulus presentation. A rank order code is then a permutation of the
neurons of this group. This order of firing corresponds to the order of
the strengths of the stimulation that the individual neurons are exposed
to: Neurons that are stimulated more strongly tend to react by producing a spike earlier. If one thinks of these neurons as being stimulated
by pixels in a picture then the neurons associated with the brightest
pixels will fire first. A lot of the essential information of the picture
is preserved in the rank order code. Some information, however, is
discarded. For example, the absolute intensities of the inputs cannot be
reconstructed from the code: A rank order code will be independent of
the illumination level. This is an advantage for many image processing
tasks. Object recognition, for instance, should not depend on whether
the picture of an object has been taken in a dim room or in bright
daylight.
1.2. Address event representation
Address event representation (AER) is a communication protocol that
provides virtual point-to-point connections for constant length pulse
signals. It was originally thought of as a means to connect spiking
artificial neurons (Mahowald, 1994; Mortara and Vittoz, 1994). In our
system we use it to convey an array of analog values that are encoded
in pulse frequency.
In AER, a fast asynchronous digital n-bit bus (a 5-bit bus in the case
of the input to our system as depicted in figure 1) is used to provide
paper.tex; 13/06/2003; 11:26; p.2
3
16
arbiter
31
integrator array
5
AER receiver
16
31
rank encoder
reset
Figure 1. The block diagram of the system including the input and output circuitry.
2n virtual pulse signal connections between multiple computational elements on one or on several chips. A sender of such pulse streams could
be a sensor array chip of some sort. Whenever an element of that array
transmits a pulse, that pulse appears as a digital number on the bus.
That number is the sending element’s ‘address’, i.e. its chip coordinate.
On the receiver side, this address is demultiplexed and a fixed length
pulse is reconstructed and delivered to the receiving element.
2. Methods
The system described in the block diagram in figure 1 has been implemented as a prototype on a 3 x 3 mm VLSI chip fabricated in the
0.6µm AMS process. The prototype receives 31 analog inputs and issues
a 128-bit output code.
The core circuit of the system (‘rank encoder’ in figure 1) translates
an array of 31 analog current inputs first into spike/time-domain signals
and then into a rank order code. The reset signal serves as a trigger
that starts a new computation as it goes low. The array of time varying
analog inputs to the rank encoder could for example be pixel values
from an image sensor or an acoustic signal processed by a filter array.
paper.tex; 13/06/2003; 11:26; p.3
4
This encoder could be the output stage of such a sensor chip. The rank
encoder is described in more detail in subsection 2.2.
For this test chip we simulated the sensor inputs. Due to pin limitation, we could not provide all 31 analog inputs directly from off-chip.
We added an input stage which receives AER signals via a 5-bit bus1 .
The AER sequences were computed on a PC to represent 31 Poisson
distributed pulse streams. Then they were downloaded to a digital pattern generator and sent onto the AER bus. An ‘AER receiver’ (figure 1)
reconstructs the 31 pulse streams, which encode the momentary input
intensity in pulse frequency. Those pulse frequencies are transformed
into analog input currents for the rank encoder by the ‘integrator array’
(figure 1). The whole input block is further described in subsection 2.1.
The 128-bit output code of the rank encoder is read out serially in
packages of 16 bits, starting from the most significant bits (MSBs).
Further explanations of the output block are in subsection 2.3.
In the following descriptions of the individual circuit blocks some
transistor width to length ratios are labeled in the circuit diagrams
next to the transistors. All other transistors are of minimal dimensions
(W/L=1.4/0.6). The biases used in the experiments are written in
brackets under the bias names in the schematics.
2.1. The input block
The input block consists of an AER receiver and an array of leaky
integrators.
The AER receiver block consists of a demultiplexer and some control
circuitry (not shown). The latter handles the handshake protocol of the
bus communication and ensures that the decoded pulses are of fixed
length, independent of the timing of the handshake.
The analog input currents are reconstructed by an array of leaky
integrator circuits (figure 2). A fixed width voltage pulse coming from
the AER receiver (inpulse) switches on I1 and thus, decrements VC
by an amount set by the bias timeconst. The charge that has been
withdrawn is replaced more slowly by the current I2 through the diodeconnected pfet transistor connected to the supply voltage Vgain− . Note
that since the same amount of charge is replaced by I2 which has been
withdrawn by I1 , the averages of the two currents Iˆ1 and Iˆ2 are the
same and this average is the input pulse frequency ν multiplied with
the current IM 1 flowing through transistor M1 when it is in saturation,
i.e. during a pulse, multiplied by the pulse length T .
1
One of the 32 input addresses is not used, since the rank encoder was designed
to receive only 31 inputs. The reason being that this results in a more convenient
number of output bits.
paper.tex; 13/06/2003; 11:26; p.4
5
Vgain(4.28V)
Vgain+
(5.00V)
1.4/2.8
1.4/2.8
I2
VC
I1
500fF
timeconst
(0.40V)
M1
2.8/1.4
inpulse
2.8/0.6
Iout
Figure 2. The ‘integrator’ circuit that computes an analog current from the
frequency of the incoming voltage pulses.
Iˆ1 = Iˆ2 = IM 1 T ν
(1)
I2 is a sort of low pass filtered or averaged version of I1 . If I2 would
flow through a resistor instead of a diode this averaging would occur
with a fixed time constant. But in this circuit, the rate of decay of
I2 decreases with the voltage across the diode (Vgain− − VC ). Thus,
smaller decrements of VC per inpulse increase the time constant of
decay of I2 . I2 is mirrored into the output current Iout . By adjusting
Vgain− and Vgain+ , positive and negative gain can be added to this
stage. The following formula for the gain G is valid for subthreshold
currents, under the assumption that the slope factor is only marginally
different for the two different source voltages. UT is the thermal voltage.
1
V
Iout
e UT gain+
≈ 1
:= G
V
I2
e UT gain−
(2)
paper.tex; 13/06/2003; 11:26; p.5
out
out
1..2
out
3..4
out
5..7
8..10
11..13
out
Vdd
out
14..16
6
3
3
3
3
2
2
Iin 7
Σ
Time
WTA
Iin 6
Σ
Time
WTA
Time
WTA
Iin 5
Σ
Time
WTA
Time
WTA
Time
WTA
Iin 4
Σ
Time
WTA
Time
WTA
Time
WTA
Iin 3
Σ
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Iin 2
Σ
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Iin 1
Σ
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Time
WTA
Vdd
Vdd
Vdd
Vdd
Vdd
Vdd
Vdd
Vdd
Vdd
encoder
encoder
encoder
Time
WTA
Vdd
encoder
Time
WTA
Vdd
pullup_bias
(3.89V)
10.4/1.0
Figure 3. The schematics of a 7 input rank encoder.
2.2. The rank encoder
The main block in figure 1, the rank encoder, computes the permutation
of inputs according to the order of strength of the analog input currents.
The circuit for 7 analog current inputs is illustrated in figure 3. The
function and signal flow is illustrated in a 4 input example in figure 4.
The input currents (Iin1..7 ) are first latency encoded by so called
integrate and fire (I&F) neurons on the left of the block-diagram in
figure 3, i.e. the neuron that receives the biggest input current will fire
first, followed by the one that receives the second strongest input, etc.
(example in figure 4). The stronger the input, the smaller the latency.
This is a general property of I&F neurons. They integrate an input
paper.tex; 13/06/2003; 11:26; p.6
encoder
encoder
Time
WTA
finished
7
analog
current
inputs
i&f neurons
WTA columns
2
3
1
Iin
finished
t0 :
reset goes low
l: latency
Figure 4. An example of a signal flow through a 4 input rank encoder.
current until a threshold is reached. They then emit an output pulse.
Thus, the latency li of that pulse spikei is proportional to the inverse
of the input current Iini with proportionality constant C as given in
equation (3) and illustrated in figure 5.
I
V
Iin
t0
t
t0
1
C Iin
t
Figure 5. The principle of latency encoding.
paper.tex; 13/06/2003; 11:26; p.7
8
Vdd
Iin i
Σ
spike i
reset
1.4/0.6
1.4/1.4
spike i
Iin i
150fF
1.4/1.4
reset
reset
ENS
reset
choicei+1
ENS
en_out
en_out
choicei+1
Figure 6. The integrate and fire neuron that projects a current input into the time
domain.
inspikei+1
inspikei
Time
WTA
outspikei
encodei
10.4/1.0
Q
R
Q
reset
choicei
ENS
en_out
ENS
S
encodei
inspikei+1
1
inspikei
0
ENS
en_out
reset
choicei
outspikei
Figure 7. One timing winner-take-all (WTA) cell.
paper.tex; 13/06/2003; 11:26; p.8
9
li = C
1
Iini
(3)
The implementation of our I&F neuron (figure 6) is based on C.
Mead’s ‘self resetting neuron’ (Mead, 1989). Our adaptation is no longer
self-resetting. The output goes high as the voltage on the capacitor that
integrates the input current reaches the switching threshold. Positive
feedback pulls the voltage on the capacitor up to Vdd too, stabilizing
that state. The circuit needs to be reset explicitly to restart the whole
process. This is done by the global reset signal.
The actual rank order encoding is then performed by a cascade
of ‘timing winner-take-all’ circuits (columns in figure 3). The latency
encoded signals are passed on from the I&F neurons to the first timing
winner-take-all (WTA) column that determines a winner (the earliest
spike/the strongest input) and re-routes the spike signals to the next
column, removing that winner from the further competition (circles in
figure 4). Thus, each subsequent column consists of one less WTA cell.
The column results are transmitted by encoders (out1..16 ). By default
the encoders issue a zero, as long as none of their inputs are active.
Thus, all non-zero outputs from the encoders indicate that a spike
has been registered and that the timing WTA column has reached a
decision. The global reset signal resets all WTA cells.
A single timing WTA cell is shown in figure 7. Its task, to determine
if its input is the winner among the latency encoded analog signals, has
become a rather simple problem that can be solved by asynchronous
logic. The first of the identical WTA cells within a column that receives
an input spike (inspikei ) declares itself the winner (The flipflop is set)
and disables its neighbours: the active flipflop output pulls EN S low,
which is global within a column and disables the flipflop inputs. The
active flipflop also forces the signal choicei low and this will force all
lower choicej (j ≤ i) signals low. The signal choicei+1 decides if either
the direct input (inspikei ) to a WTA cell or the input to the cell above
(inspikei+1 ) is routed to the neighbouring cell in the next column using
a one-bit bus multiplexer. All cells below the winning cell pass on their
direct input (choicei+1 is low). All other cells pass on the input of the
above neighbour (choicei+1 is high). Thus, the winning signal itself is
not passed on. Note that this implementation also conveniently resolves
conflicts: if two spike signals are close enough together in time to cause
two flipflops to be set within one column, the top most will be declared
the winner. Only this cell will have a high choicei+1 signal and a low
choicei signal. Thus, only for this cell will the signal encodei be high.
encodei is the input to the encoder that encodes the winner of that
column. The spike signal (outspikei ) that is routed to the next column
paper.tex; 13/06/2003; 11:26; p.9
10
is gated by the signal en out to avoid glitches. en out is simply the
inverted choicei of the bottom most WTA cell in the column.
The removal of the column winners reduces the number of output
bits: fewer bits are needed to encode the winner in smaller columns (see
figure 3). For instance, our prototype chip with 31 analog inputs would
require log2 31! = 112.66 output bits for an optimal encoding of all
possible 31! rank permutations. The encoding proposed here requires
128 bits which means that merely 12% of them are redundant. By
comparison, a solution without reducing the column height would result
in a 155-bit output which would mean 27% redundancy.
For an arbitrary number s of analog input channels the number of
output bits for different encoding strategies is given by equation (4).
N = dlog2 (s + 1)e
nobv = N s
nred = N (s + 1) − (2N − 1) − 1
nopt = dlog2 (s!)e
(4)
nobv is the number of output bits needed for the ‘obvious’ solution
without reducing the size of the columns. nred is the output size for
the solution used by us that reduces the size of the WTA columns. It
requires 19% fewer bits on average for 2 to 1023 analog inputs. nopt is
the optimal encoding of permutations of s elements. nred is on average
7% bigger than this optimal encoding for 2 to 1023 analog inputs.
Instead of using time domain encoding, a design that operates completely in the voltage or current domain could have been developed. Our
original motivation to employ time domain processing was to explore
the possibilities of a kind of encoding suggested by observations of
the nervous system. One advantage that we discovered in the design
process was the easy elimination of winners of previous columns by
simple digital bus multiplexers. In contrast, the multiplexing of two
analog signals is not so trivial nor so noise free. More generally, it can
be said that by using spikes in time domain the design goes from being
analog to digital asynchronous, which is in many respects easier to
handle.
Time domain signals also offer an interesting method to increase
the signal to noise ratio. A standard way of achieving this is to increase
the range over which a signal is represented. If it can be avoided to
increase the noise proportionally, the signal to noise ratio is improved.
In an analog circuits one can increase the supply voltage and represent
voltage signals for example between 0V and 5V instead of between
0V and 3.5V. The thermal noise level remains approximately constant
paper.tex; 13/06/2003; 11:26; p.10
11
and thus, the signal quality improves. Or in a current mode analog
circuit signals can be represented by bigger currents to achieve that
improvement. For example the output of a current mode WTA array
as described in (Mead, 1989) will be less affected by thermal noise if the
input currents and the bias current are increased proportionally. The
price to pay is an increase in power consumption. In a time domain
circuit, representing signals, not in a bigger voltage or current range,
but on a bigger time interval has that same effect. Thus, if the input
pulses to the timing WTA are spread over a bigger time interval, the
output is less likely to be affected by jitter in the delay of these pulses.
The cost of this improvement is not in power but in the speed of the
computation. It is remarkable though that in a time domain circuit
one also reduces the effect of device mismatches that way. Since the
computational elements in a time domain circuit are asynchronous logic
gates, their mismatch is best expressed in difference of gate delay. These
mismatches, however, remain constant with the increase of the signal
ranges. For instance, a WTA cell that receives an input pulse at 60ns
delay might win unjustly over another that receives a 55ns delay input,
because it switches 10ns faster. That unfair advantage will not matter
anymore if those two inputs are represented with 4 times bigger delays:
240ns and 220ns. Whereas the same trick does not work for the current
mode WTA. If a 100nA input wins unfairly over a 110nA input due to
mismatch of the transistor properties in the central differential pair like
structure, it will also win unfairly if the input currents are increased
to 400nA and 440nA respectively. That is because the mismatches of
transistor drain currents are proportional to those currents.
2.3. The Output Block
The output block in figure 1 (circuit schematics not shown) is an asynchronous logic block that has the task of sending the resulting 128-bit
code out in 16-bit packages by a handshake protocol. An arbiter circuit
mediates between the external receiver and the rank encoder. The 16
bits encoding the higher ranks are sent first. With 31 analog inputs the
first 16 WTA columns will have 5-bit outputs. Thus, the first 16 bits
will consist of the 3 x 5 bits from the first 3 WTA columns plus the
most significant bit from the fourth column. As soon as these first 16
bits are ready, as indicated by the signal en out from the fourth WTA
column, the arbiter sends them out. As soon as the first 16-bit package
has been transmitted and the next 16 bits are ready, they are sent out,
etc. The arbiter makes sure that the packages are sent in the correct
order.
paper.tex; 13/06/2003; 11:26; p.11
12
3. Results
3.1. Test Setup
In a final test run the circuit was tested with three different input
distributions that were permutations of the same set of input currents.
The 31 input currents were frequency encoded and conveyed as Poisson
distributed spike trains by the AER input bus. The 31 average frequencies νi (i ∈ [1, 31]) of those spike trains ranged from 100Hz to 3100Hz.
The three distributions were defined as follows:
1
1
31 − i 1
1
=
−
(
−
)
νi
100
30 100 3100
1
1
i−1 1
1
input distribution 2:
=
−
(
−
)
νi
100
30 100 3100
input distribution 1:
input distribution 3:
1
=
νi
(
1
100
1
100
−
−
32−2i 1
30 ( 100
2i−31 1
30 ( 100
−
−
1
3100 )
1
3100 )
(5)
i ∈ [1, 16]
i ∈ [17, 31]
The rank orders of these input distributions are represented by the
drawn out lines in figures 8, 9, and 10.
The difference of one input intensity to the next strongest in those
three distributions theoretically results in evenly-spaced firing times in
the integrate and fire neurons, since according to equations (1) and
(2) the input currents to the rank encoder Iini are proportional to the
input frequencies νi , and the latency li is proportional to the inverse
of that input current according to (3). Thus, the following equation is
true for the theoretically constant interval ∆l from the firing of any
one neuron to the next firing among the other neurons.
∆l =
C
1 1
1
(
−
)
IM 1 T G 30 100 3100
(6)
That should make the distinction between two subsequent input
intensities by the rank encoder equally difficult for low and high intensities.
The maximal input intensity of 3100 spike events per second was
chosen as high as possible, without coming too close to overloading
the input AER bus: The digital pattern generator was set to produce
address events at maximally 200 kHz. Stimulating all 31 inputs with
3100 spikes per second would require an average event rate of 96.1 kHz,
i.e. about half of the maximal capacity.
In the tests the chip was reset with a 20Hz clock. One of the factors
that determined this frequency is that it sets a lower limit for the input
paper.tex; 13/06/2003; 11:26; p.12
13
100%
100%
100%
>25%
100%
100%
100%
75%
75%
75%
20%
75%
50%
50%
15%
50%
25%
25%
25%
10%
25%
0%
<5%
0%
5%
0%
0%
5
rank
10
15
20
25
30
5
10
15
20
input number
25
30
Figure 8. The gray-scale histogram of the chip test with the first frequency input
pattern.
intensities in the time domain: If the reset frequency would have been
somewhat higher, then the weakest input intensity of 100 spikes per
second would not have caused a spike at all between two reset pulses.
Thus, the output code would always have been incomplete, without the
lower ranks assigned to an input channel. (Instead of using a clock for
reset, the circuit could have been set to reset itself after the completion
of the output rank order code. The input intensity would in that case
determine the number of outputs per time unit.)
3.2. Measurements
The output rank order is shown as gray-scale encoded histograms (figures 8, 9, and 10) over 455 computations. A perfect output would follow
the drawn out lines. Error bars indicate the standard errors in ranking
the individual inputs.
The measured output is a good approximation of the perfect result.
The standard error computed over all input channels and test runs
measures 4.4 ranks. Two different kinds of deviations from the correct
rank can be observed: random deviations and systematic deviations.
The random deviations can be blamed almost entirely on the way
the input was conveyed to the rank encoder. We could therefore expect
paper.tex; 13/06/2003; 11:26; p.13
14
5
>25%
20%
15%
10%
<5%
rank
10
15
20
25
30
5
10
15
20
input number
25
30
Figure 9. The gray-scale histogram of the chip test run with the second input
frequency pattern.
>25%
20%
15%
10%
<5%
5
rank
10
15
20
25
30
5
10
15
20
input number
25
30
Figure 10. The gray-scale histogram of another chip test run with the third input
frequency pattern.
paper.tex; 13/06/2003; 11:26; p.14
15
2.4
x 10
−6
2.35
current [A]
2.3
2.25
2.2
2.15
2.1
2.05
2
0
0.5
1
time [s]
1.5
2
Figure 11. Output currents from 3 elements of the integrator array according to
circuit simulation.
those deviations to be vastly reduced, if the rank encoder would be integrated on the same chip as the sensors, thereby eliminating the input
block of our prototype. The random deviations in this prototype system
occur mainly due to imperfect integration of the Poisson distributed
AER input pulses: Some of the randomness of those Poisson signals
is not averaged out. There are two steps of integration: the integrator
array (figure 2) and the I&F neurons (figure 6). Let us look at the
integrator circuit first. Figure 11 shows the output current (according to
a circuit level simulation) of an integrator circuit that receives 3100Hz
(dashed line, i = 1 in distribution 1 in equation (5)), 194Hz (drawn
out line, i = 16 in distribution 1 in equation (5)), and 100Hz (dotted
line, i = 31 in distribution 1 in equation (5)) Poisson distributed spike
input. The time-constant of the integration is not sufficiently long to
make these currents smooth. Figure 11 shows that the situation is worse
for the lower input intensities.
As explained in the integrator circuit description in section 2.1, the
time-constant of the integration is adjustable within limits by the bias
voltage timeconstant of the integrator circuit. We sought to increase
the time-constant by choosing that bias to be low, and the gain of the
integrator was increased by lowering Vgain− to make up for the small
paper.tex; 13/06/2003; 11:26; p.15
16
currents. Unfortunately, we could not set the timeconstant bias below a
certain limit, where thermal noise started to dominate the actual input.
To further increase the time constant, a bigger capacitance would have
been necessary. The problem is worse for the lower intensity inputs,
since the average intervals for those Poisson inputs are the longest and
the most prone to exceed the time constant of the integration.
A second stage of integration is located in the I&F neurons. By
having them integrate the currents from the integrator circuits longer,
the result will improve. Note that the time of integration in this stage
is adjustable within a wide range by adjusting the current gain G
in the integrator circuit. A lower gain on that current will lead to
longer integration times before the firing threshold of the I&F neuron
is reached. But this would be a trade-off, since it would increase the
time necessary to complete one measurement. We chose to reset our
prototype chip by a 20 Hz clock and set the gain as low as possible,
such that the last neuron was firing just in time to complete the rank
order code. From integrating the simulated currents from the integrator
array we would expect a standard error of the ranking of 4.8 2 , which
is in good accordance with the total standard error of 4.4 we measured
from the chip. We conclude from this that these effects from the input
block are the main source of random deviations. Almost none of them
are caused in the rank encoder itself. In fact, the measurements from
the chip are even slightly better than expected, but that is because we
optimized the biases for the chip and the simulation device-parameters
will be somewhat different from the real ones and thus, the same biases
will cause slightly different behaviour.
All of the mentioned causes for random deviations affect weaker
inputs more severely. Thus, the theoretical independence of rank order
encoding to absolute input intensity is somewhat compromised: In practice the code’s precision becomes worse if the average signal intensity is
lower. This is one reason why the errors are bigger for the lower ranks
in the measurements in figures 8, 9, and 10. The standard errors over
2
We used sequences of 2 seconds of simulated integrator array current output
for all 31 pulse frequencies of the Poisson inputs (examples in figure 11). We integrated them in Matlab in an assumed I&F neuron where the product of the
firing threshold and the capacitance, i.e. the charge that is neccessary to fire the
neuron, was chosen to be 8 ∗ 10−8 C. Thus the weakest input current would elicit
a spike in just under 50ms. That allowed us to reset the capacitance with 20Hz,
thus, starting a new computation every 50ms. Doubling the threshold-charge to
1.6∗10−7 C (which is equivalent to reducing the current gain by half on the chip) and
reducing the reset frequency to 10Hz leads to a standard error of 4.1. Furthermore the
following threshold-charge/reset frequency/standard error triplets where computed:
3.2 ∗ 10−7 C / 5Hz / 3.3, 6.4 ∗ 10−7 C / 2.5Hz / 2.6, 1.28 ∗ 10−6 C / 1.25Hz / 2.2,
2 ∗ 10−6 C / 1Hz / 1.6.
paper.tex; 13/06/2003; 11:26; p.16
17
those plots is 5.0 ranks measured over ranks 16 to 31 and only 3.7 over
ranks 1 to 16.
Systematic deviations occur due to mismatches in the electronic
elements on the chip. In the analog part of the circuit (the spikeintegrator and the integrate and fire neuron) the relevant mismatches
are in parameters that affect the absolute magnitude of the analog voltages and currents, namely capacitance, threshold voltage, etc., whereas
in the time domain part of the circuit (the timing WTA columns), it
is mismatches in parameters that affect the timing of the signals, i.e.
gate delays. Due to those systematic deviations some input channels
show a clear tendency to be ranked too high (e.g. input number 15,6),
whereas others are constantly ranked too low (e.g. input number 8).
To estimate the total noise in the system as a whole we applied
information theory. The noise in a system can be quantified by estimating the information about the input contained in the output. Thus,
we estimated the ‘mutual information’ of the system input and its
output. A necessary requirement for this computation is the knowledge
about the input statistics. In our test setup we had full control over the
inputs and chose a particular set of 31 input intensities. Considering all
permutations of this set there is a total of 31! possible input patterns.
(For the measurements we only used three of those permutations.) To
compute the ‘entropy’ of these patterns (the total information gained
by knowing which pattern actually occurs), we assumed that any one
of them may occur with uniform probability. If the input can be clearly
identified by looking at the output, then it contains complete information, i.e. an equivalent of log2 31! = 112.66 bits. Due to the uncertainty
of the deviations in the encoder this number is reduced. Based on our
measurements we estimated that the mutual information between the
systems input and the systems output was 40.81 bits for this particular
experimental setup. The complete derivation of this result is described
in appendix A.
4. Conclusion
An electronic circuit is presented that, as a first processing step, transforms an array of analog input signals into time domain spike signals.
Thus, these analog signals can be processed further by asynchronous
logic rather than analog devices. This type of temporal signal encoding
bears many possibilities, some of which are discussed in this paper and
others that need yet to be explored.
The particular circuit presented in this paper computes a digital
encoding of an array of analog inputs. The encoding is a so-called
paper.tex; 13/06/2003; 11:26; p.17
18
rank order code (Thorpe et al., 1996). This particular analog to digital
conversion (ADC) is independent of the absolute input level in theory
(but it is expected and observed that in practice the reliability of the
output code degrades slightly for lower input levels). It can be thought
of as an ADC that adapts to the average input level and the spread of
the inputs. Thus, it can encode essential information with a minimal
number of bits over a huge range and variety of inputs.
Our prototype chip, for example, takes 31 analog inputs and encodes
them in 128 bits. If one were to use classical non-adaptive ADC limited
to a total output size of 128 bits, it would allow for 4-bit ADCs per
input channel (plus 4 unused extra bits). The observation range for
these ADCs would be fixed, such that they could for instance be set to
encode input voltages from 0V to 5V. That interval would be divided
into chunks of 1.25V and encoded as numbers between 0 and 3. If
for some reason the input signals would only lie between 2.5V and
3.75V, all ADCs would issue the number 2 and one would gain virtually
no useful information about the inputs beyond the fact that all of
them are somewhere between 2.5V and 3.75V. On the other hand,
rank encoding would adapt its resolution to the limited spread. Thus,
in a rank encoder’s output there is all the information concerning the
relative order of strength of the inputs regardless of whether they lie
between 0V and 5V or 2.5V and 3.75V. Even if half of them are between
0V and 1V and the other half between 4V and 5V, rank encoding still
conveys differences within both clusters. Admittedly, the absolute input
strengths cannot be reconstructed accurately. But this is not a large
price to pay in many applications that actually employ a normalization
step after the ADC in order to get rid of the ‘spatial DC component’,
i.e. the average intensity level. More severe perhaps is the loss of the
information about the magnitude of the relative differences. Thus, in
the example with two distinct clusters of input strengths, one between
0 and 1V, and one between 4V and 5V, the information about the
presence of these distinct clusters is also lost. But also this limitation
is not crucial for many applications (see subsection 4.1 for possible
examples).
The circuit has been tested in a VLSI chip implementation, fabricated in the 0.6µm AMS process. As mentioned previously, the prototype has 31 analog inputs (received as frequency encoded pulse streams
on an AER bus) and 128 output bits. Due to the clever output encoding,
this is only 12% redundant bits to encode all of the possible 31! outputs.
By contrast, a more obvious solution that had been considered earlier
would have resulted in 27% redundancy.
In testing the chip, 3 input patterns out of a particular set of 31!
have been used. Each input pattern in this set should evoke one par-
paper.tex; 13/06/2003; 11:26; p.18
19
ticular output code. Assuming that all inputs appear with the same
probability, the information theoretical entropy of the input (complete
information about the input) is 112.66 bits. In a noiseless model, the
output code would contain all of that information, whereby the input
could be completely reconstructed. On the physical device, however, it
has been estimated from the tests that a total of 40.81 bits of mutual
information are contained in the output. Simulation data suggests that
most of the information loss is caused by the analog input stage that
converts the pulse streams into analog currents. With direct current
input to the rank encoder (if enough pins could be spared or if the
sensors would be located on the same chip as the rank encoder), this
information loss would be considerably smaller. Even so, the amount
of information actually conveyed is still huge. The number suggests
that the system could reliably transmit an alphabet of 240.81 = 1012.29
letters encoded as a subset of those 31! permutations of Poisson spike
trains. Depending on the application, this information may well be sufficient for further data reduction steps to work on. Promising candidate
applications could be speech or image processing.
4.1. Possible applications
While constructing this system, we envisioned its use in the early
processing stages of a phoneme recognition system. With an analog
bandpass filter array (such as a silicon cochlea (Lazzaro and Mead,
1989; Sarpeshkar et al., 1998)) providing its input, it could replace
some of the computationally costly early processing steps in a more
conventional speech recognition system: digital to analog conversion,
Fourier transformation, intensity normalization and an initial vector
quantization. Digital signal processing would then take over an already
comfortably reduced data set. The only question is if this reduced data
will still contain all necessary information to recognize phonemes.
In spoken English there are about 44 different phonemes. We have
estimated that the rank encoder in its present implementation can
already reliably convey an alphabet that is about eleven orders of
magnitude larger. This is, however, dependent on how this alphabet
is encoded in the input to the rank encoder. Thus, in the suggested
input stage of a phoneme recognition system, the noisy and less advantageous encoding of the spoken phonemes has to be considered, as
well as the noise in the filter array. But because of the huge capacity
when using our tailored encoding, we think it very likely that phoneme
power spectrums are different enough in terms of their rank order code
that those phonemes can be extracted from the rank encoder output.
However, this would still need to be verified experimentally. We hope
paper.tex; 13/06/2003; 11:26; p.19
20
to attain a functioning silicon cochlea with AER output soon to put
this assumption to the test.
Also, other kinds of processing of arrays of sensory information could
make use of a rank encoder, especially if they apply intensity normalization as one of the first steps, which is rendered superfluous by rank order
encoding. Examples can be found in visual processing. The authors who
introduced rank order encoding originally (Thorpe et al., 1996) used
it to encode pictures as input to a face detection system. Other object
recognition tasks are also good candidates. Hardware implementations
of rank order encoders, such as the one described in this paper, open
up possibilities to use this encoding in real time applications.
paper.tex; 13/06/2003; 11:26; p.20
21
real relative rank of winner
30
<5%
10%
15%
20%
>25%
25
20
15
10
5
5
10
15
20
WTA column #
25
30
Figure 12. Gray-scale histogram showing the deviations of the winners of each WTA
column to their real rank among the inputs still remaining in the competition. It
visualizes the values hi,j in equation (14) where the y-axis refers to j+1, the x-axis
to i.
Appendix
A. Derivation of the output’s information content
To estimate how much the maximal information content of the output
of 112.66 bits is reduced by noise, the information theoretical ‘mutual
information’ between the systems input and the systems output, is
estimated here. We describe the input as the random variable X that
assumes with equal probability 31! different states, i.e. the permutations
of the 31 inputs. The output is the random variable Y which is the
combination of the 31 timing WTA column outputs Y = (Y31 , .., Y1 ).
In information theory, the entropy H(X) of a discrete random variable
X is a measure of the total information content of that variable, i.e. the
uncertainty about the value that this variable may adopt. For discrete
random variables where P (X = xi ) = pi it is defined by (7).
H(X) =
X
−pi log2 pi
(7)
i
Mutual information of two random variables can be expressed with
conditional entropy as in (8).
paper.tex; 13/06/2003; 11:26; p.21
22
I(X, Y ) = H(Y ) − H(Y |X)
(8)
Or it can be expressed with the entropy of the combination of the
two variables as in (9).
I(X, Y ) = H(X) + H(Y ) − H(X, Y )
(9)
If we solve (9) for H(X, Y ) and substitute X with Y31 |X and Y
with (Y30 , .., Y1 )|X we get an expression for H(Y31 , (Y30 , .., Y1 )|X) =
H(Y |X). If we substitute H(Y |X) in (8) with this expression we get
(10).
I(X, Y ) = H(Y ) − H(Y31 |X) − H(Y30 , .., Y1 |X) + I(Y31 , (Y30 , .., Y1 )|X)
(10)
Applying (8) on I(Y31 , (Y30 , .., Y1 )|X) we get (11)
I(X, Y ) = H(Y ) − H(Y31 |X) − H(Y30 , .., Y1 |X, Y31 )
(11)
Repeating this same procedure on the right most summand 29 more
times we do finally get (12).
I(X, Y ) = H(Y ) − H(Y31 |X) − .. − H(Y1 |X, Y31 , .., Y2 )
(12)
The conditional entropies in (12) can be approximated (actually
over-estimated) by the conditional entropies that only depend on the
input.
H(Yi |X, Y31 , .., Yi+1 ) = H(Yi |X) − I(Yi , (Y31 , .., Yi+1 )|X) / H(Yi |X)
(13)
And these can be estimated from the experimental results, since we
can estimate the probability distribution of the associated conditional
random variables. These conditional random variables are exactly the
outputs of the timing WTA columns of the circuit. Each of those column
outputs can be viewed as the result of a random experiment, where the
probability distribution depends on the input and on the results of
the preceding columns. We compute the difference δi of each column’s
(index i) output (i.e. the winning input) to the real rank of this input
among the inputs that have not yet been removed by a previous column. We model the probability distributions pi,j = P (δi = j) of these
differences from the histogram hi,j (see figure 12) of the differences
resulting from the experiments.
paper.tex; 13/06/2003; 11:26; p.22
23
hi,j
pi,j ≈ P
j hi,j
(14)
Note that since we are computing a conditional entropy dependent
on input X, we would actually need to compute the entropies for each
column given every possible input pattern and then average over the
results from all input patterns. We do, however, assume (somewhat
unjustly) that the input channel properties are uniform. Thus, the
probability distribution of the WTA columns outputs for any particular
input pattern would only be a permutation of distribution for another
permutation of the input and therefore have the same entropy. By
expressing a column’s output by the winner’s real rank among the
competitors, the probability distribution does not even have to be
permuted, but would be exactly the same for every permutation of the
input. That is why we can add the histograms for the three different
input permutations used in the experiments to derive the probability
distribution pi,j .
The assumption that all input channels have uniform properties lets
us also determine H(Y ). For any permutation of a particular input rank
pattern, the rank order code output probability distribution will also
just be the same permutation applied to probability distribution of the
output of this particular input. Thus, if those probability distributions
are averaged for all inputs and all inputs occur with equal probability,
then all possible outputs will also occur with uniform probability. Thus,
we approximate by (15).
H(Y ) / log2 31! = 112.66
(15)
Finally, we can compute the mutual information I(X, Y ) between
the systems input and its output by combining the experimental histogram hi,j , (14), (7), (15), (13) into (12) to get to the final result
(16).
I(X, Y ) ≈ 40.81
(16)
Let us just briefly look at the effects of the assumptions that were
made on the way to this result.
1. Assuming uniform input channel properties will by (15) tend to
make the estimated mutual information too big. The same assumption will, however, also lead to an overestimation of the WTA
column outputs’ entropy by equations (14) and (7), and therefore
to an underestimation of I(X, Y ).
paper.tex; 13/06/2003; 11:26; p.23
24
2. Substituting H(Yi |X, Y31 , .., Yi+1 ) with H(Yi |X) according to (13)
will contribute to underestimating I(X, Y ).
3. Approximating the WTA columns’ probability distribution by (14)
will underestimate the real distribution’s entropy. That is because
the real distribution is more even and would be better approximated by including many more experiments. This will contribute
to overestimating I(X, Y ).
References
Hubel, D. and T. Wiesel: 1962, ‘Receptive Fields, Binocular Interaction and Functional Architecture in the cat’s visual cortex’. Journal of Physiology 160,
106–154.
Lazzaro, J. and C. A. Mead: 1989, ‘Circuit Models of Sensory Transduction in the
Cochlea’. In: C. A. Mead and M. Ismail (eds.): Analog VLSI Implementations
of Neural Networks. Kluwer Academic Press, pp. 85–101.
Maass, W.: 1999, ‘Computing with spiking neurons’. In: Pulsed Neural Networks.
The MIT Press, pp. 55–85.
Maass, W. and T. Natschläger: 1997, ‘Networks of spiking neurons can emulate
arbitrary hopfield nets in temporal coding’. Network: Comp. in Neural Sys. 8,
355–372.
Mahowald, M.: 1994, An Analog VLSI System for Stereoscopic Vision. Kluwer.
Mead, C.: 1989, Analog VLSI and Neural Systems. Addison Wesley.
Mortara, A. and E. A. Vittoz: 1994, ‘A communication architecture tailored for
analog VLSI artificial neural networks: intrinsic performance and limitations’.
IEEE Trans. on Neural Networks 5, 459–466.
Sarpeshkar, R., R. F. Lyon, and C. Mead: 1998, ‘A Low-Power Wide-DynamicRange Analog VLSI Cochlea’. In: T. S. Lande (ed.): Neuromorphic Systems
Engineering. Boston: Kluwer Academic Publishers, Chapt. 3, pp. 49–103.
Thorpe, S., F. Fize, and C. Marlot: 1996, ‘Speed of processing in the human visual
system’. Nature 381, 520–522.
Authors’ Vitae
paper.tex; 13/06/2003; 11:26; p.24
25
Philipp Häfliger
Philipp Häfliger was born in Zürich, Switzerland
in 1970. He studied Computer Science at the Federal Institute of Technology (ETH) in Zürich, with Astronomy as a second subject, from 1990
to 1995. A semester project in High Energy Astrophysics resulted in his
first scientific publication as a co-author. He obtained his PhD in 2000
from the Institute for Neuroinformatics at ETH. At present, he works
as a researcher in the Microelectronics Group at the Department of
Informatics at the University of Oslo in Norway. He teaches a lecture in
Neuromorphic Electronics and is the local coordinator for the European
5th Framework IST project CAVIAR, with partners in Spain and in
Switzerland. He has lately enrolled as a member in several IEEE CAS
society technical committees. When time permits, he loves to explore
the magnificent nature in Norway and elsewhere.
Elin Jørgensen Aasebø
Elin Jørgensen Aasebø was born in Oslo, Norway,
in 1978. She enrolled in the University of Oslo straight out of school.
After courses in both psychology and philosophy, she ended up with
informatics as the main area of study. In October 2002 she obtained a
Master of Science degree. The Master project was in the field of neuromorphic engineering and organized under the microelectronic group
at the Informatics department. She now works with FPGA design at
Kongsberg Defence Communications in Billingstad, Norway.
paper.tex; 13/06/2003; 11:26; p.25
Download