On-Chip Evolution Using a Soft Processor Core Applied to Image...

advertisement
On-Chip Evolution Using a Soft Processor Core Applied to Image Recognition
Kyrre Glette and Jim Torresen
University of Oslo
Department of Informatics
P.O. Box 1080 Blindern, 0316 Oslo, Norway
{kyrrehg,jimtoer}@ifi.uio.no
Abstract
To increase the flexibility of single-chip evolvable hardware systems, we explore possibilities of systems with the
evolutionary algorithm implemented in software on an onchip processor. This gives higher flexibility compared to implementing an evolutionary algorithm directly in hardware,
since the parameters and behaviour of the algorithm can
easily be changed, and complex operators are more feasible
to implement. In this paper a Xilinx MicroBlaze soft core
processor is used, and the system is implemented in a Xilinx
FPGA. A suitable hardware architecture for image recognition has been proposed, and it is applied to a face recognition task. Data buses and higher level functions have been
utilized in order to reduce the search space for the evolutionary algorithm. Experiments have been performed on
the physical device, with software running in parallel with
fitness computation in digital logic. Results show that the
MicroBlaze system evolves at half the speed of a Pentium
M system running at 17 times the FPGA clock frequency.
The distinction of a certain face from others is performed
at 94.9% accuracy. In addition, the possibilities for evolutionary adaptation over time are explored by introducing
changes in the training set. The system shows ability to
adapt to these changes.
1
Introduction
Evolvable hardware (EHW) seems useful for systems
submitted to unpredictable, time-varying environments [20,
14]. Such systems will also often be part of embedded applications, and therefore compact, on-chip solutions will be
preferred.
There have been undertaken some implementations of
single-chip evolution earlier. Kajitani et al have introduced
several LSI (Large-Scale Integrated Circuits) devices with
evolution performed in hardware [5, 6]. The benefit of such
Moritoshi Yasunaga and Yoshiki Yamaguchi
University of Tsukuba
Graduate School of
Systems and Information Engineering
1-1-1 Ten-ou-dai, Tsukuba, Ibaraki, Japan
{yasunaga,yoshiki}@cs.tsukuba.ac.jp
an approach is the evolution speed but the problem is lack
of flexibility. This would be important since there are often
many degrees of freedom when evolving hardware systems.
Implementing complete evolution in an FPGA has been
proposed by Tufte and Haddow in [16]. The evolving design is here implemented in the same device as the evolutionary algorithm. A similar approach is proposed by
Perkins et al in [9]. Significant speedup is achieved for nonlinear filtering compared to conventional processing. Several custom accelerators in FPGA for solving a protein folding problem have been introduced by Shackleford et al [12].
On-chip evolution using a prototype of the VLSI (Very
Large-Scale Integration) POEtic chip has also been reported
[10]. A robot controller and logic functions were evolved.
The architecture, specialized for the implementation of bioinspired mechanisms, contains an on-chip custom processor, and a bio-inspired array of building blocks.
Running complete evolution of image filters within an
FGA has been reported by Martinek and Sekanina [7]. In
this work the evolution (mutation only) is implemented in
reconfigurable logic. Image filters were evolved in a few
seconds from corrupted and original pictures. This design
employs data buses, earlier proposed in [11] and [13], and
function-level building blocks, first proposed in [8].
The authors have earlier demonstrated System-On-Chip
evolution using an embedded hard processor core in an
FPGA [1]. This is accomplished by integrating an evolutionary algorithm running as software on a hard processor
IP core with the target EHW implemented in reconfigurable
logic. In this paper an alternative approach is explored by
using a soft processor core. This allows for portability to a
greater range of FPGAs, including cheaper devices.
In our system, all parts of the evolution (except the evaluation of individuals which is implemented in digital logic)
are undertaken in software, providing a flexible system for
later modifications. This is slower than implementing the
evolution in dedicated hardware, but it is expected that the
fitness evaluation time will still be the most time consum-
ing part. This balanced software-hardware approach will allow for a low implementation effort while still being able to
have a single-chip design, suitable for embedded real-world
applications.
The EHW system is applied to a face image recognition
task. Experiments on image recognition by EHW were first
reported by Iwata et al in [4]. An FPLA device was utlized
for recognition of black and white images. An EHW face
image classifier system has been developed by Yasunaga et
al in [21], in which the classifier function is directly coded
in large AND gates. Evolution is performed offline and the
final system is synthesized. This approach gives rapid classification in a compact circuit, but lacks the run-time reconfigurability. Another classifying system has been proposed
in [15], also employing AND gates, in combination with
OR gates. These systems have in common the selection of
one category from a set of candidate categories.
The image recognition task proposed in this paper will
be restricted to recognizing one face out of a set of candidates. That is, the system reports an input as either a positive match to the trained face category, or a negative match.
Such an application could be imagined for an image-based
lock for mobile devices, or cameras trained to recognize one
particular person.
The next section introduces the architecture and the implementation of the on-chip EHW system, including both
hardware and software aspects. Results from the implementation are given and discussed in Section 3. Finally,
Section 4 concludes the paper.
FPGA
BRAM
64KB
LMB
TARGET
EHW
MicroBlaze
processor
PERIPHERAL
OPB
UART
PERIPHERAL
SDRAM
CONTROLLER
Figure 1. Example hardware architecture including our target EHW.
This section details the hardware and software architecture of the evolvable hardware system. The evolvable hardware system is entirely implemented on one FPGA chip.
The core modules are the processor, the RAM and the target
EHW module. The evolutionary algorithm, stored in RAM,
runs on the embedded processor, and the target EHW module is used for fitness evaluation. This module is built as a
reconfigurable hardware module which accepts a bit string
as configuration.
The bus system is a part of IBM’s CoreConnect architecture [19]. The On-chip Peripheral Bus (OPB), with a
data width of 32 bits, is used to connect the processor and
the peripherals. The Processor is connected to dual-port
SRAM, called Block RAM (BRAM), using a dedicated Local Memory Bus (LMB). This bus features separate 32-bit
wide channels for program instructions and program data,
using the dual-port feature of the BRAM. The LMB makes
single-cycle access of BRAM possible.
The target evolvable hardware is at the moment connected to the OPB. This will be detailed in section 2.2. Various on-chip peripherals are also connected to the OPB, including a UART for RS-232 serial communications. An
SDRAM controller can also be connected to the OPB
should more memory be needed.
The on-chip system is built using the Xilinx Embedded
Development Kit (EDK) [17]. EDK is a collection of Intellectual Property (IP) cores and tools for building embedded
systems on FPGAs. The hardware and software parts of the
system can be specified parametrically through various configuration files, and net lists and libraries are automatically
generated.
2.1
2.2
2
The On-Chip Evolution System
System Overview
The architecture consists of a set of modules interconnected with buses, as seen in Fig. 1. The MicroBlaze soft
processor core provided by Xilinx is central in the system.
It employs a 32-bit, 3-pipeline stage RISC architecture and
is optimized for implementation in the Xilinx FPGAs. It
is user configurable, allowing for use of various interfaces
and functionality according to system needs and constraints.
Amongst others, hardware division and multiplication, and
a floating point unit (FPU), are supported [18].
Target Evolvable Hardware
The target EHW is implemented as an OPB slave peripheral module – see Fig. 2. Interfacing with the OPB bus
has been simplified by the use of a Xilinx IP Interface core
(IPIF). This provides a simpler interface standard, the Xilinx IPIC, for the user module.
Control and configuration of this module are undertaken
through register write operations. Genome values are written to registers which are again connected to the configuration inputs of the functional unit array. Registers are also
OPB
FUNCTIONAL UNIT ARRAY
TARGET EHW MODULE
OPB
IPIF
IPIC
interface
FUNCTIONAL
UNIT ARRAY
Input
Config
Output
fij
fij
fij
fij
fij
fij
.
.
.
.
.
.
ffijij
ffijij
...
.
.
.
ffijij
Input
Output
CONTROL LOGIC
Config
Figure 2. The architecture of the target EHW
system.
provided for feeding the EHW with inputs and for storing
the outputs.
2.2.1
Functional unit array
The functional unit array (FUA), see Fig. 3, is a general
structure used for EHW. It is based on the principle that the
configuration of the FPGA itself is not changed, but a virtual
circuit which is implemented on top of it can be reconfigured. Hence the names ”Virtual FPGA” [3] or ”Virtual Reconfigurable Circuit” [11] have been proposed for circuits
building on the same principles. The behavior of the FUA
is achieved by writing configuration data to registers which
in turn control the functionality of each unit, fi,j in the array. A fixed set of functions and connections to other units’
outputs are available in each functional unit. The configuration lines control multiplexers which select which inputs
and functions to use. The system bears resemblance to the
VRC in [11].
Our FUA consists of a fixed-size array of functional
units. The array consists of C columns of R units from
input to output. Each unit has I inputs, each of which can
be connected to any output in the previous column. The
unit’s output is a result of any of F functions. The function
of each unit and its inputs are configurable. They are determined by evolution, in the way that each individual’s binary
genome is sent to the FUA and mapped to the configuration
lines. Fitness is then calculated by feeding a number of input vectors on the inputs of the first column, and reading
the results from the outputs of the last column. The array
is constructed in a pipelined fashion, that is, registers are
connected to the outputs of each layer. Currently, this is not
exploited for fitness evaluation. Only one input vector is
evaluated at a time.
Figure 3. The architecture of the functional
unit array subsystem.
#
0
1
2
3
4
5
6
7
Description
Saturated Add
High Threshold
Range
Greater
Bitwise AND
Bitwise OR
Average
Half
Function
O = A + B, F F if (A + B) > F F
O = F F if A > C2 , else 0
O = F F if C1 < A < C2 , else 0
O = F F if A > B, else 0
O = A AND B
O = A OR B
O = (A + B) >> 1
O = A >> 1
Table 1. Functions used by units in the image
recognition task. Inputs are A and B, ouput is
O. C1 and C2 are constants available to each
unit.
2.2.2
Image recognition application
The FUA has been applied to an image recognition application. For this, 8-bit wide databuses are used as inputs and
outputs of the units, which is also the data width of one pixel
from the input image. The array has R = 8 rows and C = 5
columns. There are I = 2 inputs and there are F = 8 functions available for each unit. The functions are summarized
in table 1. The specific functions are chosen because they
are believed to be useful for an image recognition task. The
threshold and range functions give a possibility to discriminate pixels based on their intensity value. Combined with
the saturated adder, threshold element-like functionality, as
seen in artificial neural networks, can be achieved. In addition, two global constants, C1 and C2 , are available to each
unit. These are coded in the genome for each individual as
8-bit values.
The input images have a resolution of 8x8 pixels, 64 in
total, while each column in the FUA consists of 8 units,
1
2
3
4
5
6
7
inputs of FUA
0
input image
selector
column
Figure 4. The genome selects one pixel from
each row of the 64-pixel picture, giving 8 pixels for the first column of the FUA.
capable of selecting from 8 inputs. To save genome size and
circuit space, as well as simplifying the design, a ”selector
column” is introduced. Basically this imposes a restriction
of only letting the unit in one row select a pixel input from
the corresponding row of 8 pixels in the image. Thus, only
3 bits are needed code for a pixel from each row, which
gives a total of 24 bits of the genome for the entire selector
column. The selected pixels are then passed on to the inputs
of the regular FUA. See Fig. 4.
This functionality could have been implemented in hardware by having the first column of the FUA be populated
with special selector units, or hard-coded ”Add A, 0” units
(using already defined functionality in the standard units),
each of them connected to a corresponding row of image
pixels. However, in the current implementation, this selection is done in software on the MicroBlaze, as this lets us
transfer only 8 instead of 64 pixels over the data bus for
each vector. Depending on the source of the data vectors
this can be moved to hardware in future versions.
The last column of the array gives 8 8-bit outputs. However, only the 8-bit output of the topmost functional unit is
used for the classification of the image.
The encoding of each functional unit in the genome
string is as follows:
Function (3 bit)
Input 1 (3 bit)
Input 2 (3 bit)
This gives a total of U = log2 F + I × log2 R = 9 bits for
each unit. The entire genome is encoded as follows:
C1 ,C2 (16b)
Sel. col.(24b)
f0,0 (9b)
...
f4,7 (9b)
The total amount of bits in the genome is then 2 × 8 +
R × log2 R + C × R × U = 400.
Figure 5. The prototype board with the VirtexII Pro FPGA. Source: Xilinx
2.3
The Hardware Platform
The design is synthesized for a Xilinx Virtex-II Pro
(XC2VP30) – see Fig. 5. This device contains 30,816 logic
cells, 2,448 Kbit BRAM, and two PowerPC 405 (PPC) embedded processors. The FPGA is situated on a Xilinx XUP
Virtex-II Pro development board, which also contains a configuration PROM and various useful interfaces.
2.4
The Genetic Algorithm Implementation on
the MicroBlaze
A Genetic Algorithm (GA) was implemented to run on
the MicroBlaze processor. 64KB of BRAM was used as a
combined instruction and data memory. The program was
written in C and compiled and linked using the MicroBlaze version of the GNU GCC compiler tools. However, the
limited amount of RAM makes it necessary to omit the use
of most standard C library functions. The code was developed with verification and simulation on a PC workstation
in mind. It is therefore possible to compile the program
both for the MicroBlaze using GCC and for a PC workstation using Microsoft Visual C, with code differences only
for low-level functionality.
As the MicroBlaze can be configured with a FPU, some
of the code uses floating point. However, if FPGA resource
usage is critical, the FPU should be removed. This implies software emulation of floating point, which should be
avoided in order to reduce code size and execution speed.
Conversion to fixed point arithmetic could then be considered. Dynamic memory management is not supported for
the MicroBlaze. Allocation of memory for data structures
such as population and individuals is handled manually.
The GA implemented for this experiment follows the
Simple GA style, given by Goldberg [2]. Fitness scaling
has been implemented, including linear scaling. A fitnessproportionate selection scheme is implemented through the
use of a roulette wheel mechanism. The individuals are
sorted with the qsort algorithm. For mutation, instead of
having one probability of mutation for every bit in the
genome, a quicker solution has been adopted. The number of mutations, n, for the whole genome is calculated by
a random lookup in a 10-position array. Then, n random
places are mutated (bit-flipped) in the genome. This is more
efficient than performing a check for every bit if a mutation
should occur or not.
2.5
Fitness Function and GA Parameters
The fitness of the face recognition system is based on the
system’s ability to recognize one face from a range of different faces. The images are taken from the ”Olivetti Research
Laboratory Database of Faces”1 . The original resolutions
of the images were 92x112 and there were 400 faces divided over 40 people, giving 10 face images per person. In
our experiment the images were downsampled to 8x8 pixels
and 10 categories were used, giving a total of 100 64-pixel
vectors. These are stored in BRAM.
90 of the vectors were used for training of the system,
while the remaining 10 were used for verification after the
evolution run. Each configuration of the hardware is fed
with the 90 training vectors (vec), and fitness is based on
the system’s ability to give a positive output for the 9 image
vectors belonging to the right person, and a negative output
for the rest. Output values from the system are compared
to target values d, which are positive (d = 1) for vectors
belonging to the right person, and negative for all other vectors (d = 0). An output value of F F (hexadecimal) from
the topmost functional unit in the last column is considered
a positive output (o = 1), while anything else is considered
a negative output (o = 0).
The fitness function can be expressed in the following
way:

 0 if o 6= d
X
1 if o = d = 0
F =
x
where x =
(1)

vec
4 if o = d = 1
For each input image vector the computed output o is compared to the target value d. If these equal and the value is
1 http://www.cl.cam.ac.uk/Research/DTG/attarchive/facedatabase.html
Resource
Slices
Slice Flip Flops
4 input LUTs
Used
2990
1037
5488
Available
13696
27392
27392
Percent
21
3
20
Table 2. Post-synthesis device utilization
for the EHW module implemented on the
XC2VP30.
negative (0) then 1 is added to the fitness function. On the
other hand, if they equal and the value is equal to one, 4
is added. In this way, an emphasize is given to the outputs
being positive. This has shown to be important for getting
faster evolution of well performing circuits. The function
sums these values for the training image vectors (vec).
For the evolution, a population size of 30 is used. Elitism
is used, thus, the best individuals from each generation
are carried over to the next generation. The (single point)
crossover rate is 0.9, thus the cloning rate is 0.1. A roulette
wheel selection scheme is applied, and linear scaling is
used. The mutation rate is expressed as a probability for
a certain number, n, of mutations on each genome. The
probabilities are as follows:
n
p(n)
3
0
0
1
2
3
7
10
2
10
1
10
Results
This section presents and discusses the results of our implementation and experiments.
3.1
Device utilization and clock speed
The total resource usage for the system, including the
target EHW, the MicroBlaze, bus structure and peripherals,
is 5113 slices, equalling 38% of the XC2VP30. The system was implemented to run at 100MHz. It is possible that
higher frequencies are attainable, up to a limit of around
150MHz. The MicroBlaze has a maximum frequency of
150MHz on the Virtex-II Pro, and the target EHW has
a post-synthesis maximum estimate of 148MHz. Table 2
shows the amount of logic used for the target EHW module. A maximum of 21% of the FPGA’s total resources are
used by this module. The post-synthesis resource usage of
the MicroBlaze processor is 1731 slices, of which 33% is
used for the FPU.
3.2
Evolution speed
The speed of an evolution run of 1000 generations was
measured for the on-chip EHW system and a Pentium M PC
Clock speed(MHz)
Evolution speed(ms)
EHW
100
10914
PC
1700
4993
PC
EHW
17
0.46
Table 3. Evolution speeds on the on-chip
EHW and PC systems. The last column indicates the ratio between the first and second
columns.
12000
Fitness calculation
10000
Individual evaluation
GA w/o fitness
Time(ms)
8000
6000
4000
2000
0
Pentium M 1.7GHz
MicroBlaze EHW 100MHz
Figure 6. Evolution speeds.
for speed comparison. As can be seen in Table 3, the evolution runs faster on the PC. However, although the evolution runs 2.2 times faster, the processor runs at 17 times the
clock frequency of the MicroBlaze system. This is mainly
due to the fact that the evaluation of individuals is carried
out in hardware on the on-chip EHW system whereas it is
simulated in the PC system. In order to better analyze the
speed of different parts of the evolution process, three different measures were made. The first measure indicates the
time used by the GA without any form of fitness evaluation, ie. all the fitness values were set directly to 0 in the
program. The second measure indicates the time spent on
evaluation of the individuals. That is, the time used for calculating outputs from inputs to the FUA. The third measure
is the time used for fitness calculation. This consists of the
calculation of a fitness value based on the training vectors
and outputs of the FUA, as well as pixel selection and transfer to the FUA. The results are summarized in Fig. 6. It is
clear that the sum of the operations related to fitness are the
most time-consuming parts for both of the systems. The
bottleneck in the PC program is clearly the evaluation of
individuals, which is simulated in software. The execution
time for this part is greatly reduced in the on-chip system. If
the target EHW module increases in size or complexity, this
difference will become even more significant. The hardware system’s bottleneck is the overhead associated to the
fitness determination of each individual. This is because
pixel selection and transfer over the system bus, as well as
the readback of the FUA output and comparison with the
target value, is done for each training vector. Thus, the low
program execution speed and bus transfer rate contribute to
making the fitness overhead much slower than the hardware
fitness evaluation.
3.3
Image Recognition Performance
Individuals with maximal fitness values were evolved after an average of 1133 generations over 20 evolution runs.
A solution with maximal fitness value was found in every
run. The average evolution time is then 12.4 seconds. For
verification, the last vector in each category was used. This
means that there were 10 vectors to test the accuracy, where
the evolved system would need to produce a ’1’ for one of
these, and ’0’ for the others. Due to the lack of more verification vectors, the position of the vector within the category
was changed, and the remaining vectors were used for training vectors for a new evolution run. This was repeated 10
times, until all of the vectors in a category had served as a
verification vector. The accuracy over all the outputs from
all the evolution runs is 94.9% correct outputs. Also, 7 of
these 10 evolution runs produced individuals which gave
correct outputs for all vectors.
3.4
Real-time adaptation
To explore possibilities for using the system with applications where the environment is changing over time, an
adaptation experiment was carried out. In a real-world application one would imagine the training set changes, by
introducing new face images of either the target face category or the other category. It could also be imagined that
some training vectors would have to be removed, based on
timestamps for example.
We simulated such a change in the training set by removing one vector and adding another vector from the target face category every 500 generations. However, as there
were only 9 images available for training in each category,
8 images were used for the training set and 1 image cycled
as the not used image. To introduce some more change, all
the pixels in the added vector were multiplied by four.
The experiment was run for 5000 generations. The results can be seen in Fig. 7. By observing the fitness value
of the best individual of each population, one can get an
impression of the system’s adaptability. Some of the training set changes are clearly observable by a drop in fitness,
while other changes do not affect the fitness value. The recover time from the fitness value drops seems to vary from
around 50 to 500 generations, which is equivalent to 0.5 to
5.5 seconds of run time on the MicroBlaze system.
105
100
95
90
85
Best fitness
Avg. fitness
80
0
500
1000
1500
2000
2500
3000
3500
4000
Figure 7. Population fitness over generations, showing the fitness of the best indiviudal and the
population average fitness. Temporary drops in fitness can be observed when the training set is
modified. Notice that the y-axis only shows fitness values from 80 to 105. 104 is the maximum
possible fitness value.
3.5
Discussion
The FPGA resource usage of the system is not a problem with the FPGA used in our experiments. But for lowcost applications, it would be desirable to use a smaller, less
expensive FPGA device. As [11] points out, the disadvantage of a virtual circuit approach is the high implementation
cost. Much of this cost comes from interconnect between
the units and multiplexers for selecting the inputs to each
unit. Other connection schemes should be explored. At the
expense of a slight speed loss, a possibility could be to use
a kind of addressing or broadcasting scheme for sending
the data to the next layer of units. The MicroBlaze processor can be configured to take up less resources by removing some functionality. A removal of the FPU would give
the most significant decrease in resource usage. This does
not necessarily have to imply a performance penalty if the
program code is written to avoid the use of floating point
instructions.
Using 12.4 seconds on average to evolve a system with
maximum fitness, the speed of the on-chip system is currently acceptable. It should also be noted that systems with
relatively high fitness values are available earlier in the evolution runs, which are usable until a better solution has been
found. However, as future tasks may be more complex to
evolve, a speed increase would be desirable. As pointed
out in section 3.2, there is a large overhead associated with
fitness calculation in software. If feeding of the training
vectors to the FUA and fitness value calculation would be
moved to hardware, a significant speed increase would be
possible. The FUA would achieve a much higher throughput, and the pipelined nature of the design could be exploited. In this case the total execution time for the on-chip
evolution would tend towards the execution time used by
only the GA. In such a case the on-chip system could perform at twice the speed, or better, than the PC system. However, an increased degree of hardware specialization comes
at the prize of a higher implementation effort and increased
resource usage. In general, performance needs for the application should be considered up against implementation
costs. A speed performance comparison with the system
described in [1] should be the subject of future work.
A face recognition performance of 94.9% is acceptable,
but the accuracy measure might not reflect real-world performance. Since only one category of faces is to be recognized, more verification vectors for this category are desirable. This would give a clearer picture of the accuracy of the
system. In future work, it could also be explored how this
system would perform when expanded to a multi-category
classification system, like the ones seen in [21] and [15].
In that case, the current FUA would be duplicated to the
amount of categories desired, and the amount of high output values from the last column in each FUA could be used
as input to a max detector for determination of a category.
Real-time adaptation to a changing environment is a
goal for our on-chip evolutionary systems. The initial experiments of introducing changes to the training set seem
promising, as the system seems to regain a high fitness in
short time. But in order to explore adaptation further, a
larger training set is desirable for the face recognition application. In a real-world application, one target EHW module
configured with the best individual could be used as an operational circuit, while a second target EHW module is used
for fitness evaluation for the evolving population. One example of an adapting application could be a camera with an
integrated chip, assigned to detect a certain person. Over
time the system could receive new images of the person, or
images of other persons which should not be detected, to be
added to the training set.
4
Conclusions
We have presented a system-on-chip EHW system using
a soft IP core processor for running the evolutionary algorithm. This has shown reasonable performance combined
with flexibility for experimentation. The configurability of
the MicroBlaze core makes such a combination interesting
for applications in low-cost systems. The EHW architecture proposed utilizes data buses and higher level functions
in order to reduce the search space. The inputs of the FUA
and the functions available to the units are adapted to the
image recognition task. These measures make it possible
to use evolution for a relatively complex task. The experimental results indicate that the system is suitable for further
experiments and development of real-time on-chip adaptation.
Acknowledgments
The research is funded by the Research Council of
Norway through the project Biological-Inspired Design of
Systems for Complex Real-World Applications (project no
160308/V30).
References
[1] K. Glette and J. Torresen. A flexible on-chip evolution
system implemented on a Xilinx Virtex-II Pro device. In
Evolvable Systems: From Biology to Hardware. Sixth International Conference, ICES 2005, volume 3637 of Lecture
Notes in Computer Science, pages 66–75. Springer-Verlag,
2005.
[2] D. Goldberg. Genetic Algorithms in search, optimization,
and machine learning. Addison–Wesley, 1989.
[3] P. Haddow and G. Tufte. Bridging the genotype-phenotype
mapping for digital fpga. In Proc. of the Second NASA/DoD
Workshop on Evolvable Hardware, 2001.
[4] M. Iwata, I. Kajitani, H. Yamada, H. Iba, and T. Higuchi.
A pattern recognition system using evolvable hardware. In
Proc. of Parallel Problem Solving from Nature IV (PPSN
IV), volume 1141 of Lecture Notes in Computer Science,
pages 761–770. Springer-Verlag, September 1996.
[5] I. Kajitani et al. A myoelectric controlled prosthetic hand
with an evolvable hardware lsi chip. Technology and Disability, 15(2):129–143, 2003.
[6] I. Kajitani, T. Hoshino, N. Kajihara, M. Iwata, and
T. Higuchi. An evolvable hardware chip and its application
as a multi-function prosthetic hand controller. In Proc. of
16th National Conference on Artificial Intelligence (AAAI99), pages 182–187, 1999.
[7] T. Martinek and L. Sekanina. An evolvable image filter: Experimental evaluation of a complete hardware implementation in fpga. Lecture Notes in Computer Science,
2005(3637):76–85, 2005.
[8] M. Murakawa, S. Yoshizawa, I. Kajitani, T. Furuya,
M. Iwata, and T. Higuchi. Hardware evolution at function
level. In Proc. of Parallel Problem Solving from Nature IV
(PPSN IV), volume 1141 of Lecture Notes in Computer Science, pages 62–71. Springer-Verlag, September 1996.
[9] S. Perkins, P. Porter, and N. Harvey. Self-contained
spatially-structured genetic algorithm for signal processing.
In J. Miller et al., editors, Evolvable Systems: From Biology
to Hardware. Third International Conference, ICES 2000,
volume 1801 of Lecture Notes in Computer Science, pages
165–174. Springer-Verlag, 2000.
[10] D. Roggen, Y. Thoma, and E. Sanchez. An evolving and developing cellular electronic circuit. In J. Pollack, M. Bedau,
P. Husbands, T. Ikegami, and R. A. Watson, editors, ALife9:
Proceedings of the Ninth International Conference on Artificial Life, pages 33–38, Boston, MA, 2004. MIT Press.
[11] L. Sekanina. Virtual reconfigurable circuits for real-world
applications of evolvable hardware. Lecture Notes in Computer Science, 2003(2606):186–197, 2003.
[12] B. Shackleford et al. A high-performance, pipelined, FPGAbased genetic algorithm machine. Journal of Genetic Programming and Evolvable Machines, 2(1):33–60, 2001.
[13] J. Torresen. Exploring knowledge schemes for efficient evolution of hardware. In Proc. of the 2004 NASA/DoD Conference on Evolvable Hardware.
[14] J. Torresen. Possibilities and limitations of applying evolvable hardware to real-world application. In R. Hartenstein et al., editors, Field-Programmable Logic and Applications: 10th International Conference on Field Programmable Logic and Applications (FPL-2000), volume 1896
of Lecture Notes in Computer Science, pages 230–239.
Springer-Verlag, 2000.
[15] J. Torresen. Two-step incremental evolution of a digital logic
gate based prosthetic hand controller. In Evolvable Systems:
From Biology to Hardware. Fourth International Conference, (ICES’01), volume 2210 of Lecture Notes in Computer
Science, pages 1–13. Springer-Verlag, 2001.
[16] G. Tufte and P. C. Haddow. Prototyping a ga pipeline for
complete hardware evolution. In 1st NASA / DoD Workshop
on Evolvable Hardware (EH ’99), pages 18–25, 1999.
[17] Xilinx Inc. Embedded System Tools Reference Manual, February 2005.
[18] Xilinx Inc. MicroBlaze Processor Reference Guide, May
2005.
[19] Xilinx Inc. Processor IP Reference Guide, February 2005.
[20] X. Yao and T. Higuchi. Promises and challenges of evolvable hardware. In T. Higuchi et al., editors, Evolvable Systems: From Biology to Hardware. First International Conference, ICES 96, volume 1259 of Lecture Notes in Computer Science, pages 55–78. Springer-Verlag, 1997.
[21] M. Yasunaga, T. Nakamura, I. Yoshihara, and J. Kim.
Genetic algorithm-based design methodology for pattern
recognition hardware. In J. Miller et al., editors, Evolvable
Systems: From Biology to Hardware. Third International
Conference, ICES 2000, volume 1801 of Lecture Notes in
Computer Science, pages 264–273. Springer-Verlag, 2000.
Download