Barresi_evolutionary_circuit_repair_hw6

advertisement
STEVENS INSTITUTE OF TECHNOLOGY
Homework 6
Design 6
4/14/2014
Section 1: Division of Work
Table 1 below shows the group’s contribution to this project. Each member incorporated
an even amount of effort to this homework. First the group members collaborated and
made the black box, the transparent box, and the functional means tree. Next the group
members each added the functionality of the project, along with the component
functionality, the component interface, and the performance metric
Table 1: Group contribution distribution
Percentage of Effort Towards
Assignment
Kevin
Barresi
Nishant
Panchal
Giancarlo
Rico
Bryan
Bonnet
25%
25%
25%
25%
Section 2.1: Project Functionality
The issue of digital circuit degradation is an important issue that needs attention; as
devices become more complex and expensive, the cost of circuit failure is very high. A single
stuck-on fault can render an entire complex system inoperable. The functionality of our design is
to create a framework of fault identification and repair in consumer-oriented markets using a
hardware evolutionary approach. The design would lead to a highly robust and reliable digital
circuitry. The targeted platform for proof of concept (PoC) demonstration would be on a Field
Programmable Gate Array (FPGA) device.
The overall functionality of the design could be broken down into self-healing,
redundant, genetically evolving, test vectors, and minimal user interaction. The system will have
a self-healing mechanism for the circuitry. It won’t need an external connection to a computer to
achieve reconfiguration, which means all operations would occur within the chip independently.
When the framework attempts to fix the faulty circuit, there is no downtime on the chip. The
chip would continue to operate while the fault is being repaired. The operational switch would be
seamless. Genetically evolving nature of the design would ensure that the chip learns from prior
repairs. This would reduce redundancy causing minimal effort to repair the chip. The power and
temperature requirements would not be affected by repair operations to a large extent. Test
vectors would be autonomously generated before operation. This would not require any userlicense software to be accessed or obtained. Fault detection would occur regularly and not affect
power requirements. Testing operations shall not affect normal operation of the chip. The testing
will be blind to the user and should not require any manual user interaction.
Section 2.2: Component Functionality:
Operation Module: This component holds the functionality of the desired circuit. This
functionality maps the Operational Module inputs to the desired output based on the
configuration. The Operational Module is capable of being reconfigured on-demand by the
Configuration Module.
Configuration Module: This module manages several Operational Modules simultaneously. The
Configuration Module schedules updates to the Operational Modules for reconfiguration and
ensures redundancy and constant operation of the overall system.
Fault Detection: This module detects faults occurring in Operational Modules. This function is
performed by injecting test vectors from the Fault Detection module into the target Operational
Module. The module then compares the actual output with the expected output of the test vector
to determine if a fault is present within the Operational Module.
Fault Localization: Once a fault is detected by the Fault Detection module, it activates the Fault
Localization module. This module is capable of injecting further test vectors which and
comparing expected and actual output. The resulting output will specifically localize where a
specific “stuck on/off” fault is located on the Operational Module.
Configuration Generation: This module holds the logic for the creation of new configurations
for Operational Modules which contain faults. It is capable of evolving by combining previous
configurations and new faults sent from the Fault Localization module. It is this module where
the system gains its unique self-healing properties by operating genetically evolving algorithms
capable of creating new configurations that work around known faults in the Operational
Module.
Reconfiguration: When a new configuration is generated and ready for programming, the
Reconfiguration Module will perform the necessary reconfiguration on the Operational Module
based on the map which was sent from the Configuration Generation module When this
operation is complete, it signals to the rest of the system that the Operational Module is now
ready for full operation.
Section 2.3: Component Interface
The device as a whole relies on a highly modular structure of interconnecting sub module
devices. As seen in Figure 1, a single self-repairing module consists a minimum three pieces: the
two Operation Modules and the single Configuration Module. Operation modules are
independent circuits that can each complete the desired operation of the device alone. Thus, the
Operation Modules are each connected to the device inputs/outputs. For a low-level
implementation, such as a simple adder circuit, each input of the adder would be routed to both
Operation Modules, and the outputs both connected to the global output of the adder.
Each of the Operation Modules are in turn connected to the single Configuration Module.
There are four distinct connection lines: a test line that connects to the input/output of the
module, a reconfigure line that allows the Operation Module’s configuration to be updated, a
fault detection line that aims to localize and characterize any detected faults, and finally an
enable/disable line that can completely disable or enable the Operation Module. Together, these
control lines provide the Configuration Module with the information it needs to correctly
diagnose faults, and the ability to repair them, if possible.
In a multi-level architecture, many instances of the aforementioned modules can be
connected together to provide higher levels of redundancy. Ideally, this would result fault
resistance in devices whose complexity would require extremely long runtimes for the genetic
algorithm used to derive re-route paths. It would also result in lower device prices, as no
redundancy in large, complex circuit pieces would be required.
Figure 1: Architecture overview
Section 2.4: Performance Metric
The solution we propose is an algorithm that would sit inside the hardware of any
electronic device and optimizes its circuitry in order to achieve optimal performance in the case
of any damage. Our solution being software based, would require certain key metrics that would
determine its success: optimal design, readability of code, computational efficiency, bounds of
code, and safety margin.
Optimal Design: Our solution should firstly work better than previous proposed solution, in
order to motivate companies to adopt our algorithm. The mathematics which the algorithm
utilizes should be proven to be the best known solution. This would allow us the patent our idea
and create demand amongst circuit designers to utilize our solution.
Design Readability: One of the biggest issues faced in software based design is the readability of
code. If the code is created in an organized manner, it can make updates and bug fixes much
easier. Making readable code would allow the group to optimize the code when a better known
mathematical method of solving the presented problem is found.
Computation Efficiency: The accuracy and precision with which the circuit is optimized is
extremely important to the performance of our design. The fault detected in the circuit must be
accurately solved with precision each time a problem arises without hurting the output
performance of the device itself.
Bounds of Code: The Big-O, Big-Theta, and Big-Omega analysis of the algorithm must also be
measured in order to make sure that the time it takes to optimize the circuit doesn’t increase too
quickly as more faults are detected. The algorithm must be implemented in a manner that the
time associated with circuit optimization is kept low.
Safety Margin: Our solution is only as strong as the encryption protecting it. The algorithm must
be stored in part of the circuitry that is most fault resistant and is secure. The algorithm should
also make sure that in no case should an attempt to fix the faulty circuitry consume too much
power or harm the actual device.
Black Box
Fixed the faulty
circuitry by
remapping the
original designated
process vectors
Power
Repairs broken circuit without
interrupting user process and
evolves the
Vectors of making
the circuitry
function as created
by designer
Heat
Transparent Box
Power
Convert power to
Voltage
Heat
Heat
power
Vectors
representing the
circuitry design
constructed by
designer
Fault Detection
Vector output that
fixes the circuitry
and optimizes
performance
Configuration
Module
Operation Module
Fault Localization
Configuration
Generation
Reconfiguration
Function-Means Tree
Fix faulty circuitry
Chipwise
Synthesis
Fault Masking
Multiple redundant
circuit blocks
Extra circuit
blocks
Overhead
Maintenance
Selection
mechanism of
only one block
Manage power
usage
Chip type
Develop chip for
specific device
Maintain system
latency
Triple-Modular
Redundancy
Nanotechnology
Add code to
implement
efficiently
Use nanotech to
make chip
propersize
Parallelism
Add three
modules
Voting Mechanism
Parallelism
works based on
selection
mechanism
Choose one of
the three
modules
Our Solution
Operation Module
Brain of the
circuit device
Control Module
Decided by
circuit designer
Detects faults
Fixes faults and
improves design
References
R. Salvador, A. Otero, J. Mora, E. de la Torre, L. Sekanina, and T. Riesgo. Fault
tolerance analysis and self-healing strategy of autonomous, evolvable hardware
systems. In Reconfigurable Computing and FPGAs (ReConFig), 2011 International
Conference on, pages 164-169, 2011.
J. Lohn, G. Larchev, and R. Demara. Evolutionary fault recovery in a virtex fpga using a
representation that incorporates routing. In Parallel and Distributed Processing
Symposium, 2003. Proceedings. International, pages 8 pp.-, 2003.
J. Heiner, B. Sellers, M. Wirthlin, and J. Kalb. Fpga partial reconfiguration via
configuration scrubbing. In Field Programmable Logic and Applications, 2009. FPL
2009. International Conference on, pages 99-104, 2009.
S. Harding, J.F. Miller, and W. Banzhaf. Self modifying cartesian genetic programming:
Parity. In Evolutionary Computation, 2009. CEC ‘09. IEEE Congress on, pages 285292, 2009.
L. L. Goh. Novel fault localization approach for atpg / scan- fault failures in complex subnano fpga/ asic debugging. In Physical and Failure Analysis of Integrated Circuits
(IPFA), 2010 17th IEEE International Symposium on the, pages 1-4, July 2010.
Download