STEVENS INSTITUTE OF TECHNOLOGY Homework 6 Design 6 4/14/2014 Section 1: Division of Work Table 1 below shows the group’s contribution to this project. Each member incorporated an even amount of effort to this homework. First the group members collaborated and made the black box, the transparent box, and the functional means tree. Next the group members each added the functionality of the project, along with the component functionality, the component interface, and the performance metric Table 1: Group contribution distribution Percentage of Effort Towards Assignment Kevin Barresi Nishant Panchal Giancarlo Rico Bryan Bonnet 25% 25% 25% 25% Section 2.1: Project Functionality The issue of digital circuit degradation is an important issue that needs attention; as devices become more complex and expensive, the cost of circuit failure is very high. A single stuck-on fault can render an entire complex system inoperable. The functionality of our design is to create a framework of fault identification and repair in consumer-oriented markets using a hardware evolutionary approach. The design would lead to a highly robust and reliable digital circuitry. The targeted platform for proof of concept (PoC) demonstration would be on a Field Programmable Gate Array (FPGA) device. The overall functionality of the design could be broken down into self-healing, redundant, genetically evolving, test vectors, and minimal user interaction. The system will have a self-healing mechanism for the circuitry. It won’t need an external connection to a computer to achieve reconfiguration, which means all operations would occur within the chip independently. When the framework attempts to fix the faulty circuit, there is no downtime on the chip. The chip would continue to operate while the fault is being repaired. The operational switch would be seamless. Genetically evolving nature of the design would ensure that the chip learns from prior repairs. This would reduce redundancy causing minimal effort to repair the chip. The power and temperature requirements would not be affected by repair operations to a large extent. Test vectors would be autonomously generated before operation. This would not require any userlicense software to be accessed or obtained. Fault detection would occur regularly and not affect power requirements. Testing operations shall not affect normal operation of the chip. The testing will be blind to the user and should not require any manual user interaction. Section 2.2: Component Functionality: Operation Module: This component holds the functionality of the desired circuit. This functionality maps the Operational Module inputs to the desired output based on the configuration. The Operational Module is capable of being reconfigured on-demand by the Configuration Module. Configuration Module: This module manages several Operational Modules simultaneously. The Configuration Module schedules updates to the Operational Modules for reconfiguration and ensures redundancy and constant operation of the overall system. Fault Detection: This module detects faults occurring in Operational Modules. This function is performed by injecting test vectors from the Fault Detection module into the target Operational Module. The module then compares the actual output with the expected output of the test vector to determine if a fault is present within the Operational Module. Fault Localization: Once a fault is detected by the Fault Detection module, it activates the Fault Localization module. This module is capable of injecting further test vectors which and comparing expected and actual output. The resulting output will specifically localize where a specific “stuck on/off” fault is located on the Operational Module. Configuration Generation: This module holds the logic for the creation of new configurations for Operational Modules which contain faults. It is capable of evolving by combining previous configurations and new faults sent from the Fault Localization module. It is this module where the system gains its unique self-healing properties by operating genetically evolving algorithms capable of creating new configurations that work around known faults in the Operational Module. Reconfiguration: When a new configuration is generated and ready for programming, the Reconfiguration Module will perform the necessary reconfiguration on the Operational Module based on the map which was sent from the Configuration Generation module When this operation is complete, it signals to the rest of the system that the Operational Module is now ready for full operation. Section 2.3: Component Interface The device as a whole relies on a highly modular structure of interconnecting sub module devices. As seen in Figure 1, a single self-repairing module consists a minimum three pieces: the two Operation Modules and the single Configuration Module. Operation modules are independent circuits that can each complete the desired operation of the device alone. Thus, the Operation Modules are each connected to the device inputs/outputs. For a low-level implementation, such as a simple adder circuit, each input of the adder would be routed to both Operation Modules, and the outputs both connected to the global output of the adder. Each of the Operation Modules are in turn connected to the single Configuration Module. There are four distinct connection lines: a test line that connects to the input/output of the module, a reconfigure line that allows the Operation Module’s configuration to be updated, a fault detection line that aims to localize and characterize any detected faults, and finally an enable/disable line that can completely disable or enable the Operation Module. Together, these control lines provide the Configuration Module with the information it needs to correctly diagnose faults, and the ability to repair them, if possible. In a multi-level architecture, many instances of the aforementioned modules can be connected together to provide higher levels of redundancy. Ideally, this would result fault resistance in devices whose complexity would require extremely long runtimes for the genetic algorithm used to derive re-route paths. It would also result in lower device prices, as no redundancy in large, complex circuit pieces would be required. Figure 1: Architecture overview Section 2.4: Performance Metric The solution we propose is an algorithm that would sit inside the hardware of any electronic device and optimizes its circuitry in order to achieve optimal performance in the case of any damage. Our solution being software based, would require certain key metrics that would determine its success: optimal design, readability of code, computational efficiency, bounds of code, and safety margin. Optimal Design: Our solution should firstly work better than previous proposed solution, in order to motivate companies to adopt our algorithm. The mathematics which the algorithm utilizes should be proven to be the best known solution. This would allow us the patent our idea and create demand amongst circuit designers to utilize our solution. Design Readability: One of the biggest issues faced in software based design is the readability of code. If the code is created in an organized manner, it can make updates and bug fixes much easier. Making readable code would allow the group to optimize the code when a better known mathematical method of solving the presented problem is found. Computation Efficiency: The accuracy and precision with which the circuit is optimized is extremely important to the performance of our design. The fault detected in the circuit must be accurately solved with precision each time a problem arises without hurting the output performance of the device itself. Bounds of Code: The Big-O, Big-Theta, and Big-Omega analysis of the algorithm must also be measured in order to make sure that the time it takes to optimize the circuit doesn’t increase too quickly as more faults are detected. The algorithm must be implemented in a manner that the time associated with circuit optimization is kept low. Safety Margin: Our solution is only as strong as the encryption protecting it. The algorithm must be stored in part of the circuitry that is most fault resistant and is secure. The algorithm should also make sure that in no case should an attempt to fix the faulty circuitry consume too much power or harm the actual device. Black Box Fixed the faulty circuitry by remapping the original designated process vectors Power Repairs broken circuit without interrupting user process and evolves the Vectors of making the circuitry function as created by designer Heat Transparent Box Power Convert power to Voltage Heat Heat power Vectors representing the circuitry design constructed by designer Fault Detection Vector output that fixes the circuitry and optimizes performance Configuration Module Operation Module Fault Localization Configuration Generation Reconfiguration Function-Means Tree Fix faulty circuitry Chipwise Synthesis Fault Masking Multiple redundant circuit blocks Extra circuit blocks Overhead Maintenance Selection mechanism of only one block Manage power usage Chip type Develop chip for specific device Maintain system latency Triple-Modular Redundancy Nanotechnology Add code to implement efficiently Use nanotech to make chip propersize Parallelism Add three modules Voting Mechanism Parallelism works based on selection mechanism Choose one of the three modules Our Solution Operation Module Brain of the circuit device Control Module Decided by circuit designer Detects faults Fixes faults and improves design References R. Salvador, A. Otero, J. Mora, E. de la Torre, L. Sekanina, and T. Riesgo. Fault tolerance analysis and self-healing strategy of autonomous, evolvable hardware systems. In Reconfigurable Computing and FPGAs (ReConFig), 2011 International Conference on, pages 164-169, 2011. J. Lohn, G. Larchev, and R. Demara. Evolutionary fault recovery in a virtex fpga using a representation that incorporates routing. In Parallel and Distributed Processing Symposium, 2003. Proceedings. International, pages 8 pp.-, 2003. J. Heiner, B. Sellers, M. Wirthlin, and J. Kalb. Fpga partial reconfiguration via configuration scrubbing. In Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on, pages 99-104, 2009. S. Harding, J.F. Miller, and W. Banzhaf. Self modifying cartesian genetic programming: Parity. In Evolutionary Computation, 2009. CEC ‘09. IEEE Congress on, pages 285292, 2009. L. L. Goh. Novel fault localization approach for atpg / scan- fault failures in complex subnano fpga/ asic debugging. In Physical and Failure Analysis of Integrated Circuits (IPFA), 2010 17th IEEE International Symposium on the, pages 1-4, July 2010.