Fault and Energy Aware Communication Mapping with Guaranteed Latency for Applications Implemented on NoC Sorin Manolache, Petru Eles, Zebo Peng {sorma, petel, zebpe}@ida.liu.se 1 Outline Motivation System model Approach The outline three sub-problems Experimental results Conclusions 2 2 Motivation Reduced feature sizes non-negligible rate of transient faults on the network links Specified message arrival probabilities have to be guaranteed Application latency Application latency has to be reduced and has to be predictable even in the case of faults Energy consumption of communication links account for a significant fraction of the total energy consumed by the chip Communication energy has to be reduced 3 3 System Model Application characteristics Task period, WCET, deadline, dependencies, priorities Message lengths and priorities Task mapping Platform characteristics Link bandwidth, energyper-bit, average failure rate, average recovery time Packet size 4 4 Problem Formulation Input: system model Output: mapping of messages to links Constraints: Task deadlines are met The probability that a inter-task message reaches is destination is higher than a specified bound Communication energy is reduced 5 5 Approach Outline Link failures are tolerated by means of transmission of redundant packet copies How many? How to map the copies? How should we generate the mapping alternatives such that they are energy-efficient? Explore the space of alternatives, select those that minimize latency Drive the exploration by task response time analysis How to model the application in order to account for packetisation and redundancy? Exploit the maximised time slack for energy minimisation by means of voltage scaling [Andrei et al., ICCAD 2004] 6 6 Communication Supports 7 7 Communication Supports 8 8 Response Time Analysis 9 9 Latency vs. Imposed MAP 10 10 Latency Reduction vs. Link Load 11 11 Energy vs. Number of Tasks 12 12 Conclusions An approach to communication mapping for NoC Guaranteed latency Guaranteed message arrival probability in the presence of transient link failures Energy minimised by slack maximisation and subsequent voltage reduction 13 13 Latency vs. Number of Tasks 14 14 Communication Supports use CS of minimal general redundancy degree for the given spatial redundancy degree CS that are only temporally redundant have no capacity for concurrent transmission of redundant copies use CS of spatial redundancy degree higher than 1 CS of SRD > 2, composed of shortest paths only, are vulnerable on the initial two links If longer paths are used, in order to tolerate double faults more energy is consumed and longer latency is obtained use CS of spatial redundancy degree at most 2 15 15 CS Candidate Space Exploration CS candidates X X X … Messages X X Tabu Search Restrict the neighbourhood to candidates that would move the communication to links with lower load of higher priority messages Attempt to re-map the messages with high jitters X 16 16