by Anders Holmberg, Product Manager, IAR Systems Safety-certified tools makes the difference In November of 2013, Renesas Electronics announced the development of a functional IEC 61508 compliant safety package for the RX631 and RX63N family of devices and at the same time, Renesas Electronics and IAR Systems announced the availability of a safety-certified development and build toolchain for the RX family of microcontrollers. The combination of hardware with a known history, application software, and diagnostic software and development tools with safety-certification is a very powerful enabler that allows the developers of safety-critical applications to hit the ground running and get their end product certified as quickly as possible, thus saving money as well as time-to-market. In this article we examine what benefits accrue to users of a certified development and build chain. 1. History repeat Over the last decades there has been a tremendous increase in the number of embedded projects that have to somehow cope with requirements on functional safety. Products for applications in measurement and control, medical devices, automotive and so on are increasingly required to fulfill certain reliability thresholds and behave safely in the event of failure. The avionics field on the other hand has obsessed with safety and reliability for a long time. Unfortunately, every niche have had their own standards and concepts related to safety although the underlying ideas are very similar. However, a strong trend since the turn of the century is a drive to unify standards and concepts so that for example methods and concepts from one standard can be applied also for another application area if justifiable. A driver for parts of this unification is the work with revising the IEC 61508 standard for Functional safety of electrical/electronic/programmable electronic safety-related systems into its second edition. It was released in 2010 and now serves as a reference standard for all kinds of programmable electronic devices, although sector specific standards in for example railway and machinery are very important to codify the safety related knowledge that is fundamental for the specific application area. The IEC 62061 standard is especially interesting, since it is a direct application of IEC 61508 in a machinery specific context. The automotive standard ISO 26262 is also an application of IEC 61508, but takes a slightly different view on safety integrity levels etc. The trend towards more formal requirements on safety related functionality is partly driven by the market; end users and product integrators higher up in the value chain demand high reliability and need some form of independent judge on what’s safe and what’s not. This opens up for product certifications of different kinds, where IEC 61508 compliance is probably the most important one in terms of impact on our industry. The second driver is international or national regulations that require compliance with named standards. The end result is that more and more products are now under pressure to comply with functional safety requirements. This trend is not likely to change in the short run. Rather to the contrary, the number of devices with some sort of incorporated safety functionality will continue to grow. The complexity will also go up as the Internet-of-Things spread and everything from toasters to emergency functionality in your car will be connected to the outside world and the internet. This evolution also blurs the line between functional safety and device security, at least in software development since the concepts partly overlap – A device that is for example susceptible to break-in attempts due to buffer overrun errors might just as well crash due to malformed data erroneously produced by a legitimate peripheral, or it might be forced to turn off the safety functionality by a malicious intruder. A paper by Stephen Checkoway and colleagues at University of California, San Diego {Comprehensive Experimental Analyses of Automotive Attack Surfaces} describe several attack channels in a modern car, where for example the in-car entertainment system can be compromised to allow command access to one or more CAN busses in the car. Another recent example that made the headlines is an online fridge that was recruited as part of a botnet to send spam emails. Page 2 2. Brace for impact In the hardware domain, designing a functional safety system includes answering some very basic questions: how do I ensure reliability and integrity of the selected components? How can I make sure they function as they should when reality strikes? For some components, failure rates and failure modes are fairly well understood and such components in critical pathways can either relatively easy be dimensioned to cope with required failure rates or doubled or tripled to further reduce the possibility of malfunction. In the latter case you can even opt for different suppliers to lessen the risk for common mode failures. In principle, the same kind of analysis must be carried out for the microcontroller you consider using, but a typical MCU is such a complex beast that other measures must be considered. One thing to consider is resilience to things like radiation etc. that can cause random bit errors on busses or in memory. Another is wear and tear, where especially non-volatile storage can be a serious threat due to approaching the limits on read and write cycles. A third source of issues is malfunctioning software, where things like writing beyond the stack can have disastrous effect. With their safety package for RX, Renesas Electronics provide a solid foundation for your safety related considerations by providing certified information and software that can be slotted into your project. One part of the package is a safety manual including among other things description of the safety mechanisms available and failure rates etc. A second part is a comprehensive self-test diagnostic library for the CPU core, RAM and flash ROM. This diagnostic library was tested with very high test coverage, by for example injecting and simulating faults at the gate level. A third part is slightly more intangible, but is still very important. The devices covered by the package are proven in the market and their performance in terms of possible failure modes is known, which is a huge advantage compared to selecting a device that is brand new on the market. Renesas Electronics is really ahead of the game with their safety package and since there are also certified software development tools and RTOS implementations available a large part of the necessary ground work is already done. This kind of offerings will become even more important as the market for safety related development grows. 3. Standards apply? If you are about to start a software project with safety-critical functionality or functional safety requirements, you are probably already aware that the tools you use must somehow be qualified as suitable for safety-related development. The exact requirements for how to qualify development tools differ to some extent on the criticality of a malfunctioning safety function or product. It is also dependent on the nature of the tool; a compiler that produces code that goes into your product is trickier to qualify than a source code metrics tool, which in turn is trickier to qualify than a version control system or a requirements management system. Section 7.4.4 and sub clauses of IEC 61508, part 3 details how support tools should be qualified. However, the standard is not very detailed on exactly how a C compiler should be qualified; for example, clause 7.4.4.10 states among other things that the selected programming language shall “have a translator which has been assessed for fitness for purpose including, where appropriate, assessment against the international or national standards” The notes to that clause then tries to explain some of the available mechanisms for a qualification effort. Taking this and other statements in the standard together indicates that qualifying a build toolchain for use in your project can result in a lot of work and associated document production, especially for higher Safety Integrity Levels. Further, if the assessment is too focused on the current project, it can be difficult to directly reuse the results in other projects. Moreover, there is a common misconception that if your project uses the same uncertified tool as another project that did achieve a safety-related certification, your tool magically becomes qualified for development of safety-critical systems. This is not the case because you are still required to prove that your project is similar enough to the other project in such a manner that you use the same functionality and in the same manner as the other project. Generally, you end up being required to provide justification all over again that your tool is still qualified. www.iar.com Page 3 4. Jumping through the hoops As mentioned, performing an in-house qualification of your selected build tools is often very timeconsuming and the needed skills have more in common with compiler writing and testing and less in common with the typical skills required for safety critical development that is close to the hardware. Further, how do you really validate that the build chain is compliant with relevant language standards? Can you get hold of a safety manual for the tool chain? What about carrying out a HAZOP analysis for the tool chain? To eliminate most of these questions and the associated work, IAR Systems has carried out a very comprehensive certification effort for our IAR Embedded Workbench for Renesas RX together with Renesas Electronics and TÜV SÜD, specialists in functional safety and associated assessments. The assessment of our build chain covers the following areas: • • • • • Our development processes and our ability to develop high-quality software in a repeatable fashion, including how we work with the specific requirements put forth by different functional safety standards. Test and quality measures, including validation of compliance with different language standards. Our processes for dealing with issues reported from the field and how users get updated about potential issues. The safety information in the safety manual and all other documentation. The assessment also covered softer issues like how many active users we have on our build chains for different MCU targets and how we make sure that the right product reaches a customer. The assessment covers both IEC 61508 and the sector-specific automotive standard ISO 26262. The latter standard is partly derived from IEC 61508 and has taken a position on tools qualification that is similar in intent but slightly different in action from IEC 61508. The outcome of the assessment is that the build tools incorporated in IAR Embedded Workbench for Renesas RX version 2.42.2 fulfills the requirements applicable to software development tools as given by both IEC 61508 and ISO 26262. The coverage of the automotive standard can be considered a bonus from how we work but it is also a good example of the fact that different standards can have very similar requirements. 5. How on earth? If you have already been through one or more safety related projects with formal certification requirements, you know the value and importance of a streamlined process. It can sometimes be tempting to invent complex processes that involve lots and lots of paperwork and the production of artifacts that in the end are justified only by the process itself and not by the real issues the processes are intended to address. In fact, one of the biggest challenges in adapting existing development processes to the requirements in IEC 61508 is to avoid over-engineering of the processes while at the same time fulfill the requirements of safety goals, traceability, decision justification, safety planning, testing and validation/verification, and feedback loops etc. as described by the standard and the V model. Of course you don’t want to cut corners or take them on two wheels, but your processes should be a help to reach your goals, not a hindrance. Another challenge is, as always, the need to balance factors like: • The bill-of-materials and associated production costs, especially for high-volume products. • Time-to-market and perceived market window. • Buy, develop yourself or outsource various needed parts for the project? This includes not only pure development activities, but also things like specification work, functional safety management, validation and verification, qualification of hardware, tools, third-party components, etc. www.iar.com Page 4 As suppliers of microcontrollers and development tools it’s not really surprising that we advocate our own solutions, but we are convinced that existing solutions that deliver on the promise to cut down on the red tape will in the end be worth every cent. 6. Software galore In developing safety critical software the relevant standards like IEC 61508 put stringent and rather heavy requirements on how the software shall be developed. This includes things like using the V model for organizing the overall project, how you select programming language and what features you can use in the selected language. There is usually also strong advice on how to test and verify the functionality of your device and especially the parts that are relevant for functional safety. It can be quite tricky to balance the need for various safety precautions that can for example drive up total memory usage and the requirements from production and market to keep the price down while at the same time retain margins. On the other hand a lot of the recommendations in for example IEC 61508 make sense also for general development of embedded systems. For example, the certified library from Renesas Electronics for MCU self-test is useful for any product where functional integrity is highly valued. Here is another challenge: Using optimizations and language extensions are not generally encouraged by safety standards, but it is our firm belief that both language extensions and optimization can have their merit also in safety related development when weighed against alternatives that require you to, for example, implement complex functionality in assembly language or increase the clock frequency to meet real-time deadlines. As long as you have solid justification for your decision backed up with matching validation and verification activities you’re on solid ground. The safety manual and other documentation gives you all the information you need to help you make an informed decision if the need arises. But how can, for example, the use of language extensions be justified and why would I need them for development? The main reason for using language extensions is to use extensions that in some way let me access special features of the underlying hardware in a type safe way and without the need to write assembly language glue code or rely on strange pre-processor magic. Such extensions can be very general, like the “@” operator used by IAR Systems compilers to indicate that a certain object shall be absolutely placed at a certain address. This can be used to safely create symbolic names for memory mapped peripherals, like timers, I/O ports etc; and this will in turn enable the linker to make sure that mapped objects do not overlap. More specific extensions are various intrinsic functions to access special features of the CPU core, like __disable_interrupt() or the RX specific __RMPA_B() that inserts a special instruction into the instruction stream in safe way. Optimization is another interesting area. Considerable effort goes into making modern compilers do their very best both on typical application code as well as on specific benchmarks; a modern highly optimizing compiler can perform some really amazing tricks with your code. At the same time, a buyer often spends quite a lot of time evaluating the performance of a compiler and associated tools. When the chosen compiler is then used in a project with safety requirements, the optimizations are most often simply turned off… This might look slightly weird seen from above, but there are some highly relevant drivers for this: • Safety standards commonly advice against using optimizations… And if your project aims for certification it’s always your assessor or notified body that have the final saying on what’s acceptable. • Use of very aggressive optimizations on a whole application can severely degrade the traceability from source code to assembly language – After function inlining, loop unrolling, instruction scheduling, common sub-expression elimination and a bunch of other transformations have been applied to a program it can be very difficult and error prone to manually map a specific piece of source code to the resulting assembly code, let alone proving that the code is implementing the right piece of functionality. Traceability requirements can be an efficient blocker of wholesale optimization. www.iar.com Page 5 • Concern about faulty implementations. Unfortunately, C compilers historically have a rather sad track record of implementing complex optimizations as well as not taking the language standards too seriously. The situation is dramatically different today. But combining several very aggressive optimizations and unleashing them on large code bases definitely increases the probability to encounter problems. • Perceived quality issues with optimizations. This is a tricky one… A quite common scenario for us is to get a bug report on some optimization that is believed to be wrong where it turns out that the code that broke is really relying on behavior that is undefined by the language standard and the compiler exploited the fact that the source code is, in a sense, broken. What’s difficult with this kind of problem is that the code might have worked as expected for a long time and then suddenly breaks after a modification that triggers different optimization behavior. Further, it might be a chain of triggering optimizations that leads up to the problem, which can make it hard to find the spot in the source that’s causing the problem. So, given that there are valid arguments to go easy on the optimization, what can we do when we really need them? Let’s start from the bottom of the list. • The best way to avoid relying on undefined behavior is to adhere as strictly as possible to a coding standard like MISRA C or another standard with similar intent. This is already close to mandatory if you are working against IEC 61508, so this is really no additional work you do just for the benefit of correct optimizations. However, strict adherence to such coding standards requires tool support in the form of static analysis tools to be viable. But the benefits of keeping the code MISRA clean are so many, that it’s really worth considering using MISRA as a basis for your work even if you’re currently not under safety requirements. Of course any standard that enforces a similar subset of the language can be used, but MISRA C is the most widespread standardized subset around. • So, what about traceability? First, an assessment should be made where potential performance bottlenecks are identified. In this scenario, code size of one or more modules can also be viewed as a performance bottleneck if they force the inclusion of more code memory in the HW design, which in turn can force a reevaluation of safety requirements etc. If the performance issues can be isolated to just a few places various techniques can be used to keep traceability on workable levels. This leads over to the last issue, since the techniques are very similar. • No matter if you want to increase traceability to, and understandability of, object code or reduce the risk of running into unexpected optimization issues, there are a few techniques that can be used: o Most compilers have a notion of optimization levels. Three levels divided in to low, medium and high optimizations are very common. Another commonality is that the trickiest optimizations are almost always reserved for the highest level. Depending on the exact traceability and verification requirements you have, you can ponder the following techniques. Can I increase the optimization level on only the modules that are perceived as slow/big? How is performance and traceability affected by this? Maybe an even higher level works, but this must always be assessed thoroughly against all other requirements. It is also often possible to turn off some specific optimization. o If needed, break out functions and put them in separate modules to isolate optimizations even further. o When optimizations are used, no matter if the level is low or high, is used only on certain modules, or across the full application, you should consider how you test the application. As soon as optimizations are used you should run as much as possible of your tests with both optimized and non-optimized builds. This way you ensure that the code behaves the same in both versions. Further, as a complement to what we talked about above, this way of testing is an excellent way to further reduce accidental dependencies on undefined behavior. This is especially so, if you can complement your test configurations with a build where all optimizations are turned on. A good thing with optimizations, as seen from a quality perspective, is that they reduce the amount of object code that might potentially have to be verified and cross-referenced. How much verification that is www.iar.com Page 6 done on this level depends on your safety goals, but for certain projects this can be a heavy burden. Usually, letting the optimizer work on the low level removes a lot of code that is just translation artifacts but retains a very good coupling with the original source code, thus actually simplifying ocular inspection. 7. Say what? So, what do you get if you select the functional safety version of IAR Embedded Workbench for Renesas RX? The high-level benefits can be summarized as follows: • A complete build chain and development environment that is certified by TÜV SÜD to comply with the requirements for tools selection in IEC 61508 and ISO 26262. • A report to accompany the certificate stating under what circumstances the certificate is valid. • A safety manual and general documentation that gives in-detail knowledge of how the tools should be used and their functional boundaries. • A test report on how the tool set is tested. • A compiler that accepts the C89, C99 and C++ languages. Exceptions and RTTI are not supported for C++, which in the case of exceptions is just as well, since their usage is not recommended for safety related development. • A Functional Safety Support and Update Agreement that includes support and prequalified bug fix updates to the certified version for as long as there are customers under contract for that version. • Regular updates on newly-discovered problems in the toolchain. The combination of Renesas Electronics’ functional safety package for the selected RX families and IAR Embedded Workbench for Renesas RX gives you a head start for developing safety critical products and applications and removes much of the drudgery that isn’t directly related to developing your application. Additionally, it gives you a compiler with outstanding optimization performance and best-inclass language conformance. On top of that it also provides various language extensions that can optionally be turned on to simplify programming close to the hardware. This includes intrinsic functions to access special hardware features as well as extensions to simplify access to memory-mapped peripheral devices. www.iar.com