The Rare Glitch Project: Verifying Bus Protocols for Embedded Systems Edmund Clarke, Daniel Kroening Carnegie Mellon University Motivation TTP/C Shorthand for Time-Triggered Protocol for SAE Class C Applications [SAE93] Real-time communication protocol for fault-tolerant real-time systems Defined by draft standard TTP/C version 0.5 from TTTech AG [TTPC99] Designed for X-by-wire applications steer-by-wire, break-by-wire, throttle-by-wire, ... E.g., replace steering wheel by a joystick Safety critical! Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Drive-by-Wire First used for military aircrafts (fly-by-wire) Steer-by-Wire: replace steering wheel by joystick Brake-by-Wire: replace hydraulic brake system Throttle-by-Wire: replace mechanic throttle pedal Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Drive-by-Wire Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Drive by wire Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Drive-by-Wire: Advantages More safety by stabilizing algorithms Passive safety: no steering column Reduced weight Reduced maintenance cost Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Implementing Drive-by-Wire Components are connected using a redundant bus Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction A TTP/C Bus Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction A TTP/C Bus Node Also the smallest replaceable unit (SRU) Host Processor Protocol Processor Bus Guardian Line Interfaces Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction TTP = Time Triggered Protocol TTP/C is uses a cyclic time-division multiple access (TDMA) scheme One TDMA Round A B C A B C A …… time Time slots are assigned statically Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Why verify? Daimler Chrysler / BMW tested TTP/C and considered it to be too inflexible They developed FlexRay, which provides more flexibility The developers of TTP/C claim that FlexRay sacrifices safety for flexibility GM has not decided yet which protocol to use Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Introduction Why is verification hard? Large state space per node (message area) Many features besides message transmission (membership service, global time base, mode changes, reconfiguration, download) Protocol provides clock synchronization Must have large number of nodes Verifying with only 2 or 3 nodes is dangerous, protocol requires 4 minimum, 20-30 nodes realistic Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Formalizing a Protocol Standard The TTP/C standard is plain, informal English text In a Drive-by-wire system, different implementations from different vendors are used We do not verify a particular implementation but the requirements for all implementations Use non-determinism to cover all implementation details Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Formalizing a Protocol Standard 1. Define set of states 4 10 5 1 7 9 3 2 Carnegie Mellon: The Rare Glitch Project 6 8 E. Clarke and D. Kroenig Formalizing TTP/C Formalizing a Protocol Standard 1. Define set of states 2. Define set of valid initial states 4 10 5 1 7 9 3 2 Carnegie Mellon: The Rare Glitch Project 6 8 E. Clarke and D. Kroenig Formalizing TTP/C Formalizing a Protocol Standard 1. Define set of states 2. Define set of valid initial states 3. Define transition relation 4 10 5 1 7 9 3 2 Carnegie Mellon: The Rare Glitch Project 6 8 E. Clarke and D. Kroenig Formalizing TTP/C Formalizing a Protocol Standard 1. Define set of states 2. Define set of valid initial states 3. Define transition relation 4 10 5 1 7 9 3 2 6 8 Verification: Prove Properties on paths Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Level of Abstraction Abstraction... permits concise specification of protocol properties allows for automated, computer aided verification Abstraction on time: Only consider specific points of time E.g., end of TDMA round, end of message, etc. Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Abstraction Hierarchy TDMA round MSG slot …. …. macroticks microticks MSG slot MSG slot …. includes MFM …. macro-tick synchronization DPRAM access timing each SRU has own time base Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Abstraction Hierarchy: Formalization Each level is modeled by a mathematical machine The machines share the same configuration set The set of reachable states of a lower level is a refinement of the reachable states of a level above Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Formalizing TTP/C Abstraction Hierarchy: Formalization 4 11 12 Msg Slot Level Macro Tick Level 4 5 Carnegie Mellon: The Rare Glitch Project 6 7 11 8 9 12 E. Clarke and D. Kroenig Formalizing TTP/C Abstraction Hierarchy: Formalization Let rx denote the transition relation for level x Let a, b denote levels and let b<a hold. c ra d holds iff there is a set of states c1, …, cn with ci rb ci+1 for i=1 to n-1 and c1=c and cn=d n can be fixed depending on the level and on c1. Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Verifying Protocol Properties Properties of Interest Service Guarantee Verify that protocol stack can transmit messages within a finite amount of time after enabling the controller Verify a guarantee for hot standby nodes to become member in case of a failure Membership service Informs all nodes about the operational state of each node within one TDMA round SRU is operational if the host sends a life sign and the controller is operational and synchronized Claim: membership bit matches real status after one TDMA round Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Verifying Protocol Properties Fault Model Described in standard System must tolerate any single hardware fault System must tolerate malicious host software … assuming that all SRUs are implemented according to the standard Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Verifying Protocol Properties Membership Service Uses implicit acknowledgement scheme Encoded in CRC that protects the frames A node that sends no or false data looses membership After sending a frame, a node watches the following frames to determine if it is still considered a member of the cluster Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Project Status Done Verification done using PVS Abstraction hierarchy Initial predicate Transition relation for message slot abstraction level and abstraction levels above; for MFM code level includes membership service without mode changes, download, and reconfiguration Parts of the Verification of the Membership Service Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Project Status Future Work More Properties Analysis of Problems of Membership Service More abstraction levels (e.g., clock synchronization) FlexRay (requires NDA) Prove abstraction hierarchy using theorem prover, model-check the individual levels of the hierarchy Common Framework: SyMP Probabilistic Model Checking (J. Wing) Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Outline Introduction Project Goals Formalizing TTP/C Verifying Protocol Properties Project Status Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Verifying Protocol Properties Problems with Membership Service No data is accepted from a node without consistent membership information Membership service is therefore safety critical Problem: Correctly working nodes may loose membership One is maybe better off without Membership Service Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig Verifying Protocol Properties Example Nodes: A, D, E, … from Vendor 1, B, C from Vendor 2 A transmits message, correctly received by D, E… but not by B, C A looses membership; can continue with next predecessor of B A B C D Carnegie Mellon: The Rare Glitch Project E F E. Clarke and D. Kroenig Project Goals Formalization of the requirements of TTP/C and FlexRay Formalization of service requirements of higher levels Formalization of a fault model Formal proof that the protocols satisfy the service requirements Carnegie Mellon: The Rare Glitch Project E. Clarke and D. Kroenig