CSCI1600: Embedded and Real Time Software Lecture 31: Verification IV Steven Reiss, Fall 2015 Working with Formula Assume we have a LTL/CTL formula for property Assume we have a finite state program Objective: Map the program to a Boolean formula Create a combined formula for property + program Prove the combined formula Last time we showed how to create this formula Ff(s’) is true if f holds in state s with bit repr s’ Making This Practical The formulas can be generated but They are going to be very large Need to both generate and use them efficiently We need an efficient representation of a formula Easy to operate on (AND/OR/NOT/IMPLIES) Space efficient Easy to manipulate by computer Decision Trees A formula can be written as a decision tree T Each node is labeled with a Boolean variable Each node has a true and a false branch out Leaves are labeled either TRUE or FALSE Example: A and (B or C) Try d AND (c OR (b XOR a)) What is the size of the tree? Does it have to be this size? A F B F T F T C F T T F Simplifying Decision Trees Most of the decision trees look the same Extra variables just yield identical subtrees Trees can be simplified Simplify the tree into a DAG There are only two real leaves (merge these) If two nodes have the same children and the same labels, they can be merged If both branches of a node go to the same node, the original node can be eliminated This can be iterated Worst case size is still exponential, actual size is generally much smaller The results is a BDD (Binary Decision Diagram) Ordered BDDs Place an a priori order on the variables as labels Note that order affects the size Optimum ordering is NP complete Logical operations are easy to do on ordered BDDs Complement: flip TRUE and FALSE leaves AND, OR: Merge the two trees level by level Ordering makes this easy Setting result to be AND/OR or merged result EG and EU Defined as iterative procedures on BDDs Stopping at appropriate fixed points (no changes to BDD on next pass) Model Checking BDDs Compute the BDD for R(s,t) for the program Inductively compute F(s) for the formula f F(S) is true iff f is true in the program F(S) is true iff it is the singleton node True Otherwise there is a path that shows false Can use the BDD path to FALSE to find counterexample BDDs have been used to verify programs with 2^80 states Handling Real Time The checking so far used relative time Sometime in the future, eventually We can’t say after k ms or before k ms Real time checking We’ve already seen some of this in terms of scheduling Scheduling theory tells us when we can validly schedule periodic tasks Provides necessary and sufficient conditions Can accommodate non-periodic an sporadic tasks as well How do we merge the two Real-Time Model Checking Augment Temporal Logic with real time operators Discrete case: Count the number of steps Do assertions in terms of number of steps This assumes step size (time increment) is constant Hardware has real clocks Each instruction takes a fixed number of cycles If we can specify the number of cycles per instruction … Alternatively, start with timed FSAs to deal with continuous real time RTCTL: Real-time CTL Add a new operator f U[a,b] g Called “bounded Until” f and g are arbitrary formula [a,b] is a time interval (expressed in steps) f U[a,b] g is true on a path P = s0 s1 … IFF g holds at some future state s on the path f is true in all states between s0 and s The distance form s0 to s is within the interval [a,b] RTCTL Add a new operator G[a,b] f Bounded Globally G[a,b] f is true on a path P = s0 s1 … IFF f holds for state si where a <= i <= b Label arcs with a time distance (default 1) This represents the number of cycles consumed here Add self arcs where appropriate Assume transition at each time Note that this is not any more powerful than CTL Can use the X operator accordingly Using BDDs for Timing Use BDDs to compute timing properties Minimum and maximum delay Compute the shortest (longest) path from a starting point to an ending point Start when condition X holds End when condition Y holds Define a function to represent the set of states at distance k Start with all states satisfying X Compute from nodes at distance k-1 using R Then check all these to see if Y holds Continuous Real Time This is a more difficult problem But is still approachable Recall Timed Automata Finite automata with a finite set of real world clocks Notion of time Transitions are instantaneous Time elapses while automata is in a state A transition may reset some of the clocks to zero A clock has a value that is the time elapsed since last reset Time passes at the same rate for all clocks Only a finite number of transitions can occur in a finite amount of time Timed Automata Clock constraints Associate a guard with each transitioon Associate an invariant with each state Time can elapse at state only while invariant is true Y >= 3, Y := 0 Y <= 10 X <= 8 Y <= 5 Y >= 4 && X >= 6; X := 0 Proving Properties Map Timed Automata into infinite state transition graph State in the graph is a state in the TA + clock assignment Transitions Delay transition: just increment the clocks Action: correspond to actual transitions Problem: Clocks are real time, not discrete Clock Regions Make all clock-related values integer Assume all are rational to start with Multiply by LCM Identify clock regions representing sets of assignments If two states corresponding to the same state of the timed automata agree on the integral parts of all clock values and also on the ordering of the fractional parts of all clocks, then the states will behave the same This lets you create a finite representation Can also be defined in terms of inequalities This lets you solve reachability (defining valid transitions) And you can do model checking on the finite representation Fault Tolerance We’ve talked about proving properties of systems This can’t always be done And even so, it is an approximation Assumes program is correct Assumes real world model is correct Assumes computer and other hardware work correctly And you can’t prove everything Embedded systems need to be fault tolerant No other recourse after failure Fault Modeling What can fail How things can fail (failure modes) Why things might fail (random or combined) Faults might be permanent, intermittent, or transient Faults might show up in different ways Responding to Faults Fault confinement Limit the effect of a fault to a local subsystem Defensive programming Fault detection and location Self tests If transient, get enough information for later analysis Fault masking Hide the effect of faults Retry Transient faults might go away Disk and memory problems may be transient Fault Avoidance Don’t write buggy code Serious testing, defensive programming Verification Avoid running devices at their limits Fault Detection and Recovery Extreme defensive programming Error checking codes (checksums) for messages Self-checking and fail-safe logic Watchdog timers and time-outs Consistency and capability checks Duplication (redundancy) This is important in critical, unsupervised systems Redundancy Triple modular redundancy Hardware is replicated three times Outcome of each module (high-level routine) is a vote If 2 agree on the answer, it is chosen This fails when All 3 disagree (fail-stop) Two modules fail together (byzantine) The voting mechanism fails Can handle k failures by having higher number of modules Space Shuttle Redundancy Five computer implement the system Four make up the primary system Normally in command; Simultaneously execute identical code Synchronize on I/O; Actuation is a physical vote Priority-based OS The fifth system is the backup Completely independent implementation Normally operates in listen-mode; Requires a manual switch-over OS is time-sliced, not priority scheduled Still have problems Same language, same compiler Fault Recovery What to do if something goes wrong Keep the system running Put the system into a stable state Watchdog timer Monitor routine of sorts Task code periodically sets a flag Watchdog (high priority) checks and resets the falg If flag is unset, restart the system Care needed to not make things worse Fault Recovery Self-checking software With checkpoint and rollback Correct data defects in memory and continue Adaptive software Safe backup state Failure isn’t just CPU failure Might not want to reboot from stored state Reset-reboot switches Fault-Based Disk Partitioning SightPath CDN nodes (cache for media files) Four disk partitions Boot partition that is never touched Two OS partitions Usually mounted read-only Upgrades are written to the spare partition Upgrade partition is then remounted read-only and marked clean Boot partition will not boot from unclean partition One data partition All data is soft-state Similar scheme used in flash-based devices Next Time Project status presentations Voluntary If you aren’t going to be here or aren’t going to give one, provide a project status handin. Following Wednesday: Guest Speaker Understanding Faults Next Time Voluntary Project Status Updates and Thanksgiving Following Wednesday: GUEST LECTURE on SECURITY