Program Analysis and Verification - Lecture 2 Program Semantics Lecturer: Noam Rinetzky Summary By: Gal Rotem and Michal Faktor This summary is based on the 24.2.2014 lecture, the lesson presentation and online book. Motivation - Verifying absence of bugs. For example, Apple recently had a bug in the SSL key exchange verification. The line 'goto fail' repeated twice, causing the jump to be always executed. This error resulted in an unreachable code segment, which could have been verified ahead. - Proving functional correctness Program analysis & verification Given a program that has an assertion assert(p) at program point pc, we would like to prove that the assertion is true (p is always true when the program reaches pc). Unfortunately, the problem is generally undecidable. An assertion can be either: ο· ο· ο· Always satisfied (true for all inputs) Never satisfied (false for all inputs) Sometimes satisfied (false for some inputs and false for some) An assertion p holds for program P at program point pc if when ever the program reaches pc p is satisfied. An analysis is sound if it only reports that assertions that hold indeed do. In particular, it never reports that an assertion that may not be satisfied as one that holds. The main idea is to verify that assertions hold is to use over approximation: we have an exact set of configurations/behavior/states and we want to prove that it has certain properties are true with regard to this set. As we mentioned before, this problem s is undecidable. However, if we can prove that a larger set, i.e., one that contains the original set, has these properties, then we can be sure that so does the exact set that we started from. For example, we would like to the properties of all reachable states of a program P starting from the set of all possible initial states, and to prove that P never reaches a 'bad' state. When preforming an overapproximation of the set of reachable states, i.e., considering a larger set then the reachable ones, we still would like to show that the set of reachable states does not overlap with the set of bad states. If we can prove this, then we know that the program never reaches a bad state. However if we failed to do so then the program might still never reach a bad state but we have failed to detect it due to over approximation. In other words, over approximation enables us to circumvent the undecidability problem by allowing the analysis to be imprecise, i.e. the analysis might fail to detect that certain true properties of a program indeed old. The challenge is to develop analyses that are sound and useful, i.e., not to imprecise. Syntax VS Semantics Syntax in the form that a program is written and Semantics is the meaning of the program. We will mark the meaning of a program P (=the semantics of P) using "semantic braces"- β¦π·β§ "Standard" semantics – "state transformer" semantics We can describe the meaning of a program using a state machine where ach operation is translated into a state transition. We can start from a certain initial state and use the state machine until reaching a final state. In slide 19, we can see a run that reaches an accepting state. We would like to find the properties of all reachable states. So, instead of using a set-transformer, we will use a set-of-states-transformer. In slide 25, we indicated the set of all possible states. We can see that as the code in the else clause is unreachable, the set of states is empty – {}. We reached a final set-of-states which is an accepting one. In any reachable state y set are equal to 42. The sets of states we used here are infinite, and thus it might not be possible to represent them in the machine memory. Hence, we would like to describe the sets in a bounded way: "Abstract-state transformer" semantics An abstract semantics is more compact, but less informative, than the set of states semantics. Each abstract state represents a set of concrete states, i.e., the actual states used in the state-transformer semantics. For example, we can map each variable either to E (if it is even), O (odd), or T (stands for top, meaning we don't know whether it is odd or even). For example, the abstract state in which all the variables are mapped to E represents all the program state in which all the variable of the programs have an even. We can see in slide 30, that our final state indicates that both y and x are even. We received an imprecise abstraction- a set which indeed a superset that contains the set of reachable states of the original problem, but doesn’t guarantee that the assertion holds. For example, if at the previous semantics we could assert that in same phase of the program y==42, now we cannot, however, we can assert that y is even, which means that we can assert that assert(y%2==0) holds and that assert(y==33) [or any other odd] ) does not. We will note that this semantics is sound-when we explore the abstract states we cover the set of reachable states. But it may be imprecise and, as a result, might overlap with the set of bad states even though the program never reaches a bad state (thus the analysis might raise "false alarms"). Programming Languages Syntax is the way we write the program. ο· ο· The syntax of the language can be described using BNF notation Different parsing technique can be used Semantics is the meaning of the program- what the program does. Program Semantics There are several types of program semantics, and the use of each depends on the problem we are addressing. Examples: ο· Operational: State-transformer. Variations: set-of-states transformer, trace transformer (where trace describes the run of the program) ο· Axiomatic: Predicate-transformer ο· Denotational: Representing the meanings of programs as mathematical function. For example: the semantics of a program that computes a factorial is the factorial function. What semantics do we want? ο· We want the semantics to capture the aspects of computation we care about. I.e., it should be able to represent the properties we are interested of. ο· We want the semantics to hide irrelevant details and use the most abstract level we can. For example, the specific way the program counter works might not be of interesting to us. o The highest level of abstraction is- "fully abstract". When using a fully abstract semantics, if we have 2 programs that for every input evaluates the same output, then they have the same “semantics”. I.e., if we have two different programs that compute the factorial function, e.g., one using a loop and one using recursion, there meaning in a fully abstract denotational semantics would be the factorial function. ο· Compositional: a a compositional semantics, defines the semantics in a modular way, the program’s meaning is defined by the meaning of its (syntactical) parts Formal semantics "Formal semantics is concerned with rigorously specifying the meaning, or behavior, of programs, pieces of hardware, etc." "This theory allows a program to be manipulated like a formula- that is to say, its properties can be calculated." Why formal semantics? ο· Allows us to define the meaning of the program in an implementation-independent way ο· Allows to automatically generate interpreters (and hopefully in the future, full-fledged compilers( ο· Makes verification and debugging more precise: if you don’t know what the program does, how do you know it is incorrect? Levels of abstractions and applications ππ‘ππ‘ππ π΄ππππ¦π ππ (πππ π‘ππππ‘ π πππππ‘πππ ) β (πΆππππππ‘π) πππππππ ππππππ‘πππ β π΄π π πππππ¦ − πΏππ£ππ πππππ‘πππ (πππππ − π π‘ππ) Going up the arrow = Going to an higher abstraction level Semantic description methods ο· Operational semantics: describes how the program operates on every state in the system, for instance by using a state transformer. We will learn two kinds of operational semantics: o Natural semantics (big step) [G. Kahn] o Structural semantics (small step) [G. Plotkin] ο· Denotational semantics [D. Scott, C. Strachy]: the program meaning is described by mathematical objects- as functions or relations between input and output. ο· Axiomatic semantics [C. A. R. Hoare, R. Floyd]: the program meaning is defined by what we can prove on the program. This semantics is used as a logical tool to prove properties of the program by its effect on assertions. Operational Semantics In the following, we consider a simple programming language called “While”. First, we define the programming language’s abstract syntax. (Here we use the term abstract syntax in the same way as in the compilation course): π β·= π|π₯|π1 + π2 |π1 β π2 |π1 − π2 π β·= π‘ππ’π|ππππ π|π1 = π2 |π1 ≤ π2 |¬π|π1 ∧ π2 π β·= π₯ β π|π πππ|π1 ; π2 |ππ π π‘βππ π1 πππ π π2 |π€βπππ π ππ π We can use the abstract syntax and represent programs using “parse trees”, i.e., their ASTs. The abstract syntax is ambiguous. In slide 55 we can see 2 AST that matches the same statement. The concrete syntax must provide sufficient information so unique trees will be constructed. The ambiguity can be resolved by adding brackets and by defining the precedence of the operators rather straightforwardly, and hence we will ignore it. Example: AST for the While program: "while (y≥z) do z:=z+1" is the following: Syntactic categories: The program’s text (syntax) is comprised of the following syntactic categories: π ∈ ππ’π Numerals (the representation of the numbers) π₯ ∈ πππ Program variables π ∈ π΄ππ₯π Arithmetic expressions π ∈ π΅ππ₯π Boolean expressions π ∈ ππ‘π Statements Using the syntactic categories we can define the semantic domains (or categories): Z πΌππ‘πππππ {0,1 − 1,2, −2, … } T πππ’π‘β π£πππ’ππ {ππ, π‘π‘} State πππ → π The semantics domains give a meaning to each syntactic category. Note that there is a difference between integers and numerals: numerals syntactic entities and are used to write numbers as text in the program. Integers are semantic entities and are used to represent actual integer numbers in Z. Formally, we should use different notations to distinguish between the two. However, for brevity, use the same (decimal) notation for both and distinguish between the two by context. A similar distinction exists for Booleans. The state of the program records the values of its variables. Formally, it is a mapping between (the names of) program variables and their (integer) value. For example: π = [π₯ β¦ 5, π¦ β¦ 7, π§ β¦ 0] is a state where the values of variable x, y, and z are 5, 7, and 0, respectively. We can perform lookup and update on the state: Lookup: π π₯ = 5 returns the value of x in state s Update: π [π₯ β¦ 6] = [π₯ β¦ 6, π¦ β¦ 7, π§ β¦ 0] is the same state as s except that x is mapped to 6. Note we are performing a destructive update: when updating the value of a variable we forget the previous mapping. Additional state manipulations examples: ο· [π₯ β¦ 1, π¦ β¦ 7, π§ β¦ 16]π¦ = 7 ο· [π₯ β¦ 1, π¦ β¦ 7, π§ β¦ 16]π‘ = undefined ο· [π₯ β¦ 1, π¦ β¦ 7, π§ β¦ 16][π₯ β¦ 5] = [π₯ β¦ 5, π¦ β¦ 7, π§ β¦ 16] ο· [π₯ β¦ 1, π¦ β¦ 7, π§ β¦ 16][π₯ β¦ 5]π₯ = 5 ο· [π₯ β¦ 1, π¦ β¦ 7, π§ β¦ 16][π₯ β¦ 5]π¦ = 7 Semantics of arithmetic expressions Arithmetic expressions are side-effect free, they are evaluated in a state but do not change it. We define the meaning of arithmetic expressions as a total function π that takes two arguments: a syntactic expression and a state. The semantic function πβ¦π¨πππβ§: πΊππππ → π is defined by induction on the syntax tree: πβ¦πβ§π = π (Note: we have 2 different n-s here. In green-the representation of the number in the programming language, and in red- the real integer) πβ¦π₯β§π = π π₯ (Performing lookup) πβ¦π1 + π2 β§π = πβ¦π1 β§π + πβ¦π2 β§π πβ¦π1 − π2 β§π = πβ¦π1 β§π − πβ¦π2 β§π πβ¦π1 β π2 β§π = πβ¦π1 β§π × πβ¦π2 β§π πβ¦(π1 )β§π = πβ¦π1 β§π (β is the syntactic multiplication operator, and x denotes the actual semantic multiplication of integers) (This definition is not really needed as in our simple language we effectively use the program’s AST.) πβ¦−π1 β§π = 0 − πβ¦π1 β§π This definition is compositional: the sematic is defined in a modular way: the meaning of an expression is defined by the meaning of its parts. Properties can be proved by structural induction. Arithmetic expression exercises: 1. Suppose π π₯ = 3. Then: πβ¦π₯ + 1β§π = πβ¦π₯β§π + πβ¦1β§π = 3 + 1 = 4 2. The meaning of πβ¦π₯ + 1 + 1β§π and πβ¦π₯ + 2β§π is identical: πβ¦π₯ + 1 + 1β§π = πβ¦π₯β§π + πβ¦1 + 1β§π = π π₯ + 2 πβ¦π₯ + 2β§π = πβ¦π₯β§π + πβ¦2β§π = π π₯ + 2 Semantics of Boolean expressions: Boolean expressions are, as well, side-effect free. The semantic function πβ¦π©πππβ§: πΊππππ → π» is defined by induction on the syntax tree: πβ¦π‘ππ’πβ§π = π‘π‘ πβ¦ππππ πβ§π = ππ π‘π‘ ππ πβ¦π1 β§π = πβ¦π2 β§π πβ¦π1 = π2 β§π = {ππ ππ‘βπππ€ππ π π‘π‘ ππ πβ¦π1 β§π ≤ πβ¦π2 β§π πβ¦π1 ≤ π2 β§π = {ππ ππ‘βπππ€ππ π π‘π‘ ππ πβ¦πβ§π = ππ π⦬πβ§π = {ππ ππ πβ¦πβ§π = π‘π‘ π‘π‘ ππ πβ¦π1 β§π = π‘π‘ πππ πβ¦π2 β§π = π‘π‘ πβ¦π1 ∧ π2 β§π = {ππ ππ‘βπππ€ππ π The operational semantic is concerned with how to execute programs, i.e., how statements modify the state and how to define transition relation between configurations. We are interested in how the states are modified during the execution of the statement. There are 2 different approaches to operational semantics: ο· ο· Natural semantics: describes how the overall results of executions are obtained. It is also called "bigstep" semantics. Intuitively, it s defined as a relation between an input state and an output state. Structural operational semantics: describes how the individual steps of the computations take place. So-called "small-step" semantic. Natural operating semantics (NS) ο· ο· ο· This semantics was developed by Gilles Kahn and it is also known as "Big/large step semantics" It is defined as a relation between configurations 〈πΊ, π〉 → π′ where the arrow represent execution of all steps used to compute statement S starting from state s and ending in state s’. There are 2 kinds of configurations: ο· o 〈π, π 〉 denotes that a statement S is about to execute on state s o π A terminal (final) state Transitions: 〈π, π 〉 → π ′ means that the execution of S from state s terminates in (result) state s'. The semantics is capable of only describing executions that terminates. Thus, using this semantics we cannot describe infinite (non-terminating) computations. ο· We define → using rules of form: premise side condition 〈π1 ,π 1 〉→π 1′ ,…,〈ππ ,π π 〉→π π ′ 〈π,π 〉→π ′ ππ … conclusion Where π1 , … , ππ are immediate constituents of S or are statements constructed from the immediate constituents of S. The side condition specifies when we can apply the rule. Natural semantics for While [ππ π ππ ] 〈π₯ β π, π 〉 → π [π₯ β¦ πβ¦πβ§π ] 〈π πππ, π 〉 → π [π πππππ ] [ππππππ ] 〈π1 ;π2 ,π 〉→π ′′ 〈ππ π π‘βππ π1 πππ π π2 ,π 〉→π ′ 〈π2 ,π 〉→π ′ ππ [ππππ ] axiom 〈π1 ,π 〉→π ′ ,〈π2 ,π ′〉→π ′′ 〈π1 ,π 〉→π ′ π‘π‘ [ππππ ] axioms 〈ππ π π‘βππ π1 πππ π π2 ππ [π€βπππππ ] π‘π‘ [π€βπππππ ] ππ β¬β¦πβ§π = π‘π‘ ππ β¬β¦πβ§π = ππ ,π 〉→π ′ 〈π€βπππ π ππ π, π 〉 → π 〈π,π 〉→π ′ ,〈π€βπππ π ππ π,π ′〉→π ′′ 〈π€βπππ π ππ π,π 〉→π ′′ If the side condition holds, we can apply the rule ππ β¬β¦πβ§π = ππ ππ β¬β¦πβ§π = π‘π‘ We note that the while rule, in case b doesn’t hold, indicates that the state remains the same. In case b holds, the while rule is defined recursively, and not by using an immediate constituent. In other words, the derivation rule for while is non-compositional. Example: Let π 0 be the state that assigns zero to all program variables. We show that the execution of statement ππ π₯ = 0 π‘βππ π₯ β π₯ + 1 πππ π π πππ ends with state π 0 [π₯ β¦ 1]. using [ππ π ππ ] and the definition of π : 〈π₯ β π₯ + 1, π 0 〉 → π 0 [π₯ β¦ 1] (1) using [π πππππ ] : 〈π πππ, π 0 〉 → π 0 (2) using [ππππππ ], (1) and (2) : 〈π πππ,π 0 〉→π 0 ,〈π₯βπ₯+1,π 0 〉→π 0 [π₯β¦1] 〈π πππ,π₯βπ₯+1,π 0 〉→π 0 [π₯β¦1] π‘π‘ ] using π 0 definition, [ππππ and (1) : 〈π₯βπ₯+1,π 0 〉→π 0 [π₯β¦1] 〈ππ π₯=0 π‘βππ π₯βπ₯+1 πππ π π πππ,π 0 〉→π 0 [π₯β¦1] ππ β¬β¦π₯ = 0β§π 0 = π‘π‘ Derivation trees When we use the axioms and rules to derive a transition 〈π, π 〉 → π ′, we obtain a derivation tree. The root of the derivation tree is 〈π, π 〉 → π ′ and the leaves are instances of axioms. The internal nodes are conclusions of instantiated rules and have the corresponding premises as their immediate children. We write the derivation tree with the root at the bottom. Therefore, the children are above their parents (and the premises are above the conclusions, as we originally defined). We build the tree from the root (bottom) upwards. Example: Assume π 0 = [π₯ β¦ 5, π¦ β¦ 7, π§ β¦ 0] π 1 = [π₯ β¦ 5, π¦ β¦ 7, π§ β¦ 5] π 2 = [π₯ β¦ 7, π¦ β¦ 7, π§ β¦ 5] π 3 = [π₯ β¦ 7, π¦ β¦ 5, π§ β¦ 5] steps 3 and 4: applications of [ππ π ππ ] axiom, so these is a leaves 〈π§ β π₯, π 0 〉 → π 1 〈π₯ β π¦, π 0 〉 → π 2 step 3: applying [ππππππ ] rule 〈(π§ β π₯; π₯ β π¦), π 0 〉 → π 2 step 2: application of [ππ π ππ ] axiom, so this is a leaf 〈π¦ β π§, π 2 〉 → π 3 step 1: applying [ππππππ ] rule 〈(π§ β π₯; π₯ β π¦); π¦ β π§, π 0 〉 → π 3 We note that in deterministic semantics, we should have only one derivation possibility in each step. If the semantics isn't deterministic, we can have several possibilities and every one that succeeds is a possible meaning of the program. In the While language, as defined so far, both derivation tree and output state are unique. Top-down evaluation example We now consider a program that compute the factorial of x, assuming that x = 2. Denoted by W π¦ β 1; π€βπππ ¬(π₯ = 1) ππ (π¦ β π¦ ∗ π₯; π₯ β π₯ − 1). First, we use only the derivations rules and build the tree from the root to the leaves, without computing the values in red. Then we compute the values in red in the specified order, and we can see that the states after (4) from the leaves (6) [ππ π ππ ]are computed [ππ π ππ ] (5) (top) down. (7) a transition 〈π¦ β π¦ ∗ π₯; π [π¦ β¦ 1]〉 → π [π¦ β¦ 2] [ππππππ ] 〈π₯ β π₯ − 1; π [π¦ β¦ 2]〉 → π [π¦ β¦ 2][π₯ β¦ 1] ππ (3) [π€βπππππ ] 〈π¦ β π¦ ∗ π₯; π₯ β π₯ − 1, π [π¦ β¦ 1]〉 → π [π¦ β¦ 2][π₯ β¦ 1] (8) [ππ π ππ ] (1) 〈π¦ β 1; π 〉 → π [π¦ β¦ 1] (9) (10) 〈π, π [π¦ β¦ 2][π₯ β¦ 1]〉 → π [π¦ β¦ 2, π₯ β¦ 1] π‘π‘ [π€βπππππ ] (2) (11) 〈π; π [π¦ β¦ 1]〉 → π [π¦ β¦ 2, π₯ β¦ 1] [ππππππ ] (12) 〈π¦ β 1; π€βπππ ¬(π₯ = 1) ππ (π¦ β π¦ ∗ π₯; π₯ β π₯ − 1), π 〉 → π [π¦ β¦ 2][π₯ β¦ 1] This example shows how the semantics can compute the derivation tree and not only assert that a given derivation tree is constructed according to the rules. Program termination Given a statement S and input s: ο· ο· S terminates on s if there exists a state s' s.t 〈π, π 〉 → π ′ S loops on s if there is no a state s' s.t 〈π, π 〉 → π ′ Given a statement S: ο· ο· S always terminates if for every input state s, S terminates on s. S always loops if for every input state s, S loops on s Semantic equivalence π1 and π2 are semantically equivalent if for all s and s': < π1 , π >→ π ′ iff < π2 , π >→ π ′ Intuitively, if π1 and π2 are semantically equivalent and we can't reach from s to s' in π1 we won't be able to do so in π2 as well. Note that we require that both program terminate exactly on the same input states. For example: the statement π€βπππ π ππ π is semantically equivalent to ππ π π‘βππ (π; π€βπππ π ππ π) πππ π π πππ. Proof: Lemma 2.5 in the book, pages 26-27 Properties of natural semantics Equivalence of program constructs ο· ο· ο· π πππ; π πππ is semantically equivalent to π πππ ((π1 ; π2 ); π3 ) is semantically equivalent to (π1 ; (π2 ; π3 )) The proof is in slide 80. The main idea is to show that for every state s, if with one statement we reach state s', then we will reach the same state s' using the other statement (and vise versa). In each direction we construct 2 derivation trees. One tree is constructed under the assumption that from state s we reach state s', and thus will give us a set of rule applications, which we use in the construction of the other tree. Example: (π₯ β 5; π¦ β π₯ ∗ 8) is semantically equivalent to (π₯ β 5; π¦ β 40) The semantics for While is deterministic Theorem: for all statements S and states π 1 , π 2: ππ 〈π, π 〉 → π 1 πππ 〈π, π 〉 → π 2 π‘βππ π 1 = π 2 Proof: in the book, pages 29-30. The proof uses induction on the shape of derivation trees. First, prove that the property holds for all simple derivation trees by showing it holds for axioms (the tree has single node in these cases) and then prove that the property holds for all composite trees: For each rule assume that the property holds for its premises (induction hypothesis) and prove it holds for the conclusion of the rule. The semantic function πΊππ Using the natural (operational) semantics, we can assign the meaning to any statement S as a partial function from State to State: πππ : ππ‘π → (ππ‘ππ‘π βͺ ππ‘ππ‘π) (A different way to define the semantics function would beπππ : (ππ‘π × ππ‘ππ‘π) → ππ‘ππ‘π. Both ways are equivalent, because of currying) π ′ ππ 〈π, π 〉 → π ′ πππ β¦πβ§π = { π’ππππππππ ππ‘βπππ€ππ π In natural semantics for While, if 〈π, π 〉 terminates then there is a state s' s.t 〈π, π 〉 → π ′, and this state is unique. Otherwise, πππ β¦πβ§π is undefined. The latter happens because we cannot construct a derivation tree 〈π, π 〉 → π ′ for any s’. Examples: πππ β¦π πππβ§π = π πππ β¦π₯ β 1β§π = π [π₯ β¦ 1] πππ β¦π€βπππ π‘ππ’π ππ π πππβ§π = π’ππππππππ. The execution of this statement doesn't terminate, and so its meaning is undefined in the natural semantics. Structural operating semantics (SOS) ο· ο· This semantics was developed by Gordon Plotkin and it is also called "small step semantics". The semantics is defined as a transition relation between configurations 〈πΊ, π〉 ⇒ 〈πΊ′, π′〉 . However, here, the (double) arrow represents an execution of the first step. ο· ο· ο· ο· There are 2 kinds of configurations: o 〈π, π 〉 denotes a statement S which is about to execute on state s o π denotes terminal (final) state Transitions are written 〈π, π 〉 ⇒ πΎ, where: o πΎ = 〈π ′ , π ′〉 if the execution of S from s is not completed and the remaining computation proceeds from intermediate configuration πΎ. o πΎ = π ′ if the execution of S from s has terminated and the final state is s'. 〈π, π 〉 is stuck if there is no πΎ s.t 〈π, π 〉 ⇒ πΎ. A configuration can be stuck if we don't have a rule in the semantics that tells us how to proceed from 〈π, π 〉. For example: if we have 'if' without 'else' afterwards, and the semantics does not define what happens if the condition does not hold then the execution gets stuck if the 'if' condition doesn't hold. If 〈π, π 〉 ⇒ 〈π ′ , π ′〉 then s and s' might be equal, i.e., we can stay in the same state after the partial execution. Note however that the statement changes. This can happens, e.g., when evaluating the condition of a an if statement. Structural semantics for While [ππ π πππ ] 〈π₯ β π, π 〉 ⇒ π [π₯ β¦ πβ¦πβ§π ] [π ππππππ ] Axioms (atomic/primitive operations) 〈π πππ, π 〉 ⇒ π 1 [πππππππ ] 〈π1 ,π 〉⇒〈π′1 ,π ′〉 〈π1 ;π2 ,π 〉⇒〈π′1 ;π2 ,π ′〉 This happens when the execution of π1 from s is not completed, and thus we proceed to π′1 2 [πππππππ ] 〈π1 ,π 〉⇒π ′ 〈π1 ;π2 ,π 〉⇒〈π2 ,π ′〉 This happens when the execution of π1 from s has terminated, and thus we proceed to π2 . This means that π1 was an atomic operation. π‘π‘ [πππππ ] ππ [πππππ ] 〈ππ π π‘βππ π1 πππ π π2 , π 〉 ⇒ 〈π1 , π 〉 ππ β¬β¦πβ§π = π‘π‘ 〈ππ π π‘βππ π1 πππ π π2 , π 〉 ⇒ 〈π2 , π 〉 ππ β¬β¦πβ§π = ππ The if rule applies some sort of rewriting to the program [π€βππππππ ] 〈π€βπππ π ππ π, π 〉 ⇒ β¨ππ π π‘βππ π; π€βπππ π ππ π πππ π π πππ, π β© When we perform the execution of a while statement, we unfold one level each time, by replacing it with 'if'. We stop unfolding when we the condition b will be false, and then we fold back and remain with the state we reached when b was false (as then, we perform skip). Derivation sequences A derivation sequence of a statement S starting in state s is either: ο· ο· A finite sequence πΎ0 , πΎ1 , πΎ2 , … , πΎπ such that: 1. πΎ0 = 〈π, π 〉 2. πΎπ ⇒ πΎπ+1 3. πΎπ is either stuck configuration or a final state An infinite sequenceπΎ0 , πΎ1 , πΎ2 , … such that 1. πΎ0 = 〈π, π 〉 2. πΎπ ⇒ πΎπ+1 Notations: ο· ο· πΎ0 ⇒π πΎπ πΎ0 ⇒∗ πΎ πΎ0 derives πΎπ in k steps. (⇒π is the composition of ⇒ k times) πΎ0 derives πΎ in a finite (possibly zero) number of steps. (⇒π∗ is the reflexive transition closure of composition of ⇒ k times) For each step there is a corresponding derivation tree. Note that SOS can define executions that don't terminate. We can see a derivation sequence example in slide 88. Each step in the sequence should be computed (alternatively, formally proved to be constructed according to the rules of the semantics) using a derivation tree. The slide contains a derivation tree for the first step, and the rules that are being used there, from 1 2 bottom-up are: [πππππππ ] , [πππππππ ] and [ππ π πππ ]. Evaluation via derivation sequences ο· For any While statement S and state s it is always possible to find at least one derivation sequence from 〈π, π 〉: apply axioms and rules forever or until a terminal or stuck configuration is reached. ο· Proposition: there are no stuck configurations in While In slide 90 we can see a derivation sequences example for factorial (n!). Note that each step should be computed using a derivation tree (as one step is proved in slide 88) Program termination Given a statement S and input s: ο· ο· ο· S terminates on s if there exists a finite derivation sequence starting at 〈π, π 〉 (This finite derivation sequence ends with either stuck configuration or a final state) S terminates successfully on s if there exists a finite derivation sequence starting at 〈π, π 〉 leading to a final state S loops on s if there exists an infinite derivation sequence starting at 〈π, π 〉 Properties of structural operational semantics ο· π1 and π2 are semantically equivalent if: o For all s and πΎ which is either final or stuck, 〈π1 , π 〉 ⇒∗ πΎ iff 〈π2 , π 〉 ⇒∗ πΎ ο§ (we are using here the ⇒∗ notation, as the length of the two derivation sequences may be different) o For all s, there is an infinite derivation sequence starting at 〈π1 , π 〉 iff there is an infinite derivation sequence starting at 〈π2 , π 〉 (π1 loops on s iff π2 loops on s). ο· Theorem: While is deterministic: if 〈π, π 〉 ⇒∗ π 1 and 〈π, π 〉 ⇒∗ π 2 then π 1 = π 2 Sequential composition Lemma: if 〈π1 ; π2 , π 〉 ⇒π π ′′ then there exists s' and k=m+n s.t 〈π1 , π 〉 ⇒π π ′ and 〈π2 , π ′〉 ⇒π π ′′ The proof (pages 37-38, Lemma 2.19) uses induction on the length of the derivation sequences. ο· If k=0 the result holds vacuously. ο· Proving that the property holds for all other derivation sequences, is done by induction- assuming that the lemma holds for all π ≤ ππ and showing that it holds for sequences of length ππ + π The semantic function πΊπΊπΆπΊ The meaning of a statement S is defined as a partial function from State to State: ππππ : ππ‘π → (ππ‘ππ‘π βͺ ππ‘ππ‘π) π ′ ππππ β¦πβ§π = { π’ππππππππ ππ 〈π, π 〉 ⇒∗ π ′ πππ π Note that when defining the semantic function we are interested only in terminating executions, i.e., ones which produce a final state. There is no meaning to the “output of an infinite execution”. We note that the semantic of While (as defined so far) is deterministic and hence ππππ is well-defined. Examples: ππππ β¦π πππβ§π = π ππππ β¦π₯ β 1β§π = π [π₯ β¦ 1] ππππ β¦π€βπππ π‘ππ’π ππ π πππβ§π = π’ππππππππ An equivalence result We have seen two definitions of the semantics of While using a natural semantics and a structural operational semantics. Theorem: For every statement S of While πππ β¦πβ§ = ππππ β¦πβ§ Proof in the book, pages 40-43. The proof consists of 2 lemmas: ο· ο· For every statement S of While and states s and s' we have 〈π, π 〉 → π ′ implies 〈π, π 〉 ⇒∗ π ′ (lemma 2.27) The proof is by induction on the shape of the derivation tree for 〈π, π 〉 → π ′ We get that if πππ β¦πβ§π = π ′ then ππππ β¦πβ§π = π ′ (1) For every statement S of While, states s and s' and natural number k, we have that 〈π, π 〉 ⇒π π ′ implies 〈π, π 〉 → π ′ (lemma 2.2.28) The proof is by induction on the length of the derivation sequence We get that if ππππ β¦πβ§π = π ′ then πππ β¦πβ§π = π ′ (2) Then, from (1) and (2) we get that πππ β¦πβ§ = ππππ β¦πβ§. In particular, if one semantic function is defined on a state s then so is the other one, and therefore, if one is not defined on a state s then neither is the other. Note: the claim is true even though structural semantics allows defining infinite computations.