An Algebra for Program Designs Tony Hoare Moscow July 2011 With ideas from • • • • • • • • Ian Wehrman John Wickerson Stephan van Staden Peter O’Hearn Bernhard Moeller Georg Struth Rasmus Petersen …and others Summary denotational models algebraic laws deduction rules operational rules Part 1 Algebra and Hoare logic • Some familiar algebraic laws • their application to program designs • derivation of Hoare logic from them Part 1 Algebra and Hoare logic algebraic laws deduction rules Subject matter: designs • variables (p, q, r) stand for programs, designs, specifications,… • they all describe what happens inside/around a computer that is executing a program. • The program itself is the most precise. • The specification is the most abstract. • Designs come in between. Binary relation: p ⊑ q • Everything described by p is also described by q , e.g., – spec p implies spec q – prog p satisfies spec q – prog p more determinate than prog q • stepwise development is – spec ⊒ design ⊒ program • stepwise analysis is the reverse – program ⊑ design ⊑ spec p • • • • • • • • ⊑ below lesser stronger lower bound more precise …deterministic included in antecedent => q • • • • • • • • above greater weaker upper bound more abstract ...non-deterministic containing (sets) consequent (pred) ⊑ is a partial order • ⊑ transitive • • p⊑r if p ⊑ q and q ⊑ r needed for stepwise development/analysis • ⊑ antisymmetric and reflexive • p=r iff p ⊑ r and r ⊑ p • needed for abstraction Binary operator: p ; q • sequential composition of p and q • an execution of p;q consists of – all events x from an execution of p – and all events y from an execution of q • subject to an ordering constraint…. Three ordering constraints • strong sequence: x must precede y • weak sequence: y must not precede x • no constraint • all our algebraic laws will apply to all three alternatives Hoare triple: {p} q {r} • defined as p;q ⊑ r – starting in the final state of an execution of p, q ends in the final state of some execution of r – p and r may be arbitrary designs. • example: {..x+1 ≤ n} x:= x + 1 {..x ≤ n} • where ..b (finally b) describes all executions that end in a state satisfying a single-state predicate b . monotonicity • Law: ( ; is monotonic wrto ⊑) : – p;q ⊑ p’;q’ if p ⊑ p’ and q ⊑ q’ – like addition of numbers • monotony justifies modular evolution – p’ and q’ are developed independently • Theorem (rule of consequence): – p’ ⊑ p & {p} q {r} & r ⊑ r’ implies {p’} q {r’} • Law is also provable from the theorem associativity • Law (; is associative) : – (p;q);q’ = p;(q;q’) • Theorem (sequential composition): – {p} q {s} & {s} q’ {r} implies {p} q;q’ {r} • half the law provable from theorem Conditional correctness • disregards unending executions • ..b is re-interpreted as including them all: – ‘if the execution terminates, it will end in a state satisfying b‘. • definition of triple stays the same • all laws apply also to conditional correctness logic as well as total correctness logic. Unit(skip): • a program that does nothing • Law ( is the unit of ;): – p; = p = ;p • Theorem (nullity) – {p} {p} • a quarter of the law is provable from theorem concurrent composition: p | q • execution of (p|q) consists of – all events x of an execution of p, – and all events y of an execution of q • same laws apply to both: – interleaving: x precedes or follows y – true concurrency: x neither precedes nor follows y. • Laws: | is associative, commutative and monotonic Separation Logic • Law (locality): – – – – (s|p) ; q ⊑ s |(p;q) (left locality) p ; (q|s) ⊑ (p;q) | s (right locality) a weak version of associativity a weak version of distribution • Theorem (frame rule) : – {p} q {r} implies {p|s} q {r|s} – in Hoare logic, & replaces | , with sidecondition that q does not make s false • Left locality provable from the theorem! Concurrency law • Law (; exchanges with *) – (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – a weak kind of mutual distribution • Theorem (| compositional) – {p} q {r} & {p’} q’ {r’} implies {p|p’} q|q’ {r|r’} • the law is provable from the theorem p|q p q ; p’|q’ p’ q’ p|q p q ; p’|q’ p’ q’ ⊑ p;p’ | q;q’ Regular language model • p, q, r,… are sets of strings (languages). • p ⊑ q is inclusion of languages • p;q is (lifted) concatenation of strings • p|q is (lifted) interleaving of strings Left locality • Theorem: (s|p) ; q ⊑ s |(p;q) in lhs: s interleaves with just p , and all of q comes at the end. in rhs: s interleaves with all of p;q so lhs is a special case of rhs • right locality is similar Exchange • Theorem: (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – in lhs: all of p and q comes before all of p’ and q’ . – in rhs: p may interleave with q’ and p’ with q – the lhs is a special case of the rhs. Conclusion • regular expressions satisfy all our laws for ⊑ , ; , and | • and other operators introduced later Part 2. more operators and laws • • • • Complete lattices Iteration, recursion, fixed points Subroutines, abstraction Basic commands Subject matter • variables (p, q, r) stand for programs, designs, specifications,… • they are all descriptions of what happens inside and around a computer that is executing a program. • the differences between programs and specs are often defined from their syntax. Specification syntax includes • disjunction (or) to express abstraction, or to keep options open – ‘it may be painted green or blue’ • conjunction (and) to combine requirements – it must be cheaper than x and faster than y • negation (not) for safety and security – it must not explode • implication to define contracts – if the user observes the protocol, so will the system Program syntax excludes • disjunction – non-deterministic programs difficult to test • conjunction – inefficient to find a computation satisfying both • negation – Incomputable • implication – there is no point in executing it programs include • • • • • • • sequential composition (;) concurrent composition (|) iteration recursion interfaces transactions assignments, inputs, outputs, jumps,… • So let’s include these in our specification/designs Bottom • A specification that has no implementation like the false predicate • A program that has no execution e.g., because of some syntactic error • Define as the least solution of _ ⊑ q – r ⊑ q implies ⊑ r • Law ( is the zero of ;) : –;p = = p; • Theorem : – {p} {q} Top ⊤ • a program with a run-time error – for which the programmer is responsible – e.g., subscript error, division by zero, divergence,… • defined as the least solution of q ⊑ _ • Law: it is a zero of ; • ⊤; p = ⊤ = p ;⊤ if p ≠ • Theorem: none Non-determinism (or): p ⊔ q • describes all executions that either satisfy p or satisfy q . • The choice is not (yet) determined. • It may be determined later – in development of the design – or in writing the program – or by the compiler – or even at run time lub (join): ⊔ • Define p⊔q as least solution of p⊑_ & q⊑_ • Theorem –p⊑r & q⊑r iff p⊔q ⊑ r • Theorem – ⊔ is associative, commutative, monotonic, idempotent and increasing – it has unit ⊥ and zero ⊤ glb (meet): ⊓ • Define p⊓q as greatest solution of _⊑p & _⊑q Distribution • Law ( ; distributive through ⊔ ) – p ; (q⊔q’) = p;q ⊔ p;q’ – (q⊔q’) ; p = q;p ⊔ q’;p • Theorem (non-determinism) – {p} q {r} & {p} q’ {r} implies {p} q⊔q’ {r} – i.e., to prove something of q⊔q’ prove the same thing of both q and q’ • quarter of law provable from theorem Conditional: p if b else p’ • Define p ⊰b⊱ p’ as b.. ⊓ p ⊔ not(b).. ⊓ p’ – where b.. describes all executions that begin in a state satisfying b . • Theorem. p ⊰b⊱ p’ is associative, idempotent, distributive, and – p ⊰b⊱ q = q ⊰not(b)⊱ p (symm) – (p ⊰b⊱ p’ ) ⊰c⊱ (q ⊰b⊱ q’) = (p ⊰c⊱ q) ⊰b⊱ (p’ ⊰c⊱ q’) (exchange) Transaction • Defined as (p ⊓..b) ⊔ (q ⊓..c) – where ..b describes all executions that end satisfying single-state predicate b . • Implementation: – execute p first – test the condition b afterwards – terminate if b is true – backtrack on failure of b – and try an alternative q with condition c. Transaction (realistic) • Let r describe the non-failing executions of a transaction t . – – – – r is known when execution of t is complete. any successful execution of t is committed a single failed execution of t is undone, and q is done instead. • Define: (t if r else q) = t = (t ⊓ r) ⊔ q if t ⊑ r otherwise Least upper bound • Let S be an arbitrary set of designs • Define ⊔S as least solution of ∀s∊ S . s ⊑ _ – ∀s∊ S . s ⊑ r ⇒ r ⊑ ⊔S (all r) • everything is an upper bound of { } , so ⊔ { } = – a case where ⊔S ∉ S similarly • ⊓S is greatest lower bound of S •⊓{} = ⊤ Iteration (Kleene *) • q* is least solution of – (ɛ ⊔ (q; _) ) ⊑ _ • q* =def ⊔{s| ɛ ⊔ q; s ⊑ s} – ɛ ⊔ q; q* ⊑ q* – ɛ ⊔ q; q’ ⊑ q’ – q* = implies ⊔ {qⁿ | n ∊ Nat} • Theorem (invariance): – {p}q*{p} if {p}q{p} q* ⊑ q’ (continuity) Infinite replication • !p is the greatest solution of _ ⊑ p|_ – as in the pi calculus • all executions of !p are infinite – or possibly empty Recursion • Let F(_) be a monotonic function between programs. • Theorem: all functions defined by monotonic operators are monotonic. • μF is strongest solution of F(_) ⊑ _ • νF is weakest solution of _ ⊑ F(_) • Theorem (Knaster-Tarski): These solutions exist. Interfaces Let q be the body of a subroutine Let s be its specification Let (q .. s) assert that q meets s Programmer error (⊤) if incorrect Caller of subroutine may assume that s describes all itsexecutions • Implementeation may execute q • • • • • Subroutine with interface: q .. s • Define (q..s) as glb of the set q⊑_ & _⊑s • Theorem: (q.. s) = q = ⊤ if q ⊑ s otherwise Basic statements/assertions • • • • • • • • skip bottom top assignment: assertion: assumption: finally initially ⊤ x := e(x) assert b assume b ..b b.. more • • • • assign thru pointer: [a] := e output: c!e input: c?x points to: a|-> e – a |-> _ • throw • catch =def exists v . a|-> v Laws(examples) • assume b • assert b =def =def • x:=e(x) ; x:=f(x) b..⊓ b..⊓ ⊔ not(b).. = x := f(e(x)) – in languages without interleaving more • p|-> _ ; [p] := e ⊑ p|-> e = x := e – in separation logic • c!e | c?x – in CSP but not in CCS or Pi • throw x ; (catch x; p) = p Part 3 Unifying Semantic Theories • Six familiar semantic definition styles. • Their derivation from the algebra • and vice versa. algebraic laws deduction rules operational rules Hoare Triple • a method for program verification • {p} q {r} ≝ p;q ⊑ r – one way of achieving r is by first doing p and then doing q • Theorem: – {p} q {s} & {s} q’ {r} implies – proved by associativity {p} q;q’ {r} Plotkin reduction • a method for program execution • <p , q> -> r =def p ; q ⊒ r – if p describes state before execution of q then r describes a possible final state, eg. – <..(x2 = 18) , x := x+1> -> ..(x = 37) • Theorem: • <p, q> -> s & <s, q’> -> r implies <p, q;q’> r Milner transition • method of execution of concurrent processes • p – q -> r ≝ p ⊒ q;r – one of the ways of executing p is by first executing q and then executing r . – e.g., (x := x+3) –(x:=x+1)-> (x:=x+2) • Theorem: – p –q-> s & s –q’-> r => p –(q;q’)-> r (big-step rule for ; ) test generation • method of test case generation • p[q]r =def p ⊑ q;r – if r describes erroneous states resulting from execution of q , then p describes some initial states in which a test-run of q will certainly reveal the error. • Theorem: • p [q] s & s [q’] r implies p [q;q’] r Summary • {p} q {r} =def p;q ⊑ r =def p;q ⊒ r – Hoare triple • <p,q>->r – Plotkin reduction • p –q->r =def p ⊒ q;r =def p ⊑ q;r – Milner transition • p [q] r – test generation Sequential composition • Law: ; is associative • Theorem: sequence rule is valid for all four triples. • the Law is provable from the conjunction of all of them Skip • Law: p; = p = ;p • Theorems: {p} {p} p [] p p − → p <p, > –>p • Law follows from conjunction of all four theorems Left distribution ; through ⊔ • Law: p;(q ⊔ q’) = • Theorems: – – – – p;q ⊔ p;q’ {p} (q⊔q’) {r} if {p}q{r} <p,q⊔q’>-> r if <p,q>-> p [q⊔q’] r if p [q] r p -(q⊔q’)-> r if p –q->r (not used in CCS) and {p}q’{r} r or <p, q’>-> r or p [q’] r and p -q’->r • law provable from either and rule together with either or rule. locality and frame • left locality (s|p) ; q ⊑ s | (p;q) • Hoare frame: {p} q {r} ⇒ {s|p} q {s|r} • right locality p ; (q|s) ⊑ (p;q) | s • Milner frame: p -q-> r⇒(p|s) - q-> (r|s) • Full locality requires both frame rules Separation logic • Exchange law: – (p | p’) ; (q| q’) (p ; q) | (p’;q’) • Theorems – {p} q {r} & {p’} q’ {r’} ⇒ {p|p’} q|q’ {r|r’} – p -q -> r & p’–q’-> r’ => p|p’ –q|q’-> r|r’ • the law is provable from either theorem • For the other two triples, the rules are equivalent to the converse exchange law. usual restrictions on triples • • • • • • in {p} q {r} , in p [q] r , in <p,q>->r, in p –q->r, in p –q->r (in all cases, p and r are of form ..b, ..c p and r are of form b.., c.. p and r are of form ..b, ..c p and r are programs (small step), q is atomic q is a program) • all laws are valid without these restrictions Weakest precondition (-;) Specification statement (;-) • (q -; r) =def the weakest solution of ( _ ;q ⊆ r) – the same as Dijkstra’s wp(q, r) – for backward development of programs • (p ;- r) =def the weakest solution of ( p ; _ ⊆ r) – Back/Morgan’s specification statement – same as p⇝r in RGSep – for stepwise refinement of designs Weakest precondition (-;) • Law (-; adjoint to ;) – p ⊑ q -; r iff p;q ⊑ r (galois) • Theorem – (q -; r) ; q ⊑ – p ⊑ r q -; (p ; q) • Law provable from the theorems – cf. (r div q) q – r ≤ ≤ r (rq) div q Theorems • q’ ⊑ q & r ⊑ r’ => q-;r ⊑ q’-;r’ • (q;q’)-;r ⊑ q-;(q’-;r) • q-;r ⊑ (q;s) -; (r;s) Law of consequence Frame laws Part 4 Denotational Models A model is a mathematical structure that satisfies the axioms of an algebra, and realistically describes a useful application, for example, program execution. Models denotational models algebraic laws Some Standard Models: • Boolean algebra ( {0,1}, ≤, , , not(_) ) • predicate algebra (Frege, Heyting) – (ℙS,├, , , not(_), => , ∃, ∀) • regular expressions (Kleene): – (ℙA*, ⊆, ∪, ; , ɛ , {<a>} , | ) • binary relations (Tarski): – (ℙ(SS), ⊆, ∪, ∩, ; , Id , not(_), converse(_)) • algebra of designs is a superset of these Model: (EV, EX, PR) • EV is an underlying set of events (x, y, ..) that can occur in any execution of any program • EX are executions (e, f,…), modelled as sets of events • PR are designs (p, q, r,…), modelled as sets of executions. Set concepts • • • • • ⊑ ⊔ ⊓ ⊤ is is is is is {} EV (set inclusion) (set union) (intersection of sets) (the empty set) (the universal set) With (|) • p|q = {e ∪ f | e ε p & f ε q & e∩f = { } } – each execution of p|q is the disjoint union of an execution of p and an execution of q – p|q contains all such disjoint unions • | generalises many binary operators Introducing time • TIM is a set of times for events – partially ordered by ≤ • Let when : EV -> TIM – map each event to its time of occurrence. Definition of < • x < y =def not(when(y) ≤ when(x)) – x < y & y < x means that x and y occur ‘in true concurrency’. • e < f =def ∀x,y . x∊e & y∊f => x < y – no event of f occurs before an event of e – hence e<f implies ef = { } • If ≤ is a total order, – there is no concurrency, – executions are time-ordered strings Sequential composition (then) • p ; q = {ef | e∊p & f∊q & e<f} • special case: if ≤ is a total order, – e < f means that ef is concatenation (e⋅f) of strings – ; is the composition of regular expressions Theorems • These definitions of ; and | satisfy the locality and exchange laws. • (s|p) ; q ⊑ s |(p;q) • (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – Proof: the lhs describes fewer interleavings than the rhs. • regular expressions satisfy all our laws for ⊑ , ⊔ , ; , and | Disjoint concurrency (||) • p||q =def (p ; q) (q ; p) – all events of p concurrent with all of q . – no interaction is possible between them. • Theorems: (p||q) ; r p || (q ; r) (p||q) ; (p’||q’) (p;p’) || (q;q’) – Proof: the rhs has more disjointness constraints than the lhs . – the wrong way round! • So make the programmer responsible for disjointness, using interfaces! Interfaces • • • • • Let q be the body of a subroutine Let s be its specification Let (q .. s) assert that q is correct Caller may assume s Implementer may execute q Solution • p*q =def (p|q => p||q) = p|q if p|q ⊑ p||q ⊤ otherwise – programmer is responsible for absence of interaction between p and q . • Theorem: ; and * satisfy locality and exchange. – Proof: in cases where lhs ≠ rhs, rhs = ⊤ Problem • ; is almost useless in the presence of arbitrary interleaving (interference). • It is hard to prove disjointness of p||q • We need a more complex model – which constrains the places at which a program may make changes. Separation • PL is the set of places at which an event can occur • each place is ‘owned’ by one thread, – no other thread can act there. • Let where:EV -> PL map each event to its place of occurrence. • where(e) =def {where(x) | x ∊ e } Separation principle • events at different places are concurrent • events at the same place are totally ordered in time • ∀x,y ∊ EV . where(x) = where(y) iff x≤y or y≤x Picture space time Theorem • p || q = {ef | e ∊ p & f ∊ q & where(e) where(f) = { } } • proved from separation principle Convexity Principle • Each execution contains every event that occurs between any of its events. • ∀e ∊ EX , y ∊ EV. ∀x, z ∊ e . when(x) ≤ when(y) ≤ when(z) => y ∊ e – no event from elsewhere can interfere between any two events of an execution A convex execution of p;q p space time q A non-convex ‘execution’ of p;q p space time q Conclusion: in Praise of Algebra • • • • Reusable Modular Incremental Unifying • • • • • Beautiful! Discriminative Computational Comprehensible Abstract Algebra likes pairs • Algebra chooses as primitives – operators with two operands – predicates with two places – laws with two operators – algebras with two components +, =, &v,+ rings Tuples • Tuples are defined in terms of pairs. – Hoare triples – Plotkin triples – Jones quintuples – seventeentuples … Semantic Links denotations algebra deductions transitions Increments algebra Filling the gaps algebra