Tracelets: a Model for the Laws of Concurrent Programming Tony Hoare Oxford Feb 2012 Our Universe • E is a set of events that can occur in or around a computer during execution of a program – drawn as boxes. • D is a set of dependencies between the events – drawn as arrows between boxes – e --> f means f depends on e • source, target: D --> E • source(d) --> target(d) Labels • P are sets of properties of D or E – e.g., the type of command executed – the objects involved in the event – the value of the data transmitted on the arrow • labels:E + D --> P – labels(e) are drawn in the box, – labels(d) on top of the arrow A tracelet is a subset of E , denoted by p, q, r . For example: • a trace of a single execution of a program • or of a single execution of a command • or of a single object used in the execution • I = { }, the tracelet that contains no event An object • is used by a program to store or transmit data – e.g., a variable, a semaphore, a channel, … • Its behaviour is modelled by a tracelet – containing some or all events in which it has engaged • A trace of a complete program execution – is the union of all the tracelets for every resource that it has used Pictorially ν δ ν labels the allocation of an object. δ labels its disposal. All other events of its tracelet lie in between A variable ν :=2 :=4 =4 δ :=3 =2 =3 =2 :=4 labels an assignment of value 4 =4 labels a fetch of value 4 Object names νx x:=4 x= 4 x:=2 x:=3 x=2 x=3 x=2 may be added to the labels δx A variable ν :=2 :=4 =4 δ :=3 =2 =3 =2 The arrows from each fetch to the next assignment ensures prompt overwriting Weak memory ν :=2 :=4 =4 δ :=3 =2 =3 =2 … which does not occur in modern weak memories. A Semaphore ν V P V P δ • P is an acquisition of the semaphore • V is a release (by the same owner) A buffered channel ν !2 !4 ?4 !4 ?4 !=3 ?2 labels an output of value 4 labels an input of value 4 δ ?3 A single-buffered channel ν !2 !4 ?4 δ !3 ?2 ?3 Each output depends on prior input of the previous message A complete program trace is the union of the tracelets for every command that it contains. • The tracelet of a command can be analysed into sub-tracelets for each of its immediate sub-commands. • The analysis determines whether the trace is a valid trace for the program. Concurrent Composition • p|q = p q, provided that p q = { } – otherwise the analysis is invalid, because no event is an execution of two distinct commands. • Theorem: | is associative and commutative with unit Definitions • p --> q = e є p, f є q . e --> f • p => q = p = q or p is undefined. Sequential Composition • p;q = p|q provided not q --> p – otherwise the analysis is invalid, because no event in execution of the first command can depend on any event in the execution of the second. • Theorem: ; is associative with unit Example If x is a shared variable x := 3 ; x:=4 = x := 3 x := 4 Or, if blue arrows are equal, x := 3 ; x:=4 = x := 3 x := 4 Theorems • p;q => p|q – Proof: they are equal whenever not q --> p – otherwise, lhs is undefined • (p|p’);q => p|(p’;q) – they are equal when they are both defined – if rhs is undefined, then q --> p’ – which implies that q --> (p’|q), – therefore the lhs is also undefined. Exchange laws • p;(q|q’) => (p;q)|q’ – proof similar • (p|p’);(q|q’) => (p;q)|(p’;q’) – proof similar • All exchange laws are derivable from the last, – by substituting for p’, or q’, – or for both q and q’ Separating concurrency? • r = p||q = r = p;q & r = q;p – there is no arrow between p and q • BUT – this would prohibit shared variables – p||q < p;q – p ; (q||q’) < (p;q)||q’ – the wrong way round! • Let’s postpone this problem etc. A command • is modelled by the set of tracelets of all its possible executions in all its possible environments. • = {{}} • x := 3 = { p | ‘x :=3’ є labels(p)} • x := y = { p | n. {‘x:= n’, ‘y = n’} labels(p)} Let P, Q, R, be commands • • • • P | Q = { (p|q)| p є P & q є Q } P ; Q = { (p;q) | p є P & q є Q } P \/ Q = { r | r є P or r є Q } P Q = r . r є P => r є Q All our theorems p => q also hold for P Q, – because every variable appears exactly once on each side of the inequation. Separation • Let L be the set of red arrows – they must not cross thread boundaries • Blue arrows must cross thread boundaries • Black arrows may cross either boundaries. • Definitions of ; and | must be changed to ensure these rules Dependency ordering • Let e < f mean that there is a path of arrows from e to f . • e <L f means the path consists of red arrows • p <L q means e --> f, for some e є p , f є q Interfaces of a tracelet p • arrows(p) • ins(p) • outs(p) = = = s(p) u t(p) t(p) - s(p) s(p) - t(p) – where s(p) = {d| source(d) є p} t(p) = {d| target(d) є p} In pictures… ins outs Separating ; • p;q = p|q provided that not q < p & outs(p) L = ins(q) L – outs(p;q) L = outs(q) L – ins(p;q) L = ins(p) L • Theorem: ; is associative and has unit Ok is a set of tracelets that are always preferred – e.g., no overflow, no races, no divergence, etc. • p => q • є ok • p;q є ok means = p=q or p is not defined or not q є ok p є ok & q є ok Separating concurrency • e<f = there is a path of red dependencies from e to f • p*q є ok = p є ok & q є ok & not p < q & not q < p • p*q = p|q if p*q є ok – because that’s the way it will be implemented! Lift to sets • P => Q means p є P. p є ok => p є Q otherwise q є Q . not q є ok & ins(q) = ins(p)