Team Extremely Awesome
Nels Beckman
Project Presentation
17-654: Analysis of Software Artifacts
Analysis of Software Artifacts -
Spring 2006
1
A Goal-Based Literature Search
• This semester we explored many fundamental style of software analysis.
• How might each one be applied to the same goal?
• (Finding race conditions)
• Purpose:
• Analyze strengths of different analysis styles normalized to one defect type.
• See how you might decide amongst different techniques on a real project.
Analysis of Software Artifacts -
Spring 2006
2
What is a Race Condition?
• One Definition:
• “A race occurs when two threads can access (read or write) a data variable simultaneously and at least one of the two accesses is a write .”
(Henzinger 04)
• Note:
• Locks not specifically mentioned.
Analysis of Software Artifacts -
Spring 2006
3
Why Race Conditions?
• Race conditions are insidious bugs:
• Can corrupt memory.
• Often not detected until later in execution.
• Appearance is non-deterministic.
• Difficult to reason about the interaction of multiple threads.
• My intuition?
• It should be relatively easy to ensure that I am at least locking properly.
Analysis of Software Artifacts -
Spring 2006
4
But First: Locking Discipline
• Mutual Exclusion Locking Discipline
• A programing discipline that will ensure an absence of race conditions.
• Requires a lock be held on every access to a shared variable.
• Not the only way to achieve freedom from races!
• See example, next slide.
• Some tools check MLD, not race safety.
Analysis of Software Artifacts -
Spring 2006
5
Example:
(Yu '05) t u t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) t:Lock(a) t:Write(x) t:Unlock(a) t:Join(v) u:Lock(a) u:Write(x) u:Unlock(a) v v:Lock(a) v:Write(x) v:Unlock(a)
Analysis of Software Artifacts -
Spring 2006
6
Four Broad Analysis Types
• Type-Based Race Prevention
• Languages that cannot express “racy” programs.
• Dynamic Race Detectors
• Using instrumented code to detect races.
• Model-Checkers
• Searching for reachable race states.
• Flow-Based Race Detectors
• Of the style seen in this course.
Analysis of Software Artifacts -
Spring 2006
7
Dimensions of Comparison
• Ease of Use
• Annotations
• What is the associated burden with annotating the code?
• Expression
• Does tools restrict my ability to say what I want?
• Scalability
• Could this tool legitamately claim to work on a large code base?
• Soundness
• What level of assurance is provided?
• Precision
• Can I have confidence in the results?
Analysis of Software Artifacts -
Spring 2006
8
Type-Based Race Prevention
• Goal:
• To prevent race conditions using the language itself.
• Method:
• Encode locking discipline into language.
• Relate shared state and the locks that protect them.
• Use typing annotations.
• Recall ownership types; this will seem familiar.
Analysis of Software Artifacts -
Spring 2006
9
Example: Race-Free Cyclone
• To give a better feel, let's look at
Cyclone.
• Other type-based systems are very similar.
Analysis of Software Artifacts -
Spring 2006
10
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;
Analysis of Software Artifacts -
Spring 2006
11
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;
Declares a variable of type “an integer protected by the lock named l.”
Analysis of Software Artifacts -
Spring 2006
12
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;
(loc is a special lock name. It means this variable is never shared.)
Analysis of Software Artifacts -
Spring 2006
13
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.” let lk<l> = newlock();
Analysis of Software Artifacts -
Spring 2006
14
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.” let lk<l> = newlock();
Variable name
Analysis of Software Artifacts -
Spring 2006
15
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.” let lk<l> = newlock();
Lock type name
Analysis of Software Artifacts -
Spring 2006
16
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {
// blah blah
}
Analysis of Software Artifacts -
Spring 2006
17
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {
// blah blah
}
This can be ignored for now...
Analysis of Software Artifacts -
Spring 2006
18
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {
// blah blah
}
When passed an int whose protection lock is l...
Analysis of Software Artifacts -
Spring 2006
19
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {
// blah blah
}
The caller must already possess lock l...
Analysis of Software Artifacts -
Spring 2006
20
Example: Race-Free Cyclone void inc<l:LU>(int*l p;{l}) {
*p = *p + 1;
} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }
} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42; int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);
}
Analysis of Software Artifacts -
Spring 2006
21
Example: Race-Free Cyclone void inc<l:LU>(int*l p;{l}) {
*p = *p + 1;
} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }
} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42;
It would be a type error to call inc without possessing the lock for the first argument.
int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);
}
Analysis of Software Artifacts -
Spring 2006
22
Example: Race-Free Cyclone void inc<l:LU>(int*l p;{}) {
*p = *p + 1;
} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }
} void f(;{}) { Imagine if the effects clause were empty...
let lk<l> = newlock(); int*l p1 = new 42; int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);
}
Analysis of Software Artifacts -
Spring 2006
23
Example: Race-Free Cyclone void inc<l:LU>(int*l p;{}) {
*p = *p + 1;
} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }
} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42;
A dereference would also signal a compiler error, since it is unprotected.
int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);
}
Analysis of Software Artifacts -
Spring 2006
24
Type-Based Race Prevention
• Positives:
• Soundness
• Programs are race-free by construction.
• Familiarity
• Languages are usually based on well-known languages.
• Locking discipline is a very common paradigm.
• Relatively Expressive
• These type systems have been integrated with polymorphism, object migration.
• Classes can be parameterized by different locks
• Types Can Often be Inferred
• Intra-procedural (thanks to effects clauses)
Analysis of Software Artifacts -
Spring 2006
25
Type-Based Race Prevention
• Negatives:
• Restrictive:
• Not all race-free programs are legal.
• e.g. Object initialization, other forms of syncrhonization (fork/join, etc.).
• Annotation Burden:
• Lots of annotations to write, even for nonshared data.
• Especially to make more complicate features, like polymorphism, work.
• Another Language
Analysis of Software Artifacts -
Spring 2006
26
Type-Based Race Prevention
• Open Research Questions:
• Reduce Restrictions as Much as Possible
• Initialization phase
• Subclassing without run-time checks in OO
• Encoding of thread starts and stops
• Remove annotations for non-threaded code
Analysis of Software Artifacts -
Spring 2006
27
Type-Based Race Prevention
• Open Research Questions:
• Personally, sceptical that inference can improve a whole lot.
• Programmer intent still must be specified somehow in locking discipline.
• But escape analysis could infer thread-locals.
Analysis of Software Artifacts -
Spring 2006
28
Dynamic Race Detectors
• Find race conditions by:
• Instrumenting the source code.
• Running lockset and happens-before analyses.
• Lockset has no false-negatives.
• Happens-before has no false positives.
• Instrumented source code will be represented by us.
• We see all (inside the program)!
Analysis of Software Artifacts -
Spring 2006
29
Lockset Analysis
• Imagine we’re watching the program execute…
...
marbury = 5; madison = 5; makeStuffHappen();
...
Analysis of Software Artifacts -
Spring 2006
30
Lockset Analysis
• Whenever a lock is acquired, add that to the set of “held locks.”
...
roe = 5; wade = 5; synchronize(my_object) {
...
Held
Locks: my_objec t
(0x34EFF
0)
Analysis of Software Artifacts -
Spring 2006
31
Lockset Analysis
• Likewise, remove locks when they are released.
...
brown = 43; board = “yes”;
} // end synch
...
Held
Locks:
Analysis of Software Artifacts -
Spring 2006
32
Lockset Analysis
• The first time a variable is accessed, set its
“candidate set” to be the set of held locks.
...
rob_frost = false;
...
Candidate
Set: rob_fros t
(0xFFFF0
1)
(0xFFFF0
8)
Held
Locks: lock1
(0xFFFF0
1) lock2
(0xFFFF08)
Analysis of Software Artifacts -
Spring 2006
33
Lockset Analysis
• The next time that variable is accessed, take the intersection of the candidate set and the set of currently held locks…
...
if(!rob_frost) {
...
Candidate
Set: rob_fros t
(0xFFFF0
1)
(0xFFFF0
8)
∩
Held
Locks: lock1
(0xABFF4
4)
Analysis of Software Artifacts -
Spring 2006
34
Lockset Analysis
• If the intersection is empty, flag a potential race condition!
...
if(!rob_frost) {
...
Candidate
Set: rob_fros t
(0xFFFF0
1)
(0xFFFF0
8)
∩
Held
Locks: lock1
(0xABFF4
4)
Analysis of Software Artifacts -
Spring 2006
35
Happens-Before Analysis
• More complicated.
• Intuition:
• Certain operations define an ordering between operations of threads.
• Establish thread counters to create a partial ordering.
• When a variable access occurs that can’t establish itself as being ‘after’ the previous one, we have detected an actual race.
Analysis of Software Artifacts -
Spring 2006
36
Happens-Before on our Example
1
2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u
1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v)
Analysis of Software Artifacts -
Spring 2006
37
Happens-Before on our Example
1
2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u
1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v)
Clock value.
Analysis of Software Artifacts -
Spring 2006
38
Happens-Before on our Example
1
2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u
1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) x: u-1 t-2
Each variable stores the thread clock value for the most recent access of each thread.
Analysis of Software Artifacts -
Spring 2006
39
Happens-Before on our Example
1
2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u
1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) t: self-2 u-1 x: u-1 t-2
Also, threads learn about and store the clock values of other threads through synchronization activities.
Analysis of Software Artifacts -
Spring 2006
40
Happens-Before on our Example
1
2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u
1
…
32 t:Join(u) t:Write(x) t:Fork(v) t: self-2 u-32 x: u-32 t-2
If u were to go off, incrementing its count and accessing variables, t would find out after the join.
Analysis of Software Artifacts -
Spring 2006
41
Happens-Before on our Example t When an access does occur, it is a requirement that: for each previous thread access of x : t’s knowledge of that thread’s time
≤ x’s knowledge of that thread’s time t:Join(u) t:Write(x) t:Fork(v)
Analysis of Software Artifacts -
Spring 2006 t: self-2 u-32 x: u-32 t-2
42
So, combining the two…
• Modern dynamic race detectors use both techniques.
• Lockset analysis will detect any violation of locking discipline.
• This means we will get plenty of false positives when strict locking discipline is not followed.
• Simple requires less memory and fewer cycles.
Analysis of Software Artifacts -
Spring 2006
43
So, combining the two…
• Modern dynamic race detectors use both techniques.
• Happens-Before will report actual race conditions that were detected.
• Extremely path sensitive.
• No false positives!
• False negatives can be a problem.
• High memory and CPU overhead.
• As we have seen, happens-before does not merely enforce locking discipline.
• Works when threads are ‘ordered.’
Analysis of Software Artifacts -
Spring 2006
44
So, combining the two…
• Performance-wise:
• Use lockset, then switch to happens-before for variables where a race is detected.
• Of course this is dynamic! No guarantee or reoccurrence!
• Similarly, modify detection granularity at runtime.
Analysis of Software Artifacts -
Spring 2006
45
Future Research
• Use static tools to limit search space
• We can soundly approximate every location where race might occur.
• Performance improvements
• Could be used for in-field monitoring.
• Improve chances of HB hitting?
Analysis of Software Artifacts -
Spring 2006
46
Model-Checking for Race Conditons
• The Art of Model Checking
• Develop a model of your software system that can be completely explored to find reachable error states
Analysis of Software Artifacts -
Spring 2006
47
Model-Checking for Race Conditons
• Normally, scope of model determines whether or not model checking is feasible.
• Detailed model – Model checking takes longer.
• Simple model – Must be detailed enough to capture principles of interest.
Analysis of Software Artifacts -
Spring 2006
48
Model-Checking for Race Conditons
• Model-checking concurrent programs is quite a challenge
• Take a large state space
• Add all possible thread interleavings
• Result – Very large state space
• Details of specific models would be too muc to go into
Analysis of Software Artifacts -
Spring 2006
49
Model-Checking for Race Conditons
• Strategies:
• Persistent Sets
• Eliminate pointless thread interleavings
• Sometimes known as partial order reduction
• Contexts
• Represent every other thread with one abstract state machine.
• Like CEGAR, only refine as much as needed.
Analysis of Software Artifacts -
Spring 2006
50
Model-Checking for Race Conditons
• Ease of use?
• Annotations
• None
• Expression
• Some tools use model-checking to implement lockset which does not allow much expression.
• Others allow us to find actual race conditions!
• Scalability
• A Question Mark: Is the state space small enough?
• Previous tools using partial order reduction have been used on large software, not for races
Analysis of Software Artifacts -
Spring 2006
51
Model-Checking for Race Conditons
• Soundness?
• Yes, model-checking in this manner is sound, as long as it terminates.
• Precision?
• Depends on how your model is used.
• In one model lockset analysis is used. Tends to be imprecise.
• Another model directly searches for “racy” states, which makes it very precise, but it doesn't yet work in the presence of aliasing.
Analysis of Software Artifacts -
Spring 2006
52
Good 'ole Flow-Based Analysis
• Has been approached in a few ways
• Engineering Approach
• Sacrifice Soundness
• Increase Precision as Much as Possible
• Rank Results
• Use Heuristics and Good Judgement
• Think of PREfix or Coverity
• Rely on Alias Analysis
• Rely on Programmer Annotations
Analysis of Software Artifacts -
Spring 2006
53
Good 'ole Flow-Based Analysis
• Engineering Approach:
• Start with interprocedural lockset analysis
• Make simple improvements:
• “use statistical analysis to computer the probability that s ... similar to known locks.”
• “realize that the first, last or only shared data in a critical section are special.”
• “if the number of distinct entry locksets in a function exceeds a fixed limit we skip the function”
• (Engler ’03)
Analysis of Software Artifacts -
Spring 2006
54
Many Benefits
• Ease of Use?
• Annotations
• None or a constant number that give immidiate precision improvements.
• Expression
• Non-lock based idioms are 'hard-coded' by heuristics.
• Scalability
• More than any other.
• Linux, FreeBSD, Commercial OS
• 1.8MLOC in 2-14 minutes
Analysis of Software Artifacts -
Spring 2006
55
Many Benefits
• Soundness?
• Not sound in a few specific ways.
• Ability to detect some false negative.
• Precision?
• Fewer false positives than traditional lockset tools.
• ~6 when run on Linux 2.5.
• 10s, 100s, 1000s in other static tools on smaller applications.
Analysis of Software Artifacts -
Spring 2006
56
Other Flow-Based Tools
• Some Rely on Alias Analysis
• Limited by Current State-of-the-Art
• Still Many False Positives
• May not Scale
• Some Rely on Programmer Annotations to distinguish all the hard cases
• May impose programmer burden
Analysis of Software Artifacts -
Spring 2006
57
So, Let’s Do a Final Comparison…
Analysis of Software Artifacts -
Spring 2006
58
Annotations
• Type-Based Systems
• Annotations are a major limiting factor.
They can be inferred, but they must be understood by the programmer.
• Dynamic Tools
• Unnecessary
• Model-Checking
• Unnecessary
• Flow-Based Analysis
• Necessary in some form or another
Analysis of Software Artifacts -
Spring 2006
59
Expression
• Type-Based Systems
• Limited to strict locking discipline.
• Dynamic Tools
• Thanks to combination of lockset and happensbefore, relative freedom.
• Model-Checking
• Can allow great expression (Depends on technology).
• Flow-Based Analysis
• Expression can be traded for soundness or annotations.
Analysis of Software Artifacts -
Spring 2006
60
Scalability
• Type-Based Systems
• Scalability Limited by Annotations
• Dynamic Tools
• Getting better, but performance still a major issue (1-3x mem. Usage, 1.5x CPU usage)
• Model-Checking
• Not extremely scalable. Depends highly on number of processes.
• Flow-Based Analysis
• Has shown the best scalability.
Analysis of Software Artifacts -
Spring 2006
61
Soundness
• Type-Based Systems
• Sound
• Dynamic Tools
• Fundamentally unsound; but lockset will catch most possible races in execution.
• Model-Checking
• Also sound. May not terminate.
• Flow-Based Analysis
• Different techniques trade soundness for precision.
Analysis of Software Artifacts -
Spring 2006
62
Precision
• Type-Based Systems
• Low precision. Strict MLD.
• Dynamic Tools
• Better precision.
• Model-Checking
• Can be very high. Not complete
(undecidability of reachability).
• Flow-Based Analysis
• High precision using an engineering approach.
Analysis of Software Artifacts -
Spring 2006
63
Questions
Analysis of Software Artifacts -
Spring 2006
64