Survey of Race Condition Analysis Techniques Team Extremely Awesome Nels Beckman

advertisement

Survey of Race Condition

Analysis Techniques

Team Extremely Awesome

Nels Beckman

Project Presentation

17-654: Analysis of Software Artifacts

Analysis of Software Artifacts -

Spring 2006

1

A Goal-Based Literature Search

• This semester we explored many fundamental style of software analysis.

• How might each one be applied to the same goal?

• (Finding race conditions)

• Purpose:

• Analyze strengths of different analysis styles normalized to one defect type.

• See how you might decide amongst different techniques on a real project.

Analysis of Software Artifacts -

Spring 2006

2

What is a Race Condition?

• One Definition:

• “A race occurs when two threads can access (read or write) a data variable simultaneously and at least one of the two accesses is a write .”

(Henzinger 04)

• Note:

• Locks not specifically mentioned.

Analysis of Software Artifacts -

Spring 2006

3

Why Race Conditions?

• Race conditions are insidious bugs:

• Can corrupt memory.

• Often not detected until later in execution.

• Appearance is non-deterministic.

• Difficult to reason about the interaction of multiple threads.

• My intuition?

• It should be relatively easy to ensure that I am at least locking properly.

Analysis of Software Artifacts -

Spring 2006

4

But First: Locking Discipline

• Mutual Exclusion Locking Discipline

• A programing discipline that will ensure an absence of race conditions.

• Requires a lock be held on every access to a shared variable.

• Not the only way to achieve freedom from races!

• See example, next slide.

• Some tools check MLD, not race safety.

Analysis of Software Artifacts -

Spring 2006

5

Example:

(Yu '05) t u t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) t:Lock(a) t:Write(x) t:Unlock(a) t:Join(v) u:Lock(a) u:Write(x) u:Unlock(a) v v:Lock(a) v:Write(x) v:Unlock(a)

Analysis of Software Artifacts -

Spring 2006

6

Four Broad Analysis Types

• Type-Based Race Prevention

• Languages that cannot express “racy” programs.

• Dynamic Race Detectors

• Using instrumented code to detect races.

• Model-Checkers

• Searching for reachable race states.

• Flow-Based Race Detectors

• Of the style seen in this course.

Analysis of Software Artifacts -

Spring 2006

7

Dimensions of Comparison

• Ease of Use

• Annotations

• What is the associated burden with annotating the code?

• Expression

• Does tools restrict my ability to say what I want?

• Scalability

• Could this tool legitamately claim to work on a large code base?

• Soundness

• What level of assurance is provided?

• Precision

• Can I have confidence in the results?

Analysis of Software Artifacts -

Spring 2006

8

Type-Based Race Prevention

• Goal:

• To prevent race conditions using the language itself.

• Method:

• Encode locking discipline into language.

• Relate shared state and the locks that protect them.

• Use typing annotations.

• Recall ownership types; this will seem familiar.

Analysis of Software Artifacts -

Spring 2006

9

Example: Race-Free Cyclone

• To give a better feel, let's look at

Cyclone.

• Other type-based systems are very similar.

Analysis of Software Artifacts -

Spring 2006

10

Example: Race-Free Cyclone

• Things we want to express:

• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;

Analysis of Software Artifacts -

Spring 2006

11

Example: Race-Free Cyclone

• Things we want to express:

• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;

Declares a variable of type “an integer protected by the lock named l.”

Analysis of Software Artifacts -

Spring 2006

12

Example: Race-Free Cyclone

• Things we want to express:

• “This lock protects this variable.” int*l p1 = new 42; int*loc p2 = new 43;

(loc is a special lock name. It means this variable is never shared.)

Analysis of Software Artifacts -

Spring 2006

13

Example: Race-Free Cyclone

• Things we want to express:

• “This is a new lock.” let lk<l> = newlock();

Analysis of Software Artifacts -

Spring 2006

14

Example: Race-Free Cyclone

• Things we want to express:

• “This is a new lock.” let lk<l> = newlock();

Variable name

Analysis of Software Artifacts -

Spring 2006

15

Example: Race-Free Cyclone

• Things we want to express:

• “This is a new lock.” let lk<l> = newlock();

Lock type name

Analysis of Software Artifacts -

Spring 2006

16

Example: Race-Free Cyclone

• Things we want to express:

• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {

// blah blah

}

Analysis of Software Artifacts -

Spring 2006

17

Example: Race-Free Cyclone

• Things we want to express:

• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {

// blah blah

}

This can be ignored for now...

Analysis of Software Artifacts -

Spring 2006

18

Example: Race-Free Cyclone

• Things we want to express:

• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {

// blah blah

}

When passed an int whose protection lock is l...

Analysis of Software Artifacts -

Spring 2006

19

Example: Race-Free Cyclone

• Things we want to express:

• “This function should only be called when in posession of this lock.” void inc<l:LU>(int*l p;{l}) {

// blah blah

}

The caller must already possess lock l...

Analysis of Software Artifacts -

Spring 2006

20

Example: Race-Free Cyclone void inc<l:LU>(int*l p;{l}) {

*p = *p + 1;

} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }

} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42; int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);

}

Analysis of Software Artifacts -

Spring 2006

21

Example: Race-Free Cyclone void inc<l:LU>(int*l p;{l}) {

*p = *p + 1;

} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }

} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42;

It would be a type error to call inc without possessing the lock for the first argument.

int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);

}

Analysis of Software Artifacts -

Spring 2006

22

Example: Race-Free Cyclone void inc<l:LU>(int*l p;{}) {

*p = *p + 1;

} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }

} void f(;{}) { Imagine if the effects clause were empty...

let lk<l> = newlock(); int*l p1 = new 42; int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);

}

Analysis of Software Artifacts -

Spring 2006

23

Example: Race-Free Cyclone void inc<l:LU>(int*l p;{}) {

*p = *p + 1;

} void inc2<l:LU>(lock_t<l> plk, int*l p;{}) { sync(plk) { inc(p); }

} void f(;{}) { let lk<l> = newlock(); int*l p1 = new 42;

A dereference would also signal a compiler error, since it is unprotected.

int*loc p2 = new 43; spawn(g); inc2(lk, p1); inc2(nonlock, p2);

}

Analysis of Software Artifacts -

Spring 2006

24

Type-Based Race Prevention

• Positives:

• Soundness

• Programs are race-free by construction.

• Familiarity

• Languages are usually based on well-known languages.

• Locking discipline is a very common paradigm.

• Relatively Expressive

• These type systems have been integrated with polymorphism, object migration.

• Classes can be parameterized by different locks

• Types Can Often be Inferred

• Intra-procedural (thanks to effects clauses)

Analysis of Software Artifacts -

Spring 2006

25

Type-Based Race Prevention

• Negatives:

• Restrictive:

• Not all race-free programs are legal.

• e.g. Object initialization, other forms of syncrhonization (fork/join, etc.).

• Annotation Burden:

• Lots of annotations to write, even for nonshared data.

• Especially to make more complicate features, like polymorphism, work.

• Another Language

Analysis of Software Artifacts -

Spring 2006

26

Type-Based Race Prevention

• Open Research Questions:

• Reduce Restrictions as Much as Possible

• Initialization phase

• Subclassing without run-time checks in OO

• Encoding of thread starts and stops

• Remove annotations for non-threaded code

Analysis of Software Artifacts -

Spring 2006

27

Type-Based Race Prevention

• Open Research Questions:

• Personally, sceptical that inference can improve a whole lot.

• Programmer intent still must be specified somehow in locking discipline.

• But escape analysis could infer thread-locals.

Analysis of Software Artifacts -

Spring 2006

28

Dynamic Race Detectors

• Find race conditions by:

• Instrumenting the source code.

• Running lockset and happens-before analyses.

• Lockset has no false-negatives.

• Happens-before has no false positives.

• Instrumented source code will be represented by us.

• We see all (inside the program)!

Analysis of Software Artifacts -

Spring 2006

29

Lockset Analysis

• Imagine we’re watching the program execute…

...

marbury = 5; madison = 5; makeStuffHappen();

...

Analysis of Software Artifacts -

Spring 2006

30

Lockset Analysis

• Whenever a lock is acquired, add that to the set of “held locks.”

...

roe = 5; wade = 5; synchronize(my_object) {

...

Held

Locks: my_objec t

(0x34EFF

0)

Analysis of Software Artifacts -

Spring 2006

31

Lockset Analysis

• Likewise, remove locks when they are released.

...

brown = 43; board = “yes”;

} // end synch

...

Held

Locks:

Analysis of Software Artifacts -

Spring 2006

32

Lockset Analysis

• The first time a variable is accessed, set its

“candidate set” to be the set of held locks.

...

rob_frost = false;

...

Candidate

Set: rob_fros t

(0xFFFF0

1)

(0xFFFF0

8)

Held

Locks: lock1

(0xFFFF0

1) lock2

(0xFFFF08)

Analysis of Software Artifacts -

Spring 2006

33

Lockset Analysis

• The next time that variable is accessed, take the intersection of the candidate set and the set of currently held locks…

...

if(!rob_frost) {

...

Candidate

Set: rob_fros t

(0xFFFF0

1)

(0xFFFF0

8)

Held

Locks: lock1

(0xABFF4

4)

Analysis of Software Artifacts -

Spring 2006

34

Lockset Analysis

• If the intersection is empty, flag a potential race condition!

...

if(!rob_frost) {

...

Candidate

Set: rob_fros t

(0xFFFF0

1)

(0xFFFF0

8)

Held

Locks: lock1

(0xABFF4

4)

Analysis of Software Artifacts -

Spring 2006

35

Happens-Before Analysis

• More complicated.

• Intuition:

• Certain operations define an ordering between operations of threads.

• Establish thread counters to create a partial ordering.

• When a variable access occurs that can’t establish itself as being ‘after’ the previous one, we have detected an actual race.

Analysis of Software Artifacts -

Spring 2006

36

Happens-Before on our Example

1

2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u

1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v)

Analysis of Software Artifacts -

Spring 2006

37

Happens-Before on our Example

1

2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u

1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v)

Clock value.

Analysis of Software Artifacts -

Spring 2006

38

Happens-Before on our Example

1

2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u

1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) x: u-1 t-2

Each variable stores the thread clock value for the most recent access of each thread.

Analysis of Software Artifacts -

Spring 2006

39

Happens-Before on our Example

1

2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u

1 u:Lock(a) u:Write(x) u:Unlock(a) t:Join(u) t:Write(x) t:Fork(v) t: self-2 u-1 x: u-1 t-2

Also, threads learn about and store the clock values of other threads through synchronization activities.

Analysis of Software Artifacts -

Spring 2006

40

Happens-Before on our Example

1

2 t t:Fork(u) t:Lock(a) t:Write(x) t:Unlock(a) u

1

32 t:Join(u) t:Write(x) t:Fork(v) t: self-2 u-32 x: u-32 t-2

If u were to go off, incrementing its count and accessing variables, t would find out after the join.

Analysis of Software Artifacts -

Spring 2006

41

Happens-Before on our Example t When an access does occur, it is a requirement that: for each previous thread access of x : t’s knowledge of that thread’s time

≤ x’s knowledge of that thread’s time t:Join(u) t:Write(x) t:Fork(v)

Analysis of Software Artifacts -

Spring 2006 t: self-2 u-32 x: u-32 t-2

42

So, combining the two…

• Modern dynamic race detectors use both techniques.

• Lockset analysis will detect any violation of locking discipline.

• This means we will get plenty of false positives when strict locking discipline is not followed.

• Simple requires less memory and fewer cycles.

Analysis of Software Artifacts -

Spring 2006

43

So, combining the two…

• Modern dynamic race detectors use both techniques.

• Happens-Before will report actual race conditions that were detected.

• Extremely path sensitive.

• No false positives!

• False negatives can be a problem.

• High memory and CPU overhead.

• As we have seen, happens-before does not merely enforce locking discipline.

• Works when threads are ‘ordered.’

Analysis of Software Artifacts -

Spring 2006

44

So, combining the two…

• Performance-wise:

• Use lockset, then switch to happens-before for variables where a race is detected.

• Of course this is dynamic! No guarantee or reoccurrence!

• Similarly, modify detection granularity at runtime.

Analysis of Software Artifacts -

Spring 2006

45

Future Research

• Use static tools to limit search space

• We can soundly approximate every location where race might occur.

• Performance improvements

• Could be used for in-field monitoring.

• Improve chances of HB hitting?

Analysis of Software Artifacts -

Spring 2006

46

Model-Checking for Race Conditons

• The Art of Model Checking

• Develop a model of your software system that can be completely explored to find reachable error states

Analysis of Software Artifacts -

Spring 2006

47

Model-Checking for Race Conditons

• Normally, scope of model determines whether or not model checking is feasible.

• Detailed model – Model checking takes longer.

• Simple model – Must be detailed enough to capture principles of interest.

Analysis of Software Artifacts -

Spring 2006

48

Model-Checking for Race Conditons

• Model-checking concurrent programs is quite a challenge

• Take a large state space

• Add all possible thread interleavings

• Result – Very large state space

• Details of specific models would be too muc to go into

Analysis of Software Artifacts -

Spring 2006

49

Model-Checking for Race Conditons

• Strategies:

• Persistent Sets

• Eliminate pointless thread interleavings

• Sometimes known as partial order reduction

• Contexts

• Represent every other thread with one abstract state machine.

• Like CEGAR, only refine as much as needed.

Analysis of Software Artifacts -

Spring 2006

50

Model-Checking for Race Conditons

• Ease of use?

• Annotations

• None

• Expression

• Some tools use model-checking to implement lockset which does not allow much expression.

• Others allow us to find actual race conditions!

• Scalability

• A Question Mark: Is the state space small enough?

• Previous tools using partial order reduction have been used on large software, not for races

Analysis of Software Artifacts -

Spring 2006

51

Model-Checking for Race Conditons

• Soundness?

• Yes, model-checking in this manner is sound, as long as it terminates.

• Precision?

• Depends on how your model is used.

• In one model lockset analysis is used. Tends to be imprecise.

• Another model directly searches for “racy” states, which makes it very precise, but it doesn't yet work in the presence of aliasing.

Analysis of Software Artifacts -

Spring 2006

52

Good 'ole Flow-Based Analysis

• Has been approached in a few ways

• Engineering Approach

• Sacrifice Soundness

• Increase Precision as Much as Possible

• Rank Results

• Use Heuristics and Good Judgement

• Think of PREfix or Coverity

• Rely on Alias Analysis

• Rely on Programmer Annotations

Analysis of Software Artifacts -

Spring 2006

53

Good 'ole Flow-Based Analysis

• Engineering Approach:

• Start with interprocedural lockset analysis

• Make simple improvements:

• “use statistical analysis to computer the probability that s ... similar to known locks.”

• “realize that the first, last or only shared data in a critical section are special.”

• “if the number of distinct entry locksets in a function exceeds a fixed limit we skip the function”

• (Engler ’03)

Analysis of Software Artifacts -

Spring 2006

54

Many Benefits

• Ease of Use?

• Annotations

• None or a constant number that give immidiate precision improvements.

• Expression

• Non-lock based idioms are 'hard-coded' by heuristics.

• Scalability

• More than any other.

• Linux, FreeBSD, Commercial OS

• 1.8MLOC in 2-14 minutes

Analysis of Software Artifacts -

Spring 2006

55

Many Benefits

• Soundness?

• Not sound in a few specific ways.

• Ability to detect some false negative.

• Precision?

• Fewer false positives than traditional lockset tools.

• ~6 when run on Linux 2.5.

• 10s, 100s, 1000s in other static tools on smaller applications.

Analysis of Software Artifacts -

Spring 2006

56

Other Flow-Based Tools

• Some Rely on Alias Analysis

• Limited by Current State-of-the-Art

• Still Many False Positives

• May not Scale

• Some Rely on Programmer Annotations to distinguish all the hard cases

• May impose programmer burden

Analysis of Software Artifacts -

Spring 2006

57

So, Let’s Do a Final Comparison…

Analysis of Software Artifacts -

Spring 2006

58

Annotations

• Type-Based Systems

• Annotations are a major limiting factor.

They can be inferred, but they must be understood by the programmer.

• Dynamic Tools

• Unnecessary

• Model-Checking

• Unnecessary

• Flow-Based Analysis

• Necessary in some form or another

Analysis of Software Artifacts -

Spring 2006

59

Expression

• Type-Based Systems

• Limited to strict locking discipline.

• Dynamic Tools

• Thanks to combination of lockset and happensbefore, relative freedom.

• Model-Checking

• Can allow great expression (Depends on technology).

• Flow-Based Analysis

• Expression can be traded for soundness or annotations.

Analysis of Software Artifacts -

Spring 2006

60

Scalability

• Type-Based Systems

• Scalability Limited by Annotations

• Dynamic Tools

• Getting better, but performance still a major issue (1-3x mem. Usage, 1.5x CPU usage)

• Model-Checking

• Not extremely scalable. Depends highly on number of processes.

• Flow-Based Analysis

• Has shown the best scalability.

Analysis of Software Artifacts -

Spring 2006

61

Soundness

• Type-Based Systems

• Sound

• Dynamic Tools

• Fundamentally unsound; but lockset will catch most possible races in execution.

• Model-Checking

• Also sound. May not terminate.

• Flow-Based Analysis

• Different techniques trade soundness for precision.

Analysis of Software Artifacts -

Spring 2006

62

Precision

• Type-Based Systems

• Low precision. Strict MLD.

• Dynamic Tools

• Better precision.

• Model-Checking

• Can be very high. Not complete

(undecidability of reachability).

• Flow-Based Analysis

• High precision using an engineering approach.

Analysis of Software Artifacts -

Spring 2006

63

Questions

Analysis of Software Artifacts -

Spring 2006

64

Download