David Evans http://www.cs.virginia.edu/evans
These slides: http://www.cs.virginia.edu/evans/talks/cs340-s04.ppt
1 March 2004 Static Analysis 2
val·i·date
1. To declare or make legally valid.
2. To mark with an indication of official sanction.
3. To establish the soundness of; corroborate.
Can we do any of these with software?
1 March 2004 Static Analysis 3
5. LIMITATION OF LIABILITY.
TO THE EXTENT
NOT PROHIBITED BY LAW, IN NO EVENT WILL
SUN OR ITS LICENSORS BE LIABLE FOR ANY
LOST REVENUE, PROFIT OR DATA, OR FOR
SPECIAL, INDIRECT, CONSEQUENTIAL,
INCIDENTAL OR PUNITIVE DAMAGES,
HOWEVER CAUSED REGARDLESS OF THE
THEORY OF LIABILITY, ARISING OUT OF OR
RELATED TO THE USE OF OR INABILITY TO USE
SOFTWARE, EVEN IF SUN HAS BEEN ADVISED
OF THE POSSIBILITY OF SUCH DAMAGES.
…
1 March 2004 Static Analysis 4
2. RESTRICTIONS.
… Unless enforcement is prohibited by applicable law, you may not modify, decompile, or reverse engineer
Software.
You acknowledge that Software is not designed, licensed or intended for use in the design, construction, operation or maintenance of any nuclear facility.
Sun disclaims any express or implied warranty of fitness for such uses.
1 March 2004 Static Analysis 5
• Process designed to increase our confidence that a program works as intended
• For complex programs, cannot often make guarantees
• This is why typical software licenses don’t make any claims about their program working
1 March 2004 Static Analysis 6
• Testing
– Run the program on set of inputs and check the results
• Verification
– Argue formally or informally that the program always works as intended
• Analysis
– Poor programmer’s verification: examine the source code to increase confidence that it works as intended
1 March 2004 Static Analysis 7
• If all the test cases produce the correct results, you know that a particular execution of the program on each of the test cases produced the correct result
• Concluding that this means the program is correct is like concluding there are no fish in the river because you didn’t catch one!
• What makes a good test case?
1 March 2004 Static Analysis 8
• Make claims about all possible paths by examining the program code directly, not executing it
• Use formal semantics of programming language to know what things mean
• Use formal specifications of procedures to know that they do
1 March 2004 Static Analysis 9
• Does what the customer wants
• Does what the programmer intends
• Doesn’t do anything dangerous
• Always eventually halts
• Never dereferences null
• Always opens a file before writing to it
• Never prints “3”
1 March 2004 Static Analysis 10
It is impossible to correctly decide if any interesting property is true for an arbitrary program!
11 1 March 2004 Static Analysis
• Can we write a program that takes any program as input and returns true if that program always halts, and returns false if it sometimes doesn’t halt.
1 March 2004 bool alwaysHalts (Program p) {
… // returns true iff p will halt
}
Static Analysis 12
• Suppose we could write alwaysHalts
.
• Proof by contradiction: bool contradictHalts () { if (alwaysHalts (contradictHalts)) { while (true) ; // loop forever
} else { return false;
}
}
What is alwaysHalts (contradictHalts)
?
1 March 2004 Static Analysis 13
• But this means, we can’t write a program that decides any other interesting property either: bool dereferencesNull (Program p)
// EFFECTS: Returns true if p ever dereferences null,
// false otherwise.
bool alwaysHalts (Program p) { return (derefencesNull (new Program (“p (); *NULL;”)));
}
1 March 2004 Static Analysis 14
1 March 2004 Static Analysis 15
• Only work for some programs
• Accept unsoundness and incompleteness
• False positives : sometimes an analysis tool will report warnings for a program, when the program is actually okay (incompleteness – can’t prove a property that is true)
• False negatives : sometimes an analysis tool will report no warnings for a program, even when the program violates properties it checks
(unsoundness – proves a property that is not true)
1 March 2004 Static Analysis 16
• Generic Properties
– Dangerous Code
• C: memory leaks, dereferencing null, type mismatches, undefined behavior, etc.
• Concurrency: race conditions, deadlocks
– Don’t need a specification (but it may help across procedure boundaries)
• Application-Specific Properties
– Need some way of describing the properties we want
1 March 2004 Static Analysis 17
Annotation-assisted lightweight analysis tool for C
1 March 2004 Static Analysis 18
all
A Gross Oversimplification
Formal Verifiers
Compilers none
Low
1 March 2004
Effort Required
Static Analysis
Unfathomable
19
• Easy to Understand
• Easy to Use
• Quickly Detect Many Programming
Errors
• Useful Documentation
• …even though they are lots of work!
– 1/4 of text of typical C program is for types
1 March 2004 Static Analysis 20
Type of reference never changes
Language defines checking rules
One type per reference
State changes along program paths
System or programmer defines checking rules
Many attributes per reference
1 March 2004 Static Analysis 21
Type of reference never changes
Language defines checking rules
One type per reference
State changes along program paths
System or programmer defines checking rules
Many attributes per reference
1 March 2004 Static Analysis 22
• Programmers add annotations (formal specifications)
– Simple and precise
– Describe programmers intent:
• Types, memory management, data hiding, aliasing, modification, null-ity, buffer sizes, security, etc.
• Splint detects inconsistencies between annotations and code
– Simple (fast!) dataflow analyses
1 March 2004 Static Analysis 23
only extern only char *gptr; extern only out null void *malloc (int);
• Reference (return value) owns storage
• No other persistent (non-local) references to it
• Implies obligation to transfer ownership
• Transfer ownership by:
– Assigning it to an external only reference
– Return it as an only result
– Pass it as an only parameter: e.g., extern void free ( only void *);
1 March 2004 Static Analysis 24
extern only null void *malloc (int); in library
1 int dummy (void) {
2 int *ip= (int *) malloc (sizeof (int));
3 *ip = 3;
4 return *ip;
5 }
Splint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated
1 March 2004 Static Analysis 25
Other
16%
Buffer
Overflows
19%
Format
Bugs
6%
M alformed
Input
16%
Resource
Leaks
6%
Access
Pathnames
10%
16%
Symbolic 190 Vulnerabilities
Links
11%
Reported flaws in Common Vulnerabilities and
Exposures Database, Jan-Sep 2001.
[Evans & Larochelle, IEEE Software, Jan 2002.]
Only 4 having to do with crypto
108 of them could have been detected with simple static analyses!
1 March 2004 Static Analysis 26
• Most commonly exploited security vulnerability
– 1988 Internet Worm
– Still the most common attack
• Code Red exploited buffer overflow in IIS
• >50% of CERT advisories, 23% of CVE entries in 2001
• Attributes describe sizes of allocated buffers
• Heuristics for analyzing loops
• Found several known and unknown buffer overflow vulnerabilities in wu-ftpd
1 March 2004 Static Analysis 27
• ESC/Java (DEC/Compaq/HP SRC)
– Annotations describe invariants
– Detects possible RunTime execptions
• Coverity SWAT (Coverity, Inc., was Stanford research project)
– Statistical analysis of large code bases to find anomalies
– Has found thousands of flaws in Linux
• PREfix (Microsoft)
– No user annotations, but models library functions
– Used on Windows, Office, etc. code base
• Heuristics to prioritize thousands of warnings
1 March 2004 Static Analysis 28
• Redundancy is good for dependability
• Static analysis tools can check redundant information is consistent
• Any useful property is impossible to decide soundly and completely (but, lots of useful checking can still be done)
• For more on Splint: www.splint.org
http://www.cs.virginia.edu/evans/talks/cs340-s04.ppt
Static Analysis 29 1 March 2004