Static Analysis David Evans CS551/651: Dependable Computing University of Virginia

advertisement
Static Analysis
CS551/651: Dependable Computing
University of Virginia
Computer Science
David Evans
http://www.cs.virginia.edu/evans
Menu
•
•
•
•
Validation
Why Static Analysis is Impossible
Why we do it anyway
Static Analysis Tools
25 September 2003
Dependable Computing Fall 2003
2
How do you decide is a system
is dependable?
25 September 2003
Dependable Computing Fall 2003
3
Validation
25 September 2003
Dependable Computing Fall 2003
4
Dictionary Definition
val·i·date
1. To declare or make legally valid.
2. To mark with an indication of official
sanction.
3. To establish the soundness of;
corroborate.
Can we do any of these with software?
25 September 2003
Dependable Computing Fall 2003
5
Sun’s Java License
5. LIMITATION OF LIABILITY. TO THE EXTENT
NOT PROHIBITED BY LAW, IN NO EVENT WILL
SUN OR ITS LICENSORS BE LIABLE FOR ANY
LOST REVENUE, PROFIT OR DATA, OR FOR
SPECIAL, INDIRECT, CONSEQUENTIAL,
INCIDENTAL OR PUNITIVE DAMAGES,
HOWEVER CAUSED REGARDLESS OF THE
THEORY OF LIABILITY, ARISING OUT OF OR
RELATED TO THE USE OF OR INABILITY TO USE
SOFTWARE, EVEN IF SUN HAS BEEN ADVISED
OF THE POSSIBILITY OF SUCH DAMAGES. …
25 September 2003
Dependable Computing Fall 2003
6
Java’s License
2. RESTRICTIONS. … Unless enforcement is
prohibited by applicable law, you may not
modify, decompile, or reverse engineer
Software. You acknowledge that Software
is not designed, licensed or intended for
use in the design, construction, operation
or maintenance of any nuclear
facility. Sun disclaims any express or
implied warranty of fitness for such uses.
25 September 2003
Dependable Computing Fall 2003
7
Software Validation
• Process designed to increase our
confidence that a program works as
intended
• For complex programs, cannot often make
guarantees
• This is why typical software licenses don’t
make any claims about their program
working
25 September 2003
Dependable Computing Fall 2003
8
Increasing Confidence
• Testing
– Run the program on set of inputs and check
the results
• Verification
– Argue formally or informally that the program
always works as intended
• Analysis
– Poor programmer’s verification: examine the
source code to increase confidence that it
works as intended
25 September 2003
Dependable Computing Fall 2003
9
Testing
• If all the test cases produce the correct
results, you know that a particular
execution of the program on each of the
test cases produced the correct result
• Concluding that this means the program is
correct is like concluding there are no fish
in the river because you didn’t catch one!
• What makes a good test case?
25 September 2003
Dependable Computing Fall 2003
10
Analysis
• Make claims about all possible paths by
examining the program code directly, not
executing it
• Use formal semantics of programming
language to know what things mean
• Use formal specifications of procedures to
know that they do
25 September 2003
Dependable Computing Fall 2003
11
Example Software Properties
•
•
•
•
•
•
•
Does what the customer wants
Does what the programmer intends
Doesn’t do anything dangerous
Always eventually halts
Never dereferences null
Always opens a file before writing to it
Never prints “3”
25 September 2003
Dependable Computing Fall 2003
12
Hopelessness of Analysis
It is impossible to correctly decide if any
interesting property is true for an arbitrary
program!
25 September 2003
Dependable Computing Fall 2003
13
Halting Problem
• Can we write a program that takes any
program as input and returns true if that
program always halts, and returns false if
it sometimes doesn’t halt.
bool alwaysHalts (Program p) {
… // returns true iff p will halt
}
25 September 2003
Dependable Computing Fall 2003
14
Informal Proof
• Suppose we could write alwaysHalts.
• Proof by contradiction:
bool contradictHalts () {
if (alwaysHalts (contradictHalts)) {
while (true) ; // loop forever
} else {
return false;
}
}
What is alwaysHalts (contradictHalts) ?
25 September 2003
Dependable Computing Fall 2003
15
Hopelessness of Analysis
• But this means, we can’t write a program
that decides any other interesting property
either:
bool dereferencesNull (Program p)
// EFFECTS: Returns true if p ever dereferences null,
//
false otherwise.
bool alwaysHalts (Program p) {
return (derefencesNull (new Program (“p (); *NULL;”)));
}
25 September 2003
Dependable Computing Fall 2003
16
Give Up?
25 September 2003
Dependable Computing Fall 2003
17
Compromises
• Only work for some programs
• Accept unsoundness and incompleteness
• False positives: sometimes an analysis tool will
report warnings for a program, when the
program is actually okay (incompleteness – can’t
prove a property that is true)
• False negatives: sometimes an analysis tool
will report no warnings for a program, even
when the program violates properties it checks
(unsoundness – proves a property that is not
true)
25 September 2003
Dependable Computing Fall 2003
18
Properties to Analyze
• Generic Properties
– Dangerous Code
• C: memory leaks, dereferencing null, type
mismatches, undefined behavior, etc.
• Concurrency: race conditions, deadlocks
– Don’t need a specification (but it may help
across procedure boundaries)
• Application-Specific Properties
– Need some way of describing the properties
we want
25 September 2003
Dependable Computing Fall 2003
19
Splint
Annotation-assisted lightweight analysis tool for C
25 September 2003
Dependable Computing Fall 2003
20
A Gross Oversimplification
all
Bugs Detected
Formal Verifiers
Compilers
none
Low
25 September 2003
Effort Required
Dependable Computing Fall 2003
Unfathomable
21
(Almost) Everyone Likes Types
• Easy to Understand
• Easy to Use
• Quickly Detect Many Programming
Errors
• Useful Documentation
• …even though they are lots of work!
– 1/4 of text of typical C program is for types
25 September 2003
Dependable Computing Fall 2003
22
Limitations of
Standard Types
Type of reference
never changes
Language defines
checking rules
One type per
reference
25 September 2003
State changes along
program paths
System or
programmer defines
checking rules
Many attributes per
reference
Dependable Computing Fall 2003
23
Limitations of
Standard Types
Type of reference
never changes
Language defines
checking rules
One type per
reference
25 September 2003
Attributes
State changes along
program paths
System or
programmer defines
checking rules
Many attributes per
reference
Dependable Computing Fall 2003
24
Approach
• Programmers add annotations (formal
specifications)
– Simple and precise
– Describe programmers intent:
• Types, memory management, data hiding,
aliasing, modification, null-ity, buffer sizes,
security, etc.
• Splint detects inconsistencies between
annotations and code
– Simple (fast!) dataflow analyses
25 September 2003
Dependable Computing Fall 2003
25
Sample Annotation: only
extern only char *gptr;
extern only out null void *malloc (int);
•
•
•
•
Reference (return value) owns storage
No other persistent (non-local) references to it
Implies obligation to transfer ownership
Transfer ownership by:
– Assigning it to an external only reference
– Return it as an only result
– Pass it as an only parameter: e.g.,
extern void free (only void *);
25 September 2003
Dependable Computing Fall 2003
26
Example
extern only null void *malloc (int); in library
1 int dummy (void) {
2
int *ip= (int *) malloc (sizeof (int));
3
*ip = 3;
4
return *ip;
5 }
Splint output:
dummy.c:3:4: Dereference of possibly null pointer ip: *ip
dummy.c:2:13: Storage ip may become null
dummy.c:4:14: Fresh storage ip not released before return
dummy.c:2:43: Fresh storage ip allocated
25 September 2003
Dependable Computing Fall 2003
27
Security Flaws
Other
16%
M alformed
Input
16%
Buffer
Overflows
19%
Format
Bugs
6%
Resource
Leaks
6%
Pathnames
10%
Access
16%
Symbolic
Links
11%
190 Vulnerabilities
Only 4 having to do with crypto
108 of them could have been
Reported flaws in Common Vulnerabilities and
detected with simple
Exposures Database, Jan-Sep 2001.
[Evans & Larochelle, IEEE Software, Jan 2002.]
static analyses!
25 September 2003
Dependable Computing Fall 2003
28
Example: Buffer Overflows
David Larochelle
• Most commonly exploited security
vulnerability
– 1988 Internet Worm
– Still the most common attack
• Code Red exploited buffer overflow in IIS
• >50% of CERT advisories, 23% of CVE entries in 2001
• Attributes describe sizes of allocated buffers
• Heuristics for analyzing loops
• Found several known and unknown buffer
overflow vulnerabilities in wu-ftpd
25 September 2003
Dependable Computing Fall 2003
29
Defining Properties to Check
• Many properties can be described in terms
of state attributes
– A file is open or closed
• fopen: returns an open file
• fclose: open  closed
• fgets, etc. require open files
– Reading/writing – must reset between certain
operations
25 September 2003
Dependable Computing Fall 2003
30
Defining Openness
attribute openness
context reference FILE *
oneof closed, open
annotations
open ==> open closed ==> closed
transfers
open as closed ==> error
Object cannot be open
closed as open ==> error
on one path, closed on
merge open + closed ==> error
another
losereference
open ==> error "file not closed"
defaults
Cannot abandon FILE
reference ==> open
in open state
end
25 September 2003
Dependable Computing Fall 2003
31
Specifying I/O Functions
/*@open@*/ FILE *fopen
(const char *filename,
const char *mode);
int fclose (/*@open@*/ FILE *stream)
/*@ensures closed stream@*/ ;
char *fgets (char *s, int n,
/*@open@*/ FILE *stream);
25 September 2003
Dependable Computing Fall 2003
32
Reading, ‘Riting, ‘Rithmetic
attribute rwness
context reference FILE *
oneof rwnone, rwread, rwwrite, rweither
annotations
read ==> rwread write ==> rwwrite
rweither ==> rweither rwnone ==> rwnone
merge
rwread + rwwrite ==> rwnone rwnone + * ==> rwnone
rweither + rwread ==> rwread rweither + rwwrite ==> rwwrite
transfers
rwread as rwwrite ==> error "Must reset file between read and write."
rwwrite as rwread ==> error "Must reset file between write and read."
rwnone as rwread ==> error "File in unreadable state."
rwnone as rwwrite ==> error "File in unwritable state."
rweither as rwwrite ==> rwwrite
rweither as rwread ==> rwread
defaults
reference ==> rweither
end
25 September 2003
Dependable Computing Fall 2003
33
Reading, ‘Righting
/*@rweither@*/ FILE *fopen
(const char *filename, const char *mode) ;
int fgetc (/*@read@*/ FILE *f) ;
int fputc (int, /*@write@*/ FILE *f) ;
/* fseek resets the rw state of a stream */
int fseek (/*@rweither@*/ FILE *stream,
long int offset, int whence)
/*@ensures rweither stream@*/ ;
25 September 2003
Dependable Computing Fall 2003
34
Checking
• Simple dataflow analysis
• Intraprocedural – except uses annotations
to alter state around procedure calls
• Integrates with other Spint analyses (e.g.,
nullness, aliases, ownership, etc.)
25 September 2003
Dependable Computing Fall 2003
35
Example
f:openness = open
f:rwness = rweither
FILE *f = fopen (fname, “rw”);
Possibly null reference f passed
int i = fgetc (f);
where non-null expected
if (i != EOF) {
f:openness = open, f:rwness = rwread
fputc (i, f);
Attribute mismatch – passed read
fclose (f);
where write FILE * expected.
}
f:openness = closed, f:rwness = rwnone
Branches join in incompatible states: f is closed
on true branch,open on false branch
25 September 2003
Dependable Computing Fall 2003
36
Other Static Analysis Tools
• PREfix (Microsoft)
– C/C++ defect detection, no user annotations
(models of library functions)
– Runs on Windows, Office, etc. code base
• Thousands of warnings, prioritize those most likely
to be interesting
• ESC/Java (Compaq SRC)
– Annotations describe invariants
– Warnings where Java programs could raise
RunTime exceptions, concurrency issues
25 September 2003
Dependable Computing Fall 2003
37
Summary
• Redundancy is good for dependability
• Static analysis tools can check redundant
information is consistent
• Any useful property is impossible to decide
soundly and completely (but, lots of useful
checking can still be done)
• For more on Splint: www.splint.org
25 September 2003
Dependable Computing Fall 2003
38
Download