Software Model Checking for Confidentiality Rajeev Alur Joint work with Pavol Cerny

advertisement
Software Model Checking
for Confidentiality
Rajeev Alur
University of Pennsylvania
Joint work with Pavol Cerny
Confidentiality
“Data Leaks Abound And No One Is Safe“ (Feb 9
“Indian Foreign Ministry hit by spyware”
th)
(Feb 15th)
“Cell Phones a Much Bigger Privacy Risk Than Facebook” (Feb 20th)
download
online
store
programs
banking
health
records
email
2
Confidentiality

How do data leaks happen?
“Unauthorized application use: … the use of unauthorized
programs resulted in as many as half of their companies' data loss
incidents.” (“Data leakage worldwide, …”,Cisco, 2008)

Focus of our case study: J2ME
midlets for mobile devices


can buy spyware (flexispy.com,..)
“A malicious signed application could read all the PIM data
and send it to an attacker using the variety of transport
mechanisms outlined in this document.” (Symantec, 2007)
3
J2ME midlets
EventSharingMidlet:
void sendEvent(…) {
Accesses
phone’s native
data
…
contactList = (ContactList)
PIM.getInstance().openPIMlists(
PIM.Contact_LIST, PIM.READ_ONLY,
listname)
…
conn.send(message)
How do we
know that
information does
not leak?
…
}
Sends something
4
How can information be leaked?
public void sendEvent() {
doUsefulWork();
public void sendEvent(…) {
…
doUsefulWork();
low = 0;
...
if (phoneBook.contains(“555-55”)) {
conn.send (secret_message);
low = 1;
}
}
Model:
The attacker
a) knows the
program
b) observes all
external
communication
conn.send(low);
}
Information leaked due to malicious (or buggy) code.
Confidentiality is not a property of a single trace.
5
Checking Confidentiality
createEvent Midlet
//get the phone number
number =
phoneBook.elementAt(selected);
//test if the number is valid
if ((number==null)||(number==“”))
{ //output error
} else {
String message = inputMessage();
//send a message to the receiver
sendMessage(number,message);
}
•Taint analysis
too strict
•Language-based
approaches
would require
annotations for
downgrading
6
Software
Model Checking
Not applicable to specifying and verifying of
confidentiality:
Program P1. Confidentiality is not a property of a single
Specification φ
execution
(thus
not
specifiable
in LTL and
(source code)
in fact is not specifiable in μ-calculus).
•Is
Abstraction
2. Both
over- and underneeded.
every acquired
lock eventually
released?
approximation
•Is the system
deadlock free?
3. Main strength of software model checking
Software model
– Finding bugs in checker
control-oriented
programs .
Successful and widely
Yes /
used, e.g. SLAM → SDV.
No (counterexample)
7
Goal
What we need:
Specification
framework
Specification
Yes
Confidentiality
analysis tool
program
No
Analysis method
8
Reachability
Reachability
Temporal
Specifications
LTL, CTL, μ-calculus
Finite-state systems
NL-complete
Programs (Java
methods)
Undecidable.
Over-approximation for
sound analysis (of
unreachability)
9
Talk Overview
Reachability
“Confidentiality” ??
Temporal
Specifications
LTL, CTL, μ-calculus
??
Finite-state systems
NL-complete
??
Programs (Java
methods)
Undecidable.
Over-approximation for
sound analysis (of
unreachability)
??
10
Defining Confidentiality

Secret: Property to be kept confidential; typically a
predicate over state variables

Observation h of an execution:





What can the attacker observe?
Two executions with same observation are equivalent
Examples: Outputs; Sequence of messages sent
More generally, each state is labeled with observable propositions,
and observation of an execution is a sequence of observable
propositions of states
Executions of interest specified by a condition cond


Terminating executions
Executions where input satisfies some constraint
11
Conditional Confidentiality
Given a notion of observation, a property secret, and a
condition cond of interesting executions, a program P satisfies
conditional confidentiality iff
For every execution r satisfying cond, there exists an execution
r’ such that
1. r and r’ have the same observation
2. r and r’ differ on the value of secret
12
Temporal Logics for Confidentiality

Motivation: In multi-agent systems and for protocols, how
to specify requirements concerning order in which secrets
are revealed

Classical model of systems/programs: Trees

Existing branching-time logics are not adequate


Thm: Confidentiality cannot be expressed in m-calculus
Cannot capture “equivalence” of executions
13
Labeled Trees
pq
pq
pq
pq
pq
pq
pq
Agent a observes proposition p, b observes q
Labeled Trees with Equivalence Edges
pq
pq
pq
b
pq
a
a
pq
pq
pq
a
Agent a observes proposition p, b observes q
a-labeled edge between nodes: a considers them equivalent
The logic CTL≈
CTL≈
EX f
EIa g
f = p | ¬ f | f1 or f2
| EX f | f1 EU f2 | EG f
| EIa f
a
a
f
EIa f: f holds in some world considered plausible by a
• Confidentiality: AG (EIa α and EIa ¬α)
• Agent a does not reveal x before agent b reveals y
A (EIa x and EIa ~x) U ( AIb y or AIb ~y)
Analogous extension of m-calculus: µ≈
g
Model Checking
Does a finite-state system
satisfy a temporal logic
formula?
 Nesting-free fragments
CTL≈ :PSPACE complete
μ≈ -calculus: EXPTIME complete
 In general – nonelementary (resp. undecidable)
Good news: Typical confidentiality properties captured in
the nesting-free fragments
17
Talk Overview
Reachability
Conditional
Confidentiality
Temporal logics
CTL, μ-calculus
CTL≈, μ≈-calculus
Finite-state systems
NL-complete
PSPACE-complete
Programs (Java
methods)
Undecidable.
Over-approximation for
sound analysis (of
unreachability)
??
18
Confidentiality for programs
•secret: Does A contain 7?
res = -1;
i=0;
while (i<n) {
if (A[i]==key)
{
res=A[i];
}
i++;
}
send res;
•Observer sees the value of res
•cond: key is not 7
For all observations h, if h is valid (consistent with the
condition cond), then h leads to a state where secret
holds, and h leads to a state where the secret does not
hold.
Example: suppose the observer sees 3 (that is, res = 3):
There exists a state: A= [7,3]; key = 3 (observation valid)
There exists a state: A= [7,3]; key = 3 (secret holds)
There exists a state: A= [1,3]; key = 3 (secret does not
hold)
19
Confidentiality for programs
res = -1;
i=0;
while (i<n) {
if (A[i]==key)
{
res =A[i];
}
i++;
}
send res;
•secret: Does A contain 7?
•Observer sees the value of res.
•cond: key is not 7.
Confidentiality:
For all possible observations h,
if h is valid (consistent with the condition cond),
if there exists s: s in R and cond(s) and s[res]=h
then h leads to a state where secret holds,
then there exists s: s in R and secret(s) and s[res]=h
R - set of
reachable states
and h leads to a state where the secret does not hold.
and there exists s: s in R and ¬secret(s) and s[res]=h
Over- / under- approximation
Computing reachable states exactly is impractical.
Approximation: R+ (an over-approximation (R R+)),
R- (an under-approximation (R  R-))
R+
Confidentiality:
R-
R
For all possible observations h,
if h is valid (consistent with the condition cond),
if there exists s: s in R+ and cond(s) and s[res]=h
then h leads to a state where secret holds,
then there exists s: s in R- and secret(s) and s[res]=h
and h leads to a state where the secret does not hold.
and there exists s: s in R- and ¬secret(s) and s[res]=h
Lemma: The approximate formula implies confidentiality.
21
Over- / under- approximation
Computing the over-approximation R+ :
invariants (user-supplied or computed):
Example:
res = -1;
i=0;
while (i<n) {
if (A[i]==key) {
res =A[i];
Invariant:
}
i++;
(res ==key) or
}
(res ==-1)
send res;
22
Over- / under- approximation
Computing the under-approximation R- :
(loop unrolling, bounding the data structure size)
res = -1;
i=0;
while (i<n) {
if (A[i]==key) {
res =A[i];
}
i++;
}
send res;
res = -1;
i=0;
if (i<n) {
if (A[i]==key) {
res =A[i];
}
i++;
}
if (i<n) {
if (A[i]==key) {
res =A[i];
}
i++;
}
assume(i>=n);
send res;
23
Confidentiality as a logical formula
Program vars
for all h:
Invariant
if there exist pv: inv(pv) and cond(pv) and res=h implies
Weakest precondition
there exist pv: WP(P’,(secret and res=h)) and
there exist pv: WP(P’,(¬secret and res=h))
Confidentiality holds
only if:
Program with
unrolled loops
h : (pv : inv ( pv)  cond( pv)  hist  h) 
(pv : WP ( P ' , secret( pv)  hist  h)) 
(pv : WP ( P ' ,  secret( pv)  hist  h))
24
Deciding validity of confidentiality formula
Problem: Quantifier alternation.
Complexity of decision procedures (QBF,
Pressburger) high, tools not well
engineered.
Question: Could we use SMT solvers?
Idea: Restrict the expression language to
contain only equality (order).
Rationale: Many programs do not
perform arithmetic on the data, only tasks
like searching, inserting, deleting,
(sorting).
res = -1;
i=0;
while (i<n) {
if (A[i]=key) {
res =A[i];
}
i++;
}
send res;
25
Deciding validity of confidentiality formula
Result: If universal quantifier is over a
domain with only equality, we can
replace it by checking the formula at a
fixed number of specific values
h : (res , key : (( res  1)  (res  key))
 (key  7)  res  h)) 
(pv : 1 )  (pv : 2 )
res = -1;
i=0;
while (i<n) {
if (A[i]=key) {
res =A[i];
}
i++;
}
send result;
Thus, an SMT
solver can be
used
(checking
three formulas
Values 7, -1, and one other (e.g. 1) need to be
per constant).
checked.
26
ConAn (CONfidentiality ANalysis)
Java
Bytecode
WALA
Secret
Cond
Performs SMT
solving.
ConAn
Invariant
Nunroll
Processes bytecode to
produce an intermediate
representation of SSA
instructions organized in
a control-flow graph.
Narray
Valid
Yices
Unsat
27
Applications

•

•
•
Case study: J2ME Java methods
third party programs, accessing PIM information
(managing contacts, calendars, to-do lists) and
sending messages
Other Java methods:
methods from other PIM managing programs
(chat clients, calendars..).
data structure accessing methods from Java
standard library.
28
Experimental results
Project/
Class
Method Name # of
lines
unroll
running
time (s)
result
1
Java.lang/
Vector
elementAt
6
1
0.18
valid
2
EventSharing
sendEvent
122
2
1.83
valid
3
EventSharing
sendEvent
(bug)
126
2
1.80
unsat
4
find
9
1
0.31
unsat
5
find
9
2
0.34
valid
6
Funambol/
Contact
getContact
13
2
0.32
valid
7
Blackchat/
ICQContact
getContact-ByReference
23
2
0.24
valid
8
password
check
9
2
0.22
valid
29
Conclusions
Algorithmic, specification-driven analysis is
an effective way of establishing that
programs do not leak confidential
information.
30
Download