ppt - Microsoft Research

advertisement
Component-based Synthesis
Sumit Gulwani
(MSR Redmond)
Joint work with:
Susmit Jha and Sanjit Seshia
(UC-Berkeley)
Ashish Tiwari
(SRI)
Ramarathnam Venkatesan
(MSR Bangalore/Redmond)
Component based Synthesis
Problem Definition
Given:
• A library of components where each component
comes with its functional specification
• Functional Specification of desired behavior
Obtain: Appropriate composition of components to
obtain desired behavior.
Applications
Bit-vector Algorithm Synthesis, Deobfuscation
1
Application 1: Bit-vector Algorithms
• Straight-line programs that use
– Arithmetic Operators: +,-,*,/
– Logical Operators: Bitwise and/or/not, Shift left/right
• Challenge: Combination of arithmetic + logical operators
leads to unintuitive algorithms
• Application: Provides most-efficient way to accomplish
a given task on a given architecture
2
Examples of Bitvector Algorithms
Turn-off rightmost 1-bit
10101100
10101000
&
Y
Y & (Y-1)
10101100
Y
10101011
Y-1
10101000
Y & (Y-1)
3
Examples of Bitvector Algorithms
Turn-off rightmost contiguous sequence of 1-bits
10101100
10100000
Y
Y & (1 + (Y | (Y-1)))
Ceil of average of two integers without overflowing
(X|Y) – ((X©Y) >> 1)
4
Examples of Bitvector Algorithms
P24: Round up to next
highest power of 2
o1 := sub(x,1);
o2 := shr(o1,1);
o3 := or(o1,o2);
o4 := shr(o3,2);
o5 := or(o3,o4);
o6 := shr(o5,4);
o7 := or(o5,o6);
o8 := shr(o7,8);
o9 := or(o7,o8);
o10 := shr(o9,16);
o11 := or(o9,o10);
res := add(o10,1);
P25: Higher order half
of product of x and y
o1 := and(x,0xFFFF);
o2 := shr(x,16);
o3 := and(y,0xFFFF);
o4 := shr(y,16);
o5 := mul(o1,o3);
o6 := mul(o2,o3);
o7 := mul(o1,o4);
o8 := mul(o2,o4);
o9 := shr(o5,16);
o10 := add(o6,o9);
o11 := and(o10,0xFFFF);
o12 := shr(o10,16);
o13 := add(o7,o11);
o14 := shr(o13,16);
o15 := add(o14,o12);
res := add(o15,o8);
5
Application 2: Deobfuscation
• Transform given code into simpler representation
(using components from the given code).
• Important for identifying malware/viruses
6
Deobfuscation Example: Multiply by 45
Int multiply45Obs(int y)
a=1; b=0; z=1; c=0;
while(1)
if (a==0)
if (b==0)
y=z+y; a=:a; b=:b; c=:c;
if :c break;
else
z=z+y; a=:a; b=:b; c=:c;
if :c break;
else
if (b==0) z=y<<2; a=:a;
else z=y <<3; a=:a; b=:b;
return y;
Int multiply45(int y)
z=y<<2;
y=z+y;
z=y<<3;
y=z+y;
return y;
7
Deobfuscation Example: Interchange src/dest
InterchangeObs(Ipaddr *s, *d)
*s = *s©*d;
if (*s == *s©*d)
*s = *s©*d;
if (*s == *s©*d)
*d = *s©*d;
if (*d == *s©*d)
*s = *d©*s;
else
*s = *s©*d;
*d = *s©*d;
return;
else *s = *s©*d;
*d = *s©*d; *s = *s©*d;
Interchange(Ipaddr
*s, *d)
*d = *s © *d;
*s = *s © *d;
*d = *s © *d;
8
Dimensions in Program Synthesis
• Functional Specification
– Pre/Post-conditions, Input-output examples,
Inefficient/Related programs
– Interaction in face of over/under specification
• Search Space
– Imperative/Functional Programs
• Operators
• Control-flow
– Restricted Models of Computation
• Search Technique
– Constraint Generation
• Invariant-based, Path-based, Input-based
• Precise/Abstract/Approximate Operator Encoding
– Constraint Solving
9
Dimensions in Program Synthesis
• Functional Specification
– Pre/Post-conditions, Input-output examples,
Inefficient/Related programs
– Interaction in face of over/under specification
• Search Space
– Imperative/Functional Programs
• Operators (Arithmetic/Logical)
• Control-flow (Straight-Line)
– Restricted Models of Computation
• Search Technique
– Constraint Generation
• Invariant-based, Path-based, Input-based
• Precise/Abstract/Approximate Operator Encoding
– Constraint Solving
10
Dimensions in Program Synthesis
• Functional Specification
– Pre/Post-conditions, Input-output examples,
Inefficient/Related programs
– Interaction in face of over/under specification
• Search Space
– Imperative/Functional Programs
• Operators (Arithmetic/Logical)
• Control-flow (Straight-Line)
– Restricted Models of Computation
• Search Technique
– Constraint Generation
• Invariant-based, Path-based, Input-based
• Precise/Abstract/Approximate Operator Encoding
– Constraint Solving
11
Dimensions in Program Synthesis
• Functional Specification
– Pre/Post-conditions, Input-output examples,
Inefficient/Related programs
– Interaction in face of over/under specification
• Search Space
– Imperative/Functional Programs
• Operators (Arithmetic/Logical)
• Control-flow (Straight-Line)
– Restricted Models of Computation
• Search Technique
– Constraint Generation
• Invariant-based, Path-based, Input-based
• Precise/Abstract/Approximate Operator Encoding
– Constraint Solving
12
Functional Specification
• Choice 1: Logical relation between inputs and outputs
• Choice 2: Input-Output Examples
13
Functional Specification: Logical Relations
Problem: Turn off rightmost 1-bit
Functional Spec of components Subtract, Bitwise-And
Subtract(I1,I2,J)
:=
J = (I1-I2)
Bitwise-And(I1,I2,J) :=
J = (I1 & I2)
Functional Specification of desired behavior
b
Æ[ (
p=1
I[p]=1
b
Æ
(I[j]=0)) ) (J[p]=0 Æ(J[j] = I[j])) ]
j=p+1
jp
14
Experiments: Comparison with Exhaustive Search
Program
Brahma
AHA
time
Program
Name
lines iters time
P13
4
4
6
X
P14
4
4
60
X
P15
4
8
119
X
P16
4
5
62
X
P17
4
6
78
109
P18
6
5
46
X
P19
6
5
35
X
P20
7
6
108
X
P21
8
5
28
X
P22
8
8
279
X
P23
10
8
1668
X
P24
12
9
224
X
P25
16
11
2779 X
Name
lines iters
time
P1
2
2
3
0.1
P2
2
3
3
0.1
P3
2
3
1
0.1
P4
2
2
3
0.1
P5
2
3
2
0.1
P6
2
2
2
0.1
P7
3
2
1
2
P8
3
2
1
1
P9
3
2
6
7
P10
3
14
76
10
P11
3
7
57
9
P12
3
9
67
10
Brahma
AHA
time
15
Functional Specification
Problem: Turn off rightmost contiguous string of 1-bits
• Logical Relations
– A bit complicated
• Input-Output Relations
– Key challenge is to resolve ambiguity
– Our solution: Interaction with user
16
Dialog: Interactive Synthesis
Problem: Turn-off rightmost contiguous string of 1’s
User: I want a design that maps 01011 -> 01000
Oracle: I can think of two designs
Design 1: (x+1) & (x-1)
Design 2: (x+1) & x
which differ on 00000 (Distinguishing Input)
What should 00000 be mapped to?
User: 00000 -> 00000


17
Dialog: Interactive Synthesis
Problem: Turn-off rightmost contiguous string of 1’s
User: 01011 -> 01000
Oracle: 00000 ?
User: 00000
Oracle: 01111 ?
User: 00000
Oracle: 00110 ?
User: 00000
Oracle: 01100 ?
User: 00000
Oracle: 01010 ?
User: 01000
Oracle: Your design is X & (1 + ((x-1)|x))
18
Synthesizing Inputs for Dialog with User
• Distinguishing Input construction is a bit expensive.
• We tried two optimizations
– Interleave with random inputs.
• Overall end-to-end performance even worse.
– Interleave with biased random inputs.
• Performs best.
19
Biased Random Input Selection
• Theorem: If a circuit uses only add/subtract/and/or/not
operators, then ith bit of an output depends only on ith bit
of inputs and bits on right side of it.
• Biased Random Strategy:
– Choose a random input whose rightmost bits are different
from the ones that have already been queried for.
– For example, if 3 inputs of following form have been queried
r1 0 0
r2 0 1
r3 1 0
Then, choose the 4th input to be of the form
r4 1 1
20
Experiments: Random vs Biased-Random
Prog.
Random
Biased-Random
Prog.
Random
Biased-Random
Time
Iters Time Iters
P13
33
9
7
3
7
P14
14
25
4
7
1
4
P15
168
7
14
4
11
1
6
P16
67
10
19
6
4
8
2
6
P17
217
17
21
6
P6
6
23
2
4
P18
229
19
26
4
P7
1
5
1
5
P19
164
13
65
5
P8
2
11
1
6
P20
214
17
63
6
P9
5
10
5
6
P21
1074
15
272
6
P10
14
14
3
9
P22
X
X
186
9
P11
24
16
14
9
P23
24
9
12
5
P12
279
24
46
10
P24
12
4
3
2
P25
X
X
1
9
Time
Iters Time Iters
P1
1
5
1
3
P2
7
11
5
P3
2
8
P4
2
P5
21
Conclusion: Component based Synthesis
Problem Definition
Given:
• A library of components where each component
comes with its functional specification
• Functional Specification of desired behavior
Obtain: Appropriate composition of components to
obtain desired behavior.
Inspiration
• Standard process of knowledge discovery
• Modular development
• Can it help with modular synthesis?
22
Download