Programming by Sketching

advertisement
Two techniques for programming
by sketching
(Stanford, November 2004)
Rastislav Bodik, David Mandelin,
Armando Solar-Lezama, Lin Xu UC Berkeley
Rodric Rabbah
MIT
Kemal Ebcioglu, Doug Kimelman, Vivek Sarkar IBM
Synthesis
• Program synthesis
– given a specification, synthesize a program meeting
this spec
– synthesis inverse to verification
– most work in reactive systems (Pnueli, Kupferman, …)
• Synthesis vs. compilation
– synthesis involves a search for the desired program
• Benefits
– “less coding, more correctness”
Programming by sketching
• Our approach
– apply synthesis to software
– “sketching”: specification is partial (underspecified)

sketch
program = completed sketch
Two sketching techniques
Sketch:
– partial implementation, provided by programmer
Sketch resolution:
– completing the sketch into a full implementation
– which one? (sketch completes into many
implementations!)
1. StreamBit:
– behavioral spec + sketch  full implementation
2. Prospector:
– sketch  several full implementations
– user selects implementation with desired behavior
StreamBit: Sketching high-performance
implementations of bitstream programs
Project lead: Armando Solar-Lezama
Bitstream Programs
• Bitstream programs: a growing domain
– crypto: DES, Serpent, Rijndael, …
– coding in general, NSA/BitTwiddle
• Bitstream programs operate under strict
constraints
– performance is very important
• up to 95% of server cycles spent in security-related
processing
– correctness is crucial
• subtle bug in Blowfish implementation allowed over half
the keys to be cracked in less than 10 minutes
Example
• “Drop every third bit in the bit stream.”
• exhibits many features of complicated permutations
– exponentially many choices
– greedy choice is suboptimal
• fast implementation can be sketched
SLOW
O(w)
sketch
FAST
O(log w)
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
functionality
? ? ? ? ? ? ? ? ? ? FAST
? ? ? implementation
? ? ?
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
+

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Full sketch (13 lines of code)
WSIZE=16;
subsequence = Unroll[WSIZE](subsequence);
Compare
with
100+ lines[shift(1:2
of such
FORTRAN
(from BitTwiddle)
subsequence
= PermutFactor[
by 0),
shift(17:18 bycode
0), shift(33:34
by 0)],
[shift(1:16 by ?), shift(17:32 by ?), shift(33:48 by ?)]
] ( subsequence );
subsequence.subsequence_1=DiagSplit[WSIZE](subsequence);
...
for(i=0; i<3; ++i) {
subsequence.subsequence_1.filter(i)
=
DATA MASKB2 /Z'FFC003FF000FFC00',
Z'3FF000FFC003FF00',
PermutFactor[ [shift(1:16 by
0 || 1)],
Z'0FFC003FF000FFC0',
Z'03FF000FFC003FC0',
[shift(1:16 by 0 || 2)],
...
[shift(1:16 by 0 || 4)]
]( subsequence.subsequence_1.filter(i) );
c Compress 5-bit groups together
}
TB = IAND(TB + ISHFT(TB, SKIPBC), MASKB2(J)) Size: 13 lines
TC = IAND(TC + ISHFT(TC, SKIPBC), MASKC2(J))
...
What you gain
• DropThird benchmark:
– Speedups over naïve code with a 14 line sketch:
• 32 bit on a Pentium IV: 83.8%
• 64 bit on an Itanium II: 233%
• DES benchmark:
– 32 bit on a Pentium IV with 30 line sketch:
• 634% speedup over naïve
• within 11% of hand optimized libDES
– 64 bit IA64 and IBM SP2
• we beat libDES by 8%
What is sketching
• Key idea: separation of concerns
– specify behavior without concern for performance
– create implementation without concern for bugs
• domain expert:
– writes a behavioral specification of her crypto algorithm
– as clean as possible, no optimizations
• performance expert:
– describes an efficient implementation of the clean algorithm
– neither reimplements nor describes in full
– he only sketches an outline of the implementation; compiler fills in
details
– if sketch is wrong, compiler complains  no bugs can be
introduced
Compilation strategy
A sketch overrides a naïve compiler:
– naïve compiler translates the clean algorithm into
target code,
• with a simple sequence of semantics-preserving
transformations:
(1) make all filters word-size (unroll and split)
(2) decompose word-size filters into machine instructions
– sketch “inserts” a step into the naïve sequence
• Ex.: sketch decomposes a filter into a pipeline of filters
• after sketch is applied, naïve compiler continues
The behavioral spec (StreamIt)
• StreamIt
– synchronous dataflow language
– filters represented internally as matrices
3
2
100
010
x
x
x
y =
y
z
consumes a 3-bit chunk of input;
produces a 2-bit of output.
Naïve compilation
• Example: Drop Third Bit
(word size W = 4 bits)
– Unroll filter
– decompose into filters operating on W=4 bits of input.
– decompose into filters producing W=4 bits of output
rrobin 4,4,4
12
8 100 000 000 000
3
2
100
010
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
or
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
rrobin 4,4,4
duplicate
1
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
cat
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
or
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
Naïve compilation (cont.)
• Make each filter correspond to one basic operation
available in the hardware
in
duplicate
t1 = in AND 1100
0100
0010
0001
0000
1000
0100
0000
0000
0000
0000
0010
0000
or
t2 = in SHIFTL 1
t3 = t2 AND 0010
out = t1 OR t3
The Full Picture
Task Description
Implementations
Level of abstraction
(high
low)
Decomposition without sketching
specify FAST bit
shifting algorithm
w/out sketching:
F.F_1
F
F.F_2
F.F_3
• User provides high level decomposition of F into
F.F_i
• System Takes care of compiling F.F_i
• Correctness is guaranteed as long as
[F.F_3]  [F.F_2]  [F.F_1] = F
• Avoid spelling out the decomposition:
Sketch It!
[some properties]  [some properties]  [some
properties] = F
100 000 000 000 000 0
00
00 00 00 000 00 00 0000000 00 0
00
001 010 000 00000 000 000 00000000000 000 0
0
10 00 00 000 00 00 0000000 00 0
00
00
00 01 00 000 00 00 0000000 00 0
00
000 000 000 01101 000 000 00000000000 000 0
0
00 00 00 000 10 01 0000000 00 0
00
00 00 00 000 00 00 1100000 00 00 0
000 000 000 00000 000 000 00000100000 000 000 0
00 00 00 000 00 00 0000100 01 0
00
00
00 00 00 000 00 00 0000001 00 1
1
000 000 000 00000 000 000 00000000000 000 0
00
1 00100
0 00100000
0 00000
0 00000
0 00000000
0 00000
0 00000
0 00000000
0 00000
0 00000
0 00000000
100000
010000
001000
000100
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
100000
010000
001000
000100
000000
000000
000000
000000
000000
000000
000000
000000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
1000
0100
0010
0000
100000000000
010000000000
001000000000
000100000000
000010000000
000001000000
000000100000
000000010000
000000000000
000000000000
000000000000
0000
0000
0000
0000
0000
0000
0000
0000
1000
0100
0010
0
0
0
0
0
0
0
0
0
0
1
F.F_1
F.F_2
F.F_3
Sketching: another example
A permutation from DES cipher (64 bits  64 bits)
32 bits
32 bits
shift(1:64 by 0 || 33 || -33),
shift(1:2:31 by -33),
shift(34:2:64 by 33),
[] // unspecifed; filled in by compiler
Sketch
Problem: when implemented as a table lookup, the table is very large
Idea: decompose into a pipeline of two permutations:
1. provided by the programmer:
an inexpensive permutation
2. automatically derived from the sketch:
two identical permutations (to be implemented as one smaller
Sketching: How it works
•
•
•
•
•
•
•
Start with a sketch
Define xi,j as the amount bit i will move on step j
Semantic equivalence imposes linear constraints on the xi,j
Many of the constraints in the sketch also impose linear constraints on xi,j
Solving the linear constraints produces a space of possible solutions
Map the nonlinear constraints to this solution space
Search
SketchDecomp[
[shift(1:32 by 0 || 1)],
[shift(1:32 by 0 || 2)],
[shift(1:32 by 0 || 4)],
[shift(1:32 by 0 || 8)]
]( Filter );
User Study (time to first solution)
Words per microsecond
First Solution Performance
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
0.00
C5
C4
C3
C2
C1
SBit 1
SBit 2
Sbit 3
Sbit 4
StreamBit
C
1.00
2.00
3.00
Hours
4.00
5.00
6.00
User Study (developing a good
implementation)
Performance over Time
C5
C4
C3
C2
C1
SBit 1
SBit 2
SBit w sketchin
Sbit 3
Sbit 4
Cref
Words per microsecond
12
10
8
6
4
2
0
0.00
2.00
4.00
6.00
Hours
8.00
10.00
Implementing the fastest DES
• How fast can we match the fastest DES
implementation?
– 6 different implementations in 4 hours
– includes all but one trick used in libDES
– so fast partly because sketching avoids bugs
1.2
1
0.8
PIV 2.5GHz
PIII 700 MHz
PIII 496.82
IA64
Solaris
IBMSP
0.6
0.4
0.2
0
fulltable
notable
sometable
Concluding Remarks
• StreamBit allows for
– Task specification oblivious to performance
– Implementation specification without bugs
• Same idea may apply in other domains
– If people currently resort to very low level coding
– If some algebraic structure can be imposed on the
task
– It may be amenable to implementation sketching.
Mining Jungloids: Helping to
Navigate the API Jungle
Project lead: David Mandelin
A software reuse problem
• big components reusable [Lampson’99]
– OS, DBMS, browser
• small components challenging
– flexibility: functionality cut finely, for fine control
– size: in J2SE, 21,000 methods in 1000s of classes
• cost to understand and use
– one of three obstacles to reuse [Lampson’99]
• searching for information
– nearly ¼ of developer time [metallect.com]
 often give up reuse and reimplement
Example
programming task: parse a Java file into an AST
IFile file = …
ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file);
ASTNode node = AST.parseCompilationUnit(cu,
?
false);
Why so hard to find? (productivity: 2LOC/hour)
1. class member browsers? two unknown classes used
2. follow expected design? two levels of file handlers
3. grep? method returns a subclass
The morale?
• type signatures
– not very useful in finding desired code
– but once found, can be used to verify
• so why not search existing code base?
– somebody must have written these two lines before!
– yes, but not in same method
• for software engineering reasons
– or even same program
• e.g.: parse an editor buffer, not a file
• still, sample code useful, as we will see …
Our goal
• We want a programmer’s “search engine” that
– doesn’t merely find an example code
– instead, it synthesizes the desired code
– from two favorite sources:
• type signatures
• existing code examples
More precisely
• mining input:
– the API (type signatures from class definitions)
– corpus of API client code
• search input:
– a query specifying programmer’s intent
• output:
– synthesized code
– ready for insertion into user program
– give several candidates (user selects one)
Formulating the code search problem
We must decide on the structure of:
– input query (coding intent)
• easy to express for the user
• yet specific enough for the search engine
– output code (synthesized code)
• easy to understand and validate (by reading docs)
• code should complete the program under construction
The query: from ‘have’ to ‘want’
• 1st observation
– Reuse problems can usually be described with a
have-one-want-one query q=(h,w):
“What code will transform
a (single) object of (static) type h into
a (single) object of (static) type w?”
• Our parsing example: q = (IFile, ASTNode)
IFile file = …
ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file);
ASTNode node = AST.parseCompilationUnit(cu, false);
Output code: jungloid
• 2nd observation:
– most queries can be answered with a jungloid
• jungloid:
– a unary expression composed of unary expressions:
•
•
•
•
•
field access
call to an instance method with 0 arguments
call to a static method or constructor with 1 argument
conversion to supertype
(multi-argument methods decomposed into unary ones)
IFile file = …
ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file);
ASTNode node = AST.parseCompilationUnit(cu, false);
Coverage
An informal experiment:
– using 16 coding headaches, collected by us
• Can the query express interesting problems?
– yes, for 12 out of 16 coding problems
• Can queries be answered with a jungloid?
– yes, all 12 queries answered with jungloids
• 9 of them are simple jungloids
• 3 of them use some multi-argument methods
Prospector: our prototype
• Eclipse plugin
– integrated with “code completion assist”
var.[CTRL+SPACE]
field
– the “want”
foo() type w
WantType
x =len,
[CTRL+SPACE]
bar(int
Object key)
– a set H of “has” types obtained from context
• local variables, arguments, class fields, globals
– issue queries (h,w) for each h  H
Type signature graph
Any path from h to w is a (h,w)-jungloid
getResource()
IJavaElement
getParent()
IResource
IContainer
supertype
IClassFile
IFile
AST.parseCompilationUnit()
CompilationUnit
ICompilationUnit
ASTNode
AST.parseCompilationUnit()
• 3rd observation:
– desired jungloid typically among k shortest paths (k=5)
Jungloids with downcasts
IDebugView debugger = ...
Viewer viewer = debugger.getViewer();
IStructuredSelection sel = (IStructuredSelection) viewer.getSelection();
JavaInspectExpression expr = (JavaInspectExpression) sel.getFirstElement();
IDebugView
getViewer()
Viewer
getSelection()
ISelection
downcast
IStructuredSelection
Object
downcast
JavaInspectExpression
Our solution
• Besides downcasts, this problem appears in
– method arguments of type Object (only accept a
JavaBean)
– String objects (strings are highly polymorphic)
• Potential solutions
– parametric type inference, alias analysis
• Our solution
– mine a corpus of API uses for legal downcasts
Mining jungloids with downcasts
• Ideally, only correct jungloids are synthesized
– correct = it must be possible to write a client code in
which the jungloid’s downcast succeeds, for at least
one input
• This ideal can be approximated (overview):
– use a corpus of API client code
– extract jungloids with downcasts
– use them to extend the signature graph
• In the limit, we meet the ideal
– limit = infinitely large, bug-free corpus
• bug-free corpus
– weak requirement: jungloids in corpus to succeed for
one input
Mining jungloids with downcasts
(example)
protected IJavaObject
getObjectContext()
IStructuredSelection<JavaInspectExpression>
IWorkbenchPage page = …
Viewer<IStructuredSelection<JavaInspectExpression>>
IWorkbenchPart part = page.getActivePart();
getSelection()
IDebugView view =Viewer’
(IDebugView) part.getAdapter();
ISelection s = view.getViewer().getSelection();
downcast
IDebugView
getViewer()
ISelection’
IStructuredSelection sel = (IStructuredSelection)s;
IStructuredSelection’
Object Viewer
selection getSelection()
= sel.getFirstElement();
getFirstElement()
JavaInspectExpression
(JavaInspectExpression)
ISelection exp =downcast
Object’
selection;
IStructuredSelection
...
}Object
downcast
downcast
JavaInspectExpression
The jungloid mining algorithm (key
idea)
When extracting jungloids, how to determine the
necessary downcast context (i.e., jungloid
suffix)?
x.a.(T)
w.x.a.(T)
s.y.a.(S)
y.a.(S)
What if the context is too short?
– unsound: a query may synthesize a jungloid that will
throw exception in any client code
What if the context is too long?
– incomplete: a query may fail to synthesize the
jungloid even though the corpus contains the
Experiment 1 (ranking test)
• hypothesis:
– to find the desired code, the user needs to examine
only top 5 candidate jungloids.
• result:
– desired code in “top 5” 17 out 20 times (10 out of 20,
in “top 1”)
– remaining three fixable
• methodology:
– used 20 real-world coding tasks
– collected from FAQs, newsgroups, our practice, emails
to us
Experiment 2 (user study)
• hypothesis:
– Prospector-equipped programmers are better at
solving API programming problems than other
programmers
• methodology:
– 6 problems, each user did 3 with Prospector and 3
without
– problems formulated not to reveal the query
– sample problem:
“The new Java channel IO system represents files as channels.
How do I get a channel that represents a String filename?”
Experiment 2 (user study). Results.
• Prospector shortens development time
– some problems solved only by Prospector users
– when both groups succeeded, Prospector users 30%
faster
• Prospector may help enable reuse
– non-Prospector users sometimes reimplemented
• Prospector may help avoid making mistakes
– mistakes applying code found on internet into own
code
• We expect even stronger results on a more
robust infrastructure.
Future work
• Coding task we currently can’t handle:
– print an AST as Java source
• The limitation:
– task is expressible as a (have,want) query
– but result is not a jungloid (as defined in this talk)
ASTNode ast = ...
ASTFlattener visitor = new ASTFlattener();
ast.accept(visitor);
ASTNode ast = ...
String result = visitor.getResult();
ASTFlattener visitor = new ASTFlattener();
ASTFlattener visitor2 = ast.accept(visitor);
String result = visitor2.getResult();
Try it!
• Web demo
– snobol.cs.berkeley.edu
• Eclipse plugin
– coming soon
– want to alpha test it?
Conclusion
Sketch:
– partial implementation, provided by programmer
1. StreamBit:
– behavioral spec + sketch  full implementation
– goal: total correctness and performance
2. Prospector:
– sketch  several full implementations
– user selects implementation with desired behavior
– goal: software reuse
Backup slides
Programming with jungloids
NodeItem node = (NodeItem) getModel();
GraphNodeFigure f = (GraphNodeFigure) getFigure();
f.getLabel().setName(node.getNodeName());
Rectangle r = new Rectangle(node.x, node.y, -1, -1);
GraphicalEditPart parent = (GraphicalEditPart) getParent();
parent.setLayoutConstraint(this, f, r)
Download