>> Tom Ball: All right. Hi. I'm... and we're pleased to have Professor Steven Reiss from Brown...

advertisement
>> Tom Ball: All right. Hi. I'm Tom Ball, from Research and Software Engineering Group,
and we're pleased to have Professor Steven Reiss from Brown University. I've known Steve for
many years. I think we met in '93 at a debugging workshop back in Sweden. I think that's when
we met first. And Steve has done work over many decades on really software engineering
systems. I think there was Garden and -- wasn't it something like that? Field, Garden, many,
many different ways to bring together programming environments before there were really sort
of things called IDEs and really looking at the architecture and innovative ways to help
programmers. And today, he's going to I think tell us some interesting things that builds on all
the work that he's been doing about creating a user interfaces. So welcome back to Microsoft.
>> Steven Reiss: It's not really creating a user interface, but we'll get to that. So feel free to
interrupt and ask questions, the two of you who are here. I assume the others can't.
>> Tom Ball: You know, they actually can. I have to monitor the website. They can ask
questions.
>> Steven Reiss: Okay. Okay, so we start in -- you have code repositories out there. Ohloh
claims to have 30 billion lines of code. I'm told GitHub is more. That's a lot of code, and these
things keep growing exponentially. Last year, GitHub was around 18 billion lines. Now it's 30.
Anything that you want to write these days, at some level, has already been written by somebody
else. You just have to find it. Or if not exactly, something close to it has been, and the
hypothesis is stealing code should be easier than writing it in the first place. However, given
that, code search, which is how you access all this data, is not really used today. There are a
variety of reasons for this. It doesn't return the right results. It's hard to determine whether what
you get back is acceptable or useful or not. It's hard to determine what you get back, what it
does, whether it does what you want it to or not. It doesn't fit your existing framework. You're
trying to put code and find a function that will fit in your program. This code doesn't quite fit. It
doesn't fit your style. If you've ever had to work with code in another style, it's really hard to do,
and it's just too much work. There are a few exceptions on the smaller end. We'll get to that.
What I want to do, though, is use this, so let's go and do an example here. So let's put a few
keywords in here. And now what we want to do is search. Now, I can explain what's going on
here. So here, I have a sketch of a user interface, which I've grabbed off the Internet and just
duplicated here, and I want to find code that implements this.
>> Tom Ball: So when you say a sketch, so ->> Steven Reiss: I really Googled a user interface sketch, or I Binged ->> Tom Ball: Is that a JPEG, or what is it?
>> Steven Reiss: Well, it was a JPEG. As I said, I converted it. It's now an SVG or it was a
PNG or something. I just took a picture and converted it. And I tell it to search and find the user
interface, and it's found user interfaces that look something like it.
>>: That look something like it.
>> Steven Reiss: That look something like it, and actually, because it gave a keyword address
book, they probably are something like it. Okay, so we got that. Sorry about the delay. I'll go
back to the slideshow. Here we go. There's a lot of magic going on there.
>>: Did it find the code?
>> Steven Reiss: It found code. It ran the code, and it ran the code to generate all those
interfaces.
>>: And then it did some fuzzy image conversion?
>> Steven Reiss: And then it did the comparison, yes. A lot of magic. Getting the code from
the repositories that does this, finding the user interfaces in that code, getting the code to compile
and run. Have you ever tried to get code from the repositories to compile and run? It's not easy.
Matching the user interface that results to the picture I gave it, to the sketch, validating the user
interfaces? We haven't gotten to that, yet, but the user can go through and validate them, and
then making the results useful, generating code that you actually want to use as a user interface
code. There are a lot of things, and I want to show you how we do each of these. Getting the
code should be simple. We have all this code out there, there are existing search engines. Just
go and use them. It's not as simple as that, though. The code search, if you remember using
code search about code -- not code, using web search about 1990, it wasn't very fun. You
learned techniques that sort of worked. Well, code search engines are about at that level of
primitivity. The best one I've found so far is Google web search addressing GitHub. But the
Google API interface is not as good as the web interface. It returns different results, and they
don't let you use the web interface from within a program. They turn that down. And Bing
hasn't indexed GitHub yet, so I can't try that. Keywords are a lousy interface to code. Code
doesn't have keywords in it, particularly. You have to manufacture them from comments, from
variable names, and they're pretty random, the way things go, and what's returned are files, not
what you're looking for, so the results are not particularly easy to use. So you have to get
through all that. So we can show what's going on here. I have another demo, if I can find it.
This one's all set up. This is looking for ordinal numbers. I want to find a routine that, given the
number 17, returns seventeenth, or given three, returns third, except programmers probably don't
know what ordinal numbers are, so they probably commented it incorrectly or called it
incorrectly in their code, and you never know what's going to happen. So here's code, like search
for ordinal numbers, and trying to figure out what this does. This is actually GitHub here, access
and you're trying to find out what this code does.
>>: You might want to choose your words a little more carefully.
>> Steven Reiss: It doesn't look like much here is going to what I want, but it's hard to say.
>>: But this is just sort of not a surprise, because you're just matching up with screens.
>> Steven Reiss: Right, and Ohloh is no better. So in order to do this type of search we want to
do, we have to be able to do better search. So let's see what we get here. Browsing source
repositories, go to the next slide. Okay. So we went over this. So you have lots of problems. It
returns files, it's hard to browse and understand, it's hard to determine whether it's relevant,
whether you want to use it or not. Who knows whether it's going to compile or not, and trying to
figure out what it does is a lot of work. So what we did is we decided we had to do better than
that, especially if we're going to use search all the time. So bring this up, and hopefully this -- I
don't have a circle going. So what we're going to do is we're going to do search here, and what
we did is used Code Bubbles, essentially, as a front end for code search. So we go and -- I think
I pushed the search button.
>>: I'm not sure. Oh, yes, you got stuff back.
>> Steven Reiss: So it goes out, and in this case it's going out to Ohloh and GitHub and using
that and returning a search. Now, I can use this in various ways. I can now look at the code and
see what this does. Ordinal, I don't know whether I want to use that. Doesn't look at that
relevant, but you can essentially look at things the way you normally look at code. So I can look
at expand ordinal, see if that's at all relevant, and I can go and see what this routine does. And I
can try figuring out what's going on here. Let's take a look. I'll get rid of this, this. I'll go to the
next one. Format, format double, that looks like it might write -- format double, format long.
Zero, zero, that looks like it might be working or close to what I want.
>>: [Indiscernible].
>> Steven Reiss: What?
>>: It's got a nice comment in there with an empty Venn branch.
>>: Using JDK 1.2?
>>: I'm suspicious.
>> Steven Reiss: But you can, again, browse it. I can test if it's going to compile fairly quickly,
say build it, and you see it, and most of it compiles. There's some stuff down here that doesn't
compile, because it's using a consumer. But the write routines I probably don't need, but I could
do this type of investigation. If I want to get more, let's look at one more here. Here's ordinal,
get long, two ordinal of int. That looks like a good routine. This uses that. Let's bring that up.
>>: Are we really your guinea pigs here?
>> Steven Reiss: For the talk?
>> Tom Ball: Is this the first time you're looking at this code? Are you carefully selecting
these?
>> Steven Reiss: Not particularly carefully.
>> Tom Ball: Not that I'm accusing you of anything, Steve.
>>: You're not a very good magician's assistant here.
>> Tom Ball: Well, it started out looking pretty bad. This one's looking better.
>> Steven Reiss: So this one looks like it works, except it uses a routine Cardinal, and if you
want that, that's probably also in the class, so what we can do here is go and tell it to expand this
result, which will get the other packages in the class. And it will go out and do a search for that
and add them in.
>> Tom Ball: So what happens if you really just want this one feature and you've got all this
other code? You want to just extract -- in the end, you want to extract the stuff that you really -you don't want to take a dependence on all this other code.
>> Steven Reiss: If I find a single routine I want, I can just go here and accept that particular
result, and now I can go up here and -- this one, right? Save, which exports the accepted results
and just exports the code that I chose there.
>> Tom Ball: Okay.
>> Steven Reiss: And it also does other things not all that well, but you'll notice there's a suggest
button down here, and as I go and accept results and as I go and delete results, it will recompute
a set of words to do a new search with, so it's doing word analysis. It's sort of working. I haven't
rejected anything, but if I do a suggest here, it goes out, computes things, comes back maybe.
Maybe not. I haven't tried that. I don't know what's running at the back end. Okay, so we can
go back to the talk. We can now do search, and we actually use this to find -- get the use
interface code to work. So talk, talk. No, that's not talk. Where's the talk here? There's the talk.
Okay, so we had that. What we did here, it turned out we did this in a week or two weeks,
something like that, because we had Code Bubbles, and Code Bubbles is a very plug-in interface,
so you have Mint running, which is the message bus, and you have Code Bubbles there, and the
message bus talks to Eclipse at the back end with a little plug in in Eclipse, and then you have
various active things, and then Code Bubbles has all these different plug-ins for code editing,
text editing, debugging and so forth. Now, we started with that for Rebus, and what we did is we
just removed all of the stuff that was Eclipse specific, including things like debugging, which
you're not going to do, and search and educational extensions in the programmer's notebook.
And then we added a back end, Rebase, to replace Eclipse, and that's it. That made it work.
Now, Rebase, we have to work on this back end, and it's fairly simple, as well. It talks to the
message bus. It has a command interface, which has various search commands. We have plugins for the different search engines, so you can add a new search back end in the matter of an
hour or so. It was an hour for the [indiscernible] code, which was the last one we added. It has
this code to handle word bags that deal with suggest. It has an editing interface to let you edit
the files, so I can actually edit the code there and then try compiling it and seeing what happens
as I edit to make it compile. Project interface has file, and then there's a plug-in for Java back
end, so you can plug in with whatever language semantics you want to make it work -- search
over particular languages, because it has to be able to compile and understand the code. So that
was the first problem that we had to deal with in user interface search, is getting the code for the
user interface. So assuming we can do that, we still have to find the user interfaces in the code,
get the code to compile and so on and so forth. Now, all of that we were convinced we could do,
and the way we were convinced of that is we had another search tool we did before. This is here.
And that's here. And this basically is what we're searching for is ordinal number, say third, just
to make it a little more specific. And we're told that we have a signature, which is a string of
ord, and a function returning a string, and we're giving it a test case. In this case, we say 17
should return seventeenth, and that's all we're telling it. And we want this to find code that
actually does that. Now, this is going off, and it's going to in this case GitHub and finding code
that matches ordinal number in third, taking every function in there, thinking that might be a
particular solution, and seeing if that solution works or not, making it compile first, and then
testing it against your test cases and whatever comes back, it'll -- if it passes the test cases, it's
going to accept it and show it. Now, there's a lot going on to make it all do that.
>>: You're using VMs to protect yourself from malicious code, I hope.
>> Steven Reiss: Well, no. It's run as a separate process.
>>: Okay. It can still go to the file system.
>> Steven Reiss: And, two, it's running with the Java security measures.
>>: Okay, so that's an applet sort of?
>> Steven Reiss: Sort of thing. I could set whatever level of security I want, so I could set file
security to allow whatever I want to allow, and the default is don't let it write files.
>>: A question, how does the third help the search?
>> Steven Reiss: How does the?
>>: You search for ordinal number third, right? Are you going to do ->> Steven Reiss: Well, I'm thinking that if it's going to convert things like 17 to seventeenth, it
has to convert three to third, and third's a word that's not going to appear that many other times,
so I'm trying to restrict the search in some way.
>>: And you expect the string third to appear in the code somewhere?
>> Steven Reiss: Well, if it's going to output to third, I expect -- that's going to be a hard thing
for the code to have, for the code to generate without having the string there.
>>: The special cases. So you're saying, I know some special cases.
>> Steven Reiss: I could do it without, and actually without, in GitHub, it works. Without in
Ohloh, it doesn't work. It just takes longer, because it has to try a lot more things, and I wanted
this to finish. So here, you see you get -- here's one piece of code that does it. And it's sort of
reasonable. Here's another piece of code.
>>: That looks like generated.
>> Steven Reiss: I think it's just typed in. It just reformatted everything. I told it to reformat. If
you want to use it, it basically will reformat into whatever style you want, and I told it to
reformat it in my style, so that's getting the output. Here's another case. This is actually similar
to the previous case, same code, probably minor change. And this is different code. But the
code exists. It's out there. You should be able to use it, and I can just cut and paste this and stick
it in my routine, and having some sense it'll work. I can do additional test cases if I want. It's
now in my format, and it's useful.
>>: Can you go find the licenses and pop those up, too, so you know what you're getting?
>> Steven Reiss: Okay. The license should be here. Why doesn't it have it?
>>: Okay. Just trying to make trouble.
>> Steven Reiss: But, seriously, it does have the license.
>>: Okay, somewhere.
>> Steven Reiss: If it had one, it should have a button over here to show up the licenses. It does
extract the license from the code, sticks it in a database. Now, maybe, as you said, the
distributed system depends on lots of machines. The database server might not be working at
this point. So it depends on the database server to look up the license base and the license ID.
Actually, since the system that's running here has been running for three or four weeks, it's
probable that the database server has at some point died and it hasn't necessarily recovered from
that. I'll try fixing that, though. But, yes, it should have a license, and we do extract the license.
Okay. So we had to get -- what we found in doing this, making that magic work, is you had to
get the code from the repository, find candidate methods, so find particular methods that might
convert a string to an int, an int to a string, get the code to compile and run, again, checking the
resultant code using users' test cases, validating the code, making sure that it works, and making
the results useful. Now, this is a lot like what we want to do to make user interface search work,
basically the same thing. And what we're trying to do here is take some set of specifications,
whether they're test cases or whether they're user interface sketch and essentially generate a
program from that using a black-box automatic programmer. In code search, you're doing
something similar. You're taking specifications, which happen to be keywords, usually, going
into a black-box search engine and you're getting code back. Well, actually, what you do is you
get that code back and then you do a lot of program analysis on your own and program editing
and try making it do what you want to do, figure out which code might work the way you want it
to and take that code and eventually you get a program. What we tried doing in S6 is take the
solutions you get from the code tool, code search engine, and do the type of transforms on those
that you as a programmer would do yourself to make it compile, to make it run and figure out
what actually would be useful. We can generate lots more solutions, we again do
transformations on those solutions. We eventually validate it and generate the program, and
that's basically what S6 does, so we think about taking specifications, which in this case are the
signature or the test cases and the keywords, feeding it into S6 and generating a program from
that. Now, the way this works is it first uses keyword search to go out to the search engine. That
generates a set of initial solutions. When we're looking for a method, we say each method in any
of the files that's returned is a candidate solution, and we generate all the candidate solutions.
Now we apply transformations. Transformations come in a whole bunch of varieties. There are
some simple ones like change the name of this method to be the name the user specifies. Very
simple transformation, except you have to change any recursive calls and things like that. Some
are a little more complex. It has an extra parameter here. Let's remove that parameter and insert
an assignment statement, so a Boolean whether to return upper or lowercase. We'll try true and
false and see whether each one of those works. And then there are even more complex solutions,
so find a line in the code that computes a value of the return type, do a backwards slice, until the
only free variables are values of the input types and extract that code from the function. This is
useful, because you often find code that does multiple things at once, and you only want one of
those things done, so we're essentially splitting the function up automatically. But we have about
40 different transformations we're doing, and each time you do one, you're getting to write a new
solution, and you just keep doing this until you can't apply any more transformations to generate
new solutions.
>>: So how often does this really work? It seems awfully complicated, and do you have people
using it?
>> Steven Reiss: I use it.
>>: Yes.
>> Steven Reiss: It's on the web. People tried it.
>>: Have you done a study or something to see?
>> Steven Reiss: In the original paper, I had a number of test cases, 10, 15 test cases.
>>: So what sort of things were these?
>> Steven Reiss: I'd have to look at the original paper. They ranged the gamut from little simple
string manipulation functions to robots.txt, finding something that tells you whether there's a
robots.txt on this webpage. I did English stemming recently this way -- English stemming,
something that would stem English words or remove plurals, at least. But it depends -- there are
two things it depends on. One is the keywords, which given that we have the Rebus stuff there,
so you can do a better exploration and figure out the right set of keywords to add makes that
easier. And the second is the set of transformations, and we keep expanding that set of
transformations.
>>: I don't quite understand the transformations. So you said like slicing, for example?
>> Steven Reiss: We basically find a line to compute, so value the return type and do a
backwards slice until the only free variables are the input types, essentially identifying code
within a function that does what you want it to do, potentially.
>>: But you only specify keywords, so you have to specify some sort of interface to know that.
>> Steven Reiss: There were three things to the specification, keywords, the signature and the
test case.
>>: So the signature is going to help you a lot, but you might have functions that have all sorts
of genericity, have extra out parameters, in parameters.
>> Steven Reiss: We have transformations that remove extra parameters. We have
transformations that reorder parameters.
>>: I see. This is getting back to my original question, is how do you extract? So the
transforms are about extracting just the code that you need.
>> Steven Reiss: And making it compile.
>>: And then the tests are your validation.
>> Steven Reiss: Right. It works a surprising amount of the time.
>>: Okay, so we can just try it?
>> Steven Reiss: Yes.
>>: Okay, all right.
>> Steven Reiss: We have to do dependency analysis, because you saw in the ordinal number
case, the function is one thing, but it used some static variables, it used helper functions. And
that's what this does. It finds all the other things that are needed from that other than the original
solution. We generate test solutions. We actually run JUnit tests on all of these things. That's
what takes the most time. We actually analyze the failure reason for these solutions, and we can
do additional transformations, so that if it returns Seventeen, where Seventeen -- or the whole
thing is in caps, or we want to go all lowercase, we have transformations that can deal with that.
We have transformations that can deal with off-by-K errors, things like that. Generates new
solutions, they get tested. Eventually, you get some that pass, hopefully. We can do formatting
and then return it back to the user. So this magic can be used for user interface search. We
looked at user interface search as a problem along the lines of what we were doing here with S6.
You're not old enough to know what S6 is a play on words of, right?
>>: No, I don't understand it.
>> Steven Reiss: Okay. Bell Labs had a system called L6.
>>: L6?
>> Steven Reiss: Bell Labs Low Level Link List Language. It was the first -- it essentially
generated videos from the language. It was great. It was sort of early Lisp. It's a good name.
The issues here, we have go get the initial files. Again, we're going to do -- we have things for
that. We have to get the candidate solutions. Solutions differ. We don't have methods, we have
user interfaces instead of that. We have some problems with files. We have to find appropriate
transformations, because the transformations used in S6 may not be appropriate for user
interfaces, or there might be additional transformations which are appropriate for user interfaces.
We have to find a way of testing the solutions. We don't have test cases anymore. We just have
this picture, and we have to present the solutions to the user, and presenting code isn't exactly
what you want to do here. So we had S6. What we did here is we removed parts of S6. We no
longer needed dependency analysis, and we didn't want to look at failed certain transformations
and failed solutions. Then we just added something to show the user to validate. So other than
that, it's the same architecture. It's actually using S6 in the back end. So finding user interfaces
in the code, this is what we need to do. So initial files, use keywords as before. This is going to
work, right? We're going to return the keyword, files. How many people write user interfaces to
fit in one file. Not that many. There are a few out there, and we found them, but if you want to
have a broader selection of user interfaces, people are going to write their auxiliary files. They're
going to write their own implementation of the list widget that they want some additional
features for. They're not going to necessarily nest it. They're going to have a user interface class
they're going to use or user interface package they're going to use for the rest of their system, and
you have to deal with all this. So interfaces tend to span multiple files, depends how they were
written. So what we did is we extended S6 to handle package and system search, so instead of
looking for files, you can say, look for everything in this package, and it will essentially find the
file that comes back from the keyword search and then expand that to include everything in the
package, just like we did in Rebus, to expand it. You can expand it a second time and get all the
related packages in there. You can do that in Rebus, too. It takes a little longer. This generates
a single source, and the we just to transformations. It has multiple classes in it. Probably is not
legal Java to compile, but it's close, and we can eventually generate real classes from that and
actually compile it. And we added a lot of new transformations, or a few new transformations, to
split classes up, to extract inner classes into outer classes, make classes static where need be.
>>: There's going to be -- are you reliant on any like Model-View-Controller split? Or are you
going to actually bring data of the application over? I mean, if you want to show the user
interface, then you have to generally populate it with something.
>> Steven Reiss: Yes. We'll get to that. That's transformations. Yes?
>>: So just a quick technical question. So one thing that can happen with UIs, due to things like
localizing text and things like that, is sometimes things can get externalized outside the code.
>> Steven Reiss: Yes, we deal with that. So we first have to identify -- that's in the Swing
transformations here. So we do transformations. That's what S6 does. So we added
transformations. Well, first we had to identify potential solutions and identify potential
interfaces. So any class that implements a component, we say the constructor for this class can
be a potential user interface. Any method, non-private method in a class that returns something
of type component, we said this could be a potential user interface. Now we have potential
solutions. We have to set it up so we can actually use it, so we set it up to have a separate class,
which effectively calls the constructor or calls the constructor and then calls the method and then
returns that user interface, so we have a transformation that does that. Now, we have to handle
all these weird cases of Swing and internationalization and everything else that goes along with
that, and we added these Swing transformations, and they effectively do that. They look at all
the calls to the Swing code, and if anything is undefined in those calls, or if they are calling the
Resource Manager at any point, replace that with -- don't remove the call, but replace it with
something with a fixed string or a new string each time, S6_Temp_somethingsomething. Some
unique string. We replace all Resource Manager calls. We have our own Resource Manager we
can substitute in. You have problems in that they'll use outside models, tree models or list
models, for your particular application. That's usually not in the same class. You may or may
not want that, to copy their whole model. So we'll replace it with our own model. So we have a
transformation which removes their model and substitutes our model. We have transformations
that if they're using some class -- we don't have a library, but it looks the same as a Swing class,
we'll replace it with a Swing class. But there are lots of different transformations that try to
make it so that it will compile and still generate a reasonable user interface. We have
transformations to simplify things. We are taking this code which is not just the user interface.
We want just the user interface back, so we want to throw away everything that isn't related to
the user interface, so we'll go and do a dependency analysis, and effectively any code that's not
used or that can't be reached directly we'll throw away, and then we'll throw away some other
code, too, if it's private methods and things like that. We'll make sure it compiles. We'll make
sure we remove anything that's other than Swing calls that looks undefined. And eventually,
we'll come up with code which is just the user interface -- and we also do all the standard ones -and has a good chance of compiling. Now we need to test it. The code has to run, and what we
do is we actually run the code, we actually build the user interface from the code. That generates
a widget. We do an internal -- we basically crawl that widget, looking at the hierarchy imposed
by that widget and compare what the widget is to what the user's diagram looks like. And then
we order the results on how we match. Now, we have to do this. We have to understand the user
sketch and we have to match, develop a matching algorithm. The sketches, well, we didn't want
to start with arbitrary sketches, although I'm told, in another two weeks, we can start with
arbitrary sketches, because we have something which translates an arbitrate sketch into an SVG
diagram suitable for our use. So we're working on that. But then, we essentially take an SVG
diagram, which is a structured graphics. You can do almost anything there, but it's great because
there are lots of tools that generate SVGs or let you do that, either web tools, web interfaces or
Inkscape and things like that. And then we have classifiers, which we run over the SVG, to try
figuring out what each component of the SVG is. Some of them group multiple items. They
assign properties to the items. So here's a simple example. It's a login panel, login dialog. We
first identify all the SVG elements here, so you have one for that text element there, text element
here, a rectangle here, a rectangle here with some text, a text element in it, a little box there,
remember me, text there and a circle around it, and then the whole thing is an element. Okay.
And now we basically identify what these things are, so we say that's a text field, and we say it's
probably going to be a text field, because it's a box and it's shaded. That's how the user would
typically shade it, or big outline around it. That's how you typically represent it. This is a text
field, as well, but because it has stars there, we're going to guess that it's a password field. These
are labels. That's fairly easy. They're not inside a box or anything. They're just text. These two
go together, because they're adjacent to one another, and this is sort of a box of some sort next to
a piece of text. This is probably a radio or a toggle button of some sort, and you're not going to
be able to differentiate them from a sketch. The user's going to make them look the same way.
This is, again, text inside a box, and it's either a text field, or in this case, because the box is
round, it's probably not a text field. It's probably a button. So we'll do that type of analysis on
that, and these are the properties we're looking for. We're looking for is it an input area or not, is
it a button or not, could they represent a table, a box with lines, vertical and horizontal lines, a
list, a choice thing where you have a box that has a down arrow on the right, so you know you
can choose multiple things there, plaintext, whether it's a symbol, some little icon the user tried
to draw, whether it's a line, whether it's something used only for grouping, it doesn't have any
other effect, options, multiline text versus single-line text, scrollbars we detect. We have a
classifier that looks for those. Numeric fields, if you only have numbers, we can detect that,
sliders, icons. We look for all these different types of things. So example, components only
containing text, combo boxes. As I said, there's a box assembled to the right. Those are
examples.
>>: So you're not -- so you're assuming somebody's done the conversion of a picture to SVG.
>> Steven Reiss: Yes, right.
>>: SVG is telling you this thing's a box.
>> Steven Reiss: Right. And that's what I have a student working on.
>>: Okay, so you have this decomposition of SVG, the problem of taking just an arbitrary set of
pixels, array of pixels, and mapping it.
>> Steven Reiss: To SVG. I've separated that.
>>: Right, but once you have the SVG, you still have quite a bit of noise, so what are your
classifiers? How are your classifiers dealing with just the different ways a scrollbar may appear
and all of that?
>> Steven Reiss: They're basically hand-coded and trying to look at all different possibilities.
So for a scrollbar, I think it looks for a rectangular region, which is either long and narrow or tall
and narrow, one of those. And it has something at the top and bottom and maybe something in
the middle, so a little symbol. You typically will put an up arrow or down arrow.
>>: But these things are robust to scale translation. The classifiers are robust to many
transformations that might be done.
>> Steven Reiss: Right. I've tried to make them that way.
>>: Do you have data on how good they are?
>> Steven Reiss: I've done -- I'll show you the test cases I've done. And, yes, we have lots of
different ones. Go there. We now get a hierarchical set of items. You have nesting here. Each
item has a position and has a size. Each has properties, which are these things we can assign to
it. We now map each of these items into a candidate by saying this item can be one of these
types of widgets. This is the only place we're Swing specific. If you wanted to use Android or
some other toolkit, you'd just use a different mapping here, so we'll say this can be a text field or
a text region, whatever it can be. We also look at and determine relative positioning between
these components, so if this is close enough, we'll say this is above this one, this is to the left of
this one, this is to the right of this one. If it's nested, we can say it's at the top of its nesting thing
or at the bottom of its nest, of its parent. And we're generating a hierarchical component
specification which looks something like this for the example we had, so you had the thing.
Here's the component. That's the top. It says that it's a panel. Then you have a component
which is the label, which is the username there. You have a component which is a text thing.
That's this thing here, which is left of the label, which the label is on the left of, and so on. So
this is what we're trying to match against, effectively, and that's the output of our prescan. Now,
we have to compare the hierarchy. We have to compare the actual widget to this hierarchy. So
we get the widget hierarchy through introspection, we find all possible mappings, or in theory we
will find all possible mappings between the actual components in the widget and the user -- the
components in the hierarchical specification. It turns out we are going to stop at some point,
because there can be an exponential number of these. If you have 20 text fields, you have two to
the 20th different matchings. You don't want to look at all of them, or if there are 32 text fields,
you're never going to finish. But we'll find some set of matchings there, and then the matching
has to obey the hierarchy, so it says this is nested in. We have to be nested. The type constraints
are obeyed. Most of the abstract widgets are mapped, and there are not too many extra
components, and we're a little fuzzy here, and there are reasons for being fuzzy, and I'll get to
them. Then we score the mappings, so we'll look, and you have a diagram here. We'll look at
the width and height of the actual widget, and if it's within 100, we'll give you points for that.
We'll give you credit for being width and credit for the height. If it's strings, we're going to do a
string difference on the text, so the strings are the same. We're going to give you quite a few
points for that. You have relative positioning. We'll look at the relative positioning between the
two, and if you're within 50 pixels, we'll give you points for that. If you have extra components,
we're going to take off points, or if you have missing components, we're going to take off points.
So we're going to penalize and we're going to add, and we essentially come up with a mapping.
>>: Sorry. So the scores ->> Steven Reiss: It's for ranking things.
>>: Right, right, exactly. So I would think a good property you'd want the scoring system to
have would be something like you'd want the differences in scores to be sort of proportional to
the amount of engineering effort it would take to overcome the gap. So, for example, things
being kind of in the wrong pixel order, like something's supposed to be above but it's really
below, that's a super-quick fix, right? That's no big deal for you to code. But, for example, if
something is supposed to have a password box and it doesn't have it, obviously, it takes a big
engineering effort to do that.
>> Steven Reiss: There's a larger penalty for extra components and missing components.
>>: I don't know, I guess my question is ->> Steven Reiss: Missing is the largest.
>>: How did you come up with the scoring?
>> Steven Reiss: Heuristically. Experimentation. There's scoring, but a good match is going to
be at the top, and you don't care about the things towards the bottom, anyway. To validate the
results, I'm not going to do the validation. We're going to show the results to the user, and you
saw that original screen. And we have the ability to turn on the code, so we can go back to
where we were, maybe. I'm going to want that later on. This one. Yes, there are our results.
What we can do here is we can click on any of those results, and we actually get the -- it lets us
run that immediately. We can see what it looks like. We can play with it, and we can see all the
events that are generated. We can see what happens, if it lets us -- this one doesn't let you
change the size, but if it did let you change the size, we could play with that. Here's another one.
I don't know why it doesn't let you change the size. But you can interact with it and see if it's
acceptable. The ones you like, you can accept here, or you can accept them here, or reject them.
And these two are the same.
>>: Can you show the original picture? Can you view these side by side?
>> Steven Reiss: Yes. I'll just have to find it. Because I didn't get rid of it. There it is.
>>: There it is. So it sort of had a big text area on the left, although sort of [indiscernible]
except at the bottom, and then it's got some fields on the right with text labels.
>> Steven Reiss: No. This is missing in all of them. Some labels at the top is missing in all of
them, but they're reasonably close. And something like this, it has name, address, whatnot. But
once you accept whatever you want, you can now say get the code, and it returns the code for
your accepted things. What we haven't done is make the code something you'd want to use yet,
because most of the code that's out there that generates user interfaces you don't want to touch
with a 10-foot pole, but that's another set of transformations we have to work on. We are. But
you can actually get the code that implements that interface. You could do a few other things.
We let you do some minor editing of the user interface, as well. So I made it -- but I can change
the labels to say what I want and rearrange, make things visible or not visible as I go. And if the
code is interactive, you can actually interact with it, so if it has buttons, tabs, to show you
different fields, the interaction will show you that.
>>: Question. So user interfaces seem like a particularly hard example, right? Because there's
stitched together multiple levels of things. You have to understand the user's intent in terms of
just the image, the parsing aspect of it and so on. And then down all the way to the code
comprehension stuff. Do we have code search working well for non-user-interface-related tasks?
Can you search effectively for a ->> Steven Reiss: S6 will do it fine if you have test cases. If you don't have test cases, I don't
know --
>>: Yes, with test cases.
>> Steven Reiss: With test cases, yes, S6 tends to work, if you can find the right keyword.
>>: Have people been using that?
>> Steven Reiss: I have. I haven't made it that public, but I have been using it. Whenever I start
writing code where I think -- I don't feel like this writing this routine, whether it's doing string
manipulation, whether it's doing English stemming, robots.txt or even simple things, commandline parsing, for example.
>>: So the test cases you provide are usually textual. Can I describe I want a red-black tree
implementation that's got some tweak on it? How would you specify something like that?
>> Steven Reiss: Well, it's hard to do a test for a red-black tree, because any tree
implementation should pass all of your tests. Now, you could do keyword search with red-black
and then see if you have a tree implementation in there, and that might give you what you want,
or you could give it a very complex case and see which ones have the proper timing.
>>: It's hard to do a test case from looking at it just functionally from the outside. But if you are
able to say I want a tree div structure, I want it to look like this.
>> Steven Reiss: S6 will handle that. It's not easy to use -- it has a notion of user context. If
you actually run it from within Code Bubbles, you can write your own test cases, and there are
transformations where it'll use your classes and it'll take code that has some other class there and
map that class to use your class. So if you have your own node, whatever you want to
implement for that, it'll try transformations which do that mapping for you.
>>: So do you feel like the basic code search functionality in S6 is good enough and you feel
like that's almost a solved problem and that's why you're exploring this one more level on top of
it which has its own ->> Steven Reiss: I think it's a solved problem, given you probably need another dozen
transformations to handle everything, to do some of the mappings there, especially on the context
stuff. The context mappings we have are sort of primitive. Two, you need a lot more CPU
power. I'm right now running this on my desktop machine. Now, my desktop machine has 16
cores and 64 gig, so it's not a trivial machine, but it could actually run in a cluster very easily.
And you need to have the right keywords to begin with. Keyword search is really bad. And you
need to be able to do test cases, and that only works, as you noted, in a smaller set of things. But
I think for those cases, yes, we've balance sheet shown that if you can find the code, we can do
the transformations and make it useful, if the code exists in the repository. If you tried doing
something really complex, the code might not exist in the repository. In that case, there's not
much we can do. This is why we aim for guided code search, where you do it piece by piece,
and the pieces are there. But user interfaces are one of the things -- trying to go beyond test case,
what can you specify, and this is one. Another one I want to work on this summer is
architectures, where I can specify with a UML diagram and then do a search for something that
implements that architecture. Okay. So this time I'll start from current slide. So we did that.
We did that. So to validate this, we did a web search. I don't know if we used Bing or Google,
one of those, for user interface sketch, and we basically found all these things out there. We
converted them to SVG just by taking the picture and making it an SVG diagram that looked
basically the same. And then we had to figure out keywords, and the first time through, without
the Rebus there, half of them failed because we didn't have the right keywords. After we could
go and find the right keywords, we could make all of these things so that we could actually find
the corresponding interfaces. And you can see logins should be fairly easy, and you have lots of
solutions that it had to look at, and it actually found 45. But all of these, it found them. The stars
here, there were other solutions that it didn't try, that it stored up, and basically if it finds some -it only looks at a limited number at a time, so there are potentially more solutions there. And we
didn't look at more than about 5,000 solutions on any of these. There are test cases when we're
doing -- looking at things at not the file level, but we expand it to package or system, where we
look at 50,000 solutions or something along that ilk, and it takes hours rather than minutes to do.
But we did find solutions for all of those problems.
>>: I just have a philosophical question. I mean, there's been a serious amount invested in
designers for GUIs, where people draw the things they want and then you get the code. So if you
go this route, you get a piece of code that sort of is close to what you want, but are you really
better off in the end this way than starting with what Visual Studio provides and doing the
design? I'm very suspicious, right? Because you're getting something that's not really what you
want, and it's got a bunch of underlying code.
>> Steven Reiss: Yes. You may or may not want to use it, there might be easier ways. But if
you try using any of those toolkits, you find that it's a lot of work to make it so that you have the
right hierarchy there, the window grows correctly, you have the right interactions between all the
things.
>>: So you're saying when I get the code ->> Steven Reiss: The code potentially includes all that.
>>: So somebody else has gone through the hard part of really making it work well, so I can
drag and drop the widgets, but then I still have a lot more work to do.
>> Steven Reiss: You still have a lot more work, and it may or may not scale and do the right
thing when you resize the window.
>>: Yes. Okay.
>> Steven Reiss: That's one argument I can give. The second argument I can give is that I want
to do UI development. Here's my initial sketch. This is -- I started with the initial sketch for the
Pizza program. I relaxed it a little, and I got all these different sketches, which are different user
interfaces for restaurant ordering, basically. So I can use it for exploration. So I can take my
initial sketch, and most of these sketches aren't your really final interface. Even the address
book, I didn't separate name and surname. Seeing the examples there, maybe I should be doing
that. Maybe there are other fields I'm missing.
>>: If I've never designed a UI before, this is valuable just to see what people have done.
>> Steven Reiss: That's one of the reasons we make the scoring fairly lax, not only because we
want to find something, but also because you want to have the ability to find alternative
interfaces. That's a valid use in and of itself. So here's a nice one. It doesn't look necessarily
like the original, but it has most of the capability and it actually does move the total there.
There's interactive code for that. And there's another use. We're able to generate an image, so
going back to our example here, this one. Let's remove everything. That's what the big red
button does, and do login, J checkbox, J password, on both and do a search. I should remove all
the windows here. Now we can take something here or take this, and say show the user
interfaces. So not only can we search the code, we can now tell it to find the user interfaces in
this code. It'll go essentially using the technology that we developed here to show the user
interfaces that can be generated from that code. You can see the different transformations to get
different labels, but you can do this for everything to see what the code looks like. Okay. So I'll
go back here, play from current slide. So we can use that, current status, S6, as I said, is a
functional website. Rebus is actually distributed as part of Code Bubbles. I don't tell anybody,
but if you just run Code Bubbles minus Rebus, you get that. The user interface stuff is under
development. The code is actually out there, not necessarily the current code, but it's there, and
we have a number of things to do yet, which I've mentioned. We're working on going to direct
manipulation interfaces, rather than just static ones. I don't know how to do that. There are not
that many Java ones out there that do direct manipulation, and these don't do it cleanly. And
then we're looking at, as I said, frameworks. Our eventual goal is to take this little toy here and
write a simulator for it with the user interface and the physics engine and everything else without
writing any code ourselves, doing it all via code search. So we want to start with specs and
essentially do program creation from code search. We're also looking at bugfixing via code
search. We do a lot of other things. I didn't want to take too much time.
>> Tom Ball: Thank you, Steve. Simulation of more applause.
Download