XSnippet: Mining For Sample Code

advertisement
XSnippet: Mining For Sample
Code
Naiyana Tansalarak and Kajal Claypool
Presented by: Shan Li
CISC864
Topics

Overall of Research




Purposes
Contributions
Approaches
Detail in Approaches
Overall of Research

Purposes:


To provide sample codes for new
developers to learn tech. quickly
Approaches

Mining sample codes from existing
software systems
Overall of Research cont.

Steps in Approaches


Range of Queries generalized / specialized?
Ranking Heuristics for context-sensitive / contextindependence


Such as: constructor function / constructor function of
DOM
Mining Algorithms


BFSMINE Alg. , restricts: inside a scope of a method
Extensions to BFSMINE Alg.
Approaches: the Snippet
Mining Processes
Figure1: A high-level view of
the snippet Mining Process
Approaches cont.



The goal of the snippet mining is to mine from a given code
sample repository all code snippets that satisfy a given user
query Q,
SelectionAgent pre-selects a set of code model instances cmi
on B+ tree index defined on all types declared or referred to
in the code sample repository.
The MiningAgent invokes the BFSMINE algorithm for every
code model instance cm  C
i
Approaches cont.


BFSMINE algorithm
traverses a code model
instance and produces as
output a set of paths P
that represent the final
code snippets returned to
the user.
On completion of the
BFSMINE phase, the
MiningAgent passes the
collection of the paths P,
to the PruningAgent.
Approaches cont.

Queries





The query retun all snippets s, containing
codes that instantiate a type tq:
(1) all codes that instantiate tq:
(2) instantiation of tq is dependent of the code
context, i.e. via a static method
The following example
Approaches cont.
Approaches cont.

Queries

A type-based instantiation query




tq is instantiated from any type from the
context CT(m)
T (s) the lexically visible types in the code
snippet s and CT (m) denotes the type
context of the method
CT (m) : all set of inherited types, visible
types in a scope of method, all types for local
fields
Approaches cont.
Approaches cont.

Queries

Parent-based in instantiation query



s denotes a snippet, CP (s) the parent context
of the snippet, CP (m) the parent context of the
method m.
CP (m): The parent context of a method m,
denoted as CP (m), is a set containing the
superclass extended by its containing source
class C, as well as all interfaces implemented by
its containing source class C.
Approaches cont.
Approaches cont.

Source Code Model


A graphic representation of the structure of source codes.
Nodes: a type node, an object node, a method node
Edges: inheritance, implement, composite, method,
assignment or parameter edge.
Approaches cont.

BFSMINE Algorithm

Given a user query
,
The goal of the BFSMINE
algorithm is to determine
for all such instances nq,
types and eventually
code segments that
instantiate the node nq
and hence the query type
tq. Domain(nq) = {tq}
Approaches cont.
Approaches cont.
Extension-BFSMIN
Approaches cont.
Extension-BFSMIN
Approaches cont.
Personal Comments

Strengths




User defined queries
Results from a context-independent retrieval to
various degrees of context-sensitive retrieval
BFSMIN Algorithm based on a graph that
represents a source code model allows mining
across method boundaries
Ranking heuristic (length, frequency, context ) for
providing best-fit code snippets



Multiple sample codes with the same query
context-independent retrieval (length / frequency )
context-sensitive retrieval (context)
Personal Comments

Potential weakness

Results



Is it possible to provide semantic ranking ?
Why? Probably, the return code snippets do not have
logic among them, just only a chunk of codes
Validation approaches


To prove that snippet codes is helpful for developers,
authors use group test. Two groups with the same
condition except that one uses snippet codes, other do
not.
Limited ?
Download