XSnippet: Mining For Sample Code Naiyana Tansalarak and Kajal Claypool Presented by: Shan Li CISC864 Topics Overall of Research Purposes Contributions Approaches Detail in Approaches Overall of Research Purposes: To provide sample codes for new developers to learn tech. quickly Approaches Mining sample codes from existing software systems Overall of Research cont. Steps in Approaches Range of Queries generalized / specialized? Ranking Heuristics for context-sensitive / contextindependence Such as: constructor function / constructor function of DOM Mining Algorithms BFSMINE Alg. , restricts: inside a scope of a method Extensions to BFSMINE Alg. Approaches: the Snippet Mining Processes Figure1: A high-level view of the snippet Mining Process Approaches cont. The goal of the snippet mining is to mine from a given code sample repository all code snippets that satisfy a given user query Q, SelectionAgent pre-selects a set of code model instances cmi on B+ tree index defined on all types declared or referred to in the code sample repository. The MiningAgent invokes the BFSMINE algorithm for every code model instance cm C i Approaches cont. BFSMINE algorithm traverses a code model instance and produces as output a set of paths P that represent the final code snippets returned to the user. On completion of the BFSMINE phase, the MiningAgent passes the collection of the paths P, to the PruningAgent. Approaches cont. Queries The query retun all snippets s, containing codes that instantiate a type tq: (1) all codes that instantiate tq: (2) instantiation of tq is dependent of the code context, i.e. via a static method The following example Approaches cont. Approaches cont. Queries A type-based instantiation query tq is instantiated from any type from the context CT(m) T (s) the lexically visible types in the code snippet s and CT (m) denotes the type context of the method CT (m) : all set of inherited types, visible types in a scope of method, all types for local fields Approaches cont. Approaches cont. Queries Parent-based in instantiation query s denotes a snippet, CP (s) the parent context of the snippet, CP (m) the parent context of the method m. CP (m): The parent context of a method m, denoted as CP (m), is a set containing the superclass extended by its containing source class C, as well as all interfaces implemented by its containing source class C. Approaches cont. Approaches cont. Source Code Model A graphic representation of the structure of source codes. Nodes: a type node, an object node, a method node Edges: inheritance, implement, composite, method, assignment or parameter edge. Approaches cont. BFSMINE Algorithm Given a user query , The goal of the BFSMINE algorithm is to determine for all such instances nq, types and eventually code segments that instantiate the node nq and hence the query type tq. Domain(nq) = {tq} Approaches cont. Approaches cont. Extension-BFSMIN Approaches cont. Extension-BFSMIN Approaches cont. Personal Comments Strengths User defined queries Results from a context-independent retrieval to various degrees of context-sensitive retrieval BFSMIN Algorithm based on a graph that represents a source code model allows mining across method boundaries Ranking heuristic (length, frequency, context ) for providing best-fit code snippets Multiple sample codes with the same query context-independent retrieval (length / frequency ) context-sensitive retrieval (context) Personal Comments Potential weakness Results Is it possible to provide semantic ranking ? Why? Probably, the return code snippets do not have logic among them, just only a chunk of codes Validation approaches To prove that snippet codes is helpful for developers, authors use group test. Two groups with the same condition except that one uses snippet codes, other do not. Limited ?