Automated Round-trip Software Engineering in Aspect Weaving Systems Mikhail Chalabine, Christoph Kessler, Peter Bunus Programming Environments Laboratory Dept. of Computer and Information Science Linköping University, Linköping, Sweden. {mikch, chrke, petbu}@ida.liu.se Abstract We suggest an approach to Automated Round-trip Software Engineering in source-level aspect weaving systems that allows for transparent mapping of manual edits in the woven program back to the appropriate source of origin, which is either the application core or the aspect space. 1 Introduction Automated Round-trip Software Engineering (ARE) is a demanding problem in a wide range of modeling and programming systems. The goal of ARE is to automatically assure consistency between different interrelated system artifacts, such as, views, models or code, and, in particular, propagate updates made in the derived artifacts back to their sources of origin. The generic approach to ARE builds around the classical mathematical definition of a bijective mapping and its inverse, which, albeit working beautifully in theory, is difficult to realize in practice as automatic inversion of arbitrary refactoring functions on arbitrary code is hard [3]. The problem, therefore, calls for domain-specific solutions. In this paper we sketch a solution to ARE applicable to a specific class of source-level code refactoring systems where refactoring actions, such as, preprocessing, aspect application, semi-automatic transformations or manual editing are encoded as aspects statically woven into a core. A series of weaving steps transforms an initial, possibly empty, core into a final version of the program [8].1 The approach is of particular importance to Invasive Interactive Parallelization 1 Note that the model also covers refactoring programs in Invasive Software Composition [3]. (IIP) [5] where an interactive system must allow for consistency-preserving manipulations in the composed (parallelized, woven) code. We show that, given we distinguish between the original core code and aspect code, the inverse mapping can be implemented in three steps. First, derive the origin of a lexeme being edited, i.e., decide whether it stems from an aspect or the core and find its (row, column) position relative to the aspect or the core. Then propagate the update to the origin, and, finally, recompose the system, i.e., re-apply all refactoring actions that causally followed the updated source. 2 Motivational example The following simple example illustrates how manual edits in the (transformed) code can be mapped back to either the application core or to the appropriate element in the aspect space. Let us consider a banking application in which a BankAccount class is defined as follows: public class BankAccount { private float balance; private float safetyDeposit; public BankAccount(float p_balance, float p_sdeposit){ balance = p_balance; safetyDeposit = p_sdeposit; } public void credit(float amount){ balance = balance - amount; } public float getBalance(){return balance;} public float getSDeposit(){return safetyDeposit;} } A very simple banking transaction can be illustrated by the following BankTransaction class: 21st IEEE International Conference on Automated Software Engineering (ASE'06) 0-7695-2579-2/06 $20.00 © 2006 Authorized licensed use limited to: Linkoping Universitetsbibliotek. Downloaded on January 6, 2010 at 11:59 from IEEE Xplore. Restrictions apply. public class BankTransaction { public static void main(String[] args) { BankAccount ac = new BankAccount(500, 400); ac.credit(200); ac.credit(100); } } in which two withdrawals are performed on a newly created account. Now let us impose that the owner of an account needs to have at least 250 in the current deposit before withdrawing money and after each withdrawal the balance should be positive. We will introduce these concerns as an AspectJ aspect: public aspect secureTrans { pointcut checkings(BankAccount bk): execution(* BankAccount.credit(float)) && target(bk); before(BankAccount bk) : checkings(bk) { assert bk.getBalance() > 250; } after(BankAccount bk) returning: checkings(bk) { assert bk.getBalance() > 0; } } The aspect secureTrans contains two constraints (assertions) that should be inserted(woven) before and after the execution of the credit method. Now let us suppose that our system has a means to display the woven code to the programmer. In this case the BankTransaction will look like: public class BankTransaction { public static void main(String[] args) { BankAccount ac = new BankAccount(500, 400); assert ac.getBalance() > 250; ac.credit(200); assert ac.getBalance() > 0; assert ac.getBalance() > 250; ac.credit(100); assert ac.getBalance() > 0; } } Now let us consider a manual edit at the woven code level that attempts to relax the first assertion by specifying that, before each withdrawal, the condition that needs to be satisfied is that the amount of money in the current deposit plus the amount of money from the safety account should exceed 250. This edit will transform the first assertion into: assert ac.getBalance() + ac.getSDeposit() > 250; At this point we might consider that the manual edit should be applied only at this program point and leave the other assertions unmodified. On the other hand, since the original code stemmed from an aspect advice we might also consider to modify the advice from originating aspect, rewind all other applications of the aspect, and reapply the modified aspect to the original core. In this case the assertion before the second withdrawal attempt will be modified as well. 3 Background and notation In the following, we assume that a syntactically correct source program is given and is parsed according to a context-free grammar. A concrete syntax tree (CST) or parse tree [1] is a tree representing the syntactic structure of a source program. Its inner nodes correspond to nonterminals and its leaves to terminals (tokens) of the underlying context-free grammar of the source language, as resulting from the parsing process. Whitespace (such as blank spaces, tabs and line breaks) and comments in the source program can be represented properly in the CST by adding artificial “fill” terminals before every terminal occurrence on the right hand side of every production rule. A syntactic program point, or shortly program point, is a triple p = (s, r, c) that consists of a source file name s and the row index r and column index c in the source file s where the lexeme of a token begins. A tree pattern is a generic CST with at least one tree variable symbol. Tree patterns can be formulated in a superset of the base language. Then, the actual code for performing the necessary matching functionality (testing for equality of symbols etc.) can be generated automatically from tree pattern descriptions. Generic tree pattern matching with similar pattern description syntax is widely used in generic tree transformation systems such as TXL [4], puma [6], or TRAFOLA [7], as well as in retargetable code generation. Similar to COMPOST [3] we explicitly declare the pattern type in the pattern specification. This is the type of the associated nonterminal of the pattern root together with variable symbols. This eases parsing the pattern specification and pattern matching, and it makes the composition interface explicit. In COMPOST, trees for program fragments are included in containers, so-called fragment boxes, which are typed explicitly with predefined fragment box types, such as class boxes, method boxes, etc. In practice, these fragment box types always correspond to the contained root nonterminal. We view aspect as a set of pattern-matching-based rewrite rules applicable to program source code that 21st IEEE International Conference on Automated Software Engineering (ASE'06) 0-7695-2579-2/06 $20.00 © 2006 Authorized licensed use limited to: Linkoping Universitetsbibliotek. Downloaded on January 6, 2010 at 11:59 from IEEE Xplore. Restrictions apply. together manifest a certain concern. Each rewrite rule has a left-hand side that consists of a tree pattern, and a right-hand side (also called advice) that corresponds to a subtree substitution that is applicable when the pattern matches a node in the program tree. A weaving transformation A is either an application of a (matching) aspect rule to a CST T , or a subtree substitution stemming from a manual editing step. We denote a weaving transformation A applied to T resulting in a tree T by T A T A hook [3], also known as a join point in AspectJ terminology and as a redex in graph transformation theory, is a program point where the left hand side of an aspect rule matches. Hooks can be implicit, given in terms of program elements that match the pattern stated in the aspect rule, or explicit, given by a manual marking of a specific program element (or sequence of elements) by the programmer such that an aspect can address it. Given a CST T0 of a core program and a sequence of weaving transformations (aspect applications or edits) A1 ,...,An , the weaving sequence T0 A1 T1 A2 T2 ... An Tn (1) pulls the core through a sequence of intermediate states T1 , . . ., Tn , where Tn holds the final version of the woven code. Assmann [2] defines inverse transformations and automatic roundtrip engineering (ARE) as follows: Let A, B be two domains, and f : A → B a transformation function from function space F . If there is a functional i : F → F which calculates for f its inverse f −1 ∈ F then R = (A, B, f, i) is an automatic round-trip engineering system (ARE). The problem with this definition is that it is nonconstructive. For certain transformations such as code removal, the inverse may not exist at all, or the system may not be able to derive it automatically from stored information. We, therefore, choose an approach that does not need to construct inverse transformations at all. All that is needed is a means to trace back the origin of code, and to store the weaving sequence (in forward order). 4 Problem Solution Assume some core program is given by its CST T0 and Tn is the final transformation state as in (1). Consider a manual edit of the woven code in the final state Tn . In order to back propagate the edit we must first find the origin of the code being edited, i.e. a proper syntactic point (s, r, c). 4.1 Program point location In order to track the location information we extend the nodes of the CST-based high-level intermediate program representation (IR) by some ancillary data and functionality. In a straightforward implementation, every augmented node stores its current coordinates in terms of a syntactic point (sw , rw , cw ). It further keeps a history of aspects being applied to it, which is encoded as a list of syntactic points, see e.g., Figure 1(a). Elements of this list hold a pointer to a location in every refactoring aspect rule applied to the node. Given such an augmented IR, the source of origin for a program point is derived in the following way. When the user clicks the woven code, see e.g., the bottom interface frame shown in Figure 1(b), the system searches the IR of the woven code for a leaf node (token) with the coordinate values coinciding with the (row, column) coordinates of the click. When the node is hit, the Get history() method returns its position in the source of origin, see Figure 1(a). A more efficient implementation of this base principle is being developed in ongoing work. We shall now describe a mechanism of propagating updates in the woven code to the proper source. For clarity of the presentation we split the propagation problem in two subproblems. 4.2 Woven-code–to–core propagation In woven-code–to–core propagation an update is born to all the states Ti , i ∈ (1, . . . , k − 1, k + 1, . . . , n). It is either applied as just another weaving transformation (that is, Tn An+1 Tn+1 ), leaving the preceding history intact, or is directly committed to the original core T0 , which is subsequently subjected to re-application of the whole weaving series (replay). Observe that due to the change in T0 , re-application is likely to result in a different configuration (weaving sequence) as the back-propagation can introduce new and eliminate existing hooks. 4.3 Woven-code–to–aspect propagation In woven-code–to–aspect propagation an update is born to the aspect source, which, in turn, requires consistent propagation to all or only a subset of hooks affected by the aspect. Similar to the propagation to a core, an update can be encoded as just another local weaving transformation (Tn An+1 Tn+1 ) or be directly committed to the aspect source. The user can be prompted, e.g., by a clickable dialog which of the two alternatives she prefers: do a local patch or a commit the update to the aspect source and replay. 21st IEEE International Conference on Automated Software Engineering (ASE'06) 0-7695-2579-2/06 $20.00 © 2006 Authorized licensed use limited to: Linkoping Universitetsbibliotek. Downloaded on January 6, 2010 at 11:59 from IEEE Xplore. Restrictions apply. (a) A fragment of the woven-code CST. Aspect SecureTrans inserts a new subtree shown in the dashed rectangle. Each node carries position information, shown for nodes with calls to bk.getBalance() and ac.Credit() only. Following each weaving step, the (sw , rw , cw ) coordinates at every node located after the weaving point are adjusted for the transformation length. (b) A manual edit in the woven code, shown bold in the bottom frame, is propagated to the proper source of origin, which in this case is the SecureTrans aspect shown in the top-right frame. Figure 1. A simple solution attaches position and the transformation history to nodes in the CST. 5 Conclusion and future work We sketched how to track the origin of lexemes in the woven code and how to use this mechanism for implementing ARE in aspect-weaving systems. The suggested solution allows to keep the woven code and the sources consistent when the woven code is manually updated. We report plans for improving the performance of the suggested scheme and formalizing it in terms of generic tree transformations. We also report ongoing work on an ARE-enabled implementation of an interactive weaving system Aspect-Diesel for a simple Pascal-like base language. Acknowledgments This research was supported by the SSF–project RISE - Research on Integrated Software Engineering– and the CUGS graduate school. References [1] A. V. Aho, R. Sethi, and J. D. Ullman. COMPILERS: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA, 1986. [2] U. Aßmann. Automatic Roundtrip Engineering. Electronic Notes in Theoretical Computer Science, 82(5), 2003. [3] U. Assmann. Invasive Software Composition. Springer, 2003. [4] I. A. Carmichael and J. R. Cordy. The TXL Programming Language Syntax and Informal Semantics Version 7. Dept. of Computing and Information Science, Queen’s University at Kingston, Canada, June 1993. [5] M. Chalabine and C. Kessler. Crosscutting concerns in parallelization by invasive software composition and aspect weaving. In Proceedings of the 39th Hawaii International Conference on System Science. IEEE, 2006. [6] J. Grosch. Transformation of Attributed Trees using Pattern Matching. In U. Kastens and P. Pfahler, editors, Fourth Int. Conf. on Compiler Construction (CC’92), Springer LNCS vol. 641, pages 1–15, Oct. 1992. [7] R. Heckmann and G. Sander. TrafoLa-H Reference Manual. In Program Development by Specification and Transformation: The PROSPECTRA Methodology, Language Family, and System, pages 275–313. Springer LNCS Vol. 680, 1993. [8] G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. V. Lopes, J.-M. Loingtier, and J. Irwin. Aspectoriented programming. In Proceedings of the 11th European Conference on Object-Oriented Programming, volume 1241, pages 220–242. Springer LNCS, 1997. 21st IEEE International Conference on Automated Software Engineering (ASE'06) 0-7695-2579-2/06 $20.00 © 2006 Authorized licensed use limited to: Linkoping Universitetsbibliotek. Downloaded on January 6, 2010 at 11:59 from IEEE Xplore. Restrictions apply.