Reengineering Large-Scale Polylingual Systems

advertisement
The University of Texas at Austin
Reengineering of Large-Scale
Polylingual Systems
Mark Grechanik, Dewayne E. Perry,
and Don Batory
The Center for Advanced Research In Software Engineering (ARISE)
Polylingual Systems

Polylingual systems consist of interoperating
programs (or COTS components) that are written
in two or more languages or are run on two or
more platforms


Native type system is the type system of a host
language in which a program is written
A program written in a host language interoperates with
a program based on a Foreign Type System (FTS)
Pn Pk
Pn
Pk
Pn Pk
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
2
Examples of Polylingual Systems

A C++ program and an EJB interoperate
PC++ PJava
PC++
PJava
PC++ PJava

A C# program and a Python program interoperate
PC# PPython
PC#
PPython
PC# PPython
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
3
Large-Scale Polylingual Systems


Polylingual systems can be represented as graphs of
interoperating programs
 Circles mean programs
 Arrows mean interoperating APIs
For a clique with n programs, the complexity of APIs
used to interoperate programs is O(n2)
P1
We need a scalable
approach for designing,
implementing, and
maintaining large-scale
polylingual systems!
The Center for Advanced Research In Software Engineering (ARISE)
Pn
P2
…
P3
P4
The University of Texas at Austin
4
Assumptions

Reflection is available for all platforms

The cost of reflection is insignificant



Hardware is powerful and cheap
Cost of network communications outweighs the
cost of reflection the order of magnitude
Polylingual systems are based on recursive
type systems
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
5
Core Abstraction
CEO
CFO
Name
Bonus
CTO
Test
Name
Salary
Geeks
Geeks
The Center for Advanced Research In Software Engineering (ARISE)
Int n = R[“CEO”][“CTO”][“Geeks”]
The University of Texas at Austin
6
Operations On Reification Operators
Copy
Creates a copy of an element or attribute and adds it to its new
location. All properties of an element or an attribute are cloned
including all nested elements
Move
It is identical to the copy operation except for the automatic removal of
the original element or attribute upon completion of copying
Add
It appends elements and attributes under a given path
Remove
It removes elements and attributes from the given path. If a removed
element contains nested elements then the entire branch of the graph
under the removed element is deleted
Relational
Compares graphs and their elements with constants, variables, or
other graphs
Logic set
Computes various logic set operations such as intersection, union,
cartesian product, complement, and difference
Composition
Composes two reification operators
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
7
Our Solution: Reification
Object-Oriented Framework (ROOF)

Basic idea: each component in a polylingual system is
represented as a graph of objects and a uniform set of APIs is
provided to navigate and manipulate these objects

We use the generality of graphs to develop a language and
platform-independent solution for polylingual systems

Reification Object-Oriented Framework
 Reify objects from an FTS to the host language
 Remote objects become first-class objects
 Reification is based on reflection
 ROOF hides all the complexity that programmers have to deal
with today
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
8
Birds-Eye View of the ROOF
Foreign Object Reification Language (FOREL)
Reification Object-Oriented Framework (ROOF)
CORBA
.Net
XML
The Center for Advanced Research In Software Engineering (ARISE)
HTML
DBMS
The University of Texas at Austin
9
Reification Mechanism
HTML Parser
<H2>
<B>
<FONT
size=“2">
Hello World!
</ FONT >
</B>
</H2>
The Center for Advanced Research In Software Engineering (ARISE)
C++ Program
…
String s;
s = R[“H2”][“B”][“FONT”];
…
The University of Texas at Austin
10
Reification Mechanism
HTML Parser
<H2>
<B>
<FONT
size=“2">
Hello World!
</ FONT >
</B>
</H2>
C++ Program
…
String s;
s = R[“H2”][“B”][“FONT”];
…
RHTMLC++
to
from
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
11
Reification Mechanism
HTML Parser
<H2>
<B>
<FONT
size=“2">
Hello World!
</ FONT >
</B>
</H2>
C++ Program
…
String s;
s = R[“H2”][“B”][“FONT”];
…
RHTMLC++
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
12
Reification Mechanism
HTML Parser
<H2>
<B>
<FONT
size=“2">
Hello World!
</ FONT >
</B>
</H2>
C++ Program
…
String s;
s=R
R[“H2”][“B”][“FONT”];
…
RHTMLC++
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
13
Reification Mechanism
HTML Parser
<H2>
H2
<B>
B
FONT
<FONT
size=“2">
Hello World!
</ FONT >
</B>
</H2>
C++ Program
…
String s;
H2
B
s = R[“H2”][“B”][“FONT”];
FONT
S
…
RHTMLC++
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
14
Reification Mechanism
Java Virtual Machine
class JCls
{
String GetString()
{
return( new String(
“Hello World!”));
}
}
The Center for Advanced Research In Software Engineering (ARISE)
C# Program
…
String s;
s = R[“JCls”][“GetString”];
…
The University of Texas at Austin
15
Reification Mechanism
Java Virtual Machine
class JCls
{
String GetString()
{
return( new String(
“Hello World!”));
}
}
C# Program
…
String s;
s = R[“JCls”][“GetString”];
…
RJavaC#
to
from
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
16
Reification Mechanism
Java Virtual Machine
class JCls
{
String GetString()
{
return( new String(
“Hello World!”));
}
}
C# Program
…
String s;
s = R[“JCls”][“GetString”];
…
RJavaC#
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
17
Reification Mechanism
Java Virtual Machine
class JCls
{
String GetString()
{
return( new String(
“Hello World!”));
}
}
C# Program
…
String s;
s=R
R[“JCls”][“GetString”];
…
RJavaC#
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
18
Reification Mechanism
Java Virtual Machine
class JCls
{
GetString
String GetString()
{
return( new String(
“Hello
World!”));
Hello World!
}
}
C# Program
…
String s;
s
GetString
S = R[“JCls”][“GetString”];
JCls
…
RJavaC#
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
19
Properties of the ROOF

Our solution does not introduce




Additional type systems
Hard-to-learn API
Special constraints that affect programmer’s
decisions to share objects
ROOF allows programmers to



Avoid using any naming mechanisms
Type check foreign objects at compile time
Other reasons
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
20
FORTRESS


We exploit properties of FOREL-based code
to recover high-level design of polylingual
systems with a high degree of automation
Our solution is FOReign Types Reverse
Engineering Semantic System (FORTRESS)



Normalize code to conform to FOREL grammar
Analyze FOREL-based code using program
analysis techniques (CFA and DFA)
Infer schemas that describe FTS models and
operations executed against them
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
21
FORTRESS Process
Normalized
code
Compiler
Front end
Program
Analysis
GUI
The Center for Advanced Research In Software Engineering (ARISE)
Schema
Inference
Visualization
Engine
The University of Texas at Austin
22
FTS RE Algorithm
Parse the source code and build an AST
Build a control flow graph
Build a data flow graph
For each branch in the control flow graph do
1)
2)
3)
4)
a)
b)
c)
d)
5)
Program
Analysis
Detect reachability of statements accessing and
manipulating reified types
Schema
Create schema definitions from reified types
Inference
Translate operations on reified type instances to
Output
operations of the schema definition elements
Output the schema and operations on its instances Generation
End For
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
23
Schema Inference

SELECT u.Name, c.Course FROM User u,
Courses c WHERE u.ID = c.ID;




Two tables: User and Courses
Attributes Name and ID in User table
Attributes Course and ID in Course table
Declaration of attribute ID in both tables is the
same or compatible
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
24
Schema Inference
User
Name
ID
The Center for Advanced Research In Software Engineering (ARISE)
Courses
Course
ID
The University of Texas at Austin
25
Schema Inference in FTSs
ReificationOperator R;
float var = 100000.0;
R[“CEO”][“CTO”](“Salary”) = var;

What can we infer from this statement?

The structure of a branch of the data flow



Composite type CEO of some FTS
Attribute Salary of type CTO
The type of this attribute and a value that it is
assigned in this branch
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
26
Schema Inference in FORTS
CEO CTO Salary = var;
R[“CEO”][“CTO”](“Salary”)
CEO
CTO
The Center for Advanced Research In Software Engineering (ARISE)
Salary
The University of Texas at Austin
27
Synergy

Program analysis and schema inference engine is a
powerful combination



Create the schemas that reflect the semistructured data
operated by the code
Relate different FTSs by analyzing a single FTS program
Create high-level design by relating actions to schemas
rather than variables and functions
I
J
Q
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
28
Output Generation

Outputs



schemas describing FTSs
instructions in readable format that manipulate
instances of schemas
Visualization Tool


Presents a single high-level view of FTSs
Models program execution and visualizes its
aspects
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
29
FORTRESS Architecture
AST
FOREL
code
Compiler
Front end
Schema
Inference
Engine
Visualization Driver
Control Flow
Analyzer
GUI
FORTRESS
Navigate to node
Elapsed time: 2mins 27 sec
Data Flow
Analyzer
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
30
Conclusion

We show how the ROOF serves the underlying mechanism
enabling the verification of large-scale polylingual systems


Reduce the complexity from O(n2) to 1
Provide uniform API for graph navigation and manipulation with
precise semantics assigned to operations

Enable an effective reverse engineering process

Removes pain associated with understanding of
legacy software

No existing solution addresses this problem
The Center for Advanced Research In Software Engineering (ARISE)
The University of Texas at Austin
31
Download