2001-21 - Systems & Information Engineering, University of Virginia

advertisement
2001 Systems Engineering Capstone Conference • University of Virginia
DEVELOPMENT OF AN INTEGRATION METHODOLOGY FOR LEGACY
DATA SYSTEMS
Student Team: Rebecca E. Gonsoulin, Wendy Lee, Donté Parks
Faculty Advisors: Michel King, Department of Systems Engineering
Client Advisors: Stephen Osborne
Lockheed Martin Undersea Systems
Manassas, Virginia.
stephen.osborne@lmco.com
KEYWORDS: legacy data integration,
methodology, schema.
ABSTRACT
One of the critical problems faced by any large
enterprise with significant investments in computer
technology is data integration, the aggregation of
information from dissimilar sources to provide
increased capabilities. Data integration tends to be a
very resource-intensive process, in terms of both time
and money. The problem with the variety of
integration options is that there is no set of best
practices that can be applied to a given situation. We
researched methodologies of software development
and current techniques of data integration and created
a methodology for integrating legacy systems for
Lockheed Martin Undersea Systems. The Legacy
Integration Framework (LIF) consists of five steps:
Understand, Develop, Test, Implement, and
Maintain. LIF can act as a guiding strategy to ensure
that critical integration-specific details remain in the
forefront over the course of a data integration project.
INTRODUCTION
One of the critical problems faced by any large
enterprise with significant investments in computer
technology is data integration, the aggregation of
information from dissimilar sources to provide
increased capabilities. The approach currently used
by Lockheed Martin Undersea Systems (LMUSS) for
the US Navy involves development of a federation.
In this approach, the systems undergoing integration
are left largely autonomous, with only a minimum of
changes being made to allow for the increased
functionality. While effective, federation is limited,
and new computer systems demand a higher degree
of integration in order to exploit their full value.
Unfortunately, there is no established set of best
practices that can be applied universally to data
integration projects, leaving technical staff with no
69
choice but to develop a new approach for every situation.
LMUSS asked our Capstone team to create a
methodology for use in integrating legacy systems by
researching methodologies for software development and
current techniques of data integration.
DATA INTEGRATION
Early computing offered the opportunity to store and
process large numbers of transactions on mainframes.
Because the number of computing resources was small,
integration was not necessary. The limited demand for
applications meant that they were often written
specifically to fit a particular need or requirement
(Chuah). With the shift from mainframes to desktops and
distributed networks, corporations have come to realize
the value of integration, as the leveraging of enterprise
knowledge can only occur if the widely available sources
of data can be modified to work in a cooperative fashion.
Early integration concentrated on overcoming the
differences between systems by translating data into a
common schema, the schema being the set of rules that
governs the behavior of a database. While schema
integration was an effective approach, the process
involved was very time-consuming, prone to error, and
limiting in terms of flexibility.
These difficulties led to the popularity of federation.
Federation involves the development of a global schema,
which acts as a common language between databases in
the federation (Bouguettaya, 1998). Databases use this
global schema to interact with other databases. Instead of
attempting full integration, federation allows for different
degrees of interoperability. Federation saves a lot of
effort, but is still limited in terms of its capabilities,
especially in terms of scalability. Newer approaches have
been developed, although many of these also involve
manipulation of the schema at some level (Bouguettaya,
1998).
One of the larger hurdles in integration projects is the
determination of how communication is going to occur
between the different systems. Different technologies
have been developed to fill this void. Extensible Markup
Language (XML) addresses the messaging between
Development of an Integration Methodology
systems. It allows users to develop their own tags to
describe their data. The large benefit of XML is that
in separating the representation and semantics of
data, it allows for an extreme amount of flexibility.
In addition, its widespread support within the
commercial industry has prompted the development
of a variety of tools to facilitate XML use.
Another class of software easing the messaging
between different systems is middleware.
Middleware is used to "glue together" separate,
preexisting systems. This software rests on a variety
of different platforms, and is typically combined with
its own messaging system. In this way, there is a
defined standard for the way that the various systems
can interact. By hiding the translation and details
from the user, these programs increase
understandability through the reduction of
complexity.
Enterprise Application Integration (EAI) is a
method for integrating legacy applications and
databases while adding or migrating to a new set of
e-commerce applications. This architecture provides
a fundamental structural organization for software
systems by providing a set of predefined subsystems,
by specifying their relationships and by including the
rules and guidelines for those relationships (Lutz,
2000). It is an approach based on determining how
existing applications fit into new ones and defining
ways to efficiently reuse what already exists.
Platform integration provides some connectivity
among heterogeneous hardware, operating systems,
and application platforms. It uses technologies such
as Object Request Brokers (ORBs) and Remote
Procedure Calls (RPCs) (Gold, 1999).
CORBA, the Common Object Request Broker
Architecture, is a middleware architecture and
specification for creating, distributing, and managing
objects in a network. It enables applications to use
standards for locating and communicating with each
other.
The key element of CORBA, the ORB, acts as a
“software bus” managing access to and from objects
in an application, linking them to other objects,
monitoring their function, tracking their location, and
managing communications with other ORBs. Using
Interface Definition Language (IDL), the ORB can
interact with applications written in different
languages. IDL provides objects with well defined
interfaces. Objects can be written in any common
language including Java, C, C++, and Cobol
(“CORBA,” 2000). The main advantages of CORBA
are platform, language, and location independence.
70
Data integration tends to be a very expensive process,
in terms of both time and money. Its difficulties stem
from the fact that many of the systems undergoing this
integration were never designed for such changes. As
computing grew in popularity and availability, it became
increasingly easy to develop systems for specific purposes
on an as-needed basis. With this sporadic approach to
systems development, attention was not placed on
potential needs for the future. When the need arose for
greater functionality, it was therefore difficult to
implement, especially in the area of integration. Schema
integration, federation, and middleware all serve to fill
this need, and each has been explored to determine their
underlying rationale, as well as their strengths and
weaknesses.
METHODOLOGIES
In the early days of computing and development, there
was no formal design or analysis of systems. Organizing,
defining, and managing the development of projects is a
widely discussed topic in development of products.
Organizations have conflicting needs for a well-defined,
well-managed process, and for the quick delivery of the
product. Existing methods attempt to provide an orderly,
systematic way of development.
Current legacy system integration methods rely on the
tailoring of software integration methodologies.
Lockheed Martin Undersea Systems has such corporate
methodologies, but these are proprietary and were
unavailable for use in this project. This section describes
general approach for integration projects. Software
developers commonly use the spiral and waterfall
methods, while the Gibson method is a problem solving
method taught in Systems Engineering.
The Waterfall Method
While steps in the waterfall method often vary, the
general breakdown is as follows: requirements analysis,
specifications, design, implementation, test, maintenance,
and iterate. The graphic representation of these phases,
shown in Figure 1, resembles a waterfall. The first step,
requirements analysis, defines the requirements of the
project as stated by the customer. In the specification
stage, analysts develop the specifications (i.e. processor
speed, disk space needed) for the system. Next, the
developers outline and design the system, including the
requirements and the specifications from the previous
phases. Implementation is the next step, followed by
testing and maintenance of the system. Designers can
perform iterations of the process in order to make certain
they included all requirements and specifications. The
2001 Systems Engineering Capstone Conference • University of Virginia
waterfall method emphasizes completing each phase
of development before proceeding to the next phase
[Sorenson, 1995].
Requirements
Specifications
Design
Implementation
Test
Maintenance
Iterate
Figure 1: The Waterfall Method. This method
emphasizes the completion of one phase before
proceeding to the next [Sorenson, 1995].
The Spiral Method
The spiral method contains four major stages:
planning, risk analysis, engineering, and customer
evaluation [Spiral, 2001]. Figure 2 shows a diagram
of the spiral method. Each cycle begins with a
planning period, which consists of determining the
objectives, and the alternatives and constraints of the
project. The planning stage also includes defining
the requirements important to the customer. The risk
analysis stage analyzes the alternatives and attempts
to identify and resolve any risks. The engineering
stage consists of prototyping, developing and testing
the product. The customer evaluation stage is an
assessment by the customer of the products of the
engineering stage [Spiral Method, 2001].
The Gibson Method
The Gibson Method, created by John Gibson, is a sixphase approach to systems analysis that provides an
organized way to approach a problem objectively and to
justify the rationale behind a decision. Figure 3 shows a
table of the six steps included in this method.
Determining the goals is critical to the success of a
project. If the goals and objectives of the project are not
clear, then the outcome of the project will not meet
expectations of the customer. The next step, establishing
criteria for ranking alternative solutions involves
developing indices of performance, or IPs, which judge
the performance of a design option. Possible indices of
performance include the impact the solution will have on
existing systems, cost, and whether the solution is harmful
to the environment. Developing alternative solutions is
also an important step in the Gibson method. This allows
a system designer to include all possible options for an
unbiased selection. Ranking the alternative solutions
involves using IPs to choose the best possible solution to
use in the project. Iterating through the method allows
for a more careful problem analysis. Taking action is the
final step in the Gibson method. This step includes
producing the solution and testing the final product. This
step should include a formal procedure for achieving the
goals stated in the first step [Gibson, 1991].
The Gibson Method for Systems
Analysis

Determine Goals of System

Establish Criteria for Ranking
Alternative Candidates

Develop Alternative Solutions

Rank Alternative Candidates

Iterate
 Action
Figure 3: The Gibson Method. This method for systems
analysis is a robust strategy for approaching complex problems
[Gibson, 1999]
Figure 2: The Spiral Method. Every cycle goes through
each stage of the method, creating several iterations of each
stage. [Lutz, 2001]
71
Any legacy system integration methodology needs to
fit into current practices. While current methodologies
view legacy system integration through a software
engineering lens, an integration methodology cannot be
blind to the fact that much of legacy system integration
involves software engineering, in addition to traditional
engineering. An integration methodology, rather than
recommending a specific technology, forms a framework
for approaching integration projects.
Development of an Integration Methodology
An integration methodology should be both
flexible and extensible. Therefore, the integration
methodology will be able to accommodate different
types of integration projects. One of the reasons that
legacy system integration is difficult is because many
integration projects require different integration
technologies. Any methodology should leave room
for improvement as well. There will undoubtedly be
innovation in the field of legacy system integration.
A methodology should allow for incorporation of
new ideas.
THE LEGACY INTEGRATION FRAMEWORK
The Legacy Integration Framework (LIF –
pronounced LĪF) is an amalgam of the Waterfall,
Spiral, and Gibson methods, in addition to
incorporating information from client interviews.
LIF is a guiding strategy to ensure that critical
integration-specific details remain in the forefront
over the course of a data integration project. It is
also meant to ensure that legacy integration projects
are seen in the context of the lifetime of a system,
rather than as a solution to an immediate need.
The LĪF Methodology





Understand
Develop
Test
Implement
Maintain
Each step of the LIF strategy incorporates ideas
from pre-existing methodologies. It is linear in many
respects like the Waterfall Method, but incorporates
the idea of iteration found in the Spiral Method. The
understanding of problem context is largely based on
the Gibson Method. The Gibson method, a
cornerstone of the University of Virginia’s Systems
Engineering curriculum, was developed primarily for
decision analysis. Gibson systems analysis is largely
based on an understanding of the client’s needs. The
understanding phase incorporates the risk analysis
approach of the Spiral Method. This step should also
allow for revisiting earlier steps, another key feature
of the Spiral Method, since client needs and risks
should repeatedly be considered.
72
Understand
Prior to any development on a legacy system, it is
important to understand the role that it is intended to play
in the final system. This requires understanding the
purpose of the system, and how that purpose has changed
over its lifetime. A complete understanding of the legacy
in the future is needed as well. This understanding should
include both technical details and information regarding
the users and others affected by any potential changes.
The goal of this step is to understand the context of the
legacy system, and to prepare the developers for future
change.
A key portion of this step is to ensure that the
contextual information does not stay with the Systems
Engineer Work Group (SEWG) that handles the early
requirements analysis and design work. The software
engineers need some understanding of the system as well,
so they can make informed decisions, and it is imperative
that this information be passed. During client interviews,
LMUSS employee Steve Mitchell described a project
where a member of the SEWG team made presentations
at the start of each development phase, giving the
software engineers the bigger picture surrounding their
work. While this particular solution may not be the only
one that will work, knowledge of the legacy, its users, and
the role of the integrated system should be held by all
members of the integration team. This has the added
benefit of providing buy-in for those involved, as each
team member will better recognize the importance of their
work.
Risk Analysis: A key portion of the Understand phase
is risk analysis. The goal of risk analysis is to assess
overall project risk by identifying and managing specific
risks. Risk analysis addresses and eliminates risk items
before they become a threat to the success of the project.
It determines the extent of the possible risks, how they
relate to each other, and which risks are most important
(Buttigieg).
Risk analysis benefits projects in several ways.
Developers and analysts can eliminate risks before they
become problematic, which allows the development of the
product to proceed smoothly. Projects that use risk
analysis techniques have more predictable schedules. By
identifying and eliminating risks before they become a
problem, they encounter fewer surprises. Because risk
analysis prevents schedule delays, the project cost is often
lower than expected.
Although this method introduces risk analysis as part
of the first phase, analysts should be aware of potential
risks associated with each phase of the project. The
process of risk analysis should begin early, but should
continue to affect decision-making throughout the project.
2001 Systems Engineering Capstone Conference • University of Virginia
Defining the Problem: Defining the problem
accurately from the beginning is one of the most
important steps in the understand step. If the analysts
do not adequately define the problem, the project will
most likely fail. Correct phrasing of the problem and
placing it properly in context allow both the customer
and the analyst to agree on the problem.
Requirements Analysis: Requirements elicitation
is an important aspect to determining the
requirements and needs of the customer. It allows
the stakeholders to reveal and understand the needs
of the systems. Identifying the relevant sources of
information includes not only talking to the client,
but involves the users, the support staff and the
suppliers (Mehalik, 1999). Each group has
something unique to add to the system; the
requirements that the managers give to the developers
could actually be very different from those of the
actual users of the system.
There are several obvious outcomes from a good
elicitation process. The most beneficial outcome is
an informed client. The client is able to distinguish
the wants of the project from the needs. The client
understands the procedure for creating the product,
including the structure, the constraints, the risks, and
the function of the finished product. The project is
feasible in terms of time, necessary equipment,
money, and available personnel (Mehalik, 1999).
This combination leads to a high likelihood of
success for the project.
the Understand phase. The components of the system are
the application platform, operating system, database, and
application programming interface (API) and adapters.
For each project, the integrator should build an integration
architecture that ties all systems together including
application and technical components. Unlike common
methods, LIF consists of an information analysis section
where the integrator defines the data format, structure and
contents of events that will pass between applications
(WebMethods, 2000). Finally, the integrator should
identify the characteristics that the integration architecture
must have in order to meet goals (Concept Five, 2000).
An advantage of the LĪF Strategy is that it can help in
determining which products, adapters, connectors,
vendors, and technologies best fit in the integration
architecture.
Test
Prototyping can be considered a form of testing. In
the process of risk mitigation, prototypes should have
been built to ensure project feasibility, essentially testing
the capabilities of the system. Yet tests should also be
conducted to ensure that the different modules are
operating as expected. This will not be possible in all
cases, especially in projects with parallel development,
but each phase of development should include testing of
the integrated system. These tests should not be left until
actual system delivery. It may require development of a
different approach to testing to handle these parallel
development situations adequately.
Develop
Implement
Once the role of the integration system has been
defined, it can be developed. This process is largely
mature, as technical difficulty was not mentioned in
interviews as a hindrance in integration projects.
Rather, the difficulties came in requirements
definition, which should be addressed in the previous
step.
For development, there must be an assessment of
project risks. In this step these risks should be
officially addressed, although this should also be a
continual process through the project. Technical
pitfalls should be identified and remedied. In
addition, changing requirements should be freely
communicated to all members of the team who could
be affected.
Architecture Analysis: Perhaps the most unique
part of this method is in Architecture Analysis. The
goal of this phase is for the integrator to fully
understand the current states and components of the
integration system. It is the technical counterpart to
73
After development has been completed, it can be
implemented in its intended environment. At this point it
is critical to ensure that the implemented solution operates
as expected. This step should be simple if in the previous
steps engineers have kept the client needs and objectives
in mind.
Maintain
The completion of one integration project probably
will not mark the end of the changes to a system. The
system must then be maintained through later integration
with other systems. Design decisions should be made
with consideration for future changes.
CONCLUSIONS
Legacy system integration is a pervasive problem in
both the public and private sectors. It is made all the
Development of an Integration Methodology
more difficult because there is no organized
collection of best practices that can be applied to
integration situations. This project collected these
best practices and developed a strategy called the
Legacy Integration Framework that can be used in
conjunction with current methods of development to
improve the capabilities of the client, Lockheed
Martin Undersea Systems.
Any legacy system integration methodology needs
to fit into current practices. The Legacy Integration
Framework, rather than representing an entirely new
strategy, forms more of a framework for approaching
integration projects. While not detailing every aspect
of the legacy integration process, LIF can act as a
guiding strategy to ensure that critical integrationspecific details remain in the forefront over the
course of the project.
Lutz, J. “EAI Architecture Patterns.” EAI
Journal. March 2000.
Sorenson, Reed. “Comparison of Software Development
Methodologies.” January 1995. 15 February 2001.
http://stsc.hill.af.mil/crosstalk/1995/jan/comparis.asp.
“Spiral Method.” 16 February 2001.
www.cstp.umkc.edu/personal/cjweber/spiral.html.
“Spiral Model.” 15 February 2001.
http://www.isds.jpl.nasa.gov/cwo/cwo_23/handbook/s
piral.htm.
WebMethods, Inc. Overview of the Application
Integration Methodology (AIM) Process. August 2000.
BIOGRAPHIES
REFERENCES
Bouguettaya, A., et. al. Interconnecting
Heterogeneous Information Systems.
Kluwer Academic Publishers: Boston, 1998.
Chuah, M., Juarez, O., Kolojejchick, J., and
Roth, S. “Searching Data-Graphics by
Content.”http://www.cs.cmu.Edu/Web/Groups/sa
ge/Papers/SageBook/SageBook.htmlSageBook.
Concept Five Technologies. Strategic Business
Applications Through Application Integration.
2000.
http://www.concept5.com/docs/wp02_appint.pdf
“Corba or XML; Conflict or Cooperation?” 15
October 2000.
http://www.omg.org/news/whitepapers/index.htm
Gibson, John E. How to do a Systems Analysis. Ivy,
VA: P.S. Publishing: July 1991.
Glossbrenner, A. and E. Glossbrenner. Search
Engines for the WWW. Berkeley: Peachpit
Press: 1998.
Gold-Bernstein, Beth. “EAI Market
Segmentation.” EAI Journal.
July/August 1999.
King, Nelson. “The New Integration
Imperative.” Intelligent Enterprise. 5 October
1999:24.
74
Rebecca Gonsoulin is a fourth-year Systems
Engineering major from Williamsburg, VA, concentrating
in Communications. Her focus for the project was in the
research of search and retrieval techniques and in
software development methodologies. She has accepted
a position with Applied Signal Technology, Inc. as a
Systems Engineer in Annapolis Junction, MD. She’d tell
you more about her job, but then she’d have to kill you.
Wendy Lee is a fourth-year Systems Engineering
major concentrating in Management Information Systems
from Gaithersburg, MD. Her focus for the project was the
research in integration methodologies and development of
the checklists for the evaluation of products and
middleware. She will be working for
PricewaterhouseCoopers as an IT Consultant in the
Washington Consulting Practice Division after
graduation. We tried to add something funny to this bio,
but she wouldn’t let us put it in.
Donté “The Man” Parks is a fourth-year Systems
Engineering major from Hampton, VA, completing a selfdesigned concentration in Internet Information
Management. His principle project contribution was
research on legacy integration techniques in addition to
providing uplifting bits of wit and sarcasm. He has
accepted a position with Microsoft, and will begin work
there after spending a few (more) months sitting around
not doing much besides watching Cartoon Network and
playing his Playstation 2.
Download