integration of a financial data service and the introduction

advertisement
1999 Systems Engineering Capstone Conference • University of Virginia
INTEGRATION OF A FINANCIAL DATA SERVICE AND THE INTRODUCTION TO
DISTRIBUTED COMPUTING SYSTEMS
Student team: Andrew Brix, Andrea Brotto, John DeGuenther, Ryan Lentell
Faculty Advisor: James W. Lark, III
Department of Systems Engineering
Client Advisor: Kevin Cromer
Shadwell Capital LC
Charlottesville, VA
E-mail: kcromer@shadwell.com
KEYWORDS: Database Management System,
Distributed Computing Network (DCN), CORBA,
Microsoft SQL Server 7.0
ABSTRACT
Shadwell Capital LC is a hedge fund located in
Charlottesville, VA. This project involved assisting them
optimize their computer resources to meet their needs. This
included two major tasks – integrating their new financial
data service and developing a distributed computing
network.
The integration of the financial data provider entailed
linking the Bridge Stock Data service to a database to store
stock prices. A comparison between Microsoft SQL Server
7.0 and Oracle 8.0 was performed in which it was
determined SQL Server 7.0 better fit Shadwell Capital’s
needs.
A second aspect of the project involved the creation of
a distributed computing network for Shadwell. Distributed
computing involves running parallel processes across a
network of computers, and is implemented based on a
client/server model.
The system that has been implemented was built as a
framework for the future development of a fully functional
system. It provides Shadwell a good understanding of the
possibilities available with distributed computing, and will
help them design a more functional system in the future.
INTRODUCTION
Every instant of the day stock prices are changing.
Companies trading stocks need to be able to monitor what
is happening as precisely as possible. To keep track of
stock prices and to forecast future changes in stock value,
companies need to store information on stock prices over
specific time intervals. These time intervals need to be as
small as possible so the stored data will show all trends
over time. A company can use the data to perform
analyses that will provide them with information to do
such things as calibrate existing forecasting models.
Modern and more sophisticated methods are needed to
accomplish this job more effectively. The integration of a
financial data service is one way in which companies are
accomplishing this task. This capstone project addresses
performing this task for Shadwell Capital LC.
To analyze the data collected though the use of the
financial data service, Shadwell will use neural network
models in the future. These models are complex and
burdensome to a single processor. To alleviate this
burden, Shadwell wishes to distribute this burden across
their Windows NT network. Distributed applications are
developed using object-oriented technology, whereby an
application is broken into self-contained modules that are
located on different processors. At run-time, the modules
can interact with each other to produce the desired output.
By distributing an application, the modules can run
simultaneously, as opposed to sequentially, which will
reduce processing time. This capstone project addresses
installing a distributed computing network at Shadwell
Capital LC and provides information on distributed
application development.
INTEGRATION OF A FINANCIAL DATA
SERVICE
Overview of System. Data enters Shadwell Partners
through the Bridge Stock Data System. Our task was to
store this data on the Shadwell Server for use by Shadwell
employees. The data is downloaded from the Bridge
System into Microsoft Excel through a dynamic data link.
SQL Server then saves this data on the Shadwell server in
Excel format where it can be uploaded to Microsoft SQL
Server using the Data Transformation Services provided.
9
Integration Of A Financial Data Service And The Introduction To Distributed Computing Systems
Finally, Shadwell employees can access the information
through an OBDC1 link to his models. See Figure 1
Figure 2. Life Expectancy of Shadwell Database
Years of Use Shadwell Server Database while
Tracking 2000 Companies
Time Interval Between
Microsoft SQL
Storing Information (min.) Oracle 8.0
Server 7.0
78
31.87
58.76
39
17.62
33.16
26
12.18
23.10
20
9.31
17.72
16
7.53
14.38
13
6.32
12.09
11
5.45
10.43
10
4.79
9.18
9
4.27
8.19
8
3.85
7.39
7
3.22
6.19
6
2.77
5.33
5
2.29
4.40
4
1.95
3.75
3
1.30
2.51
2
0.98
1.89
1
0.65
1.26
Data Enters via
Bridge Imported to
Excel
Saved on
Shadwell Server
Imported to SQL
Server
Imported to
Models via
ODBC
below for a visual representation of the system.
Figure 1. Diagram of Data Flow through Shadwell
Computing Network
Comparison of Databases. Our analysis showed
Microsoft SQL Server 7.0 is the optimal database for
Shadwell Capital LC needs. The main basis for this
recommendation was testing of the life expectancy of the
database on the two architectures. As can be seen in
Figure 2 Microsoft SQL Server 7.0 provides nearly twice
the life expectancy compared to Oracle 8.0. Additionally,
it provides superior tools for database creation and
management. Price is not a differentiating factor as both
packages cost nearly $1,400. One major drawback of
Microsoft SQL server 7.0 is its incompatibility with any
operating system other than a Microsoft product. This
should not be a problem in the Shadwell Capital LC
environment because they are a completely Microsoft
operated workplace and along with SNL Securities have
no intention to change. For these reasons, we believe
Microsoft SQL Server 7.0 is the optimal choice for
Shadwell Capital LC.
Database Design. Shadwell’s staff required the
database to perform the following functions 2:








Import data nightly from Bridge
Migrate data from old database into new database
Backup data
Use stored data to run existing models
Query databases for specific periods or companies
Adding/Deleting a company to/from the list
Edit company attributes
Adjust for stock splits
The class diagram that describes classes and their
relationships within the database, list the classes needed
for the database as: Company, Stock Price At Time t,
Stock Daily Statistics, and Stock Quarterly Statistics.
Price of stocks are constantly changing. For this reason,
every company has many Stock Prices according to the
time at which the price is measured. In addition to storing
intraday stock prices, daily summary statistics will be
stored in the Stock Daily Statistics table. Additionally,
quarterly data will be stored in the Quarterly Statistics
Table.
Database Implementation. The SQL Server 7.0’s
tables were developed manually. Data types were chosen
to minimize storage space. The data types for each
attribute are shown in Figure3.
1
ODBC Driver – (Open Database Connectivity Driver) a cross platform
Application Programming Interface (API) that can be used to
access any DBMS or DBMS Server that has an ODBC
Driver. This enables a software developer to build and
distribute a client/server application without targeting a
specific DBMS
10
2
The last four use-cases are operations enabled from an appropriate and
functional graphical user interface. This is not in place in the present
system.1
1999 Systems Engineering Capstone Conference • University of Virginia
Figure 3. Attributes and Data Types
ATTRIBUTE
MS SQL Server 7.0’s Data Types
Quarter Number
Small int
Volume
Real
Key Funding
Char(7)
DayAndTime
Date-time
Price
Real
Company Name
Char(50)
Ticker Symbol
Char(6)
Exchange
Char(1)
Shares Outstanding
Real
Further, we needed to take into consideration the
necessity of backing up the data with a certain frequency
and storing the backup files in a different place from the
one where the system is located. This task will be
accomplished through the use of SQL Server 7.0’s
Database Maintenance Plan Wizard. The wizard is very
similar to the one use to import/export files into/from
SQL Server 7.0. Every night at midnight the plan backs
up the data as appropriate. Finally, every time a plan is
executed, e-mail alerts are sent to Shadwell employees in
order to keep them updated about data entry and any
errors that may have occurred.
Database Integration. Shadwell decided in October
1998 to acquire Bridge as their financial data provider. A
bridge-tool is added to Excel allowing the user to operate
some bridge-functions integrated in the Excel Tools
Menu3.
A Dynamic Data Link (DDL) between Telerate and
Excel allowed us to import data into Excel. We then
needed to integrate the transfer of data from Telerate into
SQL Server 7.0 through Excel.
In order to perform such integration, we have solved
two different tasks:
 We have created a task to transfer the data from
Telerate into the Excel table.
 We have developed, using the Data Transformation
Services in SQL Server, a plan that grabs the desired data
from the Excel file at appropriate times and stores them in
the appropriate fields of the database. These two tasks and
the export of data from SQL Server into other
applications are described in the following three sections.
3
Note: Telerate and Excel are said to be linked through a Dynamic Data
Link.
DISTRIBUTED COMPUTING SYSTEMS
Since the development of the computer earlier in the
twentieth century, computed processes have become more
complex. Initially, the computer was an elaborate tool that
was used for relatively simple calculations and operations,
yet modern computers perform tasks that are far more
complicated than those of the past. Even with the
increase in computing power, these tasks are often so
extremely intensive that they require great periods of time
to be executed. There are ways to improve the
performance of these systems—some costly, requiring the
users to purchase expensive new equipment and others
that can be implemented without the need for any
additional equipment. Distributed computing is a costeffective means to improve the efficiency of time
consuming computed processes, because it has evolved as
a way to leverage the power of existing computer
networks.
The distributed computing architecture is similar to
client/server computing. Each component of the working
system, called an object, can be called upon by another
program to perform certain tasks (Lewis, 1997). The
application called upon is the server, while the process
performing the request is called the client. Upon
completion, server objects send results back to the client
program. The client program would then compile the
results and proceed with its process. This method would
be more efficient than previous computing methods
because it would allow for multiple components of the
system to be run simultaneously across the network, yet
the results of each can be compiled and viewed on a
single computer. One advantage of distributed computing
is that each object can reside on a different computer,
which allows the computing process to spread the
workload over a network of computers (Lewis, 1997).
Distributed computing allows several networked
computers to work together as if they were one machine.
There are several technologies currently supporting the
distributed architecture that our project team intends to
design, yet the specifications for each are rather different.
Microsoft and the Object Management Group, a
consortium of over 800 companies, have developed
competing architecture standards that allow for the remote
request of computed processes; these software
components must meet strict guidelines regarding their
design and interfacing. Each of these components, called
objects, implements a set of functions and encapsulates
data, yet the requirements for each standard architecture
are quite different (Foody, 1996, 43-45). Because of this,
a large portion of the project was devoted to the
11
Integration Of A Financial Data Service And The Introduction To Distributed Computing Systems
comparison of Microsoft’s Distributed Component Object
Model (DCOM) and the Object Management Group’s
Common Object Request Broker Architecture (CORBA).
These two architectures will be discussed later in more
detail.
Shadwell wanted to find the most cost-effective
solution to its problem. For this reason we proposed the
development of a distributed computing network, which
allows us to solve Shadwell’s problems, yet do so by
using existing hardware.
CORBA vs. DCOM. Based on what was discussed and
read, our capstone team chose to work with CORBA.
Aside from major differences between DCOM and
CORBA, DCOM has a variety of weaknesses, which were
relevant to this project:




Complexity - DCOM and its interfaces are complexperhaps unnecessarily so (Rock-Evans, 1998, 271).
To learn all the aspects of developing the code for a
DCOM application is much more involved than
learning the same aspects in CORBA.
Legacy system integration difficult - Microsoft is
having a lot of difficulty in integrating “legacy”
technology. Integration in DCOM has been
achieved via a very complex set of drivers,
translators, proxy objects, and gateways (RockEvans, 1998, 272).
New and possibly unstable - Microsoft’s middleware
products have taken some time to come to final
realization and are only now beginning to take shape
(Rock-Evans, 1998, 273).
Other platform support is weak - Microsoft’s entire
middleware strategy and services are entirely
dependent on the user having Windows NT as their
strategic platform (Rock-Evans, 1998, 274).
Our team has a variety of reasons for choosing
CORBA over DCOM, which are related to both the
weaknesses of DCOM and the differences between
CORBA and DCOM. These reasons are as follows:

12
Minimize complexity - CORBA was most similar to
what we had learned in a course on object oriented
programming with C++ through its use of interface
inheritance. Also, CORBA’s use of inheritance from
the CORBA::Object reduces the number of complex
details about the interworkings of the Distributed
Computing Network (DCN) that we would have to
learn. Minimizing complexity was important because
we were learning a new technology with little


guidance and had to work within a limited time
frame.
Interference with the legacy system at SNL - The
application would be installed on the network at
SNL/Shadwell where they currently have a legacy
system. We did not want any of our work to interfere
with this legacy system and CORBA does not
interfere with such systems, whereas DCOM may
pose a problem.
Possible system change - If SNL/Shadwell ever chose
to change to a different operating system such as
Unix, integration with the DCOM application would
be difficult. CORBA is blind to operating system so
a system change would have minimal effect.
What was developed for Shadwell. Once CORBA had
been chosen we were able set up a CORBA ORB on the
Shadwell network.
What is An ORB? An Object Request Broker, or an
ORB, is a software component that acts as middleware for
other software components. ORB’s allow objects to
“talk” to each other as if they were part of the same
program; in essence ORB’s permit software components
to work together as a larger program (Katiyar, 1995).
Software components written in any language can
communicate with components written in different
languages via ORB’s; they can even communicate with
components running on other platforms. For example,
one component may be implemented in a Windows 95
environment using the C++ language and another
component may be running on a Unix system and
developed using a programming language called COBOL,
yet by using an ORB these two programs may be able to
communicate with each other.
Another advantage that ORB’s provide is that they
allow software to be distributed across a network. In the
past software developers designed programs that could
only run a single machine, yet often these programs
required tremendous computing capabilities. To run such
a program would require the use of a large mainframe or
supercomputer, which may be quite expensive. By using
an ORB developers recognized that they could exploit the
vast computing power of already existing networks, such
as a company’s Local Area Network (LAN). ORB’s have
the ability to bridge the gap between networked
computers, thus allowing separate pieces of a single
software application to communicate as if they were
physically located on the same machine (Orfali, 1998).
1999 Systems Engineering Capstone Conference • University of Virginia
The ORB technology provides the ability to create the
distributed computing system for Shadwell Capital.
Using an ORB, the system can link Shadwell’s existing
software components, allowing separate components to
communicate via SNL’s LAN and act as one program.
The specific ORB we chose was the Iona Technologies
Orbix ORB, an ORB supplied in the supplementary CD
for the book CORBA for Dummies. For the ORB to be
used across the Windows NT network at Shadwell, the
ORB had to be installed on any computer that would be
used for distributing applications. For the purposes of our
project, we installed the ORB on two specific computers
to test the ORB and generate a simple distributed
application. The actual distributed computing network is
just the series of computers on which the ORB is
installed. The ORB enables applications to be distributed
among those computers.
With the ORB installed on several computers, we
needed to understand the basics of distributed application
development using the given ORB. The most efficient
means possible to gain such an understanding was
through the development of a sample application. The
application we chose to develop is called “Messenger.”
The main idea behind messenger was to build an
application that would be able to send messages from a
client on one workstation to a server on another
workstation that would then display the message. The
basic structure of the application is client/server where the
server stores the message and the client makes commands
associated with the message (phrase). We determined
that the following functionality was needed for the
program:




Ability to store a phrase at the server.
Ability of client to set a new phrase.
Ability of client to print the phrase at the server
workstation.
Ability of client to check the current stored phrase.
The completed application encompasses all of the
desired abilities. The server stores a phrase, which is
initialized to a default value, when the server is started. A
client can then set the stored phrase on the server to
whatever a user enters at the client workstation when
prompted to do so. The client can then check the stored
phrase or print it on the screen of the server workstation.
Another client can also access the same server and
perform operations on the stored phrase. The server
remains running as long as at least one client is in contact
with the server.
An account of the development of the messenger
application was recorded as the application was produced.
The description allows a user to acquire a grasp on how to
develop distributed applications. Although “Messenger”
is a simple application, it allows one to understand the
basic underlying principles of the complex technology of
distributed application development. This account was
formulated for the purpose of educating our client.
Ultimately, Shadwell will be using distributed technology
to develop distributed applications, and the account will
help in this process. The first application to be developed,
partially with our aid will allow data to be passed from
the Bridge workstation to another workstation.
Future Development. The system implemented at
Shadwell does not contain the full functionality that was
originally proposed, as the project’s scope was scaled
back to make it more reasonable. The system is a simple
distributed client/server program that allows a client to
start a server application, and then allows the client and
server to communicate across the network. The
implemented design provides the elementary framework
for a fully functional system. Shadwell currently has
several applications, such as financial models, that could
be enhanced by incorporating them into the distributed
system.
Shadwell employees have stated that in the near future
they will be developing new financial models and other
processes that would be better executed if they could be
run across the SNL network. One particular process that
is under consideration is the transfer of data from a server
program to an Excel spreadsheet. Processes such as these
could be incorporated into the distributed system in the
future as well.
Because it is expected that the project will undergo
further development in the future, it is important that
Shadwell employees understand the specifics of the
implementation. If Shadwell determines that it wishes to
further the development of the distributed system, then at
least one of its employees must have a working
knowledge of both CORBA and C++. For this reason a
significant effort will be placed on the explanation of
CORBA and the specifics of the developed application.
By educating Shadwell’s employees the costs associated
with future development will be reduced.
13
Integration Of A Financial Data Service And The Introduction To Distributed Computing Systems
CONCLUSIONS
The results of this project are extremely beneficial to
Shadwell. The working database will allow employees at
Shadwell to have permanent access to valuable data. This
is beneficial because it allows them to manipulate data,
such as in complex financial models, helping them make
important investment decisions.
The introduction of distributed computing may also aid
Shadwell’s employees, as it will make their computed
processes more efficient. Distributed computing should
permit Shadwell to create complex financial models that
will help them make important investment decisions.
REFERENCES
Foody, Michael A. “OLE and COM vs. CORBA.” Unix
Review. April 1996: 43-45.
Katiyar, Dinesh. “Notes on OLE/CORBA.” 1995. Online.
Internet. Available:
http:\\cui.unige.ch\OSG\people\jvitek\Resources\Languages
\Year95\msg00106.html
Lewis, Bob. “The Distinction Between Distributed
Computing and Client Server is Essential.” InfoWorld.
July 21, 1997: 80.
O’Hara, Liz and John Schettion. CORBA for Dummies.
New York, NY: IDG Books Worldwide, Inc., 1998.
Orfali, Robert and Dan Harkey. “Client Server
Programming with JAVA and CORBA, Second Edition.”
Wiley Computer Publishing. 1998.
Rock-Evans, Rosemary. DCOM Explained. Boston,
MA: Digital Press, 1998.
BIOGRAPHIES
Andrea Brotto – Mr. Brotto is a fourth-year Systems
Engineering major from Carimate (Italy). He is the UVA
Golf Men's Team Captain and the scholar athlete of the
year for 1998 (Ralph Sampson Scholarship Award).
Andrew Brix – Mr. Brix is a fourth-year Systems
Engineering major with a minor in Economics from North
Caldwell, NJ. Next year he plans on joining the
Management Consulting team at Ernst & Young LLP.
John DeGuenther – Mr. DeGuenther is a fourth-year
Systems Engineering major from Atlanta, GA. This past
14
summer he worked as a Systems Developer and
Programmer for two University of Virginia professors on
a joint VDOT/UVA project. Mr. DeGuenther will be
joining American Management Systems, Inc. next year.
Ryan Lentell – Mr. Lentell comes to the University from
Cape Cod, MA. For the past two summers he interned
with PaineWebber working on building their new trading
floors. He is minoring in economics at the University of
Virginia along with his major in Systems Engineering.
Download