This paper introduces the concept of service oriented databases and

advertisement
Service Oriented Databases:
Abhishek Khanolkar
Abstract: This paper introduces the concept of service oriented databases and compares the different
proposed architectures. At the beginning we try and differentiate the central databases with service oriented
databases. Till date different architectures for the service oriented databases are proposed. Some are even
implemented in commercial DBMS. This paper tries and evaluates a few of them.
Introduction: Service Oriented Architecture (SOA) is an immerging architecture. SOA is not necessarily a
new architecture [1]. Companies have been implementing this architecture for some time now. To
understand the concept we need to understand the concept of ‘service’. A service is defined as “a software
component that encapsulates a function, has a well defined interface that includes a set of messages that the
service receives and sends, and a set of named operations” [1]. Real world examples of services are
abundant—hairdressing cooking and cleaning are good examples of services. So now that we understand
‘services’ it is easier to discuss SOA. The SOA can be divided into three parts, 1) services, 2) Application
that discovers and uses services and 3) an infrastructure, protocol or medium that connects applications to
services [2]. Web services could also be used to build and deploy SOA. Web services are defined as “A
Web service is a software system designed to support interoperable machine-to-machine interaction over a
network. It has an interface described in a machine-process able format (specifically WSDL)” [4]. SOA can
also be defines in the context of Web Services – “Service-Oriented Architecture is a collection of
distributed, self contained Web services that communicate with each other independently of the context or
state of other services” [3].
SOA and databases: Databases play a very vital role in any SOA. Any service however small requires data
to access and add values to it. It is important to note that any service operates to add values to the task it
operates on. A service could be used to insert data into the database or retrieve data from the database.
Database systems are integrated with related applications and can be accessed only by the interfaces
provided by the Application Server (AS) or middleware. There are different kinds of middleware’s -- RPC
(remote Procedure Clients), CORBA (Common Object request Broker Architecture), J2EE now JEE (Java
Enterprise edition), IBM MQ, TIBCO (The Information Bus Company), web methods, MOM (Message
Oriented Middleware) and JMS (Java Message Service). All these technologies are either design patters or
commercial products used to create the SOA. Databases are also divided into two parts central databases
and distributed databases. In the Service oriented databases we are talking about both of these two kinds,
but mostly distributed databases.
In this paper when we talk about the Service Oriented Databases we are also talking about database
Middleware.
Service oriented database or Database Middleware Systems: Database Middleware can be defines as
“systems used to integrate collections of data sources over computer network” [5]. Database Middleware
Systems use the concept of ‘data integration servers’ [5]. The Data integration Servers provide a uniform to
the application viewing the data. There are two ways of deploying an ‘integration server’ – 1) Database
Gateway and 2) Database Mediator [5]. In the Database Gateway approach the commercial database is
configures to access a remote database through a ‘gateway’. The gateway is responsible for providing
access methods to the remote data [5]. In the second approach the integration server used a mediator server
for distributed query processing. The mediator uses the functionality of ‘wrappers’ to access and modify
the data. In these two approaches the user defined types and the query operators are defined in global
operators and contained in libraries. The user defines libraries must be linked to clients in the system. There
is deficiency in this approach 1) the inability to deploy application specific functionality and 2) inability to
efficiently procession user defined types. The MOCHA architecture tries to address these problems.
MOCHA: MOCHA stands for Middleware based On a Code SHipping Architecture. MOCHA is a
database Middleware system designed to interconnect hundreds of data sources. MOCHA is a selfextensible middleware system for Distribute Data sources [5].
A self-extensible middleware-- “A self-extensible middleware system is one in which new applicationspecific functionality needed for query processing is deployed to remote sites in automatic fashion by the
middleware system itself” [5]. MOCHA achieves this by shipping Java code with new possibilities to
remote sites. This shipped Java code can then be used to manipulate data. This pattern differs with the
current database middleware systems because in the current middleware systems the Administrator has to
manually install all the code. MOCHA automatically deploys code to provide efficient query processing.
MOCHA works efficiently with the data operators (data-reducing and data-inflating operators). Data
reducing operators can be aggregates, predicates etc. An, AVG function in ORACLE can be considered as
data reducing operators. So MOCHA efficiently places the task of deploying and using data reducing
operators on the data sources. The data-inflating operators which increases the size of original data are
‘evaluated near’ the client. Since, in most cases the code is smaller than the actual data this optimizes query
processing as less data needs to be transferred.
Comparison: The MOCHA approach is very different from the current database middleware as the data is
processed in the integration servers or data source evaluates only those operators present in its environment
[5]. And, no code shipping takes place.
The reason for developing MOCHA owe to the internet and various data sets used across. So database
middleware services will be efficient only if they are scalable and they offer efficient query processing.
MOCHA offers this opportunity. If currently used database middleware start to deploy code, then all the
libraries also need to be configures and as the number of applications across the network increases this
deployment also escalates.
Developers might also need to add extra functionality to the applications to make it work.
MOCHA on the other hand will provide application-specific functionality to interested sites in automatic
fashion. MOCHA also makes the query operator evaluate on the site. The MOCHA architecture is based on
three components 1) Client Applications 2) Query Processing Coordinator (QPC) 3) the Data-Access
Provider (DAP). The Client could be anything an applet, servlet or stand alone java application. A client
receives request form browser and provides that to the QPC. QPC job is to process and optimize the query.
DAP will help the QPC in query processing and deploying the code, DPC also works with the remote
databases. Query operators used in MOCHA are two types—1) projections and predicates and 2)
aggregates. Aggregates are implemented in java. The operators are based on the plans created by QPC.
Important Note: In MOCHA the code deployment phase occurs on-line as automatic process without
human intervention and there is no need to restart any process to use the functionality received in code
deployment.
Some Technical issues with MOCHA: 1) MOCHA implemented an aggressive policy of object allocation
and re-use. This improved memory management. This was lacking in some of the existing JDBC drivers
used in database middleware’s as objects we created and used just once.
2) Originally the communication between QPC and DAP was done using the RMI it was found to be slow.
So, MOCHA authors created a communication platform using java sockets.
3) Security was provided in order to avoid a dangerous code being executed on the host machine. For this
purpose java infrastructure was used.
We now discuss a different architecture which also is implemented in a commercial product.
SODA: SODA stands for Service Oriented Database Architecture [6]; it is developed for Microsoft SQL
Server DBMS. The SODA architecture was developed with one thing in mind the need for loosely coupled
databases for loosely coupled applications. It was found that the core if many loosely coupled SOA systems
was monolithic and clustered databases. The SQL server development team decided to add functionality to
the system that will make it more SOA oriented. After adding the feature what came out was the SODA.
The following features are added in the up-coming SQL Server.
1) SQLCLR—SQL (Structure Query Language) and CLR (Common Language Runtime).
Embedding the CLR in the core database engine.
2) Database Change Notification—representation of complex queries so that database change
notification can be provided without the need to poll the database.
3) Native web Service Access—administrator can publish a procedure as web method, this allows
SOAP compliant clients to access services directly from database engine.
4) Service Broker—SQL Service is service centric rather than message centric and this benefits a lot.
SODA what database SOA must include-1) It must support and act as a service host.
2) It may be able to directly process and transform service request.
3) It must provide an efficient and easily programmed logic.
SODA why SOA in database?
1) Need to maintain messages in queues persistently.
2) Ability to scale services up and down.
3) Easy support for grid computing.
Conclusion: If a comparison is made between these two architectures I would prefer SODA to MOCHA.
MOCHA is an excellent system for service oriented architectures with features such as memory
management, automatic extension of code and deployment. But, SODA is taking care of all those SOA
oriented demands; it also is taking care of the typical client server demands. Moreover if a current
architecture is using either SQL Server or other such commercial database it is easier to implement the
additional features provided by such systems; then to go in for an entirely new system/architecture like
MOCHA. This is from the business perspective; from the technical perspective both the systems offer
great opportunities. In future it is expected that SQL and other such commercial systems may include more
features such as one offered by MOCHA. It is also important to note that MOCHA was proposed at the turn
of the century that is about 7 years back, and SODA is quite current. It is important to note that although
good architectures are suggested it takes ample of time to implement them in commercial systems.
References:
[1] Gennaro (Jerry) Cuomo, IBM SOA “on the Edge” SIGMOD 2005.
[2] Mira Kajko-Mattsson, Grace A. Lewis, Dennis B. Smith, A Framework for Roles for Development,
Evolution and Maintenance of SOA-Based Systems, International Workshop on Systems Development in
SOA Environments (SDSOA'07), 2007
[3] Dov Dori, SODA: Not Just a Drink! From an Object-Centered to a Balanced Object-Process ModelBased Enterprise Systems Development, Proceedings of the Fourth Workshop on Model-Based
Development of Computer-Based Systems and Third International Workshop on Model-Based
Methodologies for Pervasive and Embedded Software, 2006
[4] Web Services Architecture,http://www.w3.org/TR/2004/NOTE-ws-arch-20040211/ 2004
[5] Manuel Rodr´ıguez-Mart´ınez, Nick Roussopoulos, MOCHA: A Self-Extensible Database
Middleware System for Distributed Data Sources_, MOD 2000, Dallas, TX USA
[6] David Campbell, Service Oriented Database Architecture: App Server-Lite?, SIGMOD 2005,
Download