A Layered Architecture based on Java for Internet and Intranet

advertisement
A Layered Architecture based on Java for
Internet and Intranet Information Systems
Fidel CACHEDA, Alberto PAN, Lucía ARDAO, Ángel VIÑA
Departamento de Electrónica y Sistemas
Facultad de Informática, Universidad de A Coruña
Campus de Elviña s/n, 15071 A CORUÑA
Telephone: +34-981-167000 Ext. 1323 Fax: +34-981-167160
Email: {fidel, alberto, lucia, avc}@gris.des.fi.udc.es
Abstract. In this paper we present an architecture for building Information Systems
that can be adapted to several runtime environments. Our architecture is structured
in three layers: client, services and data layers. All the layers are independent,
making the system more flexible and scalable. The core of the IS is implemented in
Java to make possible platform and database independence. One of the most popular
Spanish Internet Search Directories: BIWE (http://www.biwe.es), implements this
architecture.
1. Introduction
Since the World Wide Web started at CERN in 1991 its growth has been incredible, owing
to the large amount of heterogeneous information (i.e. personal Web pages, research
publications, etc) that can support. One of the main factors that has contributed to the
growth of the Internet have been the Information Systems (IS’s).
An IS is an organized combination of people, hardware, software, communication
networks and data resources that collects, transforms and delivers information [1]. The
importance of handling the information has forced many enterprises and organizations to
use these systems (IS´s) to manage their Intranet and Internet services.
The IS’s solve the problem of the continuous information updates, by the dynamically
generation of the information, avoiding the impression of obsolete information.
Nevertheless, an important characteristic of current IS’s is the possibility of adaptation to
several runtime environments. This characteristic lets the same IS be valid for Internet and
Intranets, and even for very dynamic environments where the number of users, the amount
and type of information is continuously changing.
Here we present a multi-platform architecture for IS’s that allows system adaptations,
creating a flexible and scalable IS. Also we present an implementation example of the
architecture in a Spanish Internet Search Directory called BIWE.
In the next section we describe the objectives we want to obtain with this architecture. In
Section 3 we explain the different layers of the proposed architecture to describe in Section
4 the implementation. In Section 5 we describe the industrial benefits of our architecture
and in Section 6 we show several configurations for the implementation BIWE. Finally we
expose the conclusions obtained from the implementation of our architecture.
2. Objectives
We believe that an IS suitable for different and changing environments should address the
following basic requirements:
•
•
•
•
•
Platform independence: The system should support different Operating Systems (OS) in
order to obtain an IS adaptable to any kind of environment. This will allow, for
instance, run a low-loading service on an ordinary PC running Windows, while a highloading service could run on a high-performance UNIX server.
Database independence: Directly related to the previous requirement, to achieve a total
independence, the IS should avoid any restriction with the database management system
used in each environment. Both requirements are useful when the system is purchased
by third-party entities since the system will seamlessly run in the environment of the
buyer entity (consider environment equivalent to operating system and database
management system).
Protocol communication independence: An IS should be designed to use any
communication protocol with clients and any language as user interface. Therefore, an
IS could choose the more suitable user interface for each environment.
Extensibility: The system architecture should give support for adding easily new
services and features. Also, the IS should allow its users to develop their own services
(apart from the existing services), without knowledge about the internal working of the
system, just using open frameworks on which all the services are based.
Scalability: The IS should provide a layered architecture that can be distributed in
different ways depending on the system requirements and the network configuration of
its environment. This will let the system give a better performance to different
environments and be prepared to support the changes of the system requirements (think
that in Internet it is usual that a successful service could increase its number of users in
a factor of 10 or even more).
3. Description of the architecture
Nowadays the most commonly used architecture for the development of IS’s in Internet is a
three-layer architecture [4]. The proposed architecture is built on a data layer, a services
layer and a client layer. The connection between the layers can be done through a network,
it does not have to take place in the same machine. In the Figure 1 we can see a diagram of
the architecture with its the different layers and interconnections.
Figure 1: The high-level architecture diagram
The data layer includes all the elements of the system that store data in a physic
device. The data layer is composed by a database management system, although any other
storage system that can interchange information with the next layer according to the
specified interfaces could be used. The tasks of this layer are to store, update and retrieve
data, creating a central point for any data access. The connection with the services layer is a
key issue because the attainment of total independence between both layers is an aspect
very important. The independence between these layers means that the IS will be able to
change the database management system (the core of the data layer) without any changes in
the services layer. There are many different types of interfaces between databases and
services (ODBC, JDBC, PL/SQL, etc.) but we will deal with this subject in the next
section.
The main task of the services layer is to provide all the services requested by the client
layer using, if it is necessary, the data layer for accessing stored information. The services
layer is the more complex and important part of the architecture, from the point of view of
the independence among the other layers. The elements of this layer are a Web server, a set
of services, a Data Access Framework and a Graphic Interface Framework (see Figure 2).
Figure 2: The services layer diagram
The services are provided by a Web server, (the better way to offer services over Internet
or Intranets), that is the interaction point with the client layer.
The Data Access Framework is the single element of this layer that interacts with the
data layer, so it is the central point for any access to information stored in the database.
Furthermore, this element isolates logic changes of the database to the rest of services layer
to obtain transparency with the database.
The Graphic Interface Framework must generate the HTML code that is sent to client
layer over Internet. For this purpose, this element is divided in two components: a basic
component that generates the HTML code, and a more complex component that takes data
structures from the data layer and uses the basic components to generate a graphic interface
to be used by all services. Using this functionality, this framework, which for a Internet IS
uses HTML to generate the graphic interface, could generate any other type of
communication language suitable for other kind of system (i.e., PDF), and the rest of the
services layer would not need any changes. Of course, the client layer should be able to
interact using this new standard of communication.
Finally, the client layer interacts with the services layer and it is the interface between
users and services. In our system, this layer will be composed basically by a Web browser
that allows users to see the results of their requests to the different services. This layer
would be present for all the users, but at the same time is the simplest layer. In any case,
this layer can be more useful if it could be able to execute services that come directly from
the services layer over the network. This means that the client layer will have a part of the
services layer running on its same machine. But not all the services will go to this layer, just
specific services that can be executed there. These services will not need any interaction
with the Graphic Interface Framework since they are just applications running in the client
side, but they may need to interact with the Data Access Framework. This interaction can
be done by two different ways: taking the Data Access Framework to the client layer (so
the Data Access Framework accesses the data layer through Internet) or interacting with the
Data Access Framework directly through Internet [3]. In the Figure 3 we show the first
configuration for the client layer.
Figure 3: The client layer diagram
From a generic point of view, the main advantage of this architecture is the flexibility
provided by the independence between the different layers. This means that the core of the
IS could be located in several computers (we will see an example of this configuration in
section 6). And moreover, this flexibility also improves the robustness of the system due to
services and/or data can be replicated in different computers, obtaining a fault tolerant
system.
More specifically, this architecture forces the services to access the data layer through
the data access interface, which makes possible to perform many changes on the database
without having to change the services, just the Data Access Framework. In the same way,
as the services use the Graphic Interface Framework to generate the graphic interface, the
client layer could use any other type of language instead of HTML only having to change
this framework.
Another advantage of this architecture is the fact that the data access and the graphic
interface components are both frameworks, so developers can design new services without
having to know anything about the data and the client layers. Moreover, external users
could develop their own services using these frameworks distributing the service layer over
many different servers.
4. Implementation
At this point, we summarize the main implementation decisions. Our decisions point at
achieving OS and database management system independence.
In the data layer a relational database has been used, and the independence of the
database access has been achieved using the JDBC 1.0 API. For the implementation of the
Internet Search Directory BIWE, we have used Oracle 7.3.3 database management system
and the Weblogic JDBC1.0 driver for Oracle.
The services layer has been written in JAVA. The services have been mainly
implemented as JAVA servlets instead of traditional CGIs. All the system administration is
centralized in a JAVA applet and some administration services have been implemented as
standalone JAVA applications. Now, we detail the reasons for the decisions above.
JAVA language was chosen mainly because of its platform-independence feature. Other
JAVA features that were considered useful for our implementation were the following:
• JAVA is an Object-Oriented language. This allows an easy and natural implementation
of our object-oriented architecture.
• Multi-threading support in JAVA is very flexible, easy to program and of good
efficiency for our purposes.
• JAVA is quickly becoming the "de facto" standard for building distributed applications
on Internet. There is already an important number of developed and on-going efforts for
enhancing the development of distributed applications in JAVA.
JAVA servlets were chosen as the main way to implement front-end services because of
their advantages over traditional CGIs:
• Servlets are a standard way of executing server-side JAVA programs. So with them we
achieve both platform independence and web server independence (as long as the main
web servers in the market already provide servlets support).
• Servlets allow multi-threading. They are loaded in memory only once and launch a
different thread for each request, which is more efficient than the execution of a CGI.
We chose a relational database system because it provides a reliable, efficient and robust
access to data along with facilities for creating secure infrastructures.
For accessing to the relational database, JDBC has been used to achieve platform and
database independence. Nevertheless, it is important to point that, like other database
independent methods for accessing data (as ODBC), JDBC can be slower than native
database-dependent methods. So, for services where efficiency could be critical we
recommend establishing a test planning to know whether the difference is actually relevant
in that specific case or not. In our system, the results obtained of the tests showed that the
difference was irrelevant for our system and it also let us presume that in very few systems
it could make a difference.
In any case, our object-oriented design would let us seamlessly create child classes of
the currently used classes for accessing the database, in a way that those services which
needed to use native interfaces to the database could do it without changing the
programming interface.
5. Industrial benefits
An IS based on this architecture can be implemented in different Operating Systems and
can also use different database management systems without any changes on the system.
This means that the system can be installed in any enterprise using the hardware and
software available. At the same time, the system could start with a reduced investment, and
later, using its distributed characteristics could increase the performance with a light
improvement of the equipment.
Another important benefit of this architecture is that users (or customers) can develop
their own services using the open frameworks of the IS. This implies that third-party
entities can adapt themselves easily the IS to its new requirements.
All these benefits mean that the IS’s developed using this architecture are immediately
exportable to third-party entities (enterprises, institutions, etc.) because of their easy
adaptation to any environment and to new requirements.
6. Architecture Configurations
Two main elements of our Internet Search Directory are:
• It can work with any database management system with an available JDBC driver.
• It can work with any Web server with JAVA servlets support.
These features along with the capability of distributing the architecture layers in one or
several computers, make possible a lot of valid configurations. Now we detail two of them
we have implemented and tested on our Search Directory.
6.1 Configuration I: 2 layers - 1 computer
This is the simplest configuration, where the data layer and the services layer run on the
same machine. We are using a Sun Microsystems Ultra Sparc 10 with one processor
running at 300 MHz and 256 Mbytes of RAM. From the software point of view, the OS is
Solaris 2.5.1, we use Oracle 7.3.3. as database management system and SUN Java Web
Server 1.2.
Figure 4: The configuration I diagram
This is the configuration that is being used now in the production system. Though the
system answers more than a million queries per month and that we are using low-cost
hardware, the system keeps fast response times, even with high load situations.
The main advantage of this configuration is:
• The reduced cost and the high performance obtained.
The disadvantages of this configuration are:
• The reduced ability of the system to grow once the maximum capacities of the
computer are reached.
• The Internet Search Directory is low protected since all the information system core
layers are exposed to attacks through Internet.
6.2 Configuration II: 1 layer - 1 computer
This configuration tries to make use of the characteristics of the architecture proposed. The
main aspect of this configuration is that the data and the services layer are in different
computers, and also in different operating systems.
In this configuration the data layer is placed in a Sun Microsystems Ultra Enterprise
3000 with two processors at 167 MHz. and 256 Mbytes of RAM with the operating system
Solaris 2.5.1. From the software point of view, this layer uses the same database
management system used as in the previous configuration, Oracle 7.3.3.
The services layer is placed in a less powerful computer, a Pentium with one processor
at 166 MHz. and 64 Mbytes of RAM with Linux as the main operating system. In this case,
the software requirements are different from the previous case. Now we use the Apache
Web server and Jrun Servlet Engine to support all the services developed as servlets.
Both computers are in the same LAN.
Figure 5: The configuration II diagram
The purpose of this configuration is to divide the workload of the Search Directory. This
is the reason why the database management system is located in a powerful computer: the
main workload of the system is located in the database access as result of the multiple
information requests done by users at the same time. On the other hand, the services layer
processes the results obtained by the data layer, so its requirements are fewer.
The advantages of this configuration are:
• The performance obtained by the Internet Search Directory is much higher.
• The system is more secure because it is possible to install a firewall that just isolates the
computer with the service layer, keeping the data layer in a secure environment.
The main disadvantage of this configuration is:
• The system is not as fault tolerant as in the previous configuration, because the correct
functioning of the Search Directory depends on two different computers and the
connection between them.
7. Conclusions
We have presented an architecture for building IS’s adaptable to different environments.
Our architecture is structured in three layers and aims to provide with scalability,
extensibility, efficiency and platform independence (both, OS and database independence).
Our first working system following this architecture is BIWE, one of the most popular
Spanish Internet Search Directories. BIWE has been implemented in JAVA, using JDBC
1.0 for accessing to an Oracle 7.3.3 database. JAVA servlets have been used for building
services on the services layer.
Using this architecture and implementation we construct two different configurations for
BIWE. The first one is simple and of low cost, but at the same time has a very high
performance. The second one uses the advantages of the architecture to make a more
distributed and efficient system, obtaining a high performance improvement, showing how
easily the performance of the system can be increased.
Future researches around the architecture proposed will try to distribute the different
layers of our IS. We will start distributing the data layer, then the services layer and finally
we will distribute both, the data and services layers.
References
[1] J. A. O’Brien, “Introduction to Information Systems. An Internetworked Enterprise Perspective”, Second
Alternate Edition, Ed. Irwin/McGraw-Hill, ISBN: 0-256-25196-7
[2] V.N. Gudivada et al., "Information Retrieval on the World Wide Web", IEEE Internet Computing, Vol. 1,
No 5, Sept. 1997, pp 58-68.
[3] A. Puliafito, O. Tomarchio, L. Vita and K.S. Trivedi, "Increasing application accessibility through Java",
IEEE Internet Computing, Vol. 2, No 4, July 1998, pp 70-87.
[4] G. Wiederhold, "Mediators in the Architecture of Future Information Systems.", IEEE Computer, March
1992, pp 38-49.
Download