Selecting a Database Management System (DBMS)

advertisement
Selecting a Database Management System (DBMS) – A Practical and QuasiTechnical Analysis of Relational Database Management Systems (RDBMS) and
Object Database Management Systems (ODBMS)
Christine Weiss
University of Colorado at Colorado Springs
1. Introduction
The stronghold or dominance of one
product in a particular line of business can
lead to the selection of a product not based
on appropriateness and fit for the need but
rather reputation and word of mouth. This
would seem to be an easy pitfall for
consumers who are uneducated and
unfamiliar with the product’s domain
however it is also prevalent in areas where
the consumer does have limited to
moderate experience and knowledge with
the domain in question.
Relational database management systems
(RDBMS) currently dominate the database
market for use in commercial, business
applications and with this trend there arises
an assumption that the relational model and
RDBMS is best suited for most, if not all,
business-driven applications. This
document explores the validity to this
dominance and makes a case for the need
to select a database management system
(DBMS) for an application based on the
application’s purpose, desired use and
functionality, and application environment
independent of product reputation or
commonplace in the market. This article
discusses two database management
systems currently available on the market
for use; relational database management
systems (RDBMS), and object-oriented
database management systems
(OODBMS). This document is not intended
to present or identify one DBMS as superior
or ideal conversely the goal is encourage
the selection of a DBMS based on analysis
and evaluation of the database’s purpose
and desired competencies.
1.1. Background
There is a tendency in business, particularly
in the technology field, to depend upon the
knowledge and expertise of those
individuals with strong technical
backgrounds when a project is forced to
make ‘technical decisions’. Although this
does tend to work out favorably for the endproduct and the project team, there remains
a need or, rather, a desire for individuals in
decision making positions and/or in
positions of influence with a limited to
moderate technical background to weigh-in
and provide useful input to technical
decisions. This article is aimed at the latter
of the two groups and presents ideas that
have technical basis mixed with real-world
application and understandability.
The author of this article typically falls into
that role on a project team of possessing a
basic to moderately strong technical
background and understanding. When
faced with making a technical decision
alone on what type of database
management system to use for a master’s
project, there emerged a desire to research
the different database management
systems available and represent the
findings for others falling into this quasitechnical role.
A DBMS is a database software program
used to catalog, store, maintain, and
retrieve data in a database. The key
characteristics of a database management
system are:
Therefore, the purpose of this research
paper is to provide sound reasoning and
explanatory information for a reader without
extensive technical knowledge to make
decisions or provide valuable input to
decisions on projects regarding the
selection of a database management
system for a particular project. The
following questions were proposed and
served as guidance prior to and during
research. The article should answer:
●
Use of a data model for designing and
outlining the database schema;
●
A standard language for querying of the
database which enables user to retrieve,
modify, and specify storage of data;
●
Data structures; and
●
A method for performing transactions
aimed at ensuring the four key features
of database transactions; atomicity,
consistency, isolation, and durability
(ACID).
●
What are some of the different types of
DBMS available on the market?
2.1. Relational Database Management
●
What effects or influence does the
domain of the project or the purpose of
the application have on selecting a
DMBS?
●
What is the relationship between the
front-end implementation and the
DBMS? How does this relationship
affect the selection process?
●
What are the advantages and
disadvantages of each DBMS?
A relational database management system
uses the relational model as its basis. The
relational database model was created by
Edgar F. Codd at IBM during the early
1970’s and is based on the set theory to
construct data in terms of rows and
columns. Codd defined 12 rules by which a
truly relational database is defined:
foundation rule; information rule;
guaranteed access rule; systematic
treatment of null values; dynamic on-line
catalog based on the relational model;
comprehensive data sublanguage rule, view
updating rule; high-level insert, update, and
delete; physical data independence; logical
data independence; integrity independence;
distribution independence; and the
nonsubversion rule. Interestingly, no
commercial RDBMS to date has ever been
able to conform to all 12 of Codd’s rules.
2. Types of DMBS
Two different types of DBMS were selected
for evaluation and analysis; the relational
database management system and the
object-oriented database management
system. Although other DBMS do exist,
these systems were selected based on the
large number of commercial products
available in the market for each system and
their contrasting data models.
System (RDBMS)
Because of this, the 12 rules have evolved
into guiding principles or goals of database
design.
The RDBMS structures data into relations
(tables) which form a two-dimensional
representation of the data into rows and
columns. A relation contains tuples (rows)
and each tuple represents a distinct record
in the table. A tuple consists of a set of
unorganized attributes (columns) providing
detail for the record. Rows are assigned a
unique identifier, also known as a primary
key, by which the record can be accessed,
manipulated, and referenced by other tables
or applications. Columns store the
attributes of a record, more commonly
known as fields, and each attribute is
assigned a data type.
During the mid 1980’s, Structured Query
Language (SQL) was identified and
accepted as the standard query language
and transaction mechanism for RDBMS.
SQL queries can be used to access and
return data from tables, define records and
their attributes, and to view data from
multiple tables through operations such as a
join.
Two of the most popular examples of
RDBMS currently on the market are Oracle
and Microsoft Access.
2.2. Object-Oriented Database
Management System (OODBMS)
As implied by the title, object-oriented
database management systems use the
object-oriented model (OOM) similarly to
how object-oriented programming
languages (OOPL) adhere to the OOM.
Coincidentally, research of OODBMS also
dates back to 1970’s (late 1970’s/early
1980’s) however the term ‘object-oriented’
wasn’t coined for databases of this type until
the mid 1980’s. The first commercial
product did not appear on the market until
the late 1980’s. More recently, there has
been a resurgence in OODBMS as open
source object databases emerged that were
more affordable and user-friendly due to the
use of OO-languages such as Java and C#.
The relationship of an OODMS and an
OOPL establishes a strong correlation
between the application data model and the
database data model. This relationship or
marriage of the OODBMS and OOPL
serves as a focal point for analysis and
evaluation of OODBMS.
An OODBMS is the combination of objectoriented programming methodologies (i.e.,
encapsulation, inheritance, abstraction) and
basic database management principles that
help to ensure ACID properties are met.
One of the primary features of an OODBMS
is that “accessing objects in the database is
done in a transparent manner such that
interaction with persistent objects is no
different from interacting with in-memory
objects.” (Obasanjo, 1). In addition, the
OODBMS employs the same mechanisms
for retrieving and modifying stored object
data as the OOPL would utilize to perform
the same actions on an object in the
applications cache. An example of how an
OODBMS operates is to consider the
process of saving data from an application
developed using an OOPL to a flat file. The
system saves specific instances of an object
or multiple objects to a file using the object
identifier (OID) as it’s key and these objects
are recreated using the saved data upon
opening.
The Object Data Management Group
(ODMG) was a group aimed at developing
standards for OODBMS and objectrelational database management systems
(ORDBMS). They released three versions
of a document referred to as the ODMG
which recorded and communicated agreed
upon standards for OODBMS. The group
has since disbanded. One standard the
ODMG did identify was the selection of the
Object Query Language (OQL) as the
standard query language for OODBMS.
OQL uses syntax similar to SQL and is
rarely used since the basic functionality of
queries in intrinsic to object-oriented
programming languages.
3. The Effects and Influence of the
Application’s Domain
There does seem to be a consensus among
scholars and researchers that the purpose
or business use of an application should be
considered when selecting a DBMS. This is
particularly apparent when the domain is
well understood and defined upfront when
the selection processes occurs.
Relational database continue to dominate
and out perform an object-oriented
database for applications meeting traditional
business objectives. The performance of
these databases is still considered the ideal
and therefore, any transaction dependent
business application would probably
continue to benefit from its use.
Object-oriented databases are becoming
increasingly popular in Computer Aided
Design (CAD), Computer Aided
Manufacturing (CAM), and Computer Aided
Software Engineering (CASE) typeapplications. A characteristic that is
apparent in these and other similar
applications using object-oriented
databases is their use of real-world objects
that are easily converted to database
objects using the object-oriented data
model. These systems where objects can
be rolled up into a hierarchical structure or
decomposed into smaller pieces seem to be
ripe candidates for object-oriented
databases. Additional examples include
software for assembling airplanes or cars,
warehouse management, and fields of
science such as molecular biology and highenergy physics.
In some instances, such as those stated
above, the domain of the application has a
clear and obvious influence on the type of
database selection. In other applications,
the impact seems minimal if any exists at
all. This factor will serve on a case-by-case
basis.
4. Relationship of the DMBS and
the Application’s Front-end
Implementation
On an ideal project, the decision on which
programming language to develop an
application’s front-end and the selection of a
particular database management system
would occur hand-in-hand or virtually
simultaneously. Unfortunately, this is not
the typical process. In most cases, the
programming language is decided by the
customer/client and documented in the
system requirements therefore, leaving the
database decision to occur as an
afterthought during design activities.
With the use of the object-oriented data
model as the basis of OODBMS and OOPL,
consideration of a DBMS other than a
OODBMS for an application developed in
an OOPL seems almost absurd. For
applications using an OO language for the
implementation of the front-end, the
database and communication between it
almost becomes intertwined and almost
indistinguishable from the application code
eluding a complementary relationship
between the two or marriage.
Relational database management systems
do not have a similar relationship with any
one programming language used for frontend implementation. The one element that
does stand-out and would probably find
benefit from consideration is the RDBMS
dependency on SQL for communication
between an application and the database.
Thus, taking SQL into consideration when
selecting a language for front-end
implementation and evaluating languages
that exhibit similar fundamentals and
properties would appear to be desirable.
This may require evaluation and trial of a
declarative programming language.
Additionally, procedural languages seem to
blend with the principles of the relational
database and SQL. In procedural
programming, the objective is to segment
the solution into collection of data structures
and routines. This seems reminiscent to the
relational model which also aims to
breakdown the solution into sets of similar
data that have a common set of activities or
functions that can be inflicted upon them.
The relationship to consider or highlight, if
any, when considering the language for
which the front-end will be implemented and
a DMBS is to identify languages that have a
basis in complimentary data models. This is
one of the primary reasons that an OOPL
and an OODBMS correlate and are often
identified together.
5. Advantages and Disadvantages
As with most dueling technologies, most
current research tends to imply a
relationship of ‘one’s disadvantage is the
other’s advantage’ among RDBMS and
OODBMS. In terms of selecting a database
management system, the advantages and
disadvantages of each DBMS should be
measured against the goals and objectives
of the application. Selecting a database
should not occur in a void as selection is
predicated on the software’s purpose and
environment in comparison to the
advantages and disadvantages of any
DBMS.
It is important to note that extensive
research in academia has been performed
on the advantage and disadvantages of
OODBMS and RDBMS. This items listed in
the subsequent sections are not an
exhaustive list. This list is aimed and
specific to meeting the objectives outlined of
this article and should only provide a basis
for comparison.
5.1. Advantages
5.1.1. OODBMS
● Promotion of Reuse
OODBMS are ripe for reuse – a common
goal for most software applications.
Inheritance and polymorphism are two key
features of the object-oriented model that
provide the user with the ability to reuse
objects. Those familiar with OOPL, which
demonstrate the same capabilities,
recognize the advantage of this feature that
allows them to reuse existing data
structures and operations as a foundation
for adding new objects exhibiting similar
features. This element is not only useful for
expanding the current number of stored
objects in a single database but is also
applicable from one OODBMS to another.
● Management of Application Code by
Database Facilities
The use of an object-oriented database can
greatly decrease the volume of code used
by the application. As characteristic of the
object-oriented model, an object holds an
entity consisting of attributes, behaviors,
states, and relationships. Therefore, this
data does not need to be defined in the
application code as this data would be
stored with the object in the database.
Application code is also reduced by not
requiring a querying language to access
and store data. The advantage of storing
the majority of the application data in a
database is that it can then be managed by
database facilitates that ensure data
integrity characteristics such as recovery,
versioning, and persistence.
●
Relationships are Represented Explicitly
A key feature of the OODBMS is the use of
pointers which enables the system to
access an object directly without requiring a
search. Through research, this has proven
to make the OODBMS preferred over a
RDBMS for performing many tasks. This
would apply to tasks that can be performed
using navigational interfaces instead of the
SQL standard – declarative interfaces.
Relationships are typically represented
using an explicit mechanism such as
pointers. Pointers and navigational
interfaces give an OODBMS an advantage
by telling the database ‘how’ instead of
making them search for a ‘what’.
5.1.2. RDBMS
● Strong Performance and Expandability
Relational databases exhibit rapid access to
stored data, large storage capacity, and are
considered flexible or expandable. It is not
necessary to understand every detail of how
a current RDBMS is designed to be able to
expand the database to include additional
relations. Conversely, the RDBMS relations
do not typically demonstrate direct
relationships or dependencies on other
relations and therefore, removing a table
can have substantially lower risk than
removing an object definition in an
OODBMS.
●
Wide-spread Comprehension of the
Relational Data Model
Relational databases are easy to
understand and can be interpreted in many
different ways. Tables and the
representation of data into tabular forms is a
concept familiar to most individuals whether
they are technical or non-technical. With
little knowledge of the domain or relational
databases, an individual will typically be
able to use their previous exposure of tables
and quickly decipher the data contained
within a relational database.
●
Mature System with Strong
Mathematical Foundation
RDBMS have been a business staple in
supporting applications for over 20 years.
The sheer number of RDBMS currently in
use has tested and proven successfully in
achieving a multitude of business needs.
Many researchers also state that the
RDBMS basis in set theory and the
mathematical concept of the relation also
attribute to the RDBMS success and
dominance. For many, the mathematical
aspect is an attraction for RDBMS because
it provides an accepted logic and rigor.
5.2. Disadvantages
5.2.1. OODBMS
● Unintuitive Data Model
The object-oriented model is not initially
intuitive to the average individual. Although
some researchers would contradict this
statement, this model is lost without
previous exposure or explanation on the
basics of the model. To properly design an
effective OODBMS, the database designer
must have a solid understanding of the
object-oriented model and how to efficiently
implement it.
In addition, some data domains have
explicit objects and clearly defined
relationships among objects. Most business
applications do not contain obvious or
intuitive objects which makes application of
the model more difficult.
●
Existing Dominance of the RDBMS
The RDBMS continues to have a stronghold
with business systems. This is a major
disadvantage for OODBMS because as new
databases are added or legacy systems are
modified, it is easier to insert another
relational database into the environment
than to add an OODBMS. The addition of
an OODBMS in this environment may
require modifications in the existing
databases in order to enable
communication and data access among the
existing databases and the new OODBMS.
Unfortunately, this will be a major obstacle
for any OODBMS and may end up being a
primary decision point for a company when
selecting a DBMS.
●
Reputation for Poor Performance
OODBMS performance has historically
been a major downfall and limitation on its
use. As with any product, previous
reputation can work against a product even
though the issue may no longer be of any
relevance. Many articles and research
projects of the late 1980’s thru the late
1990’s repeatedly identified this as being a
major limitation to the wide-spread
popularity and use of the OODBMS. It does
appear that in the current market, OODBMS
can perform at comparable speeds to
RDBMS assuming the data domain is an
ideal domain for an object-oriented data
model.
5.2.2. RDBMS
● Lack of support for data-intensive,
complex applications.
Relational databases lack the ability to
handle complex interrelationships of data.
The RDBMS is unable to store complex
data such as images, digital, and
audio/video data types. With the
commonality of these types of data rapidly
increasing, this presents a major limitation
on the use of RBMS.
●
Language Restrictions
A relational database cannot communicate
or operate with any language other than
SQL. Some researchers identified this as a
benefit claiming advantages such as easier
access to data in multiple databases and
straightforward migration of two or more
databases using a single sub-language.
Although these are valid points, this actually
requires the use of at least two
programming languages to implement any
application using an RDBMS – one for
implementing the application and one for
querying of the database. This would
require extensive knowledge by the
development team in order to ensure proper
implementation and increases the volume of
code that requires maintenance. It will
ultimately increase project scope and
complexity.
●
Assignment of Unique Identifiers
The assignment of object identifiers (OID) is
virtually a transparent process to the user in
an OODBMS. This is not a built process of
RDBMS and requires code and
maintenance to ensure tuples are uniquely
identified. If this function is performed
incorrectly, the integrity of the data comes
into question.
6. Conclusions
Relational databases have a strong-hold on
the current database market due to their
maturity, reliability, the majority of existing
applications using the relational model, and
some unknown factors associated with the
immaturity of object-oriented databases. As
object-oriented programming languages
continue to emerge as the favored or
dominate programming language for
building new applications, the OODMBS will
surpass the RDBMS as the most popular
and dominate database management
systems in the business market. Research
trends in academia tend to support this
observation by the number of projects
currently concluding or underway on
converting an existing relational model to an
object-oriented data model. Similarly, there
was extensive research on object-relational
database management system (ORDBMS)
The main objective of these database
management systems is to merge the
benefits of both the relational and objectoriented model. Not surprisingly, many
RDBMS products on the market today are
releasing first generation ORDBMS
products. If these products are able to fulfill
the main objective of ORDBMS as well as
enable companies to convert existing
relational models to partial or full objectoriented models, the fatality of relational
databases is almost eminent. This will
enable current products to cash-in on this
migration and preserve some semblance of
their market base.
References
[1] Barry & Associates. OODBMS Facts. April
2001. http://www.odbmsfacts.com.
[2] Chaterjee, Jagadish. Introduction to
RDBMS, OODBMS and ORDBMS. January
3, 2005.
http://www.aspfree.com/c/a/Database/Introd
uction-to-RDBMS-OODBMS-and-ORDBMS/
[3] Cigler, James B. Orooji, Ali. ORR: ObjectRelational Rapprochement. COMPSAC '99.
Proceedings. The Twenty-Third Annual
International. October 27-29, 1999.
Page(s):42 - 48
[4] Devarakonda, Ramakanth S., Objectrelational Database Systems — The Road
Ahead, Crossroads, March 2001. Volume 7
Issue 3.
[5] Fong, Joseph. Converting Relational to
Object-Oriented Database. d. March 1997.
Volume 26 Issue 1.
[6] Kim, Won. Object-Oriented Database
Systems: Strengths And Weaknesses.
Journal of Object-Oriented Programming
Focus On ODBMS. 1992.
[7] Kisworo, M.W.; Rajagopalan, P.
Implementation of an Object-Oriented FrontEnd to a Relational Database System. IEEE
Region 10 Conference on Computer and
Communication Systems. 24-27 Sept. 1990
Page(s):811 – 815.
[8] McClure, Steve. Object Database vs.
Object-Relational Databases. IDC Bulletin
#14821E - August 1997.
[9] McFarland, Gregory, Rudmik, Andres, and
Lange, David - Modus Operandi, Inc. Jan
31, 1999. Object-Oriented Database
Management Systems Revisited: An
Updated DACS State-of-the-Art Report.
https://www.dacs.dtic.mil/techs/oodbms2/oo
dbms-toc.shtml.
[10] Obasanjo, Dare 2001. An Exploration of
Object-Oriented Database Management
Systems.
http://www.25hoursaday.com/WhyArentYou
UsingAnOODBMS.html.
[11] Rahayu, W.; Chang, E.; Dillon, T.S.
Implementation of Object-Oriented
Association Relationships in Relational
Databases. Database Engineering and
Applications Symposium, July 1998.
Page(s):254 – 263.
[12] Smith, Karen E., Zdonik, Stanley B.,
INtermedia: A Case Study of the Differences
Between Relational and Object-Oriented
Database Systems. OOPSLA ’87
Proceedings. October 4-8, 1987.
[13] Sujithan, K. R. An Object Model of Data,
Based on the ODMG Industry Standard for
Database Applications. The Institution of
Electrical Engineers. 1995.
[14] Zand, Mansour, Collins, Va, Caviness, Dale.
A Survey of Current Object-Oriented
Databases. Data Base Advances, February
1995. Volume 26, No. 1.
Download