OODB – CS42002 Object Relational Database Comparison Presentation Report By Neil Black Steven Bruce Marisa Di Filippo OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Contents Relational Databases 3 Object Orientated Databases 5 Object-Relational Databases 8 Database Comparison 9 References 11 2 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Relational Databases In order to make a comparison of the different databases we first need to have an understanding of the main types of database on offer. The first and by far the most common is the relational database. A relational database uses records relating to each other in a table in order to store information. For example, all the records for employees of a company would be stored in one table, the EMPLOYEE table. A table is easily visualised as a tabular arrangement of data, not unlike a spreadsheet, consisting of vertical columns and horizontal rows. This can be shown in the table below. EMP_ID 1 2 78 79 200 FIRST_NAME Neil Marisa Steven Hannah Fraggle EMPLOYEE LAST_NAME Black DI Filippo Bruce Bains Eater DEPT 1 1 2 3 3 SALERY 750000 100 35789 10 1 A table consists of a number of records. The field names of each record in the table are the same, although the field values may differ. Every employee record has a salary field, called SALARY. The values in the SALARY field can be different for each employee. Each field occupies one column and each record occupies one row. In each column of the table, you put a specific category of information for the employees, such as their ID number, first name, and department. Each row in the table contains the information relating to a specific employee, together as one record. Each record is a unique entry and is independent of any other record in the table. Every table in the relational database has a field or a combination of fields that uniquely identifies each record in the table. This unique identifier is called the primary key. The primary key provides the means to distinguish one record from all the others in a table. It allows the user and the database system to identify, locate, and refer to one particular record in the table. The database designers determine the best candidate field for the primary key. The employee's first and last names together could be a primary key, that is until a new employee with the same name is hired. Then the key would no longer be unique. Sometimes the designers have to define a new ID number or code field, just so a table has a primary key. For the EMPLOYEE table, the primary key would be the employee ID number. No two employees can have the same ID number. 3 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo When a field in one table matches the primary key of another table, the field is referred to as a foreign key. A foreign key is a field or a group of fields in one table whose values match those of the primary key of another table. You can think of a foreign key as the primary key of a foreign table. Using the above table we can see how two tables in a relational database relate. EMP_ID 1 2 78 79 200 DEPT 1 2 3 FIRST_NAME Neil Marisa Steven Hannah Fraggle EMPLOYEE LAST_NAME Black DI Filippo Bruce Bains Eater DEPARTMENT DESCRIPTION Executive Decisions Girly Chats Toilet Inspectors DEPT 1 2 3 2 3 SALERY 750000 100 35789 10 1 LOCATION Glasgow Edinburgh Glasgow 4 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Object Orientated Databases. Over the past decode or so, a relatively new technology has been merging in the database market, Object Orientated Databases (OODB). We can think of an OODB as the combination of two things, the advantages of object technology and the advantages of database technology. First of all lets remind our self of the basics of object technology. Objects AS the name may suggest, objects are a fundamental part of object orientation. Objects are basically an entity that is uniquely identifiable, they can be characterised by there behaviour, have an updateable state that can be accessible or hidden and objects can have relations to other objects. Abstraction As we are showing here by the simple fact were using Object Orientation (OO) within the context of a database system. OO is not simply restricted to programming or databases. We user abstraction to find out how complex an issue is, and then we use it to solve the problem. Encapsulation This means that the object not only contains the data, but the instructions for processing that data. This means objects can be reused. Classes and Instances Objects are defined by their classes. Attributes, behaviours, relationships to other objects are all defined in a class. An instance is simply an individual occurrence of a class. Inheritance New Objects can inherit characteristics and behaviour from previously created objects. Messages and Methods Methods are used to access data. An object is sent a message to invoke a method. Simple as that. Polymorphism Polymorphism means that a message can produce different results depending on which object it is sent to. 5 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo We all know that the purpose of a database is to provide persistent data that can be shared across many users concurrently. Data integrity, system recovery, reduced data redundancy, the ability to run ad hoc queries and provide a secure environment for a common interface to the data are also a must in most modern database systems. OODB’s not only contain all of these features but also contain some impressive improvements on top. OODB’s are best suited to storing small amounts of complex data. They also tend to access the data faster. One reason for this is because the storage and memory models are the same (both based on the concept of objects). Code does not need to be converted from the flat architecture of the storage model to the object application language. Today there are two types of databases in use, one that is modelled after an OO language and one that is independent of any OO language in particular. Building an OODB after a current OO language has one major advantage; there is no mapping between the database and language needed. Of course this means that the database is limited to the language model. With language independent OODB’s, developers have access to a richer database model, but, they must map the database to the language model. Some of the OODB specific features are as follows. Object Identity In the object database model, Object Ids (OID) are used to identify data and maintain uniqueness instead, this means there is no need for primary and foreign keys. Navigation of objects occurs by moving across (traversing) OID’s. Data integrity is also maintained through the use of OID’s. Therefore, no OID is ever reused to avoid data corruption. OID’s can be either physical or logical. Physical object ID’s might include a page id and a container id. One issue that arises with the use of physical ID’s is that data may become corrupted. If object ID’s are never reused to avoid corruption, then fragmentation is unavoidable. Alternatively, logical ID’s are independent of a disk address and a node address. These are often called links or handles. Locking Another feature of an OODB is the three ways of locking. These are container locking, page locking, and object locking. With container locking, to get at an object, the entire container must be locked. Similarly, in page locking an entire page must be locked in order to access a particular object. The lowest level is the locking of the object itself. 6 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Object Access In an object database, an object server "serves" objects to the client. The objects can be served page by page, which is very fast if all the desired objects are on the same page. Groups of objects called clusters can also be accessed. This is a way of optimising objects that are used for the same application. Clusters can be logical or physical. Dynamic Space Compaction The object database architecture can become fragmented because of the physical movement of objects, physical OID’s being used, and the clustering of objects together, as mentioned above. Also, pages of objects are often loaded again and again, which over time causes fragmentation. Some object databases can dynamically reorganize space, which again over time offers increased performance. Navigation The object model in memory is the object model in disk, and this greatly increases the access speed for the database. Therefore accessing data is faster than relational access methods. Navigating to an object the first time is faster than a relational join because there is no join processing. Navigation to a previous object is also faster because the object cache is stored in the client’s virtual memory. There is no need to access the server, and the transaction therefore runs at the same speed the memory can be accessed at. 7 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Object-Relational Databases The object-relational model is an amalgamation of the relational and object models that try to unify aspects of both. There is no official definition of what an object relational database management system is. The system tries to add object-orientated functions to tables. The data is still stored in table structures but some of the tabular entries have richer data structures, known as data types. This approach provides more structure for modelling a greater degree to complex data and functions, but still lacks the object requirements of encapsulation within operations with data. It also has limited support for relationships, identity, inheritance, composition, polymorphism or creating user defined persistent classes and integration with host languages like Java. ORDBMS supports an extended form of SQL, SQL-3. The extensions allow support of the object model. Typical extensions include queries with nested objects, inclusion of methods and function in search predicates and queries with abstract data types. The complexity of data that can be processed using ORDBMS is much less than can be achieved using an OODBMS. They are unable to navigate by following references, many to many relationships and across relationships to form composite objects. The complexity of the relationship between the data being represented can be as much a problem as the complexity of the data itself. The ORDBMS is in fact still relational because the data is stored in tables, with the extensions used as the language for data definition, manipulation and query but the introduction of SQL4 will turn this into a collection of objects. The expert knowledge that has been gained from using RDBMS can be applied to the RDBMS once the SQL extensions have been mastered and this can be used to allow users to migrate from ODBMS towards ORDBMS. There are various products available for ORDBMS, such as UniSQL, Informix and IBM and the level of performance and operability varies according to the vendor. The product to use has to be decided upon according to the application, with trade offs according to the product chosen. The ORDBMS’s are gaining in popularity and are expected to eventually outsell relational databases. Object-relational databases allows organizations to continue using their existing systems, without having to make major changes, and allows them to start using object-oriented systems in parallel. 8 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Database Comparison. Relational Table/Record Based Relation expressed with keys Object Oriented Limited complexity of relationships Defined Types Flexible Querying (SQL) Language Independence Transaction Management – Multi user concurrency Limited support for distribution Difficulty to represent containment Difficult to process recursive relationships Object Based Relationships built in to objects and containers Complex relationships Variety of Data Types Object SQL Often connected to a language Fault tolerance and high throughput Easily distributed Abstraction Fast retrieval Comparison of Database Management Systems Criteria Defining standard RDBMS ORDBMS ODBMS SQL2 (ANSI X3H2) SQL-3 ODMG-V2.0 Support for object- Poor; Limited mostly to oriented 25% of coding time new data types programming mapping the program object to the database Direct and extensive Simplicity of use Table structures Same as RDBMS, easy to understand; with some confusing many end-user tools extensions available OK for programmers; some SQL access for end users Simplicity of development Provides independence of data from application, good for simple relationships Provides independence of data from application, good for simple relationships Objects are a natural way to model; can accommodate a wide variety of types and relationships Extensibility and content None Limited mostly to new data types Can handle arbitrary complexity; users can write methods and on any structure Criteria RDBMS ORDBMS ODBMS 9 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo Performance versus Level of safety interoperability varies with vendor, must be traded off; achieving both requires extensive testing Level of safety varies with vendor, must be traded off; achieving both requires extensive testing Level of safety varies with vendor; most ODBMSs allow programmers to extend DBMS functionality by defining new classes Distribution, Extensive replication, and federated databases Extensive Varies with vendor; a few provide extensive support Product maturity Immature; Relatively mature extensions are new, are still being defined, and are relatively unproven Very mature Well established Experienced people Extensive supply of Can take advantages SQL accommodated, and the popularity tools and trained of RDBMS tools but intended for of SQL developers and developers object-oriented programmers. Software systems Provided by major Provided by major ODBMS vendors RDBMS companies RDBMS companies beginning to emulate RDBMS vendors, but none offers large markets to other ISVs Vendor viability Expected for the major established RDBMS vendors Expected for the major RDBMS vendors; UniSQL. Less of an issue than it was; some shakeout still expected 10 OODB CS42002 Presentation Report by Neil Black, Steven Bruce and Marisa Di Filippo References Title Author Publisher Databases: From Relational to object orientated systems Claude Delobel Christopher Lecluse Philippe Richard Thomson Publishing Relational Database Principles Colin Ritchie Continuum Object Orientated Database Management An Object Orientated Database System Kemper Moerkotte Prentice Hall K. R. Dittrich U. Dayal A. P. Buchanan Michael Stonebraker Springer-Verlag Object-Relational DBMS’s: -the Next Great Wave Morgan Kaufmann 11