Object Oriented Databases: The Essentials Gürkan Nişancı Bilkent University, Department of Computer Engineering, nisanci@bilkent.edu.tr user interfaces, etc. What is an object? As written in Object-Oriented Database Systems: Promises, Reality and Future, [Kim-93], “The term “object” means a combination of “data” and “program” that represents some real-world entity” and object-oriented is stated as combination of object encapsulation and inheritance. Encapsulation means that the users cannot see the inside of the object capsule, but can use the object by calling the program part of object. Inheritance can be called as “reuse”. Inheritance is creation of a new object by extending an existing object. The power of object-oriented concepts is delivered when encapsulation and inheritance work together. Abstract Object-Oriented Database Management System is a database management system that implements objects directly, in contrast to relational, network or hierarchical database management systems. ObjectOriented Databases are databases that support objects and classes. They are different from the more traditional relational databases because they allow structured sub-objects, each object has its own identity, or object-id (as opposed to a purely value-oriented approach) and because of support for methods and inheritance. This paper attempts to summarize the features characterizing object-oriented databases (OODBS or ODMS). This paper also attempts to invalidate the commonly held view that the lack of a standard object model is a disadvantage for object oriented database systems. After an introduction I'll clarify the buzzword object-orientation and then go into detail of the database and objectoriented features of an OODBS. Finally I'll have a look on present and future of objectoriented database management systems. OODBMS is a database management system that implements objects directly, in contrast to relational, network or hierarchical database management systems. An object-oriented database management system supports the modeling and creation of data as objects. It contains not only data structures but also procedures that act upon those structures. Some relational database management systems do implement some procedures in a database via rules and triggers, but an object database contains methods--complete, perhaps very complex, procedures along with data structures. In pure OODBMS, the only way to access database data structures is through methods. Users can support new media types with OO databases simply by creating new objects. With OO databases, the application and the 1. Introduction Today, object-oriented principles are used in many technologies like objectoriented programming languages, objectoriented database systems, object-oriented 1 database use exactly the same object model. This isn’t the case with relational database, with which users must utilize an object model for the application and a relational data model for the database. So users must develop mapping procedures between the object and relational models. 2. Promises of Object Oriented Databases The main objective of an OODBMS is to provide consistent, data independent, secure, controlled and extensible data management services to support the objectoriented modeling paradigm. Today’s OODBMS provide most of these capabilities. Many of these products are second generation OODBMS that have incorporated the lessons learned from the first generation products. Given the high degree of interest in object-oriented technologies, there is a substantial market pull to put OODBMS products on a fast track where features and capabilities will continue to advance at a rapid rate. Currently, object-oriented database management systems are receiving a lot of attention from computer society. In the paper “The Object-Oriented Database System Manifesto”, [Atkinson-89] the reason of this situation is stated: “Three points characterize the field at this stage: (i) the lack of a common data model, (ii) the lack of formal foundations and (iii) strong experimental activity.” Object-oriented databases have the ability to model all three of these components directly within the database supporting a complete problem/solution modeling capability. The main difference between a traditional DBMS and an OODBMS is in the passive and active behavior of the underlying system and the way these are implemented. A traditional database is a passive one; for instance, it embodies a collection of structured data such as a relational database. When data is to be processed and manipulated, it is accessed by the application program via DBMS, stored in program data structures, processed to produce the required action, and the results of that action are written back on the passive database. In an object-oriented environment, database management systems must be able to store objects (messages, methods, attributes and instance variables), their relationships (subclass-of or superclass-of) and the class hierarchy. An object-oriented database system must be able to retrieve objects, relationships and hierarchy at a later time. In an object-oriented database, once a user request is transmitted to the object base, the objects respond to the request in a way consistent with their behavior, take necessary action, alter data, and invoke other Relational database technology has failed to handle the needs of complex information systems. The problem with relational database systems is that they require the application developer to force an information model into tables where relationships between entities are defined by values. Relational database design is a process of trying to figure out how to represent real-world objects within the confines of tables in such a way that good performance results and preserving data integrity is possible. Object database design is quite different. For the most part, object database design is a fundamental part of the overall application design process. The object classes used by the programming language are the classes used by the ODBMS. Because their models are consistent, there is no need to transform the program’s object model to something unique for the database manager. 2 objects, if necessary, to complete the request. Objects talk to each other by messages while a request is completely performed within objects, unlike with traditional DBMS. the benefits will be realized after a considerable investment has been made to learn how to use it effectively. 3. Limitations of Relational Systems A major strength of the OODBMS technology is its ability to represent complex behaviors directly. By incorporating behaviors into the database, one substantially reduces the complexity of applications that use the database. In the ideal scenario, most of the application code will deal with data entry and data display. All the functionality associated with data integrity and data management would be defined within the basic object model. The advantages of this approach are all operations are defined once and reused by all applications, and changes to an operation affect all applications, simplifying database maintenance (although most databases require the applications to be recompiled). Usually database programming is accomplished by embedding SQL queries into the user interface program to access or update the database. The programming language (C, Visual Basic, Visual C++) has its own data model independent from that of database system, as does the database system. The disparity between the two is in the respective data models in that neither resembles one another very closely. Furthermore, there also exists a communications barrier between the two systems which is perpetuated by this disparity among the two competing data models, which somewhat explains why the programming aspect is referred to as the "front-end" and the database storage aspect, which is clearly isolated from the actual user interface program is sometimes called the "back-end". The process, which is often gone through to accomplish communication between a developed program and its underlying database is done in three basic steps: The benefits of object-oriented database applications development are an increase in productivity resulting from the high degree of code reuse and an ability to cope with greater complexity resulting from incremental refinement of problems. One also gets increased design flexibility due to polymorphism and dynamic binding. Finally, both developers and users will experience benefits resulting from the naturalness and simplicity of representing data as objects. 1. Move data from the database into the program work area, 2.Perform operations on the retrieved data, These strengths need to be weighed against the organizational changes introduced by this new and different way of engineering solutions. Different engineering considerations contribute to performance and reliability than for relational DBMS. Projects need to be managed differently. Clearly, one needs to approach this new technology with eyes open, recognizing that 3. Move manipulated data back to database origin. This inherently leads to difficulties in having to translate between a programming language and its target querying language as well as efficiency. For example, consider that most database languages represent data in rows composed 3 of attributes (each of which can be varying data types), whereas with programming languages variables are usually of a single type unless expressed in a class or structure. In order to refer to these individual database rows, they must be bound to individual program variables or objects (these said objects must then in turn directly reflect the underlying database structure). For this reason, a great deal of computational effort can thus be expended in converting one data model's variables to another to allow operations on the data to be performed correctly, in addition to the necessary retrieval and rewriting of the database information. This intrinsic difference in the way in which actual data is represented by databases and programming languages can be called an impedance mismatch. with complex behaviors. An application area where this kind of complexity among data types, object relationships, and object behaviors exist includes engineering, manufacturing, simulations, office automation and large information systems. Although these are not the only areas in which OODBs could be used, as they could provide the same database functionality as its relational precursor, these fields are ones which can fully realize the potential strength with which OODBs is able to model complex real-world situations. 4. Features of An Object Oriented Database System An object-oriented system must satisfy 2 criteria: it should be a DBMS, and it should be an object-oriented system. To be a DBMS it must have these 5 features: persistence, secondary storage management, concurrency, recovery and an ad hoc query facility. For the existence of second criteria, there must be eight features: complex objects, object identity, encapsulation, types or classes, inheritance, overriding combines with late binding, extensibility and computational completeness. Furthermore, the majority of relational database systems only include standard fixed data types such as integers, reals and char string - with a few special types such as data and money. There is no current facility within relational systems to allow the user to define new types with their own unique operations. Another shortcoming of relational databases is they have poor modeling power and structure. The relational system relies on the row structure for its data form and this in turn explains its method of data transactions and manipulations through querying. With this approach no method is provided for either distinguishing between two records/objects that are equal (having the same values) and two records/objects that are identical (being the same object). 4.1 Complex Objects One of the major features, in my opinion, is the support of complex objects (also referred as composite objects), i.e. simpler objects (lowest level e.g. integerand character-objects) can be put together in order to build objects that are more similar to the real world objects. These composite objects can be viewed by the database user from whatever level of detail he requires. Complex object constructors, e.g. tuples, sets, bags, lists, and arrays, must be orthogonal, i.e. they should apply to any object. Furthermore there must be Generally, the use of OODBs is most advantageous when presented with one of the following scenarios: a large number of different data types, a large number of relationships between the objects, objects 4 functionalities for transitive operations (e.g. deletion) on complex objects. ability to express any computable function with a data manipulation language. This is yet another instance of the OODB approach attempting to eliminate the disparity between languages, and develop a unifying language that is powerful enough to handle complex data operations. 4.2 Inheritance Inheritance is the ability to define descendent classes that have common data or methods that can be incrementally altered to form another object class. This property derived from the object oriented methodology, gives sub-classes the ability to participate in the transmission/manipulation of data and access methods within its hierarchy scope. With inheritance complex problems can be represented more faithfully and intuitively in a way which promotes the reuse of code and shared application specifications. 4.5 Polymorphism - Overloading, Overriding and Dynamic Binding These OODB principles have been borrowed from object oriented programming ideas. Polymorphism is comprised of overloading and overriding to generate "one interface, multiple methods." Overloading allows a single method to be implemented multiple times in multiple ways, usually differentiated by the parameters each new implementation takes. Overriding allows redefinition of implementation to take place given these varying types. To accomplish these first two OODB principles, dynamic binding must be employed to determine at run-time which operation is to be executed, depending on which object is requested. Together, these three features allow code to be representative of problem domains (with such things as multiple constructors) without being of a narrow scope limited to a specific object. 4.3 Encapsulation In the most general sense, encapsulation is an object oriented technique which hides the data and implementation of operations making them only accessible through those same operations or methods. This, in turn, provides data independence where implementations of classes can be altered without the need to alter other methods. The only way to access and manipulate these attributes is through the provided methods, which establishes a single cohesive model for data and its methods along with information hiding. 4.6 Persistence Although not part of the object model as applied to object oriented databases, persistence is one of the fundamental ideas behind them. From the database perspective, this model is quite evident, however from the programming perspective it is rather alien. Essentially, persistence allows data to survive past the execution of the creating process to be reused later in another process. This unique approach to programming was invented by 4.4 Completeness The type of completeness required of OODBs is that which involves computations both simple and complex. In the relational system, SQL does possess many of the basic computational functions within its symbols (+, -, /, *) yet it is far from complete. Computational completeness calls for the 5 Joachim Schmidt in Germany, and Malcolm Atkinson and Ron Morrison in Scotland, and has now been applied extensively to many prototypes and commercial software database products. Persistence by reachability: Each object or data value, if in the scope of an object, which is itself persistent, is as a result persistent as well. 3. The program can operate in the same way on persistent and transient data. Regardless of the data, which is the target of an application's operation, the application will operate indifferently upon it. This is a natural extension to rule two, and ensures there is virtually no communication barrier between the application and the database. To alleviate the problems that arise with relational systems in the impedance mismatch and the communication barrier, persistence allows the programming language and the database to have the same model with the program being able to manipulate transient data and persistent data in the same way. In doing so, the problem of isolating the application from the database is eliminated as direct, immediate interaction with the database is made possible through persistence. In the paper, "Those Persistent Objects“, Bancilhon lists three basic rules for this interaction: Ultimately, the persistent model is one, which unifies the application and its corresponding database by insisting on global operations acting indifferently on all data, a common language interface, and a recognizable division between transient and persistent data. 1. The database model and the programming language type system are the same. This in turn corrects the impedance mismatch by doing away with data conversions between two separate data models and replacing them with one common model. 4.7 Object Identity Objects, in themselves, have an identity separate from that of their actual state, and although this idea is not new from an object-oriented standpoint, it is when considering databases. Object identity establishes relations among objects, in addition to a means of navigation within the database structure. It follows then that two objects can be identical or that two objects can be equal - yielding two different object equivalence. As a result, this gives rise to two implications: object sharing and object updating as it is written in "The ObjectOriented Database System Manifesto”. 2. Data is partitioned between persistent data and transient data. The persistence model permits programs to manipulate persistent and transient data - persistent data is update by means of transactions, which once committed are final, whereas transient data has existence only for as long as the life of the program. Furthermore, the persistent model can be dynamic, meaning that an object's persistence status can be modified, or static, entailing that the object's status remains unchanged throughout its lifetime. Persistence models, as a result, can be generated by two approaches: Object sharing involves the idea that two objects can share a component or data element. As an example, loosely based on one found in the "Manifesto," consider an object Person composed of a name, age and a set of children he/she has. Suppose two persons Fred and Wilma have a mutual child Persistence by creation: Objects are created with a persistent status. 6 Pebbles, each would be represented as follows: persistent language based OODBs naturally perform navigational search (owning to their programming language heritage), while query language based OODBs naturally perform query search. (Fred, 41, {(Pebbles, 4, -)}) (Wilma, 30, {(Pebbles, 4, -)}) With this representation it is not clear whether Pebbles is the child of Fred and Wilma or actually two similar children named Pebbles are involved in the relation, although we all know the former to be true. Object sharing makes this explicit, as the Person Pebbles would have an OID such as #123 to yield: 4.8 Ad Hoc Query Facility and Recovery Users need a simple ad-hoc query facility. This feature is well known from relational systems and must therefore be as declarative, efficient and application independent The database should provide a high-level, efficient, application independent query facility but not necessarily a query language. As it is written in “The ObjectOriented Database System Manifesto”, a query facility should satisfy the three criteria: (Fred, 41, {123}) (Wilma, 30, {123}) which makes the parental lineage clear. Object updates ripple through to all objects which share components or data elements. If Pebbles' age were to change all that is required is to change the object with the ID #123. Yet in a relational system this might involve a whole series of updates to each object with that sub-object (e.g. all Persons Fred, Wilma, and Pebbles would have to be changed without OIDs). Herein, one can easily see the maintenance advantages with object identity. i) It should be high level, ii) It should be efficient, iii) It should be application independent. Since we can't exclude the probability of a system failure (hardware or software) the OODBS should have a suitable recovery mechanism which ensures always an integrated state of the data. Other object identity operations which could be supported include object equivalence tests (identical vs. equal), object comparisons, object cloning and object assignment. Navigation vs. Queries Navigation is searching on the basis of following pointers through a network of objects. Querying is searching by making queries (boolean expressions) on records in a set (a class in this case). Queries spring from traditional database technology. Programming languages traditionally perform search on their (non-persistent) data structures by navigation. Therefore, 5. Limitations of OODBMS As with relational systems, object oriented database systems also possess their own unique limitations. These peculiarities typically stem from the fact that OODBs are such a relatively new branch within information systems. First of all, there is no defined standard. Unlike traditional relational systems, which have a rigorous mathematical background, OODBs are still 7 in a dizzying state of flux with many competing views and prototypes. The ODMG (Object Database Management Group) has given an optimistic window of 2 years to have a complete set of standards developed within the CPSC community. Whether this can be accomplished or not, and be recognized as THE standard, is a function of time. 6. Conclusion It is obvious that relational systems are nowadays extremely widespreaded because of the early standards concerning the data model and bindings to programming languages. The SQL standard satisfies adhoc query requirements and is supported by most systems. Because users are willingly conservative in using systems, relational systems will, in my opinion, keep their advantage past the year 2000. But I'm sure object-oriented systems will come up because they are considered to satisfy new needs better than traditional systems and the interaction with other object-oriented environments (e.g. GUIs) seems to be more efficient, however, a standard would speed up the expansion of OODBSs in the market Moreover, there is no prevailing language. As a direct result of the above listed limitation, one conventional language to model, develop and implement OODBs has not yet emerged as it has with relational systems and SQL. The object model clearly designates the type of language to be used (eg. C++, java), yet many of the prototypical OODB systems are implemented in languages as OPAL (for the GemStone system), Prolog, and CQL++ - which are essentially are hybrids of object oriented languages, data manipulation languages (DMLs) and data definition languages (DDLs). The object query language (OQL) is a declarative (nonprocedural) language, which allows querying and updating database objects. Its syntax is SQL-like but it is not fully compatible with the still evolving SQL standard in order to avoid limitations on its clarity and power. Queries can invoke methods and methods in supported programming languages can include queries. OQL provides high- level primitives to deal with sets, structures, lists, and other object collections. Methods defined within an OODB system for designated classes establish a logical constraint as to how actual data can be accessed and manipulated (especially if truly encapsulated), making 'ad hoc' queries difficult to execute from within the OODB. To remedy this, "The Object Oriented Database System Manifesto" suggests a graphical browser to provide the functionality of constructing simple queries for the user. Relational joins may disappear because of the use of an object data model or may be replaced with data references. Traversing a data reference is a lot faster than building a join. Object databases can often be made more efficient than relational systems, but only when the data model has been properly constructed and the strengths and weaknesses of each DBMS model and implementation have been carefully considered. OODBMSs will probably not replace relational technology but will handle Furthermore, it is still a question how transaction management will be done. With OODB methods often involving complex data processing, a dilemma arises when considering how to avoid losing large amounts of work to rollbacks, as well as long delays caused by conventional locking. 8 complex data processing situations that require graphics, hypertext or multimedia capabilities. Such problems commonly appear in artificial intelligence, CAD/CAM, office information systems, and other applications in which information is interrelated and diverse in structure and complexity. Currently, the methods, modeling tools, and languages are primarily for standalone operation and make it difficult to integrate the new technology into the information system's production environment. As more tools become available and the techniques and methodologies become more well known, however, integration problems with mainstream information system applications will likely be resolved and new application areas will be developed. 7. Bibliography M. Atkinson et al.: The Object-Oriented Database System Manifesto, Proc. DOOD 89, Kyoto/Japan, 1989. Won Kim: Object-Oriented Database Systems: Promises, Reality and Future, Proc. of the 19th VLDB Conference, Dublin (Ireland), 1993. Hugh Darwen and C.J. Date: The Third Manifesto. SIGMOD RECORD 24:1 p.39-49, (March), 1995. (links) Amirbekian, V. Favorite OODB-related sites. University of Mining and Metallurgy, Cracow. May 7, 1997. http://galaxy.uci.agh.edu.pl/~vahe/oodb.htm. Kazimierz Subieta: Object Database Systems, Polish-Japanese Institute of Computer Techniques, 1999. Jonathan Robie, Dirk Bartels: A comparision between relational and object oriented databases for object oriented application development, White paper from POET Software Corporation. Robin Vasan: Integrating Objects And Relational Databases, Object Magazine, (July-August 1992) pg. 59. R.G. Cattell: Object Data Management: object-oriented and extended relational database systems, Addison-Wesley Publishing Co., Reading, Massachusetts, 1991. 9