. The Institute of Computer Science The Polish Academy of Sciences Andrzej Jodłowski Dynamic Object Roles in Conceptual Modeling and Databases Ph. D. Thesis Advisor: Doc. Dr. Hab. Kazimierz Subieta Warsaw, November 2002 Table of Contents 1. Introduction .................................................................................................................... 7 2. Choosing Notions in Conceptual Modeling ................................................................ 10 2.1. Conceptual Modeling ................................................................................................ 10 2.2. Notions of Object-Oriented Conceptual Modeling ................................................... 13 3. Inconveniences of Multiple Inheritance ..................................................................... 15 3.1. The Concepts of Class and Inheritance ..................................................................... 15 3.2. The Concept of Multiple Inheritance ........................................................................ 18 3.3. Multiple Classification .............................................................................................. 21 3.4. Types and Substitutability ......................................................................................... 21 3.4.1. The Concept of Type ........................................................................................... 21 3.4.2. Subtypes and Substitutability .............................................................................. 23 3.5. Well-Known Programming Languages and Multiple Inheritance ............................ 24 3.5.1. Smalltalk.............................................................................................................. 24 3.5.2. Modula-3 ............................................................................................................. 24 3.5.3. Eiffel .................................................................................................................... 25 3.5.4. C++ ...................................................................................................................... 26 3.5.5. O2 ........................................................................................................................ 26 3.5.6. Java ...................................................................................................................... 26 3.5.7. General Observations .......................................................................................... 27 3.6. Multiple Inheritance in the Object-Oriented Methodologies and Notations ............. 27 3.7. Multiple-Inheritance Problems .................................................................................. 28 3.7.1. Semantical Context and Multiple Inheritance ..................................................... 28 3.7.2. Human Factor ...................................................................................................... 30 3.7.3. The Name Conflict .............................................................................................. 31 3.7.4. Typological Problems ......................................................................................... 33 3.7.5. The Multiple Inheritance and Object Migration ................................................. 34 4. Well-Known Approaches to Dynamic Object Roles ................................................. 36 4.1. The Concept of Roles by Bachman and Daya........................................................... 39 4.2. Aspects ...................................................................................................................... 40 4.3. Fibonacci Approach .................................................................................................. 43 4.4. The Concept of Roles by Kristensen ......................................................................... 45 3 4.5. Prototypes .................................................................................................................. 46 4.6. Subtables in SQL99 ................................................................................................... 47 4.7. Roles Realized by Design Patterns............................................................................ 48 5. Dynamic Object Roles in Stack-Based Approach ..................................................... 52 5.1. The Concept of Dynamic Object Role ...................................................................... 52 5.2. Object Store Model with Dynamic Roles ................................................................. 53 5.2.1. Links among Objects and Roles .......................................................................... 55 5.2.2. Dynamic Roles - a Formal Model of an Object Store ......................................... 56 5.3. Dynamic Roles vs. Classical Object-Oriented Models ............................................. 60 5.3.1. Multiple Inheritance ............................................................................................ 60 5.3.2. Repeating Inheritance .......................................................................................... 61 5.3.3. Multiple-Aspect Inheritance ................................................................................ 61 5.3.4. Temporal and Historical Properties..................................................................... 61 5.3.5. Variants (Unions) ................................................................................................ 61 5.3.6. Object Migration ................................................................................................. 62 5.3.7. Referential Consistency....................................................................................... 62 5.3.8. Overriding ........................................................................................................... 62 5.3.9. Binding ................................................................................................................ 62 5.3.10. Typing ................................................................................................................. 62 5.3.11. Subtyping ............................................................................................................ 63 5.3.12. Substitutability .................................................................................................... 63 5.3.13. Dynamic Inheritance ........................................................................................... 63 5.3.14. Aspects of Objects and Heterogeneous Collections ............................................ 64 5.3.15. Aspect-Oriented Programming and Separation of Concerns .............................. 64 5.3.16. Meta-Data Support .............................................................................................. 64 5.4. Specification of Dynamic Roles in Database Schemata ........................................... 64 5.4.1. Concepts for Building Database Schemata ......................................................... 65 5.4.2. Naming Issues ..................................................................................................... 67 5.4.3. A Sample Construction of an Object Schema with Dynamic Roles ................... 68 5.4.4. Declarations of Data Structures........................................................................... 71 5.4.5. Metadata Management ........................................................................................ 73 5.5. Query Language for the Object Model with Dynamic Roles.................................... 77 5.5.1. The Environment Stack ....................................................................................... 78 5.5.2. Opening a New Scope on the Environment Stack .............................................. 81 4 5.5.3. Thin and Thick Sections ...................................................................................... 82 5.5.4. Private, Protected, and Public Properties ............................................................ 82 5.5.5. Binding ................................................................................................................ 83 5.5.6. Polymorphism and Overriding ............................................................................ 84 5.5.7. Creating and Deleting Roles ............................................................................... 84 5.5.7.1. Create Operator ............................................................................................ 84 5.5.7.2. Delete Operator ............................................................................................ 86 5.5.8. 5.5.8.1. Casting Operator .......................................................................................... 86 5.5.8.2. Hasrole Operator .......................................................................................... 88 5.5.8.3. Roles Operator .............................................................................................. 88 5.5.9. 6. Role-Specific Operators ...................................................................................... 86 Query Optimization ............................................................................................. 89 Extending UML with Dynamic Object Roles ............................................................ 91 6.1. Dynamic Classification vs. Dynamic Object Roles .................................................. 92 6.2. Composition vs. Dynamic Object Roles ................................................................... 94 6.3. RoleOf Relationship .................................................................................................. 96 6.4. Combining Class and Role Hierarchies .................................................................... 98 6.5. Multiple-Aspect Inheritance .................................................................................... 100 6.6. Classical Inheritance vs. RoleOf Relationship ........................................................ 101 6.7. Notation for RoleOf Instances ................................................................................. 102 7. Implementation........................................................................................................... 104 8. Conclusions ................................................................................................................. 109 Appendix A. The Prototype - General Description ........................................................... 111 A.1. The Physical Level .................................................................................................. 111 A.2. The Logical Level ................................................................................................... 113 A.2.1. The Definition of an Object: ............................................................................. 113 A.2.2. The Definition of Atomic Value ....................................................................... 114 A.2.3. Creating Objects ................................................................................................ 115 A.2.4. Information Retrieval Functions ....................................................................... 117 A.2.5. Updating Functions ........................................................................................... 119 A.3. The Conceptual Level ............................................................................................. 121 A.3.1. Class .................................................................................................................. 121 A.3.1.1. Attribute ..................................................................................................... 122 A.3.1.2. Method ....................................................................................................... 123 5 A.3.1.3. Relationship ................................................................................................ 123 A.3.2. Role Class .......................................................................................................... 125 A.3.3. Class Instance .................................................................................................... 125 A.3.4. Role Class Instance ........................................................................................... 128 A.3.5. Other Functions ................................................................................................. 129 A.4. Metabase, ODL, SBQL with Dynamic Roles ......................................................... 130 A.4.1. The Grammar of ODL ....................................................................................... 131 A.4.2. ODL Parser ........................................................................................................ 133 A.4.3. SBQL Parser ...................................................................................................... 135 A.4.4. The Grammar of SBQL ..................................................................................... 136 A.5. Query Evaluation Module ....................................................................................... 138 A.5.1. The Organization of Environment Stack ........................................................... 138 A.5.2. The Organization of Query Result Stack .......................................................... 139 A.5.3. Query Evaluation............................................................................................... 140 A.6. Types and Reserved Names .................................................................................... 143 Appendix B. The Stack-Based Approach........................................................................... 146 B.1. Introduction ............................................................................................................. 146 B.2. Objects, Classes and Abstract Object-Oriented Store Model ................................. 147 B.3. Example Database ................................................................................................... 148 B.4. Stacks ...................................................................................................................... 151 B.5. Binding .................................................................................................................... 152 B.6. Query Language ...................................................................................................... 154 B.6.1. SBQL Syntax..................................................................................................... 155 B.6.2. Results of SBQL Queries .................................................................................. 156 B.6.3. SBQL Semantics ............................................................................................... 156 B.7. B.6.3.1. Algebraic Operators ................................................................................... 158 B.6.3.2. Non-Algebraic Operators ........................................................................... 160 An Example of Query Evaluation ........................................................................... 163 Appendix C. List of Figures ................................................................................................ 164 Bibliography ......................................................................................................................... 166 6 1. Introduction For several years, dynamic object roles have had the reputation of a notion on the brink of acceptance. There are many articles advocating the concept [ABGO93, BD77, BG95, Fowler97, GSR96, Kristensen95, KØ96, KRR00, LL98, LW99, Papazoglou91, Pernici90, RS91, WJ89, WL95, Wong99], but many researchers do not consider its applications sufficiently broad to justify the extra complexity of conceptual modeling facilities. Furthermore, the concept is neglected on the implementation side. As far as we know, no popular object-oriented programming language or database system supports it explicitly. Some authors assume a tradeoff, where the role concept is the subject of special design patterns [BRSW00, Fowler97, GHJV95, RG98], applied both on the conceptual modeling and the implementation sides. The notion has already been adapted by the SQL:1999 standard [ANSI99], although the name is different, it has specific semantics, and some limitations. Person Speaker Speaker Speaker Programmee Committee Member Reviewer Participant Author Author Author Fig. 1.1 Roles played by a person The idea of dynamic object roles assumes that a real or abstract entity can acquire and get rid of some roles during its lifetime without changing its own identity. The roles appear during the life of a given object; they can exist simultaneously or disappear at any moment. For instance, a person can be a reviewer, an author, a speaker, or a participant of a conference at the same time, as depicted in Fig. 1.1. Similarly, a building can be an office, a house, a magazine headquarters, and so on. Trying to develop a class model in UML and then to implement it in a certain programming language, one can face three main difficulties: (i) Due to some objects have many specializations at the same time, this leads to multiple or multiple-aspect inheritance; 7 (ii) Some objects have many specializations of the same type (repeating inheritance). For instance, a person can be a member of many clubs at the same time; (iii) Some objects have specializations that depend on time. For instance, a person was a student a year ago, but he or she can be an employee of a company currently. Furthermore, a person can be an employee several times, at different times and companies. Similar problems, mostly related to recording historical information, have occurred with other entities such as institutions, companies and documents. One should conclude that the classical inheritance concept, as presented in UML, for instance, is not fully adequate for data environments dependent on historical data. Another disadvantage of the design is complexity, chiefly after mapping it to a relational DBMS. Very complex relational structures imply very complex SQL queries. We have concluded that such cases are poorly dealt with in UML and cause difficulties during implementation. The only radical cure is to introduce dynamic object roles both at the level of UML class diagrams and at the level of data structures implemented in object-oriented or object-relational DBMS. In this dissertation, we show that dynamic object roles are useful both for conceptual modeling and for implementation. We argue, the concept could much facilitate modeling tools such as UML [UML01] and could be an important paradigm for object databases, which are built on the spirit of ODMG [ODMG00]. Our idea to deal with dynamic roles in a query language [JHPS02] is based on the stackbased approach (SBA) [PK00, Płodzień00, SKL95]. A version [SMSRW93] was implemented in the prototype system Loqis [Subieta91]. The SBA includes a data definition language (DDL) and query language SBQL (Stack-Based Query Language) that are similar to ODMG ODL and OQL respectively and have a clean and precise semantics. SBA constitutes a uniform conceptual platform for integrated query and programming languages. One of its central concepts is naming, scoping and binding principle, which enables us to deal with naming issues effectively. In our opinion, SBA is a universal, simple and powerful object data model, and is the only formalism able to accommodate the concept of dynamic object roles naturally. We introduce an extended SBA object model that defines roles as composite objects with a special structure, semantics and generic operations. We describe the structure formally, present a sketch of a data definition and query programming languages supporting generic operations to define and process such structures. 8 The rest of the dissertation is organized as follows. Chapter 2 presents a short discussion about notions in conceptual modeling. Chapter 3 analyzes inconveniences of multiple inheritance. Chapter 4 presents the state of the art concerning dynamic object roles. Chapters 5 and 6 are the main parts of the thesis. Chapter 5 introduces our object model with roles and discusses its differences with traditional objects models introduced in programming languages and database management systems. Chapter 6 describes changes to the UML notation that allows defining dynamic roles in the UML class diagrams. Chapter 7 briefly presents the implementation of the prototype. Chapter 8 reports on our conclusion and future research plans. The objectives of this dissertation are The introduction of an extended SBA object model with roles formally. The development of a prototype of a data definition language that makes it possible defining class schema with roles, and the development of a prototype of a query/programming language supporting generic operations to act on objects with dynamic roles. The introduction of an extension of the UML class diagram encompassing modeling dynamic roles. 9 2. Choosing Notions in Conceptual Modeling Due to different viewpoints about modeling real-world phenomena and about the ways to perform this process by computer systems, it is necessary to introduce and use various notions and languages. Efforts of object-oriented paradigm are heading for unification and limitation of this variety. However, the complete unification cannot be entirely accomplished given different natures of needed descriptions or various requirements imposed by recipients who use a computer system. Peoples who are involved in constructing information systems, wrestle with many problems, especially with: choice of the level of generality of the world to be described and of its model, decisions about the meaning and localization of conceptual schema within global data schema, and difficulties related to great complexity of conceptual modeling process. The problems mentioned will be briefly presented and discussed in this chapter. 2.1. Conceptual Modeling Attempts at representing fragments of real world have disclosed the necessity for introducing several basic notions. These concepts characterize most often a descriptive kind of knowledge, so they are related to certain sets of states of real world as well as to state changes, which describe dynamic phenomena. Conceptual modeling makes it possible to build mental images with the help of given notions in order to ensure good comprehension of essential properties of characterized beings. An image of programmers, who arduously code given algorithms in certain programming language, is a rather common stereotype of works on software. However, an essence of creativity process is rather poorly illustrated by this stereotype. It does not encompass whole complexity of images and mental processes, which occur in the mind of an analyst, designer or programmer before beginning of programming and during programming works. A designer or a programmer, before he/she starts coding, should very precisely understand a problem and a method of its solution. According to common opinion, it is impossible to develop correctly 10 software, if a “what to do and how it to do” plan is incomplete or imprecise. It follows that fundamental processes related to software development occur in a human mind and they do not need to be connected with any programming language. The conceptual modeling and conceptual model notions are related to all informal mental processes, which accompany the works on software. Conceptual modeling takes place at different phases of system life cycle. It is supported by proper semiformal means reinforced by a human memory and imagination. As a rule, those means are based on graphical representation of mental images related to the reality and characterized by the data, or related to data structures and processes needed for developing an information system. Such tools as class diagrams, functional diagrams, and use cases diagrams make communication within and among project teams possible. The same means allow also communication between a designer/analyst and a customer. They also make it possible to document results of phases: analysis, design and implementation. In any information project, it is possible to distinguish three general forms of a conceptual model [Subieta98]: Mental model of real world defines a subject of information system. This model, representing domain knowledge, is seldom formalized. It should include whole knowledge about a given institution, organization, or company and about scopes and aspects of their activities, which is necessary to comprehend functions fulfilled by the information system. Abstract conceptual model defined by means of proper formal or semiformal notation (e.g. class diagrams). It usually represents only some part of domain knowledge; Conceptual model of data structures for making basis of the information system (e.g. relational schema written down in SQL). Each of these models plays a certain role in the software development. Mental model of real world is necessary to understand: what for are the data and what significance has processing of these data. The aspects of domain knowledge, which are important for an information system under development, are described by an abstract conceptual model. At last, during programming works, a schema of data structures is necessary in order to understand correctly their structure, organization, manners of data processing, etc. The following consistencies relate to conceptual modeling of a given programming venture: 11 Consistency between a mental model of real world and an abstract conceptual model of this world described with the aid of a given formal language; Consistency between a conceptual model and schema of data structures, expressed by programming language type (or class) system and/or expressed by schema description language (data description language) supported by a given database management system. The consistencies mentioned above mean that the differences between thinking about the real world, comprehension of abstract conceptual model, and understanding of stored data structures should be as small as possible. Minimization of these differences is crucial for many phases of software development’s life cycle and it is conducive to its high quality and modifiability. Aspirations for achieving these consistencies is the driving force behind the development of semantic data models and the introduction of some elements of these models (also objectoriented) as programming language and DBMS features. Since both a mental model of real world and its abstract conceptual model do not include elements related to computer environment, aspirations for obtaining the foregoing consistencies lead to increasing an abstraction level, and independence between data and programs too. This process makes that designers, programmers, and users are gradually disengaged from taking some elements of system environment into consideration. On the other hand, one can consider the tendency mentioned above as a factor, slowing down the development and the applications of conceptual modeling tools. Assuming constraints on data structures and other features of certain programming language and DBMS, many differences between conceptual models and data structure schemata cause more difficulties in understanding of data semantics as well as goals and manners of processing them. Thus, one may question the point in introducing those tools to conceptual schema, which cannot be directly mapped onto properties of data structures within certain realization base. The tendency to enrich conceptual modeling tools with more and more new concepts encounters a poorly perceptible barrier related to an excessive complexity of representation and differences between the conceptual schema and the implemented data structure. It is worth a notice that relational model is not capable of representing some features of the conceptual model in a direct way (such as e.g. inheritance hierarchy, multi-valued or compound attributes, and many-to-many relationships). That for many projects can destroy in effect a direct connection between abstract conceptual model and data structures. Both more 12 flexibility and accidentalness of choices made by designers and programmers during realization of system and more complex understanding of developed data structures cause many disadvantages in software quality. Increasing consistency between abstract model of system domain knowledge and its concrete realization in the form of implemented data structures and programs is a fundamental goal of object-orientedness. It has important consequences for many aspects of developing information systems, including speed and cost of their constructing as well as quality, reliability, openness, modifiability, modularity, portability and possibility for reuse. 2.2. Notions of Object-Oriented Conceptual Modeling Growing complexity of information systems’ applications has resulted in new demands to the conceptual modeling of business domains, databases and application programs. The conceptual modeling is supported by various formal or semiformal notations such as class, functional, use case, dynamic and other diagrams. Mapping a piece of reality into a conceptual model requires notions, which follow natural ways of human thinking and understanding, and, on the implementation level, should be easily mapped into data and programming abstractions. Object-oriented conceptual modeling notions are supported in programming languages by proper data structures and behavioral properties defined on the algorithmic level of semantic precision. A long-term tendency in the development of programming languages and database management systems is that (usually semi-formal) conceptual modeling notions after some time are becoming data and programming abstractions. Object databases are an illustration of the thesis. Considering the entity-relationship model as a tool for conceptual modeling of relational databases, we observe that many of its notions (such as entities/classes, relationships, inheritance, and others) are further materialized as constructs of database structures, query languages and programming languages. Despite a large collection of various conceptual modeling facilities it is still difficult to model directly and precisely some typical situations in the business reality. An example is the concept of multiple inheritance, which supports conceptual modeling, but leads to various semantic anomalies. Inaccurate modeling causes communication difficulties between project’s 13 members, increases the probability of errors, causes additional consumption of resources during system construction, and has negative impact on the code length, documentation, transparency, maintainability and reliability of software. Thus conceptual modeling facilities should contain all necessary notions which allow the analyst and designer to express their design vision as precisely as possible. On the other hand, excessive extension of conceptual modeling notions may cause difficulties concerning their learning and proper use by project members. Thus there is an opposite tendency to minimize the number of notions and express new notions in terms of known notions. For instance, some methodologies do not deal with aggregation considering it a special case of association; some others do not involve inheritance assuming it can be expressed otherwise, etc. Another disadvantage of a large number of conceptual modeling notions is their inherent semi-formal semantics (they enhance humans’ thinking rather than computer operation), which could cause difficulties in recognizing proper usage of the notions and semantics of their particular combinations. The tendency to extend conceptual modeling notions is also reduced by implementation environments. If some conceptual notion has no direct counterpart on the implementation side, then it hast to be mapped into other implementation notions; thus, the original idea of the analyst/designer is misshapen. In consequence, there is little motivation to use those notions, which have no direct counterparts in implementation. For example, because relational databases do not support inheritance directly, the corresponding analysis and design methodologies (except a few) avoid this concept. In our opinion, the dynamic object roles are useful for both conceptual modeling and implementation. The low popularity of the notion is caused by the already established objectoriented principles, especially in programming languages. The basic assumption is that objects conform to the substitutability principle (LSP), which seems to be very natural, but on the other hand leads to anomalies, which are evident in case of multiple, multiple-aspect and/or repeating inheritance. Another assumption, which impedes the popularity of dynamic roles, is strong static (polymorphic) typing, which in case of roles must be much relaxed or redesigned. We will try to convince the reader that both mentioned impediments of wide usage of roles can be avoided. 14 3. Inconveniences of Multiple Inheritance Multiple inheritance is one of conceptual modeling mechanisms in object-orientedness, which has been applied with varying success. This section introduces the discussion on multiple inheritance, and presents advantages and disadvantages of the concept. For clarity, the presentation is made in an informal manner, apart from needless theoretic and technical details, and in favor of general intuitions. The approach is supposed to bring closer the essence of difficulties, which are related to the formulation and applications of multiple inheritance in conceptual modeling and in widely-known object-oriented programming languages. A careful look at mechanism of multiple inheritance is essential for comprehension of role concept. 3.1. The Concepts of Class and Inheritance The concept of class represents a certain abstraction in thinking and programming. This notion is supposed to capture static properties of objects (their structures) and their behavioral specifications, including operations, which can be performed for a given object group. In this study, a class is treated as a place, storing invariants, shared properties, and those properties, which are connected with the whole population of objects. Among others, one can include the following properties: attributes methods, class names, interfaces, common and default values, relationships with objects of other classes (associations). The concept of inheritance is one of fundamental mainstays of object-orientedness. This concept describes a relationship between classes, in which: Properties1 of a certain class can be accessed within a given class, which is related to a certain class by inheritance relation; 1 Properties, which are available through inheritance, are called as inherited properties; It may concern only some properties of a given class; for instance, most often a class name is not inherited. 15 Person name birth_year age() Student semester album_no insert_mark( mark ) avg_mark() Fig. 3.1 An example of inheritance A class, from which properties are inherited, is called a superclass (a more general class, a parent class), and a class, which is extended by inherited properties, is called a subclass (a more specific class, a child class). A relation of inheritance is a partially ordered relation that organizes classes into a hierarchy (see Fig. 3.1). A class hierarchy is most often a tree, having one common root. Classes, which are nodes of tree of hierarchy, possess (inherit) properties from more general classes within this tree, as far as from its root. In case of multiple inheritance, which is further discussed, a hierarchy structure is more general and it becomes a lattice. Person class generalization Person class availability membership inheritance Employee class Person instance Fig. 3.2 Inheritance and availability In object-oriented studies the distinction between structural inheritance and behavioral inheritance is drawn. In structural inheritance2, subclasses inherit a whole structure from their superclass, i.e. attribute and method specifications, as well as their type (and signature) specifications. The structure of such a subclass should not be visible to other classes. Structural inheritance does not support substitutability principle (discussed in Chapter 3.4). 2 It is also called implementation inheritance or private inheritance. 16 However, behavioral inheritance involves a semantic use of generalization. It is important and meaningful in a conceptual model. A presented distinction does not concern other aspects of inheritance and further features that can be a subject of inheritance (e.g. class extension). Person name birth_date tel_no. Smith 12/03/1956 6498237 age() Student Employee university album_no. scholarship salary chgSalary( newSalary ) avgSalary( year ) chgScholarship( newRate ) Smith 12/03/1956 6498237 age() Warsaw Uniwersity 3236457 850 chgScholarship( NewRate ) McDuck 02/16/1975 2357673 age() Jagiellonian University 3234434 0 chgScholarship( NewRate ) Karowska 01/01/1963 age() 2460 chgSalary( newSalary ) avgSalary( year ) Fig. 3.3 An example of class diagram with objects Availability of properties, another relationship, is determined by class membership of objects. This relationship, defined between a class and its member (an instance), makes properties of a given class (perhaps not all) available within its instances. The concept of availability, sometimes occurring erroneously in the studies as inheritance, should be clearly distinguished from inheritance; see Fig. 3.2. Further in this work, availability will be sometimes considered as one of inheritance features, but these concepts will not be identified with each other. An application of inheritance and availableness means that objects, which are instances of a given subclass, possess both properties of this subclass, as well as properties of its superclass (or superclasses), e.g., as attributes or methods. Fig. 3.3 presents a simple class hierarchy and exemplary instances of classes. Attributes and methods, which are inherited by Student and Employee classes or which are available within instances of these classes, are enclosed by dashed-line ellipse, as distinct from properties, which are own attributes and methods of Student or Employee classes. In conceptual modeling and in object-oriented programming languages, one can use the mechanism of inheritance: For defining a new, more specific class on the basis of another class; 17 For creation of more general class (often this class is an abstract one) for several specific classes, enclosing their common properties (“before parenthesis”). 3.2. The Concept of Multiple Inheritance During the conceptual modeling of even a small system it may occur that some groups of objects should get properties of different natures. Generally, one can distinguish two cases. In the themselves, the first one assumes, that two (or more) object natures usually manifest in the same time when in the second one, the ambivalence of objects is strictly related to a specific Person first name last name date_of_birth address insurance_no. email getMessages() Student index_no. email scholarship Employee date_of_employment salary email getMessages() getMessages() Working Student school_hours Fig. 3.4 An example of class diagram with multiple inheritance (in UML3) period. An object at some moment manifests properties of a given nature, but later it can manifest other properties, which are properties of different nature. A simple example is depicted in Fig. 3.4. Working Student class inherits both properties of Student class and properties of Employee class. There is defined a multiple inheritance between Working Student class and Student, Employee classes. Student and Employee classes are not related by inheritance relation; therefore they represent object categories with different behavior. It 3 Unified Modeling Language 18 seems, the multiple inheritance relation extends the concept of (single) inheritance in a simple and natural way4. A class hierarchy with the multiple inheritance loses a simple tree structure and becomes a lattice (an acyclic graph) (Fig. 3.5). The inheritance relation has to be extended. In the class hierarchy with single inheritance, a class inherits properties from classes, which are edges of branches between this class and the root hierarchy class. However, in the class hierarchy with multiple inheritance, a class inherits properties from classes which are edges of branches between this class and its most general (parental) class. According to the intuition of the real world and in accordance with the formal (mathematical) point of view, the concept of multiple inheritance is simple and very attractive. It makes possible to model wide spectrum of matters in a way, which is more close to natural language than the solutions based on inflexible principle about the tree class hierarchy. It does not need so many transformations of requirements into the conceptual model and it allows obtaining models more coherent with the represented part of reality. Fig. 3.5 The inheritance in a tree and lattice hierarchy The class Working Student is created by the accumulation of properties of Student and Employee classes. It seems easy and comprehensible in terms of conceptual schema. In fact, however this solution generates many additional difficulties, related to semantic details of the given combination. Some of them will be discussed through several examples: If one assumes that all attributes are subject to inheritance, then objects Working Student inherits three attributes with the same name email. Which one should be used by getMessages method? The one, defined in Student class or in Employee class, or maybe in Person class? 4 In a concept of single inheritance a class can inherit indirectly from several classes, but directly only from one (parent) class. A multiple inheritance concept extends an inheritance relation, so a given class can directly inherit properties from more than one (parent) classes. 19 If there is a lack of supplementary mechanisms for ensuring proper security (information hiding), a programmer or a user, who can access properties of Working Student object, may also access information not intended for all users, especially for him. For example, employees of dean’s office should not be able to learn what salary receives a given student who works in an independent company. Similarly, a manager of company, where this student works, should not know student’s exam grades, neither what is his scholarship. It is not known, if there is a lack of additional methods, which one of the methods with the name getMessages (the one defined in Student class or in Employee class) should be called by the evaluation of query expression: work_stud.getMessages(), where work_stud is an identifier of a Working Student instance. i1 i1 i2 i2 copy contents from i1 to i2 i1 i2 Fig. 3.6 The ambiguity of reference copying (deep, shallow copying) Let us assume that a given student starts working after some time at the studies (and becomes a working student). A question arises: how to arrange changes, so as to include the information about his new job into proper student’s data? An answer depends on the migration of the considered instance of Student class to an instance of Working Student class. This solution is only seemingly easy, because there is the need to make use of additional information related to data structures and data processing. One of such mechanisms is copying of object content. An example depicted in Fig. 3.6 presents some of occurring difficulties. It is assumed, the object identified by i1 contains referential attribute with value i1. There is a need to copy the content of the object identified by i1 as new content of the object identified by i2. Without supplementary semantical information it is not clear how to copy that referential-value attribute. In this case, nobody knows whether the reference should point to the object identified by i1 or 20 i2? The problem mentioned is an example of difficult object-orientedness issues, related to notions such as compound object content, object migration and schema evolution. Those concepts will be discussed in the next sections of this work. 3.3. Multiple Classification In many object-oriented languages, an object has one direct class. However, there is no necessity to support that restriction; we typically look at real-world objects from many viewpoints simultaneously. Multiple classification is a special kind of generalization in which an object may belong directly to more than one class. An example of class diagram with multiple classification, so-called multiple-aspect inheritance, is depicted in Fig. 3.7. “Although multiple classification matches logic and everyday discourse well, it complicates implementation of a programming language and is not supported by the popular programming languages”([UML01], p. 345). Manager Female Sex Person Job Engineer Male Salesman Fig. 3.7 An example of multiple-aspect inheritance in UML. 3.4. Types and Substitutability 3.4.1. The Concept of Type A type is a linguistic expression or certain semantical structure which can be assigned to a variable, an expression or another programming being (value, object, function, procedure, operation, method, method parameter, module, ADT5, exception, event). It specifies either a kind of values which can be assumed by this being or it specifies “external” properties (interface) (e.g. for a procedure, a function, an operation or a method). On the one hand, a 5 Abstract Data Type 21 type is a formal constraint, imposed on a structure of programming being (or on its parameters and its result). On the other hand, it is a formal restriction of context, in which it can be called/accessed while a program is executed. Types are useful in the conceptual modeling, because a type name of a given data carries information about meaning (semantics) of certain data. For example, date type or Person type brings connotations from the external world; because of that, types improve the readability of programs. A type concept is, in fact, orthogonal to the object-orientedness, e.g. well-known object language Smalltalk does not possess types and it has a limited ability of type checking (characterized as dynamic checking). However, many of studies and languages (Java, C++, and Eiffel) treat a type notion as a basic system feature, with respect to connections with a class notion. Sometimes, it is considered incorrectly that the type and the class mean the same. It partly results from the fact that in many famous object-oriented languages, e.g. in C++, a class name is simultaneously used as a type name too. However, treating the notions of class and type seems to arise from oversimplification or incomprehension of this problem. An object type can be one of the invariants stored in the class (not mandatory), but the class can store many other invariants and features (such as constraints, exceptions), which are irrelevant to the type concept. In others studies, one considers a class as a definition of objects (object group), while types are definitions of values. This kind of distinction seems controversial too. In those studies, classes and types are often distinguished according to the principle: “a class is an implementation of a type”. Thus, a type is there responsible for the external specification of a class, while a class holds all implementational details, including method bodies. This definition can be found as correct one in many object-oriented conceptions and languages. Unfortunately, object-oriented models, which are used in most object database management systems (e.g. ODMG standard), violate the above-mentioned connection between types and classes. For example, in case of multiple inheritance, an object type is composed of two or more class types. What is its implementation in this case? Types can be defined recursively, but this kind of definition is unhelpful for classes. Also models, which assume that one object has many types at the same time, according to the inheritance hierarchy (e.g. ODMG standard), disagree with dealing with classes as type implementations. Because of various ways of comprehension of types and classes, of misunderstandings about these concepts, and of different expectations for their significance in languages and in 22 whole systems, it is very difficult to provide a coherent model, allowing a simple explanation of these notions and their mutual relation. 3.4.2. Subtypes and Substitutability There exist two somewhat different subtype definitions. First one is based on a concept of set inclusion: T1 type is a subtype of T2 type, if a set of values defined by T1 is enclosed in a set of values defined by T2. For instance, the natural number type is a subtype of the integer type. Second one, a more important definition, is based on another dependence: T1 type is a subtype of T2 type, if T1 contains more attributes (operations, methods, etc.) than T2 type. For example, the type typedef EmployeeType = struct { string name, int year_of_birth, int salary} is a subtype of the type typedef PersonType = struct { string name, int year_of_birth }. The definition of a type is supposed to be consistent with the class hierarchy: EmployeeType can concern Employee objects, while PersonType – Person objects. Employee class is a subclass of Person class. Defining subtype relation tends towards the substitutability concept. The substitutability consists in the fact that in every place of a program (context), where a given being of T type can be used, a being of a type, which is a subtype of T type, can be used there too. For instance, wherever an integer can be used, a natural number can be used there too; wherever a Person object can be used, an Employee object can be used there too. Since a subtype has more attributes than its supertype, the substitutability means that every attribute, which “protrudes” an expected type, is ignored. For example, a program can contain: PersonType p, p1; EmployeeType e; //Declarations of p, p1, e variables p := (name: ”White”, year_of_birth: 1960); e := (name: ”Brown”, year_of_birth: 1950, Salary: 3000); p1 := p; p1 := e; // use of substitutability 23 Subtyping, substitutability, covariance and contravariance [Subieta98] conduct to certain anomalies connected with the substitution process. In this work, we do not discuss this issue in detail; however, the further presented role concept will allow avoiding the use of the substitutability and anomalies related to this notion. 3.5. Well-Known Programming Languages and Multiple Inheritance At the design stage, one has to make several important decisions, related to the system under construction. Among other things, one chooses a database management system (DBMS), definition and manipulation data languages (DDL, DML). In the development of databases, lasted since the mid-sixties, many programming languages have been created, which have been mostly used in the academic environment, but some of them have been used in commercial environment too. Various language philosophies are represented also, which differ in attitude to multiple inheritance. Many languages knowingly avoid supporting multiple inheritance. Some languages introduce mechanisms of multiple inheritance, but to a different extent. In this section, several well-known programming languages will be briefly presented in the context of multiple inheritance. 3.5.1. Smalltalk This object-oriented programming language was constructed in 1976-83 years at Xerox Palo Alto Research Center in California. It introduces such notions as class, subclass, dynamic binding for methods, message passing, metaclasses. Its main principle consists in the fact that everything is an object (e.g. classes, literals, and constant values). Smalltalk is not capable of specifying subclasses of more than just one class. Lack of multiple inheritance, simplicity, and possibility to quick dynamic modifications cause that Smalltalk has become a tool often used for rapid development of prototypes. 3.5.2. Modula-3 Modula-3 is an object-oriented successor of Modula-2, which does not include problematic features of other object-oriented programming languages as, e.g., multiple inheritance. 24 3.5.3. Eiffel It is an object-oriented programming language, which has been introduced and developed by Bertrand Meyer. It includes classes, abstract classes (deferred classes), parametric classes, class clusters, and multiple inheritance. Objects can be typed statically or dynamically. In Eiffel, name conflicts occur, when two or more properties with the same name are inherited from different classes: class CLASS_A feature data: REAL; end; class CLASS_B feature data: INTEGER; end; class CLASS_C inherit CLASS_A; CLASS_B // Error end; end To avoid name conflicts, one can specify (by selection mechanism), which properties are inherited from a given class: class CLASS_C1 class CLASS_C2 inherit inherit A select data; A; B; B select data; end; end; end; end; 25 When there is a necessity to inherit more than one property with the same name from different classes, then one can make use of rename mechanism for some inherited properties: class Working_Student inherit Student rename no_tel as no_tel_university Employee; end; feature hours_schol: INTEGER; ... end; 3.5.4. C++ C++ has been constructed as an extension of C language with classes, subclasses, abstract and parametric classes, and virtual methods. It combines functionality of high-level language with elements of an assembly language (pointer arithmetic). C++ supports multiple inheritance. When properties with the same name are inherited from different classes, then to avoid name conflicts, these properties must be qualified by class name where they are defined. 3.5.5. O2 An object-oriented language and DBMS with the same name have been developed by O2 Technology in France. It supports multiple inheritance. An inherited attribute name can be locally renamed in a given class. 3.5.6. Java This language has been developed by Sun Microsystem. It combines features of C++, Smalltalk, and Objective-C languages. Java is a high-level object-oriented language, which was supposed to substitute C++ language, considered as too extensive and getting out of clear rules. Java does not support, inter alia, controversial pointer arithmetic, and does not support 26 multiple inheritance in the sense of multiple inheritance of implementation, because a class in Java can have at most just one superclass. However, it is possible in Java to specify interfaces of a given class, which are repeatedly inherited by interfaces, defined for subclasses of this class. Some researchers consider it as multiple inheritance of interfaces. Abstract methods and constants can be specified within an interface, which are visible by accessing a given class via this interface. Many interfaces may be defined for a given class. Since interfaces are subject to inheritance, class specializations inherit all interfaces, defined within classes that are more general. 3.5.7. General Observations Founders of languages which do not support multiple inheritance consider that only single inheritance has a simple and smart form, which is soundly specified and commonly comprehensible. They treat languages with multiple inheritance as inelegant [Joyner96]. Such a situation has been presented in BETA: “Beta does not have multiple inheritance, due to the lack of a profound theoretical understanding, and also because the current proposals seem technically very complicated” [MMN93]. Similarly, Modula-3, Ada95, and Java are languages without multiple inheritance. Some other researchers perceive that multiple inheritance is a notion which helps to solve some problems in conceptual modeling; thus, they claim the necessity to specify features of this notion, even if it requires very extensive further research. Although a list of difficulties, which appear by multiple inheritance, seems to be incomplete up to this day, some classes of identified problems can be solved in an automatic manner. 3.6. Multiple Inheritance in the Object-Oriented Methodologies and Notations Most of object-oriented analysis and design methodologies and object-oriented notations of conceptual models introduce the multiple inheritance as a basic feature, e.g. OMT6, UML. Their authors argue that the absence of the multiple inheritance in programming tools must cause a disadvantageous gap between the conceptual model and the implementational model. 6 Object Modeling Technique 27 Unfortunately, semantics of multiple inheritance is not fully specified in methodologies and notations which only recommend (or order) avoiding name conflicts through the use of unique method name in the proper classes or suggest evading multiple inheritance by means of class permutations (or by means of the delegation mechanism, which is semantically unclear too7) [UML97]. Fig. 3.4 presents an instance of a class hierarchy with the multiple inheritance and with name conflicts, in UML notation. 3.7. Multiple-Inheritance Problems The construction of information systems cannot be made by ad hoc design. Before the implementational tasks can start, a project must pass, i.a., requirement specification and analysis phases. Works in every phase are supported by various tools. Their results are written down with different notations. Therefore, transfer of results from one phase into another is connected with making compromise decisions are often incoherent. They deform some elements of the initial project. Multiple inheritance belongs to these difficult issues. Although the multiple inheritance concept is undertaken by all who have interest in object-orientedness, and especially in object programming languages, a uniform approach has not been developed so far. Studies making attempts to discuss this subject are usually concerned with only particular solutions in a given programming language. The concept of multiple inheritance is well-intelligible at the stage of mental representation of real world, but it causes much controversy during the further works on the information system. Some of such controversies are caused by the lack of adaptation of given programming language to define and to manipulate class hierarchies supporting multiple inheritance. Others are caused by incompetence of designers and programmers, who easy get into semantic traps, set by the implementation of the multiple inheritance in various programming languages. 3.7.1. Semantical Context and Multiple Inheritance Let us assume that an information system on students is created, which includes also information about their employment, if they work while they study. Fig. 3.4 presents a class 7 The delegation (in UML) represents a situation; if an object after receiving a message sends it to another object and it delegates an interpretation of a given message to another object. 28 diagram with multiple inheritance, where Working Student class is a specialization of Student and Employee classes. Working Student class defines school_hours, specifying weekly number of hours, when an employee is released from the work to help him continue the studies. It is assumed that the value of school_hours attribute is related to the kind of studies as well as to the amount of working time (full time, part time). Working Student class inherits all attributes from Student class and from Employee class. Only this information is expressed by the notation (in this case: UML). It is fixed by the notation that all (inherited) attributes are inherited in the same way. They are merged in the common environment of Working Student class. The notation does not specify more of the semantics of multiple inheritance. As a result, attributes of different classes, coexist side by side. They are mixed in the new common semantical space, partially loosing the information about their initial environments. From the conceptual modeling point of view, the degeneration of abstraction levels, the commonality of space scopes, or the weakness of bonds among beings, which were strongly related to each other, are negative phenomena. A clear and precisely defined semantics of the system components has grown in importance, especially in the face of necessity for continuous system’s increase, i.e. with the use of reverse engineering methods (e.g., through the reuse). Local environments, determined by Student and Employee classes, are mixed in objects of Working Student class. However, the execution of inherited method or the plucking of values from inherited attributes is delegated to classes, which define these properties. Lack of operational semantics of the multiple inheritance at the stage of conceptual modeling leads to a gap between the system design (created at this stage) and its implementation in the specific programming language. Quite a lot of free choice with the solutions results from different understanding and application of the multiple inheritance mechanisms by programmers. The combination of various semantic properties can sometimes lead to mistaken conclusions. Fig. 3.4 can serve an example. GetMessages methods get new messages. The semantics of these methods seems to be clear and natural in all classes, except Working Student class, which multiply inherits from both Student and Employee classes. The working students possess two (or three) email accounts. First account is a school account, the second one is an account at the workplace (and third one is a personal account). It is hard to find an example, whether a given person should be treated as a student, as an employee, and as a regular person at the same time. Thus, there is no doubt in most cases that ambiguity of representation of a given person should not be considered, especially to specify his most relevant email address. It should be stated, without questioning legitimacy of multiple inheritance that problems mentioned may lead to changes or at least to semantic obscurities. It 29 concerns particularly such cases when one must stress the ambiguity (e.g. duality) of semantics of a given object group through the multiple inheritance. Their different semantics are not manifesting never or almost never at the same time, because they are mutually exclusive. A class hierarchy with the multiple inheritance (lattice) seems to be too fixed construction for defining objects with the changeable nature. It is all the more important in databases, considering usually long time periods of stored data and requirements in the preserving of high flexibility, i.a., considering rather quickly varying needs of users. 3.7.2. Human Factor Up to this day, the issue of multiple inheritance concept has not reached any coherent and general solution. Therefore, the understanding and application of this notion can cause many problems, especially in languages, in which multiple inheritance is only one of the supplementary features. Problems that arise may concern matters of different nature, e.g.: Not sufficient general acquaintance of authors of information system with the multiple inheritance concept; Lack of enough skill in correct identification of cases, for which solutions can take advantage of the multiple inheritance mechanism; The choice of proper programming language and consistent translation of conceptual schema components into adequate logical schema constructions, including the choice of programming language without multiple inheritance; How to arrange the works for takeover or continuation by persons until now have not been involved in the project or in a part of it? How a dedicated (specialized) company should be instructed in connection with constructing some functions? How to ensure proper understanding of semantics of other languages, which is essential during the creation or further life of the information system8; 8 It concerns especially the works performing the extension of an existing system to new features by using of reengineering methods and new rapid application development systems (RAD). 30 What are the necessary modifications of the system, related to changes of some vital system requirements? Each of the given examples requires a systematic approach in defining various methods, which can be applied to remedy difficulties that arise. To this end, it is necessary to have a coherent and consistent approach to issues related to multiple inheritance, by many people, with different profiles and experiences. Lack of rule and recipe in this area leads to using divergent methods. They are the cause of errors and incoherencies in the system. Usually, this kind of errors is very laborious to correct. If these errors are made at initial stage of the system’s life (e.g. in the analysis stage), they then determine the further form of the project completely, because their correction or improvement are too expensive. It often leads strange constructions within the system, which are difficult to understand and can cause other mistakes. 3.7.3. The Name Conflict A substantial problem of multiple inheritance is the inheritance of properties with the same name from different classes. An example is presented in Fig. 3.4. Working Student class inherits multiply from Student class and from Employee class. A doubt arises: in what way email attribute will be inherited from Student class and the attribute with the same name from Employee class? It is unclear too, which of these attributes should be used by the getMessages method. The situation discussed is known as a name conflict. The arbitration methods are often seen by opponents of multiple inheritance as an example of pathologies that occur in understanding this concept. In typical programming languages (structural languages), the execution of program block, in which a name conflict appears, either ends by the error signalization or proceeds by using binding priority rules for multiple inheritance9, which are fixed in a given language. Priority rules are usually defined through syntactic ordering of super classes, which are used in a program block or by assigning an inheritance priority for super classes of a given class. Although both solutions seem to be sufficient for most occurrences of name conflicts, the use of them leads to much harder readability of the program, and they are also inflexible against small semantic changes (e.g. in reversing of the priority order) of this mechanism. It may cause errors, which are unforeseen and difficult to identify. For example, the use of different 9 It may be possible, e.g., by using of the dynamic binding. 31 visual code generators can lead to creation of program codes with different appearance sequences of super classes. Orion [BCGKWB87] and CLOS10 are instances of systems supporting the arbitration of name conflicts, based on super class sequence in the program code. If one assumes that there are defined methods in classes A and B with the same name m, and C class multiple inherits from A and B classes (defined in the code in given order), then a call for m method means the use of m method defined in A class, for all objects, they are members of C class. Usually, languages supporting multiple inheritance do not possess built-in priority rules for the multiple inheritance. Then it remains only to avoid name conflicts. It may be supported by additional mechanisms in the language, during the construction of the code (e.g. in C++) or by a proper construction of classes (e.g. in Eiffel). Detailed methods, how to solve name conflicts, are: Local renaming (change) of inherited property in a given subclass– O2, Eiffel. It is an effective solution, even if it leads to some inconsistency in scope of the inheritance concept. Advantages of renaming are a possibility of inheritance of all properties with certain common name, and manifesting in some features of another mechanism which is known as import of properties with renaming; A limitation of scope for inherited properties – a selection mechanism for the inheritance in Eiffel. Through selection are pointed properties, which should be primarily inherited. The selection mechanism concerns only these properties, which are explicitly pointed and all others (potential) properties are rejected during inheritance. A specification (in case of name conflict) of a class qualifier in all (or some) places, wherever inherited properties are used – C++. This method is much more laborious than for example local renaming applied in Eiffel, because it forces using the class qualifier (with scope operator “::”) for every instance of a given name in the program code. It significantly decreases flexibility of changes in the program (particularly in class definitions) and increases probability that programmers can make mistakes during reuse of class definition to specify another class with similar properties. Possibility to rename properties in one or more superclasses can be obviously applied in all languages. Though this solution seems to be trivial, it may be very inconvenient or even impossible. Unfortunately, a source code including class definitions may be often 10 Common Lisp Object System 32 unavailable. It is worth a notice that the automatic or semiautomatic renaming is a solution that causes much waste of time by many programmers, searching for errors in the incorrectly working program. B C D E Fig. 3.8 Different inheritance paths A dilemma of another general phenomenon another phenomenon dilemma is related to a problem of inheritance path’s choice (Fig. 3.8). It is particularly important for inheritance of the methods, which are defined and called in a specific environment. Let us assume that a given method is defined in B class and called from an instance of E class. A question arises, whether the execution of the method is dependent on chosen inheritance path, BCE or BDE? Generally, authors argue that in both cases a method should be executed in the same way. They are not sure however whether it takes place in all well-known languages. The original sin of all presented methods is their strong connection with a specific programming language. Lack of setting more uniform rules increases an aversion to apply them by the wider group of users. The semantics of priority arrangement constructions and avoiding name conflicts are often too complicated. Therefore, their applications should be recognized as relatively difficult. Moreover, the researches in program portability and in reuse require much expenditure of work, more advanced knowledge and experience. 3.7.4. Typological Problems Substantial problems arise when considering multiple inheritance from the point of view of type theory. It is not easy to create a type system which could meet simultaneously simplicity (for a user) criterion, conceptual cohesion and universality. Existing ideas in this field are not satisfying or are very complicated. Solutions such as in C++ are a subject of common criticism. As another potential problem, which arises in multiple inheritance, an object representation can be seen. In many systems, to save the memory, an attribute value has a fixed offset, relative to the beginning of the object representation. If these relative attribute 33 positions are fixed for both Student and Employee classes, then it cannot be possibly fit them with adequate positions of Working Student’s attributes. It leads to the necessity of using solutions that are more complex (e.g. in C++). This complicates the semantics of a language from the point of view of programmers. It can also cause errors in the software and can increase time of execution. 3.7.5. The Multiple Inheritance and Object Migration In typical general programming languages, a class structure usually is unchangeable all the time or for a long time during system’s life. In many languages, classes do not exist during the program execution11 (e.g. in C++). Most often, a modification of a class structure during the program execution is not possible, even if classes are first-category citizenship beings12. Generally, in object-oriented systems, and particularly in object-oriented database management systems, one assumes classes (and types) are first-category citizenship beings. This means that a class definition exists not only at program analysis stage, but it is also programming object. It has own localization in the computer address space, and it is (theoretically or practically) a subject of manipulations during the execution. Addition of a new method or attribute for a given class, a modification of security rules for instances of this class, and a modification of constraints on these instances belong to such manipulations (especially those often performed). Citizenship of a given programming concept can often be a little fuzzy. This takes place in a given language or system if a certain concept is postulated as second-category, but it exist in the executing time, and/or there is a possibility to perform on it some (usually limited) operations. For example, some languages make it possible to use type_of operator. This 11 The second-category citizenship being is characterizing a programming language concept, existing only in the program code (used at static analysis stage). These being are unavailable during the program execution, e.g., a variable name, a parameter name, a type, a class, a procedure signature. 12 The first-category citizenship being is characterizing a programming language concept (e.g. a type, a class, a module, a variable value, and a variable name), meaning, that a given programming being occurs and can be manipulated (changed, deleted, and created) during the program execution. For an instance, an attribute value is a first-category citizenship being, but an attribute name usually is a second-category citizenship being. 34 operator returns for given programming beings (e.g. for objects) their actual type in the form of string or reference to certain structure. This means that the type information exists and is available during the run time. This kind of fuzzy citizenship of class is e.g. a standard feature of ODMG and Objective-C13. It is often submitted to a critical examination by experts in type theory who emphasize that in this way effectiveness of the strong type checking is powerfully restricted. Moreover, like in case of the shifting of concepts into first-category citizenship concepts, also the fuzzy citizenship can result in the worst executing time. Java language is an instance of another situation. Classes in Java are static beings. However, their dynamic representations exist in the program run time, in the form of special objects, representing classes. For an object representing certain class, it is possible to execute some operations. This simulates the class behavior in Java (in certain cases) as first-category citizenship beings14. 13 For example objc_getClass(STR) construction in Objective-C. 14 For example Class.forName(String) construction. 35 4. Well-Known Approaches to Dynamic Object Roles The idea of dynamically changing object roles was proposed for the network database model by Bachman and Daya in 1977 [BD77], far earlier than object-oriented methodologies and tools were popularized. During the era of the relational model and relational systems the concept was not considered in the context of databases because it did not fit well with the relational ideology. Thus for many years the idea was forgotten. The interest to dynamic object roles has increased after computer professionals have realized the meaning of conceptual modeling in software construction. In effect, object-orientedness has been popularized in various domains of information technology, and together with objectorientedness, the role concept is considered more and more frequently. The dynamic role concept often appears in papers devoted to object design, programming languages and databases, sometimes under other names, in different contexts and with various semantics. The classical object-oriented model assumes that each object is associated with its most specific class. A deviation from this rule can be treated as a certain variant of the role concept. In particular, the Iris system [Fishman87] supports many types for a single object. A similar proposal can be found in [BG95]. The dynamic role notion appears in different contexts, in particular, in papers related to modeling office information systems [Pernici90], computer aided manufacturing [WL94, WL95], workflow management [KRR00], multimedia [WCL96b, Wong99], semantic modeling [RS91, Sciore89, Stein87], and object modeling [Papazoglou91]. These papers propose to take advantage of dynamic roles for various dynamic properties, such as object migration, schema evolution, conceptual object clustering, creation of several views for one object, and others. Recently new proposals to deal with roles have appeared. They can be subdivided into two groups. The first of them assumes that dynamic object roles are represented informally through the already existing notions of the conceptual modeling by using design patterns [KØ96, BRSW00, Fowler97, GHJV95, RG98], which have been popularized in the context of object-orientedness. The design pattern decorator [GHJV95] is considered in [Fowler97] as a 36 good mapping of dynamic roles. The decorator pattern allows one to insert additional functionality to a class without subclassing. This approach corresponds (to some extent) to Aspect-Oriented Programming [KLMM+97] and the separation of concerns principle [Dijkstra76]. Also the motivation for Subject-Oriented Programming presented in [HO93] shows that dynamic roles and SOP have conceptual similarities. The second approach introduces the role concept as an explicit notion of conceptual modeling and a database feature orthogonal to other features. Such an approach has been implemented in Aspects [RS91] and Fibonacci [ABGO93, AGO95, AAG00] prototypes. A feature called “subtable”, having some correspondence to roles, is introduced in the emerging standards SQL3 and SQL1999 [ANSI99] for object-relational systems. In Fibonacci it is assumed that a role does not have its own behavior. The support for contextual object behavior was introduced first time in the Clovers system [SZ89], in the proposal presented in [WJ89], in Multiple View in the Smalltalk context [GSR96], in the ORM model [SN88], and in Aspects [RS91]. In these approaches, however, the roles do not possess their own classes and they do not support inheritance among roles. There is also no possibility to move a role from an object to another one (what could be essential for some applications, e.g. a project manager role can be moved to another person). The proposals do not define operators, which support the programmer to switch dynamically from one object role to another one. The proposals can be essentially distinguished depending on their attitude to strong static typing and depending on whether introduced classes have first-class or second-class citizenship. In [Sciore89] classes are first-class citizens (so-called prototypes). ORM [SN88] is a similar proposal, but classes have the second-class citizenship. On one hand, for performance reasons, in commercial programming languages it is usually assumed that classes have the second-class citizenship, which enables static binding of program entities. On the other hand, the first-class citizenship of classes supports flexibility and robustness. For this reason, in database systems many program entities have the first-class citizenship, in particular, database procedures, views and triggers. Some systems, e.g. Oracle-8, assume also classes and methods to be first-class citizens stored on the side of a database server rather than on the side of a client application. An essential aspect of roles concerns a method lookup strategy after receiving a message by an object. In Fibonacci, DOOR [WCL96a, WCL96b] and ADOME [LL98] two strategies are assumed. The first one, called upward lookup, consists in lookup within a direct role class and then within its ancestors. The second one, called double lookup, consists in lookup within 37 the direct role class, then within classes of sub-roles, and finally within its ancestors. The rationale for different strategies is however not clear, since the given examples do not show explicitly the essence of the problem (considering that - eventually - everything is in the hands of designers and programmers of applications). It seems that the research presented in the papers [LCW97, LL94, LL98, LW99, WL94, WL95, WCL96a, WCL96b, WCL97, Wong99] is currently most advanced. Some papers (e.g. [WJ89]) pay attention to the fact that in case of dynamic roles a unique object identifier becomes problematic. Indeed, for consistent referential semantics (e.g. for implementing relationships among object and roles) each object role should possess its own unique identifier. This means some distinction between object identity and object identifier: having the unique identity an object can have many identifiers. This is an essential semantic novelty. Multiple interfaces, which are a feature of Java and Microsoft’s COM/DCOM, have some associations with dynamic object roles. Indeed, the programmer is able to define interfaces in such a way that a single interface represents a single object role. However, multiple interfaces do not support all dynamic roles’ features, in particular, do not deal with the inherent dynamic roles’ property to be inserted to (and removed from) an object at run time. Te following is list of features of dynamic object roles that occur in the literature on roles (taken from [Steimann2000]). It is worth a notice that some features conflict with others. 1. A role comes with its own properties and behavior. 2. Roles depend on relationships [BO98, CZ97, EWH85, Guarino92, Sowa88]. 3. An object may play different roles simultaneously [Kristensen95, MD93, Pernici90, RS91, WJS95, WCL97]. 4. An object may play the same roles several times, simultaneously [Kristensen95, Pernici90, RS91, WJS95, WCL97]. (Not in [ABGO93, BD77, Reimer85].) 5. An object may acquire and abandon roles dynamically [ABGO93, GSR96, Kristensen95, LL98, Papazoglou91, RS91]. (Object migration or dynamic classification also in [MMW94, Su91]; dynamic classification vs. role playing in [WJS95].) 6. The sequence in which roles may be acquired and relinquished can be subject to restrictions [Pernici90, Su91, WJS95]. 7. Objects of unrelated types can play the same role [BD77]. 38 8. Roles can play roles [CZ97, Kristensen95, RH95, WJS95, WCL97]. 9. Roles can be transferred from one object to another [Kristensen95, WCL97]. 10. The state of an object can be role-specific [Kristensen95, Pernici90, WJS95]. 11. Features of an object can be role-specific [ABGO93, GSR96, LL98, Pernici90, RS91]. 12. Roles restrict access [GSR96, Kristensen95, Pernici90, WCL97]. 13. Different roles may share structure and behavior [CM99, GSR96, Kristensen95, RH95]. 14. An object and its roles share identity [Booch94, GSR96, Kristensen95, RS91]. 15. An object and its roles have different identities [WJS95]. 4.1. The Concept of Roles by Bachman and Daya A historically first approach to extend data model with roles has been proposed by Bachman and Daya [BD77]. The authors introduce the concept of integrated database, assuming that every single record should represent all aspects of a given real-world being. They introduce a concept of role, which describes the prototype of a class of roles with similar properties. They introduce also a concept of entity type that defines the prototype of a class of entities, which are capable of playing the same roles. Some role types can be played by more than one entity type; then they are declared as shareable (e.g., an employee role can be played by entities: person, supplier, and customer). Some entity types should always be characterized by a given role type. In such situations, one can specify essential role types, which obligatorily characterize certain entity types (e.g., an employer role is essential for a company entity). Although details of the proposed mechanism of roles are based on network data model, the authors emphasize that an idea of roles is independent of applied data model and may significantly enrich the semantics of other models (DIAM15, relational, and role models). The role concept can increase level of cohesion of conceptual model and of logical model by the use of higher level of abstraction. In the authors’ opinion, also data independency can be reinforced in applications supporting this concept. 15 Data Independent Access Model 39 4.2. Aspects Difficulties with representation of features of complex real-world objects, which are changing over time and problems (especially typological), which are related to multiple inheritance, have become the main motivation for extending object model with new static and dynamic properties through object aspects. The proposal of aspects has been introduced by Richardson and Schwartz [RS91]. A mechanism, proposed by these authors, is capable of modeling a system, where objects can be dynamically changed in some range, but without change of their identity, and with preservation of strong type checking. The concept of an object interface16 has been introduced in object model as a set of method signatures and of object implementations, specifying an object representation and defining its method codes. Object model supports substitutability principle of types. A role mechanism is defined by combination of type conformity (through substitutability) and by extending objects with so-called aspects. Definition of the aspect of a given object consists in adding a new implementation (for this object) whose abstract data type extends abstract data type of a base (main) object. Objects with aspects are like clusters of soap bubbles with smaller bubbles attached to the large ones (aspects). In example [taken from RS91] below there are defined an interface and an implementation for objects, representing Person, and Employee aspect, extending properties of persons with information about their employment. /* Person interface – a definition of Person abstract type */ type Person { string name(); int age(); string phone(); }; /* Person implementation */ implementation personImpl { 16 The authors more frequently use “abstract data type” notion than object interface. 40 string myName; int myAge; string myPhone; public: string name() { return myName; }; int age() { return myAge; }; string phone() { return myPhone; }; /* a constructor */ personImpl( string n, int a, string p ) { myName = n; myAge = a; myPhone = p; }; /* a definition of Employee aspect – extending implementation*/ implementation empImpl with type Employee /* a name of interface of aspect */ extends Person { int myEid; string myDept; string myPhone; public: int eid() { return myEid; }; string dept() { return myDept; }; string phone() { 41 of Person if (myPhone != nil ) return myPhone; else return base.phone(); }; /* a constructor */ empImpl( int e; string d, string p ) { myEid = e; myDept = d; myPhone = p; }; }; /* operations exported from Person */ Person::name; Person::phone as homePhone; }; An exemplary usage: Person BrownP = personImpl( ”Brown”, 56, ”7872345” ); Employee BrownE = empImpl( 123, ”secretariat”, ”5652634” ) extends BrownP; bool b1 = ( BrownE @= BrownP); /* TRUE */ bool b2 = ( BrownE == BrownP ); /* FALSE */ stdout.putstring( BrownP.name() ); /* Brown */ stdout.putstring( BrownE.name() ); /* Brown */ Person p = BrownE; /* wrong: Employee does not conform to Person */ An object implementation can be extended by aspects that define new object behaviors through operations, and that can specify which elements of interface of a base object are 42 available within an aspect. Operations, defined in aspects, can directly access to a base object by base reference. Many different aspects can be created for a given object. They are created and dropped during system run time. Defining aspects is not limited to a base object only, but one can specify new aspects, which are based on definitions of other aspects. The authors have introduced a dropping rule for objects and aspects. According to this rule, when an object (or an aspect) is deleted, then all aspects, which extend this object (or this aspect) are recursively deleted too. Since aspects do not have own identity, two different operators for comparing references are introduced: The == operator can check if two references are identical. When references of different aspects (or objects) are compared, using this operator, then logical false is obtained. The @= operator can check if two references are references to the same object. When references of aspects of the same base object are compared, then logical true is obtained. Comparing references of different objects (or aspects of different base objects) returns false. In a simplification, an aspect mechanism extends type system by introducing a concept of objects, which can be extended with new components. The authors consider that inheritance concept is orthogonal to type conformity (substitutability) and they do not deal with inheritance in their model [CHRSS90] to define aspects. They claim that in some cases object hierarchy is not significantly flexible and it is not fully capable of modeling objects, which are irregular or are dynamically changing. The authors point to problems of modeling complex real-world beings. A formalization of such beings requires creating and manipulating many objects with various properties, having e.g. different names and identifiers. A model with aspects ensures one common identity for a given object, but it also supports many independent views for this object. 4.3. Fibonacci Approach Fibonacci is an object-oriented database programming language [ABGO93, AGO95]. It supports strong type checking with additional reinforcements: 43 A distinction among object interface, type, and its implementation; possibility to change an implementation with no influence on remaining system that is aware of its own interface only; Possibility to define different implementations for one object type; Introducing type hierarchy with multiple inheritance. A presented mechanism of roles refers to the proposition of aspects [RS91]. It consists in the extending object with roles in such a manner that a given object can simultaneously acquire many roles, and may be accessed within one of its roles only. The basic assumptions for proposal of roles in Fibonacci are as follows: It is assumed during modeling of a system that the real world consists of entities characterized by many structural and behavioral properties, which change over time. Entities can belong to different conceptual categories during their life. It means they can play different roles, even more than one at the same time. However, some roles cannot exist simultaneously. The behavior of entities depends on roles, which are played by them; therefore, the behavioral specifications of entities vary along with a change of a set of actual roles. Objects are exemplification of a computer representation of entities. An object is made up of a set of roles. A base object is a role too. Access to an object can be taken within one of its roles only, because messages are delivered to roles. Roles are first-category citizenship beings. An object identity does not vary, while roles of this object change. A subtyping relation, which is analogous to inheritance relation, can be specified between types of roles. Subroles and superroles are defined by this concept, as well as multiple inheritance, overriding, and inclusion polymorphism. The execution of a message depends on the role to which a message is addressed. A proper method is searched among methods, defined within this role or its superroles in accordance with a hierarchy of roles and with additionally specified strategy of passing; There is a possibility to execute a query, which returns all actual roles of a given object (role inspection), and a possibility to dynamically change a role, which determines in what way an object is accessed (role casting). Roles, occurring on the same hierarchy level, are independent of each other. An access to roles is limited by visibility scope to one chosen role (and its superroles) only. 44 Deleting a given role entails removal of its all subroles; however, this process has no influence on superroles of a given role. 4.4. The Concept of Roles by Kristensen The necessity to introduce the concept of a role in conceptual design and discussion about role features are presented in B.B. Kristensen’s papers [Kristensen95, K96]. The concept extends objects with a possibility to glue beings without own identity, whose features are similar to features of regular objects. As the most important features of roles are recognized: A limitation of visibility scope and limited-access of methods, which are defined within a given role as well as in proper main object; no permission to methods which are defined within other roles of this object; Role dependency on existence of a main object; a role cannot exist without proper main object. Methods defined within a main object are accessible by methods, which are defined within each of its roles. However, definitions of methods, characterized within an object, cannot be relied on methods of its roles; Since an object with roles possesses only one identity, it can be processed as a single entity; Roles of a given object can be dynamically added and deleted with no change of its identity; At the same time, an object can have more than one instance of the same role; Roles can be classified and organized in the hierarchy by means of generalization and aggregation mechanisms; Introducing an extension of mechanism of association. Associations can connect not only objects but also objects with its roles or roles with roles. The features mentioned above have been discussed in the well-known terms of conceptual modeling. A role simulation has been proposed, which uses generalization/specialization, aggregation, and association concepts. Possibility to specify roles is not limited to a main object only. Roles can be specified as subroles of other existing roles. They can be defined by the use of one of abstraction 45 mechanism: generalization/specialization or aggregation with extended semantics for role methods. Authors introduce also a notion of a local role, which is visible and can be manipulated within the scope of its main object (or its superole). Graphical notation, proposed for object-oriented analysis and design, is capable of representing static as well as dynamic features of roles, including possibility to define the multiplicity for roles, and implicit conditions of role occurrence (which determine or impose constraints on existence of certain roles depending on existence of other roles). 4.5. Prototypes In previous sections inheritance and membership relationships are discussed, which are related to different kinds of availability or accessibility of properties, which can be accessed in some beings but are defined in other beings. In this section, another mechanism of accessibility among objects will be presented, called prototyping. Ownership of a prototype for a given object means that prototype properties of this object can be available for certain other objects. In object-oriented paradigm with prototypes, any object can become a prototype. Under this notion, one can understand both that an object represents a pattern for a collection of objects, and that information from a prototype is available for some objects. Prototypes may be used in constructing new objects in the principle of cloning, i.e., for creation (and further modification) of an object, filled with default values. In other situations, values of attributes are not copied in new objects from their prototypes, but they are rather imported (are available). In this way, since only objects and special references (links) from objects to their prototypes are required, uniformity and minimization of means are reached. Links between objects determine a transitive relation. Objects can be related to each other by a hierarchy, which organizes availability of information from higher levels of hierarchy (similarly as in class hierarchy). Generally, there are not assumed any constraints related to links between objects, although prototypes can be obviously such objects only, which are predetermined in advance by a designer or a programmer. The concept of prototypes is universal and it allows modeling various situations, occurring within an object-oriented data structure, in particular such as class, inheritance, multiple inheritance, and roles. The proposition of viewers [SMRSW93] follows further in this direction by the assumption that references, which are led to object prototypes, can be 46 additionally supplied with filters, establishing what should be imported. The concept of viewers assumes also that imported properties are indistinguishable from own properties in programming constructions (and in a query language). Therefore, many situations, occurring in object data structures, can be simply modeled, e.g. views or object versions. PERSON Name: Smith Date_of_birth: 1966.02.03 Tel. No.: 966498237 Age(): ... STUDENT EMPLOYEE University: UW Album_No.:3236457 Scholarship: 850 Salary: 2745 Fig. 4.1 An example of prototyping On the other hand, lack of classes, which are specialized means, may lead to difficulties in conceptual modeling and programming. There is no necessity to support by the model both prototypes and classes, considering their conceptual differences. Prototypes will be further mentioned by introducing new approach to dynamic roles. 4.6. Subtables in SQL99 A new standard of SQL17 language, called as SQL99 [ANSI99], introduces a concept of subtables, revealed some convergence with the concept of dynamic roles. A mechanism of subtables is orthogonal to ADTs, subtypes and inheritance concepts. A subtable possesses (inherits) all columns of a given table and can define its own columns. A maximal table (supertable) with all subtables denotes a subtable family. Each subtable row (record) must be related to an exactly one supertable row, but each supertable row can be related to at most one subtable row. Operators insert, delete, update are defined in the coherently way and they can be performed on a whole subtable family. Exemplary declarations of a table and subtables are as follows: CREATE TABLE persons (name CHAR(30),birth_year DATE ); CREATE TABLE employees UNDER persons ( salary FLOAT ); 17 Structured Query Language 47 CREATE TABLE students UNDER persons ( student_no INTEGER ); persons emloyees Fig. 4.2 A relation between a table (persons) and a subtable (employees) The first sentence creates persons table with columns (attributes) name and birth_year. The next one creates employees subtable, which possesses all attributes mentioned above, and additionally salary column. The third sentence creates student subtable, having additionally student_no column. Each subtable row shares certain part with one of table rows, but usually a subtable has fewer rows than a table. An illustration of relationships between a table and its subtable is presented in Fig. 4.2. It is worthy a notice that every table row can be treated as an object, whereas subtable values, sticking out of a table, can be treated as a role of this object. Similarly as in the concept of roles, each such row is identified by a table name jointly with names of subtables, which possess a given row. These names may be used in select sentences. For example in Fig. 13, each row can be identified by person name, and some of them can be identified by employee and customer names too. Obviously, a person can be both an employee and a customer at the same time, i.e. a representation of a given person can simultaneously participate in two or more subtables. Analogously, a single table as well as its every subtable can be independently used in SQL queries. This feature agrees largely with the concept considering roles in object-oriented query languages, which will be presented in later sections of this work. 4.7. Roles Realized by Design Patterns At early nineties, high complexity of process of information system design, variety and particularity of solutions, and also needs and benefits, followed from reuse, have led to origin 48 of design patterns. Design patterns [GHJV94, K96, BRSW97, Fowler97] systematize design methods through i.a.: Introduction of common term dictionary in design domain; Reduction of system complexity, which is based on correct identification and application of standard constructions, occurred during design process; Accurate classification of methods and constructions, which are related to reuse. Design patterns are defined in object-oriented terms at so abstract level, which ensures independence of implementational details and which assures a wide range of their applications. A design pattern mainly consists of context, i.e. a description of potential cases, in which it can be applied, of problem discussion, and of application goal. A solution is presented in a well-defined notation, and by using widely known principles of design (occurred in the term dictionary from design domain). An issue of roles is the subject of several design patterns, e.g. decorator [K96, BRSW97, Fowler97] and extension [GHJV94], where objects with roles are represented by certain complex data structures and operations. These structures are realized by patterned fragments of program code, mostly written in C++ and Java languages. The possibility of use of existing methodologies, notations and programming languages (e.g. as OMT, UML, C++) is the main advantage of this approach to dealing with dynamic roles. On the contrary, it is worthy to mention that a representation of roles by design patterns is neither a simple nor universal solution. For instance, [Fowler97] discusses several design patterns in order to represent roles as follows: Pattern General characteristic Application type name Single Properties of all roles are In cases, when roles are very similar Role Type defined within just one class Separate Each role is defined within a In cases, when roles have not many Role Type separated class Role Common properties of roles are In cases, when roles have some common Subtype defined within a superclass, properties, but they are essentially different common properties or not at all. while specific properties of from each other. Potential combinations certain roles are defined within and 49 migration of roles are strictly their proper subclasses specified. Changes or new roles seldom occur. Role Objects Object properties are connected with be often changed or in such cases, where is many with common In cases, when many roles exist, which can separated objects, a need to define new roles very often. contained specific properties of each role Role A relationship between a given In cases, when a lot of the same roles can Relation- object and other object is appear or there is a necessity to store the ship considered as a role. history of changes of roles for a given object during certain period. Component operation() addRole( Spec ) hasRole( Spec ) removeRole( Spec ) getRole(Spec) operation = core -> operation() hasRole = core -> hasRole() ComponentCore state roles ComponentRole core ConcreteRoleA addedStateA addedBehaviorA() ConcreteRoleB addedStateB addedBehaviorB Fig. 4.3 A class diagram for Object Role pattern (taken from [BRSW97]) An application of certain design pattern is dependent on the specificity of a given system project. A choice is often ambiguous and depends on the designers’ experience and intuition. Patterns with a simply architecture (Single Role Type, Separate Role Type) are characterized by rather small flexibility. Capability for high flexibility can be assured by application of design patterns, whose constructions are much more complicated (e.g. Role Object), and thus they are more difficult in an implementation. Fig. 4.3 presents the class diagram for Role Object18 pattern, discussed in [BRSW97]. Component class includes an interface specification of common role properties ( operation() ), and also defines specifications of operations, which are available for a given object with roles (addRole( Spec ), hasRole( Spec ), removeRole( 18 This pattern belongs to a (more general) design pattern category, named decorator. 50 Spec ), getRole( Spec ) ). The implementation of this interface is placed in ComponentCore class, whose goal is creating of ConcreteRole instances and manipulating them. ComponentRole class implements an interface of Component class by message passing to ComponentCore class. Properties, which are specified for a given role, are implemented within ConcreteRoleA, ConcreteRoleB classes. An abstract pattern framework, briefly presented above, has to be significantly extended at the stage of its implementation. It should be capable of hiding internal pattern structure and providing users with mechanisms, which are semantically consistent with the role concept. The essence of roles should be a simplification of conceptual modeling in many situations, supported by some simple mechanisms in query and programming languages. But it is a week point of this idea. Although data structures are described by design patterns precisely enough to implement roles, manipulation tools are not defined by these structures and they do not support programming with the use of design patterns. The behavioral specification can be implemented through a set of methods, which are defined in classes within a design pattern, but this set is usually rather large. A limitation of methods can be made for a concrete application, but it causes that such pattern is not fully generic and universal. It induces to a point of view that design patterns are no satisfied solution, because conceptual modeling is simplified by them in indirectly way by constructing of proper data structures. In opinion of many researchers (as well as in opinion of the author of this study) design patterns support conceptual modeling only partially, especially at design and implementation stages, but not during analysis. This can cause inconsistence in the project and cause difficulties in the transitions of results, for instance from analysis to design stages. 51 5. Dynamic Object Roles in Stack-Based Approach The chapter is organized as follows. Chapter 5.1 presents the concept of dynamic object role. Chapter 5.2 introduces an object model with roles. Chapters 5.3 and 5.4 discuss differences of our model with traditional object models introduced in programming languages and database management systems. Chapter 5.5 presents assumptions of a query language in the spirit of ODMG OQL dealing with roles. 5.1. The Concept of Dynamic Object Role The idea of dynamic object roles is simple and natural. It assumes that every real or abstract entity during its life can acquire and lose many roles without changing its identity. The roles appear during the life of a given object, they can exist simultaneously, and they can disappear at every moment. For example, a certain person can be a student, a worker, a patient, a club member at the same time; see Fig. 5.1. Similarly, a building can be an office, a house, a magazine, etc. Person Employee Patient Student Club-member Student Tax-payer Dog-owner Fig. 5.1 Roles played by a person Present object models have a possibility to express static properties, e.g., the fact that a student is a person. However, it is more precise to say that a person becomes a student for some time and later he or she terminates the student role. Moreover, some person at the same time can be a student two or more times. Similarly, a person may become an employee, a patient, etc. only for some time. 52 The concept of dynamic object roles assumes that an object is associated with other objects (subobjects), which are modeling its roles. Object-roles cannot exist without their parent object (in Fig. 5.1, without the Person object). Deleting an object causes deleting all of its roles. Roles can exist simultaneously and independently. A role can have its own additional attributes and methods. It is normal that two roles can contain attributes and methods with the same names, and this does not lead to any conflict. This is a fundamental difference in comparison to the concept of multiple inheritance. Relationships (associations) between objects can connect not only objects with objects, but also objects with roles and roles with roles. For example, a relationship works_in connects an Employee role with a Company object. This makes the referential semantics clean in comparison to the traditional object models. Roles can be further specialized as subroles, sub-subroles, etc. For example, the specialization of a role Club_Member can be a role Club_President. The role concept requires introducing composite objects with a special structure and semantics. The structure should be supported by proper generic operations. In this paper we describe the structure formally and present assumptions of a query/programming language supporting generic operations to process such structures. Our idea to deal with dynamic roles in a query language is based on the stack-based approach (see e.g. [SBMS94, SKL95]), probably the only current formalism, which can naturally adapt the concept. A version of roles was implemented in the prototype system Loqis [SMSRW93]. 5.2. Object Store Model with Dynamic Roles In our approach, we assume that an object can conceptually contain many subobjects called roles. These subobjects can be inserted and removed at run time, as in [RS91, ABGO93]. Roles cannot exist without their parent objects. Deletion of an object causes deletion of all its roles. Roles can posses different types and can exist simultaneously and independently. A role can possess its own attributes and behavior (i.e. methods, rules, events, etc.). Two roles stored within the same object may have attributes or methods with the same name. Identical names in two or more roles of different types do not imply any semantic dependency between corresponding properties. For example, a person can play simultaneously the role of an employee of a research institute with the attribute Salary, and the role of an employee of a service company with the attribute Salary too. These two attributes exist at the same time, but except for the name no other feature is shared, including types, semantics and business 53 ontologies. A role dynamically “imports” attributes (values) and behavior from its superroles, in particular, from its parent object. PersonClass Name BirthYear Age() EmployeeClass Salary Job NetSalary() ChangeSalary(..) Classes Objects with roles Person Name Doe BirthYesr 1948 Person Name Brown BirthYear 1975 StudentClass Semester StudentNo NewScore(...) AvgScore() Person Name Smith BirthYear 1951 Employee Salary 2500 Job analyst Employee Salary 1500 Job clerk Student Semester 7 StudentNo 223344 is_a_customer_of works_in Person Name Jones BirthYear 1940 works_in Company Name Bank studies_at School Name NYA Student Semester 4 StudentNo 556677 studies_at School Name MLI Fig. 5.2 Objects with their roles and classes In Fig. 5.2 we present an example showing basic features of the store model with dynamic roles. The following features are presented: An object (shown as a grey rectangle with round corners) has one main role (Person) and any number of specializing roles (Employee and Student). Each role has its own name, which can be used to bind the role from a program or a query. The presented objects can be bound through name Person (each), through name Employee (second and third) and through name Student (third and fourth) . Each binding returns the identifier of a proper role (or the identifiers of proper roles in case of multivalued bindings). Each role is encapsulated, i.e. its properties are not seen from other roles unless it is explicitly stated by a special link (shown as a double-line with the black diamond end). 54 In particular, a role Employee imports all properties of its parent role Person. For example, if the second object is bound by name Person, then the properties {Name Brown, BirthYear 1975} are available; however if the same object is bound by name Employee, then the properties {Salary 2500, Job analyst, Name Brown, BirthYear 1975} are available. Each role is connected to its own class. The connection is shown as a grey thick continuous arrow. Classes contain invariant properties of corresponding roles, in particular, names (first section), attributes and their types (second section; attribute types are not shown) and methods (third section). 5.2.1. Links among Objects and Roles Links can join not only objects with objects, but also objects with roles and roles with roles. For example, a link works_in joins an object Company with a role Employee; see Fig. 5.2. A similar link studies_at joins a role Student with an object School. If such a link leads to Employee, it indirectly leads to Person, because the role Employee imports the properties of its parent object Person. However, after accessing the object via such a link, the properties of the role Student remain invisible. As follows, a role identifier must be different from the identifier of the corresponding object. Fig. 5.2 shows also a link is_a_customer_of between objects Person and Company. Accessing the object Person through this link implies that any role of the object Person remains invisible. The possibility to create links between roles is a new quality for analysis and design methodologies and notations (such as UML). Links must lead to parts of objects, not to entire objects. To model this situation the methodologies suggest using aggregation/composition. Such an approach implicitly assumes that e.g. an Employee is a part of a Person on the similar principle, as an Engine is a part of a Car. Although the approach achieves the goal (e.g. we can connect the relationship works_in directly to the Employee sub-object of Person), it obviously misuses the concept of aggregation, which normally is provided for modeling the “whole-part” situations. Design patterns [KØ96, BRSW00, Fowler97, GHJV95, RG98], which are proposed to deal with dynamic roles, offer a limited solution. Design patterns map the conceptual structures with roles onto structures without roles, but the resulting data structures must be processed by classical programming constructs, which usually need not (and are not) structured in a way reflecting the initial design concept. Thus, this initial idea of 55 the designer is lost. Moreover, the reverse mapping, from the code into conceptual structures with roles, is problematic in majority of cases. In our opinion, the only radical cure for these drawbacks is dynamic roles explicitly introduced within design methodologies and within implementation environments. 5.2.2. Dynamic Roles - a Formal Model of an Object Store In the following, we present formal models of an object-oriented data store without and with roles and then compare them. For the sake of simplicity and for making the semantics clean we assume object relativism (i.e. each property of an object is an object too) and consider all properties of the store (including classes) first-class citizens. The store models a program/database state thus does not involve types, which we consider a checking utility rather than a “materialized” property of the state. Formally, an object is a triple <i, n, v>, where i is a unique internal object identifier, n is an external object name, and v is an object value. The value can be atomic (e.g. “Doe”), can be a reference to another object, or can be a set of objects. Classes and methods are objects too. The classical object store model (i.e. being some formal approximation of object models of popular object-oriented systems) is presented as a 5-tuple <O, C, R, CC, OC>, where: O is a collection of (nested) objects, C is a collection of classes, R is a collection of root identifiers (of objects being entries of the store), CC is a binary relation determining inheritance relationship, OC is a binary relation determining membership of objects within classes. In Fig. 5.3 we present a simple example of an object store built according to the given definition. Although the model seems to be natural and formally simple there are several features, which make it inconvenient, especially for defining a query language. The model results in anomalies with multiple, multiple-aspect and repeating inheritance. In case of name conflicts it inevitably violates substitutability. Moreover, the model implies some problems with binding. For instance, assume that a query contains a name Person, which has to be bound to the stored objects. According to the substitutability, it should be bound not only to i1, but also to i4 and i9. However, these objects have the names Employee and StudentEmployee, hence the 56 binding is not straightforward. The binding rules must take the information that PersonClass defines member objects named Person, the PersonClass is specialized by EmployeeClass, StudentClass and StudentEmployeeClass, and objects of these classes, independently of their current name, should be bound to the name Person. This information is not present in the given store model and must be introduced by some additional features. For instance, in the ODMG standard it is implicitly assumed that the name of a class interface implicitly becomes the name of the corresponding objects. Still, this issue in the standard is very unclear and defined inconsistently [PK00]. The issue becomes even more problematic in case of weakly typed systems and/or dynamic bindings. A similar problem concerns the semantics of references (in general, properties specific for a given subclass). For instance, if one accesses the objects Employee through the name Person, then references works_in should not be accessible. Again, this may be a problem in case of weakly typed systems. O - Objects: < i1 , Person , { < i2, Name, ”Doe” >, < i3, BirthYear, 1948 > } > < i4 , Employee , { < i5, Name, ”Brown” >, < i6, BirthYear, 1975 >, < i7, Salary, 2500 >, < i8, works_in, i127 > } > < i9 , StudentEmployee , { < i10, Name, ”Smith” >, < i11, BirthYear, 1951 >, < i12, StudentNo, 223344 >, < i13, Faculty, ”Physics” >, < i14, Salary, 1500 >, < i15, works_in, i128 > } > ..... C - Classes: < i40 , PersonClass , { < i41, Age, (...code of the method Age...) >, ...other properties of the class PersonClass...}> < i50 , EmployeeClass , { < i51, ChangeSalary, (...code of the method ChangeSalary...) >, {< i52, NetSalary, (...code of the method NetSalary...) >, ...other properties of the class EmployeeClass... }> < i60 , StudentClass , { < i61, AvgScore, (... code of the method AvgScore...) >, ...other properties of the class StudentClass ... }> < i70 , StudentEmployeeClass , { }> ..... R - Root identifiers: i1 , i4 , i9 , ... CC - Inheritance relationships between classes: < i50 , i40 >, < i60 , i40 >, < i70 , i50 >, < i70 , i60 > , ... OC - Membership of objects in classes: < i1 , i40 >, < i4 , i50 >, < i9 , i70 > , ... Fig. 5.3 An example state of an object store in the classical object store model 57 To remove these disadvantages we propose another store model, with explicitly defined roles. It is defined as a 6-tuple <O, C, R, CC, OC, OO>, where: O, C, R, CC, OC are defined as before, but O is a set of roles rather than a set of objects; OO is a binary relation determining inheritance relationship between roles. In our terminology, we must now distinguish objects and roles. Objects may consist of many roles, and a role belongs to a single object. An object has exactly one main role. The name of this main role should reflect semantics of the entire object. Deleting this role implies deleting the entire object. Any role can inherit dynamically from another role within the same object; inheritance among roles in different objects is forbidden. The inheritance is based on the same rule as inheritance of class properties. This kind of inheritance is known from prototype-based languages, for instance from Self. A role, which inherits, is sometimes called a sub-role; an inherited role is sometimes called a super-role. The relation OO defines two functional aspects. On the one hand, the relation determines which roles are inherited by other roles. On the other hand, the relation OO fixes the semantics of manipulating objects with roles. In particularly, copying an object implies “isomorphic” copying of all of its roles, and deleting an object implies deleting all of its roles. Deleting a role implies recursive deleting all of its sub-roles, but of course, not its super-role. The relation OO is a pure hierarchy (each role has at most one super-role; no cycles). On pictures in the paper, OO is shown as doubleline with the black diamond end. In Fig. 5.4 we present an example of the object store from Fig. 5.3, built according to the new definition. In Fig. 5.5 we present the same object store graphically. The component CC - the relation determining inheritance between classes - is empty in this case. Each role directly inherits from its class. A role inherits the properties of its super-roles, hence indirectly inherits the properties of the classes that these super-roles belong to. The model does not imply problems with multiple inheritance. Because each role is an independent encapsulated object, no name conflict is possible. The model also clearly shows the reason for multiple inheritance anomalies in classical object models: they are caused by the fact that properties of different classes (perhaps incompatible) are not encapsulated, but mixed up in a single environment. The model does not imply the mentioned above problems with bindings either. Note that the identifier of each role belongs to root identifiers R, which present starting points for binding objects. Hence name Person is bound to i1, i4, i7. The binding concerns exactly the 58 Person roles; other roles are invisible. Similarly, name Employee is bound to i13 and i16, but after this binding the corresponding roles Person (i4 for i13 and i7 for i16) become visible, according to the OO relation. Thus, for instance, Employee.Name and Employee.Age are correct expressions; similarly, when name Student is bound. O – Objects (roles): < i1 , Person , { < i2, Name, ”Doe” >, < i3, BirthYear, 1948 > } > < i4 , Person , { < i5, Name, ”Brown” >, < i6, BirthYear, 1975 >} > < i7 , Person , { < i8, Name, ”Smith” >, < i9, BirthYear, 1951 >} > < i13 , Employee , { < i14, Salary, 2500 >, < i15, works_in, i127 > } > < i16 , Employee , { < i17, Salary, 1500 >, < i18, works_in, i128 > } > < i19 , Student , { < i20, StudentNo, 223344 >, < i21, Faculty, ”Physics” >} > ..... C - Classes: < i40 , PersonClass , { < i41, Age, (...code of the method Age...) >, ...other properties of the class PersonClass...}> < i50 , EmployeeClass , {< i51, ChangeSalary, (...code of the method ChangeSalary...) >, < i52, NetSalary, (...code of the method NetSalary...) >, ...other properties of the class EmployeeClass... }> < i60 , StudentClass , { < i61, AvgScore, (... code of the method AvgScore...) >, ...other properties of the class StudentClass ... }> ..... R – Root identifiers: i1 , i4 , i7 , i13 , i16 , i19 , ... CC - Inheritance relationships between classes: Empty. OC - Membership of roles in classes: < i1 , i40 >, < i4 , i40 >, < i7 , i40 >, < i13 , i50 >, < i16 , i50 >, < i19 , i60 > , ... OO – Inheritance between roles: < i13 , i4 >, < i16 , i7 >, < i19 , i7 > , ... Fig. 5.4 An example state of the object store in the object model with roles The model is also consistent concerning references. For example, when name Person is bound then references works_in are unavailable, because the roles Employee are invisible. This property holds independently on whether the system is strongly typed or untyped. 59 i40 PersonClass i41 Age (...code...) ............. i50 EmployeeClass i60 StudentClass i51 ChangeSalary (...code...) i61 AvgScore (...code...) i52 NetSalary (...code...) ............. ............. i1 Person i4 Person i7 Person i2 Name ”Doe” i5 Name ”Brown” i8 Name ”Smith” i3 BirthYear 1948 i6 BirthYear 1975 i9 BirthYear 1951 i13 Employee i16 Employee i19 Student i14 Salary 2500 i17 Salary 1500 i20 StudentNo 223344 i15 works_in i18 works_in i21 Faculty ”Physics” i127 i128 Fig. 5.5 Graphical representation of the object store from Fig. 5.4 5.3. Dynamic Roles vs. Classical Object-Oriented Models In this section we discuss several points, which make the concept of dynamic roles different in comparison to the classical object-oriented models. Dynamic object roles have the potential to create new powerful features, which are difficult or impossible to achieve in the classical object model. 5.3.1. Multiple Inheritance Due to roles are encapsulated there is no name conflict even if the super classes would have different properties with the same name; see the third object in Fig. 5.2. Notice that there is no class EmployeeStudentClass, which inherits both from EmployeeClass and StudentClass classes. In our opinion (see Chapter 3.7), the introduced model with dynamic roles is able to encompass most of situations in which multiple inheritance occurs. 60 5.3.2. Repeating Inheritance It is normal that an object has two or more roles with the same name; for instance, Brown can be an employee in two companies, with different Salary and Job. Such a feature cannot be expressed by the traditional inheritance or multiple inheritance concepts. 5.3.3. Multiple-Aspect Inheritance A class can be specialized according to many aspects. For example, a vehicle can be specialized: (1) according to environment (ground, water, air) and (2) according to a drive (horse, motor, jet, etc.). Modeling tools, such as UML, cover this feature, but it is rather neglected in object-oriented programming and database models. One-aspect inheritance makes problems with conceptual modeling and as a rule leads to multiple inheritance. Roles are a viable concept to avoid problems with this feature. The intuitive approach to replace multiple-aspect inheritance with dynamic roles is presented in Chapter 6.5. 5.3.4. Temporal and Historical Properties As shown in Chapter 6.3, dynamic object roles are very useful for temporal databases since roles can represent any past facts concerning objects, e.g., the employment history through many Employee roles within one Person object (see Fig. 5.7). Without roles, historical objects entail difficult design problems; for instance, if one wants to avoid redundancy, preserve reuse of unchanged properties through standard inheritance, and avoid changing object identifiers. 5.3.5. Variants (Unions) This feature, introduced, e.g., in C++, CORBA and ODMG object models, leads to a lot of semantic and implementation problems. Some professionals argue that it is unnecessary, as it could be substituted by specialized classes. However, if a given class can possess many properties with variants, then modeling this situation by specialized classes leads to the combinatorial explosion of classes (e.g. for 5 properties with binary variants - 32 specialized classes). Dynamic object roles naturally avoid this problem. Each branch of a variant can be 61 considered a role of an object. Thus in an object model with dynamic roles the concept of variants (unions) becomes indeed unnecessary. 5.3.6. Object Migration Roles may appear and disappear at run time without changing identifiers of other roles. In terms of classical object models it means that an object can change its classes without changing its identity. This feature can hardly be available in classical object models, especially in models where binding objects is static. 5.3.7. Referential Consistency In the presented model relationships are connected to roles, not to the entire objects; thus, e.g. it is impossible to refer to Salary and Job of Smith when one navigates to its object from the object School. In classical object-oriented models, this consistency is enforced by strong typing, but it may be problematic in untyped or weakly typed systems. 5.3.8. Overriding Properties of super-roles can be overridden by identically named properties of sub-roles. We will show that the possibilities of overriding are extended in comparison to the classical object models. The precise overriding rules will be explained in Chapter 5.5.6. 5.3.9. Binding An object can be bound by the name of any of its roles, but the binding returns the identifier of a role rather than the identifier of the object. By definition, the binding is dynamic, because in a general case during compilation it is impossible to decide that a particular object has a role with a given name. 5.3.10. Typing A role must be associated with a name, because this is the only feature allowing the programmer to distinguish a role from another one. Hence, the name of a role must be 62 determined by its type (unlike classical programming languages, where a type usually does not determine the name of a corresponding object/variable). Because an object is seen through the names of its roles, it has as many types as it has different names for roles. In this work, we do not deal in details with typing issue; it requires further research. 5.3.11. Subtyping It can be defined as usual; for instance, the Employee type is defined with the use of the Person type. However, it makes little sense to introduce the StudentEmployee type (see the third object in Fig. 5.2). Due to encapsulated roles, there is no possibility to mix up properties of a Student object and properties of an Employee object within a single structure. 5.3.12. Substitutability It becomes problematic. On the one hand, subtyping seems to be defined as in the traditional model. On the other hand, since object names of roles are determined by types, it makes little sense to say, e.g. that the Employee type can be used in all places, where the Person type can be used. The Person type implies that an expected object has the name Person, not Employee.19 Thus, the substitutability principle has to be reformatted, at least. 5.3.13. Dynamic Inheritance The EmployeeClass do not inherit statically from the PersonClass. Instead, an Employee role inherits dynamically all the properties of its Person super-role, thus indirectly inherits properties of the PersonClass. Thus in Fig. 5.2 we have shown the inheritance between EmployeeClass and PersonClass as a dashed arrow, because we consider it a comment rather than a structural property. 19 At this stage, we do not attempt to discuss further consequences of this novelty for classical issues that the strong typing community deals with, i.e. polymorphic typing, generic programming, covariance/contravariance dilemma, etc. 63 5.3.14. Aspects of Objects and Heterogeneous Collections A challenging problem with classical database object models such as ODMG [ODMG00] is that an object belongs to one collection at most. This is contradictory to both multiple inheritance and substitutability. For instance, we can include a StudentEmployee object into the extent of Students, but we cannot include it at the same time into the extent of Employees (and vice versa). This violates substitutability and leads to inconsistent processing. Dynamic roles have a natural ability to model heterogeneous collections: an object is automatically included into as many collections as roles it contains. In the classical object-oriented model this would require to introduce a superclass over such classes, with all the necessary attributes and methods. 5.3.15. Aspect-Oriented Programming and Separation of Concerns AOP [KLMM+97] makes it possible to encapsulate cross-cutting concerns within separate modules, namely: history of changes, security and privacy rules, visualization, synchronization, and so on. Thus, dynamic object roles have conceptual similarities with AOP or can be considered as a technical facility supporting AOP. 5.3.16. Meta-Data Support Meta-data are a particular case of cross-cutting concerns. It is considered in Dublin Core [DCMI] or W3C RDF [W3CRDF], e.g., authorship, validity, legal status, ownership, or coding can be implemented as dynamic roles of information objects. 5.4. Specification of Dynamic Roles in Database Schemata A standard for object-oriented databases has been proposed by ODMG [ODMG00]. It is a very important contribution to the field, but (as in any pioneering development) not yet sufficiently consistent, precise and complete (see e.g. [Alag97, Subieta97]). We consider it a 64 good starting point for further research. In this paper, we adopt the syntax of ODMG ODL (based on IDL CORBA [OMG95]), however, we will be more precise concerning defined concepts and semantics. 5.4.1. Concepts for Building Database Schemata With respect to database schemata object-oriented models and corresponding object definition languages deal with the following concepts: classes, types, interfaces, abstract data types, and declarations of stored structures. However, there is a high degree of confusion what these concepts actually mean. Below we briefly present our view on these concepts, which could be the basis for developing concrete syntax and semantics of a data definition language for the object model with roles. Classes: They are implementation units storing invariant properties of their members (objects). The invariant properties are usually reduced to names and types of member’s attributes and methods, which can be executed on the member. Other kinds of invariant properties include: a name assigned to a class member, specification of events/exceptions, implementation of reactions to events/exceptions, implementation of integrity constraints concerning the member, specification of exported properties of objects (public properties), specification of imported properties (active and passive side effects), etc. Classes, in contrast to types and interfaces, contain the entire implementation of objects hence can be subjects of marketing activities (ownership, copyrights, selling/buying, etc.). Types: They are constraints on the structure of objects and specification of input/output properties of procedures, functions and methods. The constraints restrict the context of use of corresponding entities within queries, expressions or programs. Another function of types is determining computer representation of values. A type should not be confused with a class. Types do not determine implementation hence they have no market value. A type could be an invariant of a class (can constraint member objects of the class), but in general untyped or weakly typed classes are also possible. An important role of type names is conceptual modeling: the name of a type frequently bears informal data semantics. 65 Interfaces: In general, interfaces allow the programmer to treat classess/objects as black boxes, thus should bear all the information, which is necessary to deal with objects, in particular: Names of objects to identify them in queries or programs; Exported properties (names and types of public attributes and methods); Constraints on objects, on parameters of methods, on output of methods, etc.; Events/exceptions that can be raised during execution of methods; Events/exceptions that can be captured by objects (which trigger some actions on objects); External properties which can be interrogated by methods (specification of imported elements of program, database or computer environment); External properties that can be changed by methods (side effects of methods). Interfaces defined in well-known languages and standards (CORBA, COM/DCOM, ODMG and Java) have only some of these functions. Interfaces should not be confused with types, although typing information is the main component of interfaces. If a particular system defines the class concept, then multiple interfaces to a single object are possible (COM/DCOM, Java). Otherwise, if the class concept is undefined, then the model does not contain information if two interfaces concern the same or different objects (CORBA). In the case of a model with roles, each role of an object presents its particular interfaces; an object has as many interfaces as role names. Additionally, we can assume there in no other interfaces: each defined interface determines a role of an object. Abstract data types: There is no agreement what this concept actually means. In popular explanation, it encapsulates a data structure by hiding its interior, which is accessible through defined operations only. In various sources the concept varies between ordinary types, interfaces and classes. We consider this concept as a synonym of an interface to class members; hence in our terminology we will not use this term. Declarations of stored data structures: This is the major goal of a database schema. In programming languages, the declarations of stored data structures (e.g. variables) are usually separated from the declarations of types, classes or interfaces. One can declare 66 many variables of the same type. Unlike, in SQL the declaration of a table (i.e. the SQL “create table” statement) is joined with declaration of its type; there is no possibility to declare the type of a table and then to declare any number of tables of this type. This philosophy is adopted by the ODMG standard, which assumes declarations of extents associated with declarations of interfaces. Just contrary, SQL3 and SQL1999 have resigned from the previous SQL property, taking the approach in which declarations of types and declarations of corresponding data structures can exist independently. Modules: Traditionally (for instance, in Modula-2) a module is an encapsulated trading unit (to be exchanged, replaced, bought, sold, etc.) and a unit of compilation. In a bit different sense, modules are encapsulated containers storing objects, classes, procedures, types, etc. Modules should possess a strongly defined interface, which (in Modula-2) contain export and import lists. If one would assume dynamic linking and the possibility to insert dynamically into modules new properties, then the trading function of modules is the only essential ones. In this sense, modules are regular objects, only with an additional flag that they are consistent encapsulated trading units. This flag has no influence on the general semantic properties of such an object. For this reason in the following, we do not use this concept. The popular object models usually avoid an extensive number of concepts by sticking several their roles in one concept. In particular, CORBA does not introduce classes and declarations of stored data structures. In ODMG declarations of stored data structures are determined within ODMG classes20 as “extent” clauses. Both CORBA and ODMG make no difference between types and interfaces. In some proposals, classes are treated as synonyms of types. In C++ classes have explicit qualification of privacy of attributes and methods, thus the C++ class concept has some function of an interface. Similarly, in Eiffel, whose classes explicitly introduce export lists. 5.4.2. Naming Issues Some problem with the above notions concerns naming conventions. In programming languages and in IDL CORBA it is usually assumed that classes, types or interfaces do not 20 The term “class” is specifically understood by ODMG; it does not correspond to the definition presented above. 67 determine names of the corresponding objects. They are chosen by the programmer and have the second-class citizenship. In SQL-based systems names of relations, names of attributes, names of views, etc. are determined in a database schema and have the first-class citizenship. ODMG assumes that the name of an interface becomes the name of the corresponding objects; this implicitly follows from some examples. The standard explicitly introduces the name of an extent (understood as a regular collection) within an interface. Unfortunately, this is inconsistent [PK00], especially in the context of inheritance among interfaces and the possibility to define collections having objects of different types. In general, naming of the mentioned above entities and binding of names occurring in queries/programs are not treated in existing proposals with attention. In particular, it is necessary to distinguish the name of a class from the name of its member objects, the name of a type from the name of an object of this type, and so on. The issue is important, in particular, for metadata management (discussed in the following). The distinction can be accomplished by some naming rule (e.g. Class as a suffix, for instance EmployeeClass) or by special statements or operators dealing with particular kinds of entities (e.g. delete class Employee vs. delete object Employee). 5.4.3. A Sample Construction of an Object Schema with Dynamic Roles We assume the basic syntax of IDL CORBA and ODMG ODL. Other features of our definitions are different. We will follow the top-down approach of the object-orientedness in which classes are reusable units, which can be further specialized by subclasses. This is sometimes called the open-closed principle: classes are open for specializations or extensions, and closed for modifications. In terms of dynamic roles, a class after development can be closed for modifications, but it can be further extended by new, ad hoc defined dynamic roles. In the classical object-oriented terminology, roles correspond to subclasses of a given class. A class is an implementation unit registered in the object store (on a server) by a system or a database administrator. Each class has a unique name (AbstractPersonClass, PersonClass, StudentClass, EmployeeClass and PersonCollectionClass). The registration of a class means the following actions: The full code of a class is introduced as a single object to the object store. Methods and other properties of the class are introduced as subobjects of this object; 68 Meta-information concerning the class (class name, class location, ownership, comments, date of input, last modification, status, etc.) is introduced to the corresponding structure of the catalog; The entire interface to a class is introduced to the catalogs. Name of the interface is identical to the name of a class. The interface specifies all public properties of the class. The interface is used in the following to build specialized interfaces (views) to the class properties. The interface can be introduced manually by a person registering the class, or automatically, by a special registration utility. In the second case, the utility can use special keywords and language constructs in the class code, such as keyword “public”, signatures of methods, and others. interface AbstractPersonInterface to AbstractPersonClass { attribute string Name; attribute date BirthYear; method unsigned integer Age( ) rises event BirthYearIsInvalid; } interface PersonInterface to PersonClass { inherits from AbstractPersonInterface; object name Person; } interface EmployeeInterface to EmployeeClass { mandatory role of PersonInterface; object name Employee; attribute real Salary; attribute set of string Job; relationship CompanyInterface works_in inverse employs; method void ChangeSalary(in real NewSalary ) rises event NewSalaryIsInvalid; method real NetSalary( ); } interface StudentInterface to StudentClass { multiple optional role of PersonInterface; object name Student; attribute integer Semester; attribute string StudentNo; method integer AvgScore( ); } * interface CompanyInterface to CompanyClass { object name Company; attribute string CompanyName; relationship set of EmployeeInterface employs inverse works_in; relationship EmployeeInterface manager; } interface PersonCollectionInterface to PersonCollectionClass { attribute set of PersonInterface; method integer NbrOfElements( ); method void CreateNewPerson( in string Name, in date BirthYear ); method void DeletePerson( inout PersonInterface PersonToBeDeleted ); } Fig. 5.6. Interfaces to database objects with roles Interfaces are defined and identified independently of classes, but they refer to the names of registered classes (or more precisely, to the names of their entire interfaces). The approach makes it possible to define several interfaces to a single class. Classes store invariant 69 (imported) properties of their member objects. Any properties relevant to collections of objects being members of a class (class methods and class attributes in the C++ terminology) have to be stored in a separate class (so called a “power set of” class), whose members are collections of objects (e.g. PersonCollectionClass)21. Classes are specified by interfaces, which (as in IDL CORBA and ODL) present specification of public class properties. Interfaces contain all typing information (in particular, types of attributes and signatures of methods), but may contain also information irrelevant to types. Interfaces are the subject of inheritance relationships, which can be one of two kinds. “Static” inheritance concerns properties of a class (in Fig. 5.6 - an arrow with a white triangle end). “Dynamic” inheritance concerns properties of a super-role (in Fig. 5.6 - a double-line with a black diamond end). Usually the inheritance among interfaces corresponds to the inheritance among corresponding classes. However, this is not mandatory and may depend on implementation of classes and the conceptual view on corresponding interfaces. The relationship between PersonCollectionInterface and PersonInterface (shown in Fig. 5.6 as a dashed double-line arrow) determines the dependency between a “power set of” class and the class of its elements. “Power set of” classes do not constitute a special kind of classes - they follow ordinary rules but their members are collections of objects. All the lines, arrows and boxes in Fig. 5.6 are visual comments - the same information is explicitly present inside interfaces. The syntax presented in Fig. 5.6 is an ad hoc extension of the ODL syntax. The differences to the ODL semantics are the following: Each interface is named and is associated with a class name. These names do not determine the names of the corresponding objects. Some interfaces (e.g. PersonInterface, EmployeeInterface) define names of the corresponding objects. Interfaces need not to determine such a name (e.g. PersonCollectionClass). An interface to a role is associated with another (parent) interface. A role can be mandatory optional, and/or multiple. A mandatory role means that its super-role must 21 Some models introduce so-called “static” properties and methods within a class, which concern class extents rather than class objects. Such an approach allows one to avoid the “power set of” classes. However, on the other hand, it mixes different semantic and conceptual entities, introduces some disorder in the model, and can be the reason of misunderstanding. 70 possess it. Usually roles are optional. A multiple role may have many instances within an object. We follow the convention in which objects consist of roles and posses one main role. Thus relationships, such as employs/works_in, connect roles rather than objects. As in ODL (and unlike UML and the CORBA Relationships Service), relationships are binary and have no attributes. In our opinion, relationships that are more powerful lead to clumsy programming options. Relationships can be bidirectional (with the inverse option, as for works_in/employs) or directed (no the inverse option, as for manager). Independently of the inverse option, the system is responsible for keeping referential integrity of relationships thus no dangling references can appear. Unlike ODL, defining attributes with values being references to objects/roles, is not allowed. The interface PersonCollectionInterface has an attribute being a set of objects Person. The name for this attribute is undefined here, because the definition is present in the PersonInterface. An interface like the PersonCollectionInterface can define more such attributes, what makes it possible to define heterogeneous collections. Such a feature is postulated in the ODMG standard. 5.4.4. Declarations of Data Structures The schema presented in Fig. 5.6 maps interdependencies among interfaces, but it does not determine which objects are currently stored in the database. However, just defining stored objects is the main function of a database schema, as this information is necessary for the application programmer. In ODMG, it is implicitly assumed that all interfaces, as shown in Fig. 5.6, determine stored objects (this follows from some examples presented in the ODMG standard). Besides the user can use an explicit definition of a stored data structure (so-called extent) associated with an ODMG class. The CORBA standard has no clauses determining stored data structures, but it is implicitly assumed that each IDL interface can be used to access CORBA objects, which are created in a way dependent on particular object implementation. An interface (or type) definition is coupled with the definition of stored data structures in relational systems. This is materialized in the SQL “create table” statement. The emerging SQL 1999 standard abandons these solutions and assumes that a table type (i.e. an interface, in another terminology) can be defined independently from declarations/creations of stored tables. This is a comeback to the tradition of programming 71 languages. Our view on this issue is the same: declarations of interfaces or types should be separated from declarations/ creations of data structures. Lack of such separation makes problems with the following cases: One would like to define an abstract interface (to inherit from it), which has no direct member objects; One would like to define an interface to an attribute, to a sub-attribute, etc. i.e. to an object which does not belong to the “root” objects; One would like to define an interface to a single object (which does not participate in a collection); One would like to define an interface to a module (an entity encapsulating classes, interfaces, types, objects, etc.) and - independently - interfaces to objects stored within the module. Thus, declarations of interfaces, as shown in Fig. 5.6, are not declarations of stored data structures. Declarations of the structures present another meta-information, which have to be stored in the database catalog. In Fig. 5.7 we have declared: A single object named President (accessed by the AbstractPersonInterface - no roles). Two objects named YoungPersons and OldPersons (being collections of objects Person accessed by the PersonInterface; objects Person can be specialized by roles). A set of objects Company, named Companies, without a corresponding power set of class thus without an interface to the collection. Thus, it is implicitly assumed that such a collection is accessible through a generic interface, which need not be defined explicitly by the designer or programmer. Any declared structures can be served by a query language, which presents such a generic interface. AbstractPersonInterface President; PersonCollectionInterface YoungPersons; PersonCollectionInterface OldPersons; set of CompanyInterface Companies; Fig. 5.7 Declarations of data structures stored in the database The presented idea concerns the conceptual view on objects, roles, their classes and types. In this paper, we do not deal with physical implementations. Usually an entire object occupies 72 a continuous part of the disk store. Unlike this solution, roles can be stored on separate storage parts, connected with their parent roles e.g. by pointers. 5.4.5. Metadata Management The notions of classes, types, interfaces and declarations of stored structures are the subject of metadata management. All the metadata are stored in the metadata repository (database catalogs). The following situations are possible: Non-updateable repository. Classes, types and interfaces originally exist as source texts. After introducing them to the repository they can be interrogated only; no updating is possible. Change a repository state is sometimes possible by deleting or inserting whole such entities. This situation is typical for programming languages, where such entities have the second-class citizenship. Another example is the CORBA Interface Repository. Updateable repository. Classes, types and interfaces exist as updateable structures within the metabase. The system must support all the necessary operations, which make it possible to input a new entity to the metabase, to modify it, to delete it and to print the textual form of the database schema or its parts. This situation is more typical for database management systems, which allow dynamic changing and interrogating the introduced entities. For example, a user can insert/drop a table, a view, a procedure, a trigger, etc. We also can imagine that it could be necessary to insert a new method to a class/interface, a new event, etc. The approach is assumed in the ODMG standard, which provides a metamodel having updating operations for the metadata repository. The topic is known as schema evolution. In the following, we skip discussion concerning which updating operations make sense on the metadata repository, considering it a separate topic. In general, we assume that updating of the metadata repository should be available through some generic and universal query/programming interface (as in SQL-based relational systems). The metamodel and its repository have to fulfill several functions: Conceptual modeling, to understand interdependencies in the underlying data model and in the schema language; 73 Object Store defines 0..1 * Stored Object InstanceName IsRoot? IsPersistent? * * * Interface * InterfaceName * * 0..1 Complex Role Properties IsOptional? IsMultiple? Property Static Property IsOptional? IsMultiple? is_role_of * Relationship * has_superrole Primitive Value * 0..1 Event EventName * Attribute * rises * has Role 0..1 Method returns Class Spec connects is_implemented_by inherits_from MetaObject InstanceName[0..1] * * 0..1 Type Parameter ParameterName has * Mutability inverse Structural Type Primitive Type TypeName Fig. 5.8 A conceptual structure of the metadata repository (in UML) Physical storage of every information that is introduced in the model and in the database schema. The information is necessary for the database management system programmer to implement properly its generic capabilities. In this role, the metadata repository should be also prepared to store information on physical data properties, information necessary for optimizations, privacy and security information, and so on; Enabling the database application administrator to change the schema, for example, to insert a new class or a new method. More advanced changes are qualified as schema evolution; Enabling the database application programmer to retrieve every information that could be necessary to generic programming through reflection (as, for instance, in dynamic SQL or in CORBA IR). These functions of the metamodel and its repository are to some extent contradictory. In particular, the first function requires presenting detailed concepts in a form of a complex diagram, while the last function requires that the metadata repository should be as simple as possible (to avoid metadata management nightmare during preparing application programs). 74 In Fig. 5.8 we present a conceptual structure of the metadata repository, which follows the first of the mentioned above functions. In practical implementation, other functions of the repository can be supported by “flattenting” this model, i.e. storing structural information as values of attributes rather than as specialized classes, inheritance links and relationship links. Below we present some comments to the presented diagram. Class MetaObject is a superclass of the classes Interface and Property. It stores (optionally) the name of an object instance, as presented in Fig. 5.6. Instances of the class Class Spec determine all public properties defined in the class. When a new class is registered, all its public properties are determined within a corresponding instance of the Class Spec. The class itself is stored as a regular instance of the Stored Object class (possibly distinguished; this is not shown in the diagram). Instances of the class Class Spec can be used to define any number of interfaces (relationship is_implemented_by). Role interfaces are regular interfaces, but for a role interface, the attribute InstanceName is obligatory. The relationship is_role_of connects a role interface to its super-role interfaces. In Fig. 5.6 we have assumed that the definition of a role interface contains the clause determining its super-role interface. Such a solution is more close to the classical objectoriented models, where each specialized class contains the reference(s) to its parent class(es). In contrast, in Fig. 5.8 we have assumed that the relationship is_role_of has the many-to-many cardinality and the attributes IsOptional and IsMultiple are assigned to this relationship. The basic motivation for this extension is support for aspect-oriented programming, where role interfaces may represent cross-cutting aspects of objects (such as history of changes, ownership, security, privacy, visualization, synchronization, transaction processing, etc.). An aspect can be implemented as a class, which is represented by its interface. Such an interface can be then assigned as a role to many other interfaces. Of course, on the level of the object store each role has still at most one super-role. The feature requires special syntax (not determined yet) in our data definition language. An Interface has properties (Property class) which could be static (Static Property) or behavioral (Method). Static properties are subdivided into attributes and relationships. 75 Static properties can be qualified as multiple and/or optional. In this way we model collections (in the ODMG terminology). Possibly, another qualifier IsOrdered can be introduced to model ordered collections. We avoid specialized parameterized interfaces to collections, as in the ODMG standard (see the critique presented in [Alag97]). Instead, we assume that a query language will contain several built-in operators to process collections in addition to the classical for each ... do... operator; for example union, intersection, insertion, deletion, etc. These built-in operators can be used to implement specialized interfaces for particular collections. In general, in this model we do not introduce templates or other parameterized definitions. Attributes, parameters and output of methods have associated their types. A type can be primitive (integer, string, etc.) or structural. Structural types are particular cases of interfaces. In consequence, if one has to declare the structural type (struct, in the C++ terminology), he/she has to define an identical interface. In this way we avoid a (rather subtle) difference between interfaces and structural types. Fig. 5.8 presents also the organization of the object store. Usually, this part is not considered a component of the metadata repository, but for connections of this part to the rest of the repository, we present everything on a single diagram. An instance of the class Stored Object can store any entity that is a database unit; in particular, objects, attributes, instances of relationships (pointer links), methods (including database views, procedures, rules, and reaction to events), classes and modules. More detailed classification and interdependencies between these entities are not shown on this diagram. The property of being root object is assigned directly to objects rather than to interfaces. Root objects are starting points for querying the database. The persistence property is assigned to objects rather than to interfaces or classes. This is the consequence of the principle of orthogonal persistence. In consequence, transient objects can be explicit components of persistent objects, which supports some programming techniques (e.g. makes it possible to store temporarily various flags inside objects, to enable concurrent processing). Stored objects can be primitive or complex. Complex objects are determined by the consists_of relationship. Complex objects include collections, which are modeled as many objects having the same name (as illustrated in Fig. 5.4 and Fig. 5.5). 76 Roles are a special case of complex objects. The relationship has_superrole is used to determine the inheritance hierarchy of roles. Several features of the data model and the schema language are not presented at this diagram. In particular, we do not present subdivision of object/class properties into public and private. Many other features, such as threads, triggers, rules, transactions, etc., are outside the scope of this paper. 5.5. Query Language for the Object Model with Dynamic Roles In this chapter, we follow the stack-based approach (SBA) to query languages, see [SBMS94, SKL95, PK00] and many other papers. In our opinion, SBA is probably the only adequate theoretical paradigm for a query language defined for the object model with dynamic roles. SBA can be summarized as follows: It considers query languages a special kind of programming languages. Thus, it follows the classical programming languages’ semantics rather than database theoretical concepts such as the relational algebras, calculi or first-order logic. SBA is also a much more powerful alternative to novel theoretical concepts devoted to object-oriented databases, such as object algebras, object (domain) calculi, comprehensions, monoid calculus, F-logic, and others. Each run-time database or program entity (object, method, procedure, view, etc.) has an internal (illegible) identifier and an external name, which can be used in queries/programs. Each name occurring is a query/program is bound to a run-time entity (entities) according to the current scope for this name. Scoping and binding rules follow the same discipline for each name: no difference for names of objects, attributes, methods, etc., as well as auxiliary names defined in a query. The binding returns some information about the run-time state (in majority of cases, internal identifiers of objects). Scopes for names are organized in the environment stack, which is a generalized version of classical environment stacks implemented in majority of programming languages. 77 The stack accomplishes binding names according to the classical “search-from-the-top” rule. Query operators are subdivided into algebraic, which do not deal with the environment stack, and non-algebraic, which (temporarily) change the environment stack. Typical algebraic operators are: =, +, <, union, etc. Typical non-algebraic operators are the following: where (selection), dot (projection or navigation), dependent join (known from ODMG OQL), quantifiers, etc. Queries are building blocks for database/program constructs and abstractions, such as views, procedures, methods, imperative statements (creating, updating, deleting) and control statements. The stack-based approach is implemented in the prototype system Loqis [Subieta91]. In [Płodzień00] there is shown that the approach is also adequate for implementing query optimization methods, much more general and powerful than similar methods developed for relational query languages. The methods can be easily extended for an object model with dynamic roles. Currently we are developing a prototype object-oriented DBMS dealing with roles (among several other new features). In this section, we present basic ideas related to the semantics of such a query language. 5.5.1. The Environment Stack The environment stack (ES) is responsible for scope control and binding names. In programming languages, it is managed according to procedure calls and program blocks. A new section of volatile objects (so-called activation record) is pushed onto the stack when a procedure/block is started, and the section is popped when the procedure/block is terminated. The activation record for a procedure invocation contains volatile variables (objects) that are declared within this procedure, actual procedure’s parameters, a procedure return address, and other data. Binding follows the “search from the top” rule. The last added section is the first visited during the binding, and objects from some sections remain invisible for the binding (for so-called static scoping). 78 The stack-based semantics of object query languages is explained in [SBMS94, SKL95, PK00] and other sources. The idea is that some query operators (called non-algebraic) act on the stack in a similar way as invocations of program blocks. For instance, in the query22 The order of search during binding name n Binders to properties of the currently processed Employee object Binders to EmployeeClass properties Binders to PersonClass properties ... other visible stack sections ... ... invisible sections due to static scoping ... Binders to global properties of the current user session Binders to root database objects, views, database procedures, ... base sections Binders to global library procedures, environment variables, ... Fig. 5.9 An example state of the environment stack for classical object store model Employee where Salary < 2000 and Age > 40 the part Salary < 2000 and Age > 40 is a block evaluated in a new environment, which is determined by the currently tested Employee object. Thus, for the evaluation of this subquery, ES is augmented by a new section containing references to all internal properties of the object. After the evaluation this section is popped. The stack consists of sections, which are sets of binders. Binder is a concept that allows us to explain and formally describe various naming issues that occur in object models and query/programming languages. Formally, a binder is a pair (n, x), where n is an external name, and x is some entity, in particular, a reference to an object. Such a pair will be written as n(x). Binders serve binding names occurring in queries. If binder n(x) is present on ES and we want to bind the name n, then the result of the binding is x. The binding follows the “search from the top” rule: when n is bound, we are looking for a binder n(x) that is closest to the stack top. To cover bulk data structures of the store model we assume that the binding is multi-valued: if the relevant section contains more binders whose names are n: n(x1), n(x2), 22 Using OQL syntax the query can be formulated as: select e from Employee as e where e.Salary <2000 and e.Age() > 40. SBQL avoids the “select...from” sugar, because it makes queries less orthogonal and sometimes unnatural. 79 n(x3),..., then all of them form the result of binding. In such a case binding n returns the collection {x1, x2, x3,...}. The order of search during binding name n Binders to properties of the currently processed Employee role r Binders to EmployeeClass properties Binders to properties of the Person super-role of r Binders to PersonClass properties ... other visible stack sections ... ... invisible sections due to static scoping ... Binders to global properties of the current user session Binders to root database objects, views, database procedures, ... base sections Binders to global library procedures, environment variables, ... Fig. 5.10 An example state of the environment stack for the object store with roles In Fig. 5.9 we present the state of ES during binding name n occurring within a query Employee where ... n ... for the classical object store model, as presented in Fig. 5.3. Notice that together with the internal environment of the currently processed object (top of the stack) below there are environments of its class (binders to EmployeeClass properties) and superclass (binders to PersonClass properties). Thus, for instance, for the query Employee where Age > 40 the binding of Age will be accomplished at the 3-rd section from the top, since this section contains the binder Age( i41 ). The detailed description of the stack organization and its behavior for particular query operators can be found in other sources. Thorough understanding may require deeper knowledge on programming languages’ semantics and compiler construction. In Fig. 5.9 we present an example state of the stack for the same query Employee where ... n ... during binding name n for the store model presented in Fig. 5.4 and Fig. 5.5. As we can see, the difference concerns only the new section of the Person super-role (third from the top) of the currently processed Employee role, which is inserted into the stack. Previously (in Fig. 5.9) binders to such properties were present at the top of the stack, among binders induced by the processed Employee object. 80 An obvious difference concerns the database section (in Fig. 5.9 and Fig. 5.10 second from the bottom). Previously it has contained binders to root objects, e.g. Person(i1), Employee(i4), StudentEmployee(i9), ... for Fig. 5.3. Now this section contains binders to all root roles, e.g., for Fig. 5.4, Person(i1), Person(i4), Person(i7), Employee(i13), Employee(i16), Student(i19), ... . 5.5.2. Opening a New Scope on the Environment Stack The rules for opening a new section on ES by a non-algebraic operator for the object model with roles are the natural modification of the rules for the object model introduced in [SKL95, PK00]. Consider query q1 q2 where is a non-algebraic operator, q1 and q2 are sub-queries. The object model without roles: Let q1 return the identifier of some object O. Let object O be a member of C1O class, which inherits from C2O, which inherits from C3O, etc. Let O, C1O, C2O, C3O, ... have identifiers iO, iC1O, iC2O, iC3O, ..., correspondingly. Then pushes on the top of ES the corresponding sections in the order shown in Fig. 5.11 top of ES nested(iO) nested(iC1O) nested(iC2O) nested(iC3O) ... ... sections of the object O Fig. 5.11 Sections pushed onto ES by a non-algebraic operator in the classical object store model where nested is a function returning binders to internal properties of the object, whose identifier is the argument of the function. The situation is illustrated in Fig. 5.9, where three top ES sections are just opened by the non-algebraic operator where. The object model with roles: Let q1 return the identifier of a role R1. Let R1 inherits dynamically from R2, which inherits dynamically from R3, etc. Let Ri (i = 1,2, ...) be a member of C1Ri class, which inherits from C2Ri, which inherits from C3Ri, etc. The corresponding identifiers are: iR1, iR2, iR3, ..., iC1R1, iC2R1, iC3R1, ..., iC1R2, etc. Then pushes on the top of ES the corresponding sections in the order shown in Fig. 5.12. 81 top of ES nested(iR1) nested(iC1R1) nested(iC2R1) ... nested(iR2) nested(iC1R2) nested(iC2R2) ... nested(iR3) nested(iC1R3) ... ... sections of the role R1 sections of the role R2 sections of the role R3 Fig. 5.12 Sections pushed onto ES by a non-algebraic operator in the object model with roles The situation is illustrated in Fig. 5.10, where four top ES sections are just opened by the non-algebraic operator where. All opened sections are removed after processing the role R1. Other rules concerning opening/closing ES sections by a non-algebraic operator remain unchanged. 5.5.3. Thin and Thick Sections In the previous chapter we described how the environment stack is modified by the nonalgebraic operator where. The several sections created after pushing nested(q1) onto the top of ES can be treated as a one “thick” section, e.g., when these sections are closed (poped) at the end of processing q1. Therefore, we introduce the concepts of thin and thick sections. When a new scope is opened on the ES then a new thick section, which consists of thin sections, is created. We will mark thick section with a thick line, but thin section with a thin one (Fig. 5.13). Note that the concepts are introduced for better comprehension of the environment stack only. In particular, the division presented has no influence on the binding rules. 5.5.4. Private, Protected, and Public Properties Fig. 5.11, Fig. 5.12, and Fig. 5.13 do not present situations on ES induced by encapsulation, which subdivides properties into public, protected, and private. However, the rule for the object model with roles remains the same as for the classical object store model. It is roughly the following: 82 (Employee where Salary < 2000 and Age > 40 Local env. of Age database section Person(i1) Person(i4) Person(i7) Employee(i13) Employee(i16) Student(i19) Salary(i17) works_in(i18) Salary(i17) works_in(i18) ChangeSalary(i51) NetSalary(i52), ... ChangeSalary(i51) NetSalary(i52), ... Name(i8) BirthYear(i9) Name(i8) BirthYear(i9) Age(i41), ... Age(i41), ... Person(i1) Person(i4) Person(i7) Employee(i13) Employee(i16) Student(i19) Person(i1) Person(i4) Person(i7) Employee(i13) Employee(i16) Student(i19) Person(i1) Person(i4) Person(i7) Employee(i13) Employee(i16) Student(i19) Fig. 5.13 States of ES during processing a query Private properties of an object of a given class are available only to methods that are stored within this class; Protected properties of an object of a given class are available to methods that are stored within this class or within parents of this class. This rule can be easily implemented as a part of the environment stack mechanism. A graphical solution (with passed over protected properties) is presented in Fig. B.5. 5.5.5. Binding Binding rules for the store model with roles remain the same as for the classical object store model. All role names are bound in the database section of the stack (in Fig. 5.9 and Fig. 5.10 - second from the bottom). If the model with roles is used for programming of applications 23, then role names would also bound in the section of a current user session stack (in Fig. 5.9 and Fig. 5.10 - third from the bottom) and in sections containing local environments of procedures and methods. The rules for auxiliary naming (operator as known from OQL) are the same as for the classical object store model. 23 In some future query/programming language. 83 In Fig. 5.13 we present example states of the environment stack during evaluation of a simple query in SBQL (the query language implemented in Loqis) for the object store presented in Fig. 5.5. Sections, which are inessential for this example, are not shown. The first ES state contains only the database section, where the name Employee is bound (returning {i13, i16}). The second state presents the case when the operator “where” processes i16. The name Salary is bound at the top (returning i17) and the name Age are bound in the PersonClass section, 4th from the top (returning i41). The next state presents the case when the method Age is executed: the method pushes at the top of ES its local environment. During execution of the method, the section of the Employee role (2nd from the top) and the section of EmployeeClass (3rd from the top) are invisible due to static scoping. The final state, after executing the query, is the same as the beginning state. 5.5.6. Polymorphism and Overriding The discussed above stack-based semantics supports polymorphism because each role and its class are encapsulated. Thus, the designer can chose the same name for different methods stored within different classes. Overriding is naturally supported by the scoping rules, as presented in Fig. 5.9. In particular, it is possible to override a method defined for a role by a method defined for its sub-role. The possibilities of overriding are extended. It is also possible to override an attribute defined for a role by a method defined for its sub-role, and vice versa. Such a feature can be useful, e.g., when in a specialized role one wants to substitute an attribute by a virtual attribute. 5.5.7. Creating and Deleting Roles Creating and deleting objects and roles require introducing some semantical changes for create and delete operators. In this chapter, we briefly discuss that issue. 5.5.7.1. Create Operator Due to various constraints can be imposed on object and roles the create operator should be able to create not only single objects, but objects with their roles too. The simplified grammar for creating objects and roles (in modified BNF24 notation) can be defined as follows25: 24 BNF - Backus Naur Form 84 (1) <create_def> ::= <create_expr> {<create_expr>} (2) <create_expr> ::= create <create_spec> [as <auxname>] [(<attribute_list>)] [{<with_role_list>}]; (3) <create_spec> ::= <Class>| role <RoleClass> of <identifier> (4) <attribute_list> ::= <attribute> {, <attribute>} (5) <attribute> ::= <attribute_name> = <value> (6) <value> ::= <literal> | <collection_value> (7) <literal> ::= <integer_literal> | <float_literal> | <string_literal> | <name>| null (8) <collection_value> = { <value_list> } (9) <value_list> ::= <value> {,<value>} (10) <with_role_list> ::= <with_role> {, <with_role>} (11) <with_role> ::= with role <RoleClass> [as <auxname>] [(<attribute_list>)] [{<with_role_list>}] Apart from the keyword and symbols, the following terminals are used: name – denotes any name; Class – denotes a class identifier RoleClass – denotes a role class identifier; auxName – denotes an auxiliary name; identifier – denotes an object identifier integer, float, string – denote an integer, real number and string, respectively. Example: create Company as C (name=”IPI”); create Person (birthYear=1948, name=”Doe”); create Person (birthYear=1951, name=”Smith”) { 25 The following metasymbols are used: {S} – S may be repeated; S must occur at least once; [S] – S may occur at most once. 85 with role Student (semester=6, studentNo=”223344”, scholarship=1200), with role Employee (salary=1500, job=”secretary”, works_in=C) }; create Person (birthYear=1975, name=”Brown”) { with role Employee (salary=2500, job = “assistant”) }; 5.5.7.2. Delete Operator Removal of an object leads automatically to delete all its roles, but obviously, removal of a role does not mean that its main object is deleted too. A role concept realizes cascade semantics of removal, well-known for many database systems. For example, the query delete Person as p where p.name = ”Brown” deletes all persons whose name is Brown with its all roles, while the instruction delete Employee as e where e.salary > 3000 deletes those employees (Employee roles) who earn more than 3000. 5.5.8. Role-Specific Operators For some kinds of queries, it can be useful to make an explicit conversion between different roles of an object. Such a feature is necessary e.g. for the query “get all employees which are at the same time students”. Thus, it is necessary to provide a cast operator, which will convert the identifier of a role into the identifier of another role. Similarly we can be introduced a boolean operator testing presence of a given role within an object. Another operator may return identifiers of the roles that are currently present within an object. Such operators increase the possibility of generic programming. 5.5.8.1. Casting Operator In contrast to the typical cast operators (known e.g. from C++) our casting operator not only convert types, but it is a regular run-time operator mapping collections of identifiers into another collection of identifiers. Syntactically, the operator will be written as: (name) query 86 where name is a role name, and query is a query returning identifiers of roles. Semantics of this operator is the following: it takes an identifier of a role returned by the query, then returns an identifier of a role within the same object, named name. If the object has no role named name, then the result is empty (null). If the object has more than one role named name, then identifiers of all these roles are returned. The casting operator is an algebraic operator. By the operator, we can express the following queries, for example: Example: Select students who are at the same time employees. (Student) Employee Evaluation of the query Employee returns the collection of identifiers of the roles Employee in all objects. Then the operator (Student) converts each of them into an identifier of the role Student or into null, if the given object has the Employee role and has no Student role. The result is a collection of identifiers of the role Student; nulls are ignored; see Fig. 5.14. Employee(i13) Employee(i16) The initial state ”Student” Employee(i13) Employee(i16) null i19 The QRES after The QRES after eval(‘Employee’) eval(‘(Student)’) i19 The final result Fig. 5.14 States of QRES during evaluation of the query: (Student) Employee Casting operator is a powerful operator, which should act on an object with nested roles also. For instance, if we assume that a person, who participates in a conference, has acquired roles as depicted in Fig. 5.15, than we should able to perform the query: Example: Select persons who are reviewer and who are authors of accepted papers: (Person)( (Reviewer) Author ) The above query is almost equivalent to the following query: (Person)( (Author) Reviewer ). The queries will lead to the same results if duplicates will be finally removed. Example: Assume students have the attribute Scholarship. For each Person return Name and incomings being Salary for employees, Scholarship for students or sum of Salary and Scholarship for working students. For a person, who is not an employee and not a student, the incoming is 0. (Person as p) . ( p . Name, sum( 0, ((Student) p) . Scholarship, ((Employee) p) . Salary ) ) 87 Person Programme Committee Member Author Participant Author Speaker Reviewer Fig. 5.15 An example of an object with nested roles (Conference). In the above example sum is an aggregate function similar to the corresponding SQL function. 5.5.8.2. Hasrole Operator Hasrole operator tests presence of a given role within an object. Syntactically, the operator will be written as: query hasrole name, where name is a role name, and query is a query returning identifiers of objects or roles. Semantics of this operator is the following: it takes an identifier of an object (role) returned by the query, then search whether a role exists within the same object (role), named name. If the object (role) has no role named name, then the result is false. If the object has one or more than one role named name, then true is returned. The hasrole operator is an algebraic operator. Example: Select persons who are working and are more than sixty years old. ((Person where age > 60) as p) where (p hasrole Employee) 5.5.8.3. Roles Operator The roles operator returns identifiers of certain roles that are currently present within an object. Syntactically, the operator will be written as: roles [name] of query, where name is a role name and query is a query returning identifiers of objects or roles. Semantics of this operator is the following: If name is given it takes an identifier of an object (role) returned by the query, then search whether roles with a name name exist within the same object (role). If the object 88 (role) has no such roles, then the result is empty (null). If the object has one or more than role with a name name, then their identifiers are retrieved. If name is not specified it takes an identifier of an object (role) returned by the query, then search whether roles exist within the same object (role). If the object (role) has no roles, then the result is empty (null). If the object has one or more roles, then their identifiers are retrieved. The roles operator is an algebraic operator. Example: Retrieve names of all roles acquired by Smith. unique ( nameof (((roles (Person where name = “Smith”)) as r) close by ((roles r) as r))) In the above example nameof is an operator that retrieves a name of an object (or role), unique is an operator for removing duplicates, and close by is an operator, which performs the transitive closure26. 5.5.9. Query Optimization Because the model with dynamic roles is built on the standard stack-based approach, the general idea of static query optimization through static analysis, as presented in [Płodzień00, 26 The transitive closure makes possible to process recursive data structures and to encapsulate some non-trivial iterations [AU79, SMA90]. For instance, the concept of a transitive closure may be required in order to perform the aggregate function weight for a data structure defined as follows: Part 0..1 name mass * is_made_from The total weight of parts is a sum: sum(Part.mass, Part.is_made_from.Part.mass, …) The sum above can be computed by using transitive closure operator closed by: sum( (Part closed by (Part.is_made_from.Part)).mass ). Other variants of the transitive closure, implemented in LOQIS, are discussed in [SMA90]. 89 PK00, PS01a, PS01b, PS01c], remains the same. To cover the concept of roles the database schema graph needs a new kind of nodes to store definitions of roles and a new kind of edges for the is_role_of relationship between roles. Some modifications are needed for the query optimization techniques. Among others, the stack’s sizes calculated in the method of independent subqueries should concern the modified environment stack storing sections for roles (see Chapter 5.5.2). In addition, methods, which have not been considered so far, may prove usefulness in the new model. For example, casting operators discussed in the previous section may lead to a situation when an auxiliary name cannot be eliminated but the query can still be rewritten to a more efficient form. The following illustrates the general idea. Consider the query in SBQL (adopted for the object model with roles) “get persons born after 1950 along with their companies”: ((Person as p) ((Employee) p) . works_in . Company) where p.BirthYear > 1950 In this query the dependent join operator joins each object Person (named in this query p) with the Company that the person works_in if the person has the Employee role, what is determined by the casting operator (Employee). The resulting pairs < p(iPerson), iCompany> (where iPerson, iCompany are references of Person and Company roles, correspondingly) are then filtered by the where operator, which leaves only those pairs, where the person is born after 1950. According to optimization rules presented in the cited above papers, the selection predicate can be pushed before the operator: (((Person as p) where p.BirthYear > 1950) ((Employee) p) . works_in . Company) but the auxiliary name p cannot be removed, because it is used after the join. Nevertheless, we can perform the selection before introducing p: (((Person where BirthYear > 1950) as p) ((Employee) p) . works_in . Company) Such a case could be especially common when the optimization concerns queries involving views, that is, when such a selection is applied to a view invocation, which is then macro-substituted. The example shows the possibility to apply the query rewriting methods, e.g. a method known as “pushing a selection before a join”, to queries addressing the object model with roles. Low-level optimization techniques, such as applying indices, are practically unchanged for this model. Thus, although optimization techniques for the model need further development, we do not expect that the model implies totally new query optimization problems. 90 6. Extending UML with Dynamic Object Roles The Unified Modeling Language [UML01] is becoming a widely used tool for conceptual modeling and systems design. In spite of its advantages, UML, as probably every tool, offers opportunities for improvement. Such opportunities are subjective to some degree, because what is “natural improvement” for someone, it does not have to be perceived in this way by others. Nevertheless, discussions concerning the refinement of UML are necessary, because they make it possible to create a critical mass of ideas and remedies, which is essential for the success of the language. The discussions on refining UML have existed since the advent of the standard and resulted in a large number of publications. Some of them concern UML’s general aspects; others concern its particular elements, including the class diagram. In this chapter, we focus on the integration of promising concept of dynamic object roles in the UML [JPSS02]. The term “role” has several specific kinds of meaning in software engineering and databases. “Dynamic object roles” that we deal with (see Chapter 5) has several interesting properties and proves to be especially useful for expressing some complex problem domains. Despite the concept’s advantages, some researchers consider its applications not sufficiently wide to justify the extra complexity implied by it to conceptual modeling frameworks. In particular, it has not been introduced into UML, because it is “a valuable modeling concept that can be mapped into associations” ([UML99a], p. 54) and “the use of multiple classification or dynamic classification affects the dynamic execution semantics of the language, but is not usually apparent from a static model” ([UML01], pp. 3-87). Although we understand the motivation of this decision, we disagree with the conclusion. On the other hand, some researchers propose to represent dynamic object roles informally through the already existing notions of the conceptual modeling by using design patterns [GHJV95], which have been popularized in the context of object-orientation. The design pattern decorator [GHJV95] is considered in [Fowler97] a good mapping of dynamic roles, because it allows one to insert additional functionality to a class without subclassing (see Chapter 4.7). 91 We argue in the chapter that this attitude to the concept of dynamic roles should be revised and that the complexity of many real-life problems requires special tools supporting the systems analyst. Although UML does not support dynamic object roles, the standard includes some notions related to them. One of them is dynamic classification defined as “a semantic variation of generalization in which an object may change its classifier” ([UML01], p. B-8). The concept of dynamic object roles emerges from dynamic classification, which per se is a powerful tool for conceptual modeling. Moreover, to some degree dynamic classification can be regarded as a conceptual framework for the idea of dynamic roles. However, the standard treats the concept superficially; in particular, it does not specify how to apply it and when, and offers no notation for expressing it on the class diagram. Another relevant concept is a collaboration role that “defines a role to be played by an Instance within a Collaboration” ([UML01], p. 3-125). It is similar to the concept of dynamic roles presented in Chapter 5, because it can cover behavioral specification of objects, focusing on dynamic aspects of an object’s life. In our opinion, dynamic roles should be incorporated first into the static view diagram (i.e. the class diagram), and only then into those of the other UML diagrams that are affected by this incorporation. 6.1. Dynamic Classification vs. Dynamic Object Roles Generally, a new element of UML model one can define through a stereotype. In our discussion we will make use of the dynamic classification example diagram presented in [FS97] (see Fig. 6.1) which uses the user-defined stereotype «dynamic». The model applies dynamic classification for a person’s job in order to express that a person can change his/her job. Note a serious disadvantage of this diagram: a person can have only one job at a time; this is illustrated in Fig. 6.2 – a man (modeled by a Male owner-object) has exactly one Salesman role, which means that now he is working as a salesman only. 92 Manager Job «dynamic» Female Sex Person Engineer Male Salesman Fig. 6.1 An example of dynamic classification in [FS97] Male Salesman Fig. 6.2 A man has one role A problem domain can frequently be more complex, because a person can have several different jobs simultaneously. It is presented in Fig. 6.4 – a man is an engineer, a salesman, and a manager at the same time. To model such a situation, we can modify the diagram depicted in Fig. 6.1 by adding an overlapping-constraint to the specialization hierarchy with the Job aspect; see Fig. 6.3. The situation when we want to consider for an entity both its current and past roles is similar to the situation when past roles are neglected. A role needs a special flag to indicate whether it is current or past. With this in mind, without loss of generality we will consider only current roles in our discussion. Unfortunately, also the introduction of an additional mechanism of grouping is necessary which makes possible to recognize and act on all roles of a given object (e.g., which can act on all jobs of a given person). The idea of generalization/specialization assumes that children’s classes and their properties are not known by their parent class(es). In our approach, roles of a given object can be accessed from this object in a way, which is different from simple navigation. Another important difference between classical inheritance and (dynamic) inheritance among roles is related to class instantiation. Instantiation in the class hierarchy means that a single object, which represents the most specific class, is created. An instance is not related to other ones by inheritance, because inheritance relationships exist among generalizable model elements only (such as classes and other classifiers, associations, use cases, states, events, and collaborations). Thus the instantiation mechanism, which leads to creation of an object with 93 roles, in our opinion, should assume that an instance could be in a sense a generalizable element too. Male Salesman Engineer Manager Fig. 6.4 A man has at most one role of a given kind Job «dynamic» {overlapping} Female Sex Person Manager Engineer Male Salesman Fig. 6.3 A class diagram modeling the situation from Fig. 6.4 through an overlapping-constraint 6.2. Composition vs. Dynamic Object Roles In many cases, situation can be more complicated (than depicted in Fig. 6.3), e.g., when a person needs to have several jobs of the same kind simultaneously (the case of repeating inheritance). For example, the man modeled in Fig. 6.5 works not only as an engineer, but has also two positions as a salesman, and three positions as a manager. Note that this cannot be expressed even with overlapping dynamic classification. Instead, some kind of association (the best solution is composition) has to be used; see Fig. 6.6 where the specialization hierarchy with the Job aspect has been replaced with composition, because: An object with its own roles can be treated as a composite object. Similarly to composition, a role cannot exist without its parent object. 94 The fact that a person can possess several roles of the same type is modeled through the “many” multiplicity items on the constituent ends of the composition. One can specify in a simple way whether a role should be mandatory or optional or define another its multiplicity. (Usually roles are optional. A mandatory role means that its superrole must possess it. A multiple role may have many instances within an object, e.g., a person can work many times as a manager, but he/she can work almost one as an engineer role Fig. 6.6.). The removal semantics of objects with roles is emphasized as very similar to removal semantics of composition. However, from the viewpoint of conceptual modeling this approach has a very serious disadvantage: the specialization/generalization semantics between the Person class and its subclasses (Manager, Engineer, and Salesman) has been lost. To express this semantics back on the diagram, one has to define for instance constraints for the composition. It is evident that this requires additional effort and may be error-prone. Moreover, it forbids applying the useful concept of generalization, which is one of the most significant tools of the objectoriented data model. Male Salesman Salesman Manager Engineer Manager Manager Fig. 6.5 A man has several roles of the same kind * Manager Female 0..1 Sex Person Engineer Male * Salesman Fig. 6.6 A class diagram modeling the situation from Fig. 6.5 through composition 95 6.3. RoleOf Relationship In order to be able to express dynamic object roles on the UML class diagram, we propose to introduce a new kind of relationship – the RoleOf relationship – along with a new graphical element drawn as a path with a white-and-black diamond. The white top triangle (derived from the generalization notation) expresses the existence of (dynamic) generalization, while the black bottom triangle emphasizes the fact that a role cannot exist without its owner’s object and cannot be shared (similarly as in composition). The white-and-black diamond is placed on the end of a path attached to the class whose instances are owner-objects. Similarly to generalization paths, names are not attached to RoleOf paths. A group of dynamic generalization paths for a given parent may be shown as a tree with shared segment (including the white-and-black diamond) to the child, branching into multiple paths to each child. An example of shared target style is presented in Fig. 6.7, while an example of separated target style is presented in. For the RoleOf relationship, UML multiplicity items can be defined in the following fashion: Because a role cannot be shared between different objects (in other words, a role has exactly one owner), the multiplicity for the owner end is always one (it is default) and does not have to be specified. The multiplicity for the other (i.e. role) end specifies how many roles being the instances of the same class a given owner-object can potentially possess. Fig. 6.7 presents a class diagram for our example problem – the information about possessing a position is modeled through the Manager, Engineer, and Salesman classes whose instances are roles of Person objects. The multiplicity items specify that a person can have an arbitrary number of positions of these kinds. Note that the main difference between the model from Fig. 6.3 and the model from Fig. 6.6 is not the syntax (that is, the graphical representation; in fact, they are very similar visually), but the semantics. Applying the role concept in Fig. 6.7 has important consequences. Although in both of the models instances of the Manager, Engineer, and Salesman classes cannot exist by themselves (without their whole parts or their owners), from the model in Fig. 6.7 we know directly that they are roles, to which the entire paradigm of dynamic object roles presented in Chapter 5 applies. In particular, for the model in Fig. 6.7 96 the properties of Person objects are dynamically inherited by their Manager, Engineer, and Salesman roles; for the model in Fig. 6.6 it has to be simulated by the programmer through additional code (see also design patterns in Chapter 4.7). * Manager * Engineer Female Sex Person Male * Salesman Fig. 6.7 A class diagram modeling the situation from Fig. 6.5 through dynamic object roles * Manager * Engineer Female Sex Person Male * Salesman Fig. 6.8 A class diagram from Fig. 6.7 in separated target style We want to emphasize that this feature can be fully understood if one looks at dynamic roles not only from the analyst’s viewpoint, but also from the programmer’s point of view. Then one can see why the concept has potential to reduce significantly the gap between conceptual modeling and coding. If dynamic roles are supported both in conceptual modeling and in programming, the programmer can directly code a class diagram into a programming language’s statements. Without any additional programming effort, he/she achieves the whole functionality of roles that have been identified by analysts. However, if a programming language does not support this concept, then programmers have to code this functionality by themselves. It may not be straightforward, especially for complicated models, for example: A class can be a subclass (the static specialization relationship) and simultaneously can model roles (the RoleOf relationship). In Fig. 6.9 the Manager class, which models roles of Person, is a subclass of Employee; in other words, a person can be a manager – a kind of employee. Classes of roles can be used to structure hierarchies of RoleOf relationship paths similarly as in the classical object models (i.e. those not supporting dynamic roles) classes of objects can be used to structure hierarchies of static specialization. This 97 approach is especially useful if in our example a person could also play roles other than Manager, Engineer, and Salesman; for instance, if a person could be a student. Conceptually, being a student is different from being a manager, an engineer, a salesman etc. In order to include this information into the model, we put the Employee class on the same level of abstraction as the Student class to group the classes of Salesman, Engineer, Manager etc roles; see Fig. 6.10. There is no restriction on how deep a RoleOf relationship hierarchy can be. In particular, instances of the classes Salesman, Engineer etc can have their own roles. For the RoleOf relationship, various constraints can be defined. The xor-constraint in Fig. 6.10 specifies that a person as an employee can be either a salesman or an engineer, but not both. More complex constraints can be described (as usually in UML) in a natural language. Female Person Male 0..1 Student * Employee Manager ... Fig. 6.9 A diagram in which a class of roles is involved both in static specialization and the RoleOf relationship 6.4. Combining Class and Role Hierarchies Let us consider one more example to extend our discussion and to illustrate the fact that the static specialization relationship and the RoleOf relationship can be freely combined even in the same hierarchy. The diagram in Fig. 6.11 presents a situation in which we have assumed that a person is involved in a conference only: either as a conference attendee (someone who pays a conference fee and attends the conference; modeled by the ConferenceAttendee class), a member of the conference steering committee (the SteeringCommitteeMember class), or a member of the conference program committee (the ProgramCommitteeMember class). 98 Because conceptually being a conference committee member is different from being a conference attendee, a special class ConferenceCommitteeMember has been introduced to group the roles concerning committee membership. Static specialization has been applied to this class to express the fact that, according to our assumption, a conference committee member belongs either to the steering committee or to the program committee. If someone wants to take part in the process of reviewing the papers submitted to the conference (the Referee class), according to our model he/she has to be a member of the program committee. Female Person Male 0..1 * Employee Student {xor} 0..1 0..1 Salesman Engineer * Manager ... Fig. 6.10 An example of the RoleOf relationship between roles with an xor-constraint Both a conference attendee and a conference committee member can present papers at the conference; in other words, they can be speakers. This is represented through the Speaker class, which models roles both for attendees and for committee members. In this case, we apply the RoleOf relationship twice: from Speaker to ConferenceAttendee and from Speaker to ConferenceCommitteeMember. Note that because by definition roles cannot be shared by different objects, no xor-constraint (which is default here) is necessary for these two relationships. It is interesting to learn why these two RoleOf relationships, albeit applied together, do not imply the problems of multiple inheritance (e.g., name conflicts). To this end, let us analyze how dynamic inheritance works in this case: a Speaker role is being attached to its ConferenceAttendee owner or to its ConferenceCommitteeMember owner dynamically (that is, at run time). Thus this role is dynamically inheriting the properties of the object, to which it has been attached. For instance, if it has been attached to a ConferenceAttendee object, then it is inheriting its properties (and not inheriting any properties of any ConferenceCommitteeMember). In consequence, RoleOf does not mix the properties of the two conceptually different entities. 99 Person {xor} 0..1 0..1 Conference Attendee Conference Committee Member * Steering Committee Member * Speaker Program Committee Member 0..1 ... Referee Fig. 6.11 A class diagram involving static specialization and the RoleOf relationship in one hierarchy 6.5. Multiple-Aspect Inheritance In this chapter, we want to apply the concept of dynamic object roles for multiple-aspect inheritance. Let us consider one more time the diagram from Fig. 6.1 and let us replace the specialization for the Sex aspect with specialization for a Hobby aspect (i.e. a person can be a soccer fan, a basketball fan etc). With dynamic roles, we can remove the static specialization for both of the aspects. We have already discussed how to use dynamic roles instead of the static specialization for the Job aspect; the other aspect (i.e. Hobby) can be transformed similarly, as presented in Fig. 6.12. Note that in this case the type of the objects created for this class diagram is Person – the information about a person’s hobby is stored as a role (see Fig. 6.13). Soccer Fan 0..1 Hobby * Manager * Engineer Job Person Basketball 0..1 Fan * Salesman ... Fig. 6.12 A diagram with multiple RoleOf relationships instead of multiple-aspect static inheritance 100 Person Soccer Fan Salesman Salesman Manager Engineer Manager Manager Fig. 6.13 A person with a SoccerFan role 6.6. Classical Inheritance vs. RoleOf Relationship We should notice further differences between classical generalization and dynamic generalization introduced by RoleOf relationship. The same dynamic role class can be defined for two or more classes with different semantics, located differently in a class hierarchy. An instance of such situation is presented in Fig. 6.14. Tax-Payer class can is defined as a role class both for Person class as well for Company class. Instantiation of Tax-Payer class leads to creation of a new role, which is only a role of a Company object or only a role of a Person object (because a role cannot be a role of both objects at the same time). The semantics, discussed above, differs from inheritance semantics in UML, because behavioral specifications of Person and Company classes are different. More precisely, the inheritance of structure without inheritance of behavioral specification is allowed in UML through the private inheritance [UML01]. However, private inheritance does not follow substitutability and overriding, which are important for mapping a conceptual model into implementation structures. Person taxes(year) Tax-Payer Company taxes(year) Fig. 6.14 A class diagram in which the same role class is defined for two classes with different natures 101 Person name birth_date address Client registration_date Subscriber subs_no status has actualSubs() 0..* pastSubs(date) isClient() isEmployee() discount() 0..n Subscription start_date end_date For employees up to 80% of discount Employee employment_date insurance_no seniority() Fig. 6.15 Subscriber example We mentioned in Chapter 6.4 that static and dynamic generalizations can be combined in the class hierarchy with dynamic roles. For example in Fig. 6.15, a Subscriber role can access insurance_no or call seniority method in order to calculate some discount if it is a role of an Employee object. In general, roles can dynamically specify abstract or concrete classes with some features whose domain is independent of domains of these classes. Therefore, roles may be helpful to cover cross-cutting concerns (see Chapter 5.3.15). Cross-cutting concerns implemented as roles can be important in RDF-based ontologies and metamodels [W3C02] to cover such universal meta-information as security rules, ownership, copyrights, up-to-dateness, etc. 6.7. Notation for RoleOf Instances Since instances of dynamic inheritance (RoleOf relationship) exist between roles27, we propose changes to UML object diagram. Roles are linked with parent objects using similar notation as for class diagrams by paths with white-and-black diamond end; see Fig. 6.16. 27 We consider every object has default main role, which includes all its properties. Therefore, objects can be treated as roles too. 102 :Manager :Manager :Engineer :Male :Manager :Salesman :Salesman Fig. 6.16 An object diagram from Fig. 6.5 103 7. Implementation The most important features of dynamic object roles introduced in Chapter 5 have been implemented by a prototype. The prototype contains the following modules: The object store; The ODL parser; The SBQL parser (the engine of static analysis); The engine of query evaluator. The object store consists of three levels: (i) The physical level; it is the Virtual Persistent Object Store [Subieta91a, Subieta91b]; (ii) The logical level that realizes the concept of SBA triples and performs basic operations on them (create, retrieve, update, and delete); (iii) The conceptual level that implements the main object-oriented concepts of SBA as class, encapsulation, inheritance, objects, roles and allows acting on them. The conceptual level realizes also concepts of SBA stacks (environment and result stacks). The ODL parser checks a database schema defined by a user and memorizes it into the store. The static analysis engine performs static analysis of SBQL queries and builds a syntax tree for a given a query; the module is a version of the engine implemented in [Płodzien00] and extended to deal with role-specific operators; see Chapter 5.5.7. The grammars implemented in the ODL and SBQL parsers are presented in Appendix A. There are some minor differences concerning the syntax of SBQL used in the dissertation. For instance, the form of dynamic casting operator introduced in Chapter 5.5.8.1 is defined in the prototype as (role name) query. The engine of query evaluator performs the evaluation of a query defined by a user. The environment stack implemented in the module supports an idea of thin and thick stacks introduced in Chapter 5.5.3. The result of query evaluation is presented in two forms: In a pretty form which details values and type of result; In a rough form which shows a physical representation of result. 104 For the following examples, we show the results presented by the prototype directly to the user. For a query there are returned: The class schema (the same for all examples; see Fig. 7.1); The original form of a query; The form of a query after static analysis; The result of a query (in a pretty and in rough forms). Fig. 7.1 An example of class schema 105 Example 1. The query ((role Employee) Person).works_in.Company was transformed to the syntax-tree form (see Fig. 7.2) ((((role Employee) Person).works_in).Company) Fig. 7.2 The query from Example 1 after the syntax analysis. and then evaluated (see Fig. 7.3). Fig. 7.3 The result of query from Example 1. 106 Example 2. The result of query (Person as p).(p.name, ((role Student) p).scholarship + ((role Employee) p).salary) Fig. 7.4 The result of query from Example 2. is depicted in Fig. 7.4 (we omitted a rough part of result). Example 3. The result of query roles of (Person where name = “Smith”) is depicted in Fig. 7.5 107 Fig. 7.5 The result of query from Example 3. We should notice the results of queries, presented in above examples, are the same as expected. It shows that our concept of dynamic object roles is implementable28 in the Stack Based Approach. We can also state that the object store with roles, the extended concept of environment stack, and the role-specific operators introduced in the dissertation are proved useful. 28 The prototype has been constructed in Borland C++ Builder 6. It runs as a console application under Win32 environment. 108 8. Conclusions The main goal of this dissertation was to incorporate the concept of dynamic object roles into the stack based approach, which we consider an alternative to the classical database object models (such as the ODMG object model), and into the UML class diagram, which is an important tool of conceptual modeling. The novelty of our approach is that: It makes use of a universal object data model (i.e., SBA). It incorporates role-specific operators into a query language (i.e., SBQL) in order to act on collections of objects and roles. It introduces a precise and simple proposal of extension of the UML class diagram to deal with dynamic roles. In Chapter 5 we have introduced a formal stack-based model with dynamic object roles and discussed how the classical concepts of object-orientedness, such as object identity, polymorphism, inheritance, and overriding can be smoothly incorporated into the model. The model leads to some new concepts for meta-data management, a schema definition language and a query language discussed latter in Chapter 5. In Chapter 6 we have presented a proposal of introducing dynamic object roles into the UML (Unified Modeling Language) class diagram and discussed advantages of that approach. The main results of our thesis are as follows: Dynamic object roles can reach conceptual models of many applications and not lead to anomalies and limitations of multiple, repeating or multiple-aspect inheritance. The concept bridges the gap between perception, analysis, design, and programming. SBA is a universal data model that makes it possible to support dynamic object roles in a precise and universal fashion. Our extension to the UML class diagram, defining a new RoleOf relationship, has potential to become a useful UML element. Its main advantage is the ability to model complex real-life situations in a natural manner, which otherwise would have to be 109 expressed through some lower-level mechanisms as an intricate combination of generalizations, associations and constraints. The role-specific operators, dynamic casting particularly, are powerful mechanisms of a query language when acted on objects and roles. The concept of dynamic object roles is implementable in the stack-based approach both in the ODL and in SBQL languages. We can consider as extensions of our approach the following issues: The extension of DDL language to support constraints among roles (see Chapter 6); Constructing a system of types in the object model with dynamic object roles; Introducing dynamic roles into the UML metamodel; Developing cross-cutting concerns in the spirit of AOP that makes use of the concept of dynamic object roles. Each aspect can be encapsulated as a separate role, which would provide appropriate data, e.g., active rules, security, privacy and visualization. These extensions require further research. 110 Appendix A. The Prototype - General Description In our approach, the implementation of the object store consists of three levels: physical, logical, and conceptual. The physical level defines the physical organization of persistent memory storage and operations, which can be performed within the storage. The physical level of object store specifies how data are physically represented in a computer persistent memory. It may be based on the one of well-known database management systems (objectoriented as well as relational). However, in this work the physical level has been developed as a memory package built separately. The logical level of object store defines the fundamental concept of object store model strictly related to stack-based approach (SBA). At this level, all beings are represented through collection of simple or complex objects without any further specializing objects. Classes and relationships among classes as well as class instances and links between objects can be defined at the highest level of object store. The conceptual level allows also defining new concepts introduced in this work, i.e., role classes and role instances. In next sections, details of all levels of object store will be presented. A.1. The Physical Level The physical level is responsible for storing, retrieving, and updating data within a given database or a database management system. The database can support either object-oriented data model, relational model, or any other model, which is capable of capturing SBA features. In this work, the implementation of physical store is based on the Virtual Persistent Object Store (VPOS) [Subieta91a, Subieta91b]. The VPOS is a package that includes the collection of C procedures to organize a persistent memory and manipulate the data within this memory called atoms. From the programmer’s point of view, an atom is an independent unit storing some elementary information, e.g., an attribute value, number, or reference to another atom. Atoms are identified by long (4 bytes) integers and can be referenced by their 111 identifiers29. They can be freely created, deleted, and updated30. Updating an atom never changes identifiers of other atoms. However, an atom identifier may change while its size is increasing. Every atom possesses its own name and type. Generally, a type determines whether atom is atomic, structural or it serves another special purpose. Atoms called atomic may store an atomic value, e.g. an integer, real number, or string. When atoms store cross-references (representing links) to other atoms they are called logical pointers or reference atoms. Structural atoms allow defining some structures, constituting different data levels. In the VPOS, there are introduced two main types of structural atoms, called rings and spiders. From the functional point of view, rings and spiders are equivalent, because all functions of VPOS act on them in the same manner31. Structural atoms consist of members, i.e. other atoms. Rings and spiders can be degenerated and have no members. The main ring is a special ring that contains members only and has no owner. A member of a given ring/spider may be either another ring/spider, atomic atom, or reference atom. VPOS supports full consistency of links, i.e. when an atom is moved into another place then all cross-references to it are also automatically updated, and when atom is deleted then all cross-references to it are automatically removed too. An atom name need not to be unique within the VPOS neither within a given ring or spider. Names are represented by two-byte codes, and they are stored as strings within the special name dictionary. Some names (codes) are reserved to special purposes. 29 Instead of long type, there is introduced an equivalently type: POINT. 30 Atoms may have different size from 16 to (16M – 1) bytes. 31 A ring is a collection of atoms where one atom is called owner and the others are called members. An owner contains a reference to the first member; the first member contains a reference to the second one, etc. A spider has similar organization, but its owner contains a directory to all members (references, names and types of members) and each member possesses a reference to the owner. Spiders allow achieving faster processing in cost of some extra storage space. 112 A.2. The Logical Level The logical level of object store defines objects according to the stack-based approach as simple and complex (structural) objects. This layer is a gateway (wrapper) between the conceptual and the physical levels. It is strictly bound to the physical layer implementation; so for different physical layer implementations there should be constructed adequate logical layer implementations. In this work, the physical layer is based on the VPOS, therefore the logical level uses some exported VPOS operations and it often references to the VPOS organization. Due to some objects have reserved names that cannot be identified by string equivalents, several functions, defined at the logical and conceptual level exist in two forms: With using an object name as a string; With using an object name as a short integer number (a code name). Names of such functions usually possess an infix Code. A.2.1. The Definition of an Object: In stack-based approach, objects are defined as triples: (identifier, name, value), for simple objects (identifier, name, content), for complex objects, where value either stores an atomic value, e.g., number, character, string or it contains a reference to another object (logical pointer). In the first case, objects are called atomic, in the second one – reference objects. Content constitutes a set of members, which can be simple as well as complex objects. Naturally, simple objects are represented in the physical store by simple atoms, and complex objects are represented by structural atoms. At the logical level in the store, there are specified only basic operations on objects to allow creating, retrieving, updating, and deleting objects (CRUD). Other object features, such as class membership, encapsulation, and visibility are defined at higher store level and will be presented in next sections. However, one ought to assume that they will be implemented through some special components of complex objects whose behavior must be somewhat different from behavior of ordinary objects. 113 A difference between simple and complex objects should be considered as a departure from treating objects in a uniform way. For example, single objects cannot become complex objects and vice versa. However, it is worth a notice that such necessity will occur very occasionally. Naturally, an assumption can be taken that all objects have a complex structure, even if they store only atomic value. For simplification, we adopt this point of view and thus all beings of the real there will be represented as compound objects. Therefore, simple objects will rather represent properties of beings (e.g. attributes) or some programming beings (e.g. binders). We noticed earlier that the implementation of store is based on VPOS (Virtual Persistent Object Store) [Subieta91a, Subieta91b]. We should also mention several essential assumptions related to object store: Objects are partially ordered by first/next relation. This relation determines the order during searching objects, which are placed at the same data level in the store, e.g. when they are members (subobjects) of a given object or they are input objects in queries called root objects. The relation, mentioned above, is consistent with the logical data structure in the VPOS. At each data level, first returns an identifier which refers to logically first subobject of a given object and next returns a reference to logically next object that is placed at a given data level. We assume, as in the VPOS, that the relation discussed is stable (i.e. first and next return the same reference as long as a given object is not modified). A database is physically stored by the VPOS within just one structural object identified as VPOS_ENTRY; Root objects are subobjects of the main object (VPOS_ENTRY) and they may be directly accessed, but all remaining objects are indirectly accessible through navigation within root objects; A.2.2. The Definition of Atomic Value The notion of value is related to simple objects, which can store atomic values. Complex objects have no values but they store contents, which reveal a complex structure. However, a content of a complex object can be composed of one or more simple objects. In our store, two kinds of simple objects are distinguished: atomic and reference objects. Since reference objects contain references, which are special kind of values, we will consider them separately. 114 We assume several types of atomic values should exist in the store: Float (8-byte float number); Integer (4-byte integer); String. In business database applications there usually occur many more base types, e.g., short, long long, char, date, time. However, most of them can be treated as subtypes of these above three types and supporting all of them does not bring any important quality to our concept of dynamic object roles and to its implementation. For retrieving, updating, and creating atomic objects, the VALUE structure is introduced: typedef struct { ATOMIC_VALUE_TYPE typeValue; /* a value type: integer, float, string */ IntegerT intValue; FloatT floatValue; StringT stringValue; } VALUE; A.2.3. Creating Objects Due to existing three kinds of objects in the store, there are specified three different operations for creating objects: for simple, reference, and complex objects. POINT AtomicCreate( char* Name, VALUE* Value, POINT Owner, FLAGS Flags) POINT AtomicCodeCreate( unsigned short int NameCode, VALUE* Value, POINT Owner, FLAGS Flags) The functions create a new atomic object with a given name and value as a new member (subobject) of Owner object. Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). POINT ReferenceCreate ( char *Name, POINT Ref, POINT Owner, FLAGS Flags) POINT ReferenceCodeCreate (unsigned short int NameCode, POINT Ref, POINT Owner, FLAGS Flags) 115 The functions create a new reference (pointer) object with a given name and Ref value as a new member (subobject) of Owner object. Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). POINT IntCreate( char *Name, VALUE* Value, POINT Owner ) POINT IntCodeCreate( unsigned short int NameCode, VALUE* Value, POINT Owner ) POINT ICreate( char *Name, IntegerT intValue, POINT Owner ) POINT ICodeCreate( unsigned short int NameCode, IntegerT intValue, POINT Owner ) The functions create a new atomic object with a given name and an integer value as a new member (subobject) of Owner object. Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). POINT StringCreate( char *Name, VALUE* Value, POINT Owner ) POINT StringCodeCreate( unsigned short int NameCode, VALUE* Value, POINT Owner ) POINT SCreate( char *Name, StringT stringValue, POINT Owner ) POINT SCodeCreate( unsigned short int NameCode, StringT stringValue, POINT Owner ) The functions create a new atomic object with a given name and a string value as a new member (subobject) of Owner object. Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). POINT FloatCreate( char *Name, VALUE* Value, POINT Owner ) POINT FloatCodeCreate( unsigned short int NameCode, VALUE* Value, POINT Owner ) POINT FCreate( char *Name, FloatT floatValue, POINT Owner ) POINT FCodeCreate( unsigned short int NameCode, FloatT floatValue, POINT Owner ) The functions create a new atomic object with a given name and a float value as a new member (subobject) of Owner object. Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). POINT ComplexCreate ( char *Name, POINT Owner, TypeKind Type, FLAGS Flags) 116 POINT ComplexCodeCreate (unsigned short int NameCode, POINT Owner, TypeKind Type, FLAGS Flags) The functions create a new complex object with a given name and with empty content as a new member (subobject) of Owner object. Type determines whether new object is ordinary (i.e. it represents a real-world being) or special purpose (e.g., a class will be represented as an object of a special type). Additional features of created object are specified by flags, e.g., whether the new object should be persistent (default) or transient (Flags parameter is not implemented). A.2.4. Information Retrieval Functions int getType( POINT Object ) The function returns a type of a given object. char* getName( POINT Object, char *Name ) The function returns a name of a given object as a string. unsigned short int getCodeName( POINT Object ) The function returns a name of a given object as a code. int isReserved( POINT Object ) The function returns 1 when an object has a reserved name, and 0 - otherwise. int isAtomic( POINT Object ) The function returns 1 if an object identified by Object is an atomic object, but 0 – if it is a reference or complex object. int isPointer( POINT Object ) The function returns 1 if an object identified by Object is a reference (pointer) object, but 0 – if it is not. int isComplex( POINT Object ) The function returns 1 if an object identified by Object is complex, but 0 – otherwise. FLAGS getFlags( POINT Object ) 117 The function returns the value of flags associated with the object, e.g., related to object’s persistence. VALUE* getValue( POINT AtomicObject ) The function returns the value of atomic object identified by AtomObject. It returns NULLPNTR if the operation is performed on a reference or complex object. FloatT getFloat( POINT FloatObj) The function returns a value of a float number stored within FloatObj. IntegerT getInt( POINT IntObj) The function returns a value of an integer number stored within IntObj. StringT getString( POINT StringObj) The function returns a value of a string number stored within StringObj. POINT getRef( POINT RefObject ) The function returns a value of a reference stored within RefObject object. It returns NULLPNTR if the operation is performed on an atomic or complex object. POINT findSubord( POINT Owner, char *Name, TypeKind Type, FLAGS Flags ) POINT findCodeSubord( POINT Owner, unsigned short int NameCode, TypeKind Type, FLAGS Flags ) The functions return the identifier of first Owner’s member (subobject) having a proper name, type, and flags. If Name is set as empty (””) than the reference to the first subobject of Owner is returned; if Flags == -1 than Flags parameter is ignored during searching. (Flags parameter is not implemented.) The functions assume that Owner refers to a complex object. When Owner is an identifier of atomic or reference object than NULLPNTR is returned. POINT findNext (POINT Entry, char *Name, TypeKind Type, FLAGS Flags ) POINT findCodeNext (POINT Entry, unsigned short int NameCode, TypeKind Type, FLAGS Flags ) 118 The functions perform searching next member (object) after Entry member (at the same data level as Entry object), according to given Name, Type, and Flags criteria. If Name is empty (“”) or Flags == -1 than these parameters are omitted during searching. (Flags parameter is not implemented.) POINT getSubord( POINT Object ) The function returns a reference to first subordinated object of a given object, or NULLPNTR. POINT getNext( POINT Object ) The function returns a reference to next object to a given one in the same (ring or spider) structure. POINT getOwner( POINT Object) The function finds the owner of a given object. When Object is the reference to a root object than NULLPNTR is returned. A.2.5. Updating Functions POINT setValue( POINT Object, Value* NewValue) The function assigns a new value to the atomic object referenced by Object. The type of NewValue should be consistent with actual type of Object’s value. The function returns a reference to a given object after modification of its value. When operation is incorrectly performed (e.g. if Object is a reference to reference or complex object) it returns NULLPNTR. POINT setRef( POINT Object, POINT NewRef) The function assigns NewRef value to the reference object. It returns the value of new reference or NULLPNTR if an error occurred. void ObjCodeRename( POINT Object, unsigned short int NewName ) The function renames a given object. ObjCpIns( POINT Object, POINT Owner ) 119 The function copies a given object and inserts the copy as a last member of Owner. POINT CopyObj(POINT Object, POINT NewOwner, POINT After, OPTION Option, FLAGS Flags) The function copies the object identified by Object with whole its content as a new member (subobject) of NewOwner object. Option determines a kind of copying: 0 – standard copying (all references included before copying in the Object’s content are preserved), 1 – isomorphic copying (if references included in the Object’s content refer to its inside than they are exchanged for adequate references which refer to the inside of new object copied). If Flags is set (<> -1 ) than the Object’s copy has assigned new Flags, otherwise old flags is preserved. If After is not NULLPNTR and NewOwner is NULLPNTR than Object is copied as next object to After (at the same data level as After object). (Option and Flags parameters are not implemented.) void ObjMoveLast( POINT Object, POINT Owner ) The function disconnects an object (Object) from old structure and inserts into the new (Owner) as a last its member. POINT ObjMoveAllMembers( POINT Old_Owner, POINT New_Owner ) The function moves all members of a given owner (Old_Owner) into a new owner (New_Owner) and then deletes the old owner. POINT MoveObj( POINT Object, POINT NewOwner, POINT After, OPTION Option_in, OPTION Option_out, FLAGS Flags ) The function disconnects Object from its old data level and inserts it into the object referenced by NewOwner (as new NewOwner’s member) with whole its content. If Option_in == 0 than all copied references which lead to the Object’s inside are isomorphic exchanged; if Option_in == 1 than these references are removed. If Option_out == 0 than all external references which lead to Object are deleted; if Option_out == 1 than all external references to Object are adequately updated. If After is not NULLPNTR and NewOwner is NULLPNTR than Object is moved as next object to After, otherwise Object is placed as the last member (subobject) of NewOwner. (Option_in, Option_out, and Flags parameters are not implemented.) 120 void Delete( POINT Object ) The function removes an object referenced by Object identifier. If a given object is complex than its all subobjects are recursively deleted too. All external references to deleted objects are also deleted recursively, i.e. all references to deleted pointers are removed. FLAGS setFlag( POINT Object, FLAGS NewFlags) The function sets new flags for the object identified by Object. It returns the value of flags associated with Object after modification. (The method is not implemented.) A.3. The Conceptual Level The conceptual level is an abstract level of object store. It defines an object-oriented data model which consists of class and role class definitions, generalization relationships between classes, binary association as well as class instances, role instances, and links between instances. A.3.1. Class A class is introduced at the conceptual level as a special type complex object which may include several members (or subobjects) representing attributes, methods, and relationships. An attribute definition is stored by a structural object with reserved ATTRDEF name. Similarly, a method definition and a relationship definition are represented by complex objects with reserved names METHDEF, RELDEF. If a given class has one or more parent classes than an object, containing a class definition, includes one or more adequate reference subobjects with reserved name CLTOSUPERCLDEF. POINT ClassDef( char* Name, POINT Owner ) The function defines a new class with a given name. The new class is inserted as last member of complex object identified by Owner. The operation returns a reference to the created class or NULLPNTR if an error occurred. POINT getSuperClass( POINT Class ) The function returns reference to a superclass of a given class. 121 int setSuperClass(POINT Class, POINT SuperClass) The function specifies that Class is a direct subclass of SuperClass. It returns 1 if successful and 0 otherwise. int isSuperClass(POINT Class, POINT SuperClass) The function determines whether Class is a direct subclass of SuperClass. It returns 1 if yes, and 0 if not. POINTSET getSubClass( POINT Class) The function returns a set of classes that all its subclasses (The method is not implemented) POINTSET getExtent( POINT Class ) The method returns an extent of a given class, i.e. the actual set of all instances of this class (The method is not implemented). A.3.1.1. Attribute An attribute object definition contains: An atomic object with reserved name ATTRNAME that stores an attribute name; An atomic object with reserved name ATTRATOMTYPE that stores an atomic type of attribute value; And an atomic object with reserved name ATTRCOLLTYPE that determines whether a given attribute is multivalued (set or list type collection). int AttributeDef( POINT Class, char *Name, ATOMIC_VALUE_TYPE AttrType, COLL_TYPE CollType ) The function defines a new attribute of a given class referenced by Class with a given name (Name). A type of attribute atomic value is specified by AttrType and can be integer, float, or string. CollType determines whether a new attribute possesses just one value or it may have more than one value organized by a list or set structure (NO_COLL, COLL_LIST, COLL_SET). 122 int getAttributeDef( POINT Class, char* Name, ATOMIC_VALUE_TYPE* AttrType, COLL_TYPE* CollType ) The function retrieves information about an attribute defined within a given class that is its atomic type and whether this attribute constitutes a collection. The operation returns 1 if it is successful, otherwise - 0. unsigned short int getAttrCodeName( POINT ATTR ) The function returns a name (code) of an attribute. ATTR is a reference to an attribute definition object. A.3.1.2. Method A method object definition consists of two members that are atomic objects with reserved names METHSIG, METHBODY, containing strings with respectively the signature and body of a given method. int MethodDef( POINT Class, char* Signature, char* Body) The method defines a new method of a given class with a certain signature. A signature specifies a method name as well as parameter and return value types. We assume a method body is a text in OQL-style syntax. Visibility can be similarly defined as for attributes. An instance-scope method means that it may be applied to individual objects. A class-scope method denotes a method, which can be applied to the class itself, but not to individual objects (e.g. a method that creates an instance of a class); visibility is not implemented. StringT getMethSig( POINT Method ) The function returns a signature of a given method (as a string). A.3.1.3. Relationship A relationship object definition contains: An atomic object with reserved name RELNAME that stores a relationship name; A pointer object with reserved name RELTARGETCLNAME that stores a reference to a target class; 123 An atomic object with reserved name RELCOLLTYPE that determines whether a relationship is multiple or not; And an atomic object with reserved name RELINVNAME that stores an inverse path name. int RelationshipDef( POINT Class, char* Name, POINT RelTargetClass, COLL_TYPE TargetCollType, char* InverseName) The function defines a binary relationship (association) between Class and RelTargetClass classes. However, n-ary32 relationships are introduced in many object-oriented methodologies and notations (e.g. in UML), most of object-oriented standards or languages (e.g. ODMG) do not support n-ary associations. A binary relationship is the semantic relationship between two classes that involves connections among their instances called links. A binary relationship has two ends and two names: Name, InverseName. Navigation from a Class instance to one or more instances of RelTargetClass can be realized by relationship name (Name) when it is possible to traverse a binary relationship from a given end. When InverseName is not omitted, the binary relationship is bidirectional and InverseName specifies name for the navigation from the other relationship end. When InverseName is omitted, the relationship is directed. TargetCollType specifies the multiplicity of target end, i.e. whether number of RelTargetClass instance that may be related to a Class instance is just one or can be more than one. Each relationship end has its own multiplicity value. int getRelationshipDef( POINT Class, char* Name, POINT* RelTargetClass, COLL_TYPE* TargetCollType, char* InverseName) The function gets information about a relationship with certain name, which is defined within a given class, that is a target class identifier (RelTargetClass), whether the target end of a relationship is multiple or not (TargetCollType), and an inverse relationship name (InverseName). unsigned short int getRelCodeName( POINT REL ) The function returns a name (code) of a relationship. REL is a reference to a relationship definition object. 32 A n-ary association is an association among three or more classes. 124 unsigned short int getRelTargetClassName( POINT REL ) The function returns a name (code) of a target class. REL is a reference to a relationship definition object. A.3.2. Role Class A role class is represented by class definition objects, which include one or more complex objects with reserved name ROLEDEF. Each ROLEDEF object contains: A reference object with reserved name ROLEDEFTOCLNAME that stores reference to a base class; And an atomic object with reserved name ROLEMULTIPLICITY that stores information whether a role is mandatory or optional, and whether it is multiple. POINT RoleDef( POINT BaseClass, POINT RoleClass, MultiplicityRange Multiplicity ) The method specifies that RoleClass is a role class of BaseClass. Mutliplicity defines the possible number of objects-roles that can simultaneously exist for a particular BaseClass instance. Roles are specified as optional whether multiplicity includes “zero”; roles are mandatory if multiplicity is at least one. int getRoleDef( POINT BaseClass, POINT RoleClass, MultiplicityRange* Multiplicity) The function determines whether class RoleClass is a role class of class BaseClass. If it is true than multiplicity of a given role is gathered. int isRoleClass( POINT Class ) The function returns 1 when Class is a role class. A.3.3. Class Instance We assume that classes determine structure and behavior of their instances. In our approach, all instances are represented in the store as complex objects. An instance of a given class has its own identity, name and content. The content consists of values that are consistent with the 125 proper specifications of invariants (e.g. attribute) within this class. There are two kinds of values: attribute values and reference values derived from links33. If an attribute is not multivalued then its atomic value is stored within an atomic object, but if a given attribute can be multivalued (set or list of atomic values) than all its values are stored within a complex object with the special structure. A binary link is represented in the store by two reference objects placed within two (complex) objects, which are connected by this link. Within instances there is no necessity to store fixed parts of class definitions (as e.g. methods, attribute types), thus these elements are represented within class definition objects only. Instances are created at run time as a result of primitive creation operations. Conceptually, a new object is created completely in one step. However, from the programmer’s point of view, this process consists of three steps. First, a new complex object is created with new identity, the same as a class name; then its data structure is created in accordance with proper class definitions. At last several role instances, whose parent object is the created object, may be created. Creation of such objects is restricted by some constraints, which are involved by certain role definitions, i.e. when new objects are instances of such classes for which there are specified mandatory class roles. When a given class is a root class than its new instance obtains values of attributes and references derived from this class only. However, if a given class has one or more parent classes than its new instance has both values of attributes and references, derived within this class and within its parent classes. VARIANT I (not implemented) Due to encapsulation and in order to support substitutability the values of attributes and references, which are derived from a given class, should not be mixed up with values of attributes and references derived from its parent classes. This feature is achieved through covering all attribute and reference values, which derive from a given parent, class in the recursive manner as follows: Values of attributes and references, derived from a class of which a given object is a direct instance, are placed as members of the highest level of content. 33 A link between two objects is an instance of proper relationship between classes. In our approach, we support only binary relationships. 126 Values of attributes and references, derived from a classes of which a given object is an indirect instance, are recursively captured and placed within a set of special structures. Each structure encapsulates all attribute and reference values derived from a given class. An attribute value is represented in the store by an atomic or complex object with reserved name ATTRVAL. When certain attribute can be multivalued then proper ATTRVAL object can obtain more than one atomic object, which all store values of this attribute and whose names are the same as certain attribute name. Otherwise, ATTRVAL is an atomic object with the same name as certain attribute, which stores its value. Each relationship instance is represented in the store by complex object with reserved name REF whose members are pointer objects with names as proper relationship name. REF object has one or more members in accordance with the multiplicity of relationship. Within an instance there is also stored the information about its class membership. It is represented by reference objects with reserved name OBJTOCLNAME, which store references to a class, of which a given object is a direct instance. Information about that whether a given object is an indirect instance of one or more classes is not placed in the store within instances but within a class schema only defined through class definition objects. VARIANT II (implemented) Encapsulation is a concept that occurs at the class schema level, but it is rather not related to class instances. Thus, a content of a given instance is a union of all attribute values and all references derived from links, which are led to this object. This means, e.g., those values derived from attributes, which are defined within a class whose membership is a given object, coexist side by side with values derived from attributes inherited by certain class from its parent classes. To allow identifying what a given value means, this value must be related to a proper attribute. Each attribute value is represented in the store by a complex object with reserved name ATTRVAL whose members are atomic objects with reserved name ATOMICVAL. If certain attribute can be multivalued then proper ATTRVAL object can obtain more than one ATOMICVAL members which all store values of this attribute, otherwise ATTRVAL has only one ATOMICVAL member, which stores an attribute value. ATTRVAL object obtains also reference to proper attribute definition specified within a reference object with reserved name ATTRDEFREF. Through this reference an attribute name, type and multiplicity of certain value are known. 127 Each link is represented in the store by a complex object with reserved name REF whose members are: a reference object with reserved name REFVAL that stores reference to another object, and a reference object with reserved name REFDEFREF that stores reference to proper relationship definition. Within a given instance there is also stored the information about its class membership. It is represented by reference objects with reserved name OBJTOCLNAME, which store identifiers of class, of which a given object is a direct instance. Information about that whether a given object is an indirect instance of one or more classes is not represented in the store within instances but only within a class schema defined through class definition objects. POINT InstanceCreate( POINT Class ) The function creates a new instance of class identified by Class. Class has to be ordinary class, i.e. it cannot be a role class. After creation, the new instance obtains empty values, which are derived from attributes of Class and inherited from its parent classes as well as from proper relationships, derived in the similar manner, which are referenced to NULLPNTR. If mandatory class roles are specified for Class, after creation of new instance of this class, proper role must also be created. For assuring consistency, in a query language supporting dynamic object roles there should be present a construction that performs creating instances and it is capable of creating both single instances and instances with their roles. The function returns identifier of created object, or NULLPNTR if an error occurred. POINT setLinkValue(POINT Object, char *Name, POINT Ref) The function sets a value of a given link that is stored within a reference object (Object). POINT setAttrValue(POINT Object, char *Name, VALUE *NewValue) POINT setAttrIntValue(POINT Object, char *Name, IntegerT intValue) POINT setAttrStringValue(POINT Object, char *Name, StringT stringValue) POINT setAttrFloatValue(POINT Object, char *Name, FloatT floatValue) The functions set a value of a given attribute instance that is stored within Object. A.3.4. Role Class Instance Instances of role classes are similarly created as instances of ordinary classes. It was mentioned in the previous section that every role instance cannot exist without its parent 128 object. Therefore, a new role is created for a given base object only. The information about the relationship between certain role and its base object is represented within a given role instance by a pointer object with reserved name ROLETOBASELINK that stores a reference to proper base object. The remaining content of roles is analogously represented as a content of ordinary objects. POINT RoleCreate( POINT RoleClass, POINT BaseObject ) The function creates a new role instance of RoleClass whose parent object is BaseObject. An identifier of new object is returned if successful, otherwise NULLPNTR is returned. A.3.5. Other Functions There should be defined some functions, expressing several operations, which can be performed on instances, roles and their classes. POINT getClass ( POINT Object ) The function returns an identifier of a class whose direct instance is Object. POINT isInstanceOf( POINT Object, POINT Class ) The function determines whether Object is a direct instance of Class. int getRoleInfo( POINT Role, POINT Base, MultiplicityRange* Multiplicity ) The function determines whether a given object (Role) is a role of another object (Base). If it is true, than information about the role multiplicity is gathered (Multiplicity). POINT getRoleBase( POINT Role ) The function returns an identifier of object which is a parent object of a given its role. POINT getRole( POINT Object) The function returns an identifier of first base-to-role reference object within a given object. A reference stored by the reference object point to a role of Object. POINT getRoleNext( POINT BaseToRoleLink ) 129 The function returns an identifier of next base-to-role reference object. A reference stored by the reference object point to another role. A.4. Metabase, ODL, SBQL with Dynamic Roles A metabase represents a static structural description of data. It concerns classes, types, interfaces and declarations of other stored structures. In our implementation, we limit a metabase only to represent classes and relationships between them. This approach does not completely support a orthogonality of SBA model. For simplification, it is assumed that: A class is a place for defining attributes of atomic types and method signatures (current implementation does not embrace further method specification nor method execution); A class determines a structure of all its instances in a exactly way; A class and all its direct instances possess the same name; Attributes cannot be complex (with arbitrary hierarchy of subordinated attributes), neither they cannot be treated as objects (links lead to objects only but not to attributes); however there is possible to define All classes and their instances are public; analogously all invariants of classes and properties of objects are specified as public; Only one class schema can be defined at the same time; A metabase is not modified at run time. However, it is worth to mention that the limitations described above are not essential for a concept of dynamic roles. Moreover, some of them could be easily overcome, e.g.: Class invariants can be subdivided into public, static and private; Metabase elements can be second-class citizens and can be modified at run time (class, attribute or method definitions can be added, changed or removed); More than one class schema can be stored at the same time (although only chosen one will be considered to be the actual schema). 130 A.4.1. The Grammar of ODL The ODL grammar is defined by means of modified BNF notation. The following metasymbols are used: {S} – S may be repeated; S must occur at least once; [S] – S may occur at most once. The grammar of ODL (1) <specification> ::= <class> | <class> <specification> (2) <class> ::= class <identifier> [extends <identifier>] [is <role_spec> roleof <identifier>] {<interface_body>} (3) <role_spec> ::= <role_option> [multiple] (4) <role_option> ::= optional | mandatory (5) <interface_body> ::= <export>| <export> <interface_body> (6) <export> ::= <const_dcl>; | <attr_dcl>; | <rel_dcl> ; | <op_dcl>; (7) <const_dcl> ::= const <base_type_spec> <identifier> = <unary_expr> (8) <base_type_spec> ::= float | integer | string (9) <unary_expr> ::= <unary_operator> <literal> | <literal> (10) <unary_operator> ::= - | + | ~ (11) <literal> ::= <floating_pt_literal> | <integer_literal> | <character_literal> | <boolean_literal> | <string_literal> (12) <boolean_literal> ::= FALSE | TRUE (13) <attr_dcl> ::= attribute <simple_type_spec> <identifier> (14) <simple_type_spec> ::= <base_type_spec> | <coll_type> (15) <coll_type> ::= <coll_spec> <<simple_type_spec>> (16) <coll_spec> ::= set | list (17) <rel_dcl> ::= relationship <target_of_path> <identifier> inverse <identifier>::<identifier> (18) <target_of_path> ::= <identifier> | <coll_spec> <<identifier>> (19) <op_dcl> ::= <op_type_spec> <identifier> <parameter_dcls> (20) <op_type_spec> ::= <simple_type_spec> | void (21) <parameter_dcls> ::= ([param_dcl_list]) (22) <param_dcl_list> ::= <param_dcl> | <param_dcl>, <param_dcl_list> (23) <param_dcl> ::= <param_attribute> <simple_type_spec> <identifier> 131 (24) <param_atribute> ::= in | out | inout Apart from the keyword and symbols, the following terminals are used: identifier – denotes any name; integer, real, string – denote an integer, real number and string, respectively. Example: class Person { attribute string name; attribute integer birthYear; method integer age(); } class Employee is mandatory roleof Person{ attribute float salary; attribute set < string >; method void changeSalary( in float NewSalary ); method real netSalary(); relationship Company works_in inverse employs; } class Company{ attribute string Name; relationship set < Employee > employs inverse works_in; relationship Employee manager inverse is_manager; } class Student is multiple optional roleof Person{ attribute integer semester; attribute string studentNo; attribute float scholarship; method integer avgScore; relationship University studies_at inverse studying; relationship set < Mark > has_marks inverse belongs_to; 132 } class University extends Company{ attribute string address; relationship set < Student > studying inverse studies_at; } class Mark{ attribute string course; attribute integer mark; relationship Student belongs_to inverse has_marks; } A.4.2. ODL Parser The parser of the ODL grammar has been implemented by the ODLParser function: int ODLParser( char *ODLText ) A given text is taken as a parameter by the function. If this text is recognized as a correct schema definition then the schema is stored into a database. The function returns 1 if parsing is successfully finished and the ODL schema saved, and 0 – otherwise. It uses functions described in the previous section: ClassDef, setSuperClass, RoleDef, AttributeDef, MethodDef, RelationshipDef in the way, which is descriptively presented by an example below: // first define all classes PersonC = ClassDef( "Person", VPOS_ENTRY ); EmployeeC = ClassDef( "Employee", VPOS_ENTRY ); CompanyC = ClassDef( "Company", VPOS_ENTRY ); StudentC = ClassDef( "Student", VPOS_ENTRY ); UniversityC = ClassDef( "University", VPOS_ENTRY ); MarkC = ClassDef( "Mark", VPOS_ENTRY ); // define all generalizations (class-superclass relationships) setSuperClass( UniversityC, CompanyC); // now define the remaining schema 133 //Person class AttributeDef( PersonC, "name", STRING, NO_COLL ); AttributeDef( PersonC, "birthYear", INTEGER, NO_COLL ); MethodDef( PersonC, "void age( void )", "{};" ); //Employee role class RoleDef( EmployeeC, PersonC, M_MANDATORY ); AttributeDef( EmployeeC, "salary", FLOAT, NO_COLL ); AttributeDef( EmployeeC, "job", STRING, COLL_LIST ); MethodDef( EmployeeC, "void changeSalary( in real NewSalary )", "{};" ); MethodDef( EmployeeC, "real netSalary( void )", "{};" ); RelationshipDef(EmployeeC, "works_in", CompanyC, NO_COLL, "employs" ); //Company class AttributeDef( CompanyC, "name", STRING, NO_COLL ); RelationshipDef( CompanyC, "employs", EmployeeC, COLL_LIST, "works_in" ); RelationshipDef( CompanyC, "manager", EmployeeC, NO_COLL, "manager" ); //Student role class RoleDef( StudentC, PersonC, M_OPTIONAL | M_MULTIPLE ); AttributeDef( StudentC, "semester", INTEGER, NO_COLL ); AttributeDef( StudentC, "studentNo", STRING, NO_COLL ); AttributeDef( StudentC, "scholarschip", STRING, NO_COLL ); MethodDef( StudentC, "integer avgScore( void )", "{};" ); RelationshipDef( StudentC, "studies", UniversityC, NO_COLL, "studying" ); RelationshipDef( StudentC, "has_marks", MarkC, COLL_LIST, "belogs_to" ); //University class AttributeDef( UniversityC, "name", STRING, NO_COLL ); 134 RelationshipDef( UniversityC, "studying", StudentC, COLL_LIST, "studies_at" ); // Mark class AttributeDef( MarkC, "course", STRING, NO_COLL ); AttributeDef( MarkC, "mark", INTEGER, NO_COLL ); RelationshipDef( MarkC, "belongs", StudentC, NO_COLL, "has_marks" ); Other functions void DispClassSchema( POINT Owner ) The functions prints to the standard device (stdout) a class schema stored within Owner structure (Owner is VPOS_ENTRY as default). A.4.3. SBQL Parser The required information from the database (without changing it) may be extracted by queries. The grammar of query language implemented in our prototype is presented in the next section. The simplified SBQL with dynamic roles covers expressions for retrieving information only and some manipulation operators for creating and deleting objects and roles. The rest of SBQL grammar, e.g. imperative statements, has not been implemented. The SBQL parser builds a syntax tree for a given query. The parser used in the prototype has been taken from [Płodzień00] and extended to deal with dynamic roles. For instance the syntax tree that models the query ((Employee as e) works_in.Company) where e.name = “Brown” is shown in Fig. A.1. where the root of the syntax tree as Employee = . e works_in . Company e ”Brown” name Fig. A.1 The syntax tree for query 135 Functions of Parsers Module QUERY AnalyzeQuerySyntax(FILE *file) The function retrieves a query text from a file, analyzes the query, and builds an adequate syntax tree. If parsing of a given query is successful then the root of the syntax tree is returned, otherwise an error message is printed to the standard output device (stdout). void PrintQuery(QUERY q, BOOLEAN withFathers) The function prints the textual form of query for a given root of query syntax tree (stored in operational memory). POINT SaveQueryNode( POINT Where, QUERY query ) The function stores a query syntax tree within the store as a complex object with reserved name QUERY_NODE. The root of query syntax tree is a member at the top level of store (and then Where == VPOS_ENTRY). void PrintQueryS(POINT P, BOOLEAN withFathers) The function prints the textual form of query syntax tree which held within the store. A.4.4. The Grammar of SBQL The SBQL with dynamic roles is a subset of language34 implemented in the LOQIS System [SMA90, Subieta90, Subieta91] with added new constructs corresponding to the concept of dynamic roles: Isrole operator that checks whether a given object is a role of another object – line (8); Dynamic casting operator (role roleName) that for a given object returns identifiers of all its roles with certain name (if they exist) which exist in a role hierarchy determined by the object – line(16); Roles operator that for a given object returns identifiers of roles with a given name (or all its roles)– line(16). 34 The denotational semantics is described in [Subieta85, Subieta87]. 136 The grammar of SBQL with dynamic roles is specified in modified BNF notation. The following metasymbols are used: {S} – S may be repeated; S must occur at least once; [S] – S may occur at most once. The grammar of SBQL (1) <Expr> ::= <ExprQuantifiers> [{<Oper> <ExprQuantifiers> | as <auxName> }] (2) <Oper> ::= , | where (3) <ExprQuantifiers> ::= for each <Expr> holds <ExprQuantifiers> | exist <Expr> such that <ExprQuantifiers> | <Pred> (4) <Pred> ::= <Pred1> [{ or <Pred1>}] (5) <Pred1> ::= <Pred2> [{ and <Pred2> }] (6) <Pred2> ::= [not] <Pred3> (7) <Pred3> ::= <SExpr> [<RelOper> <SExpr>] (8) <RelOper> ::= = | <> | < | <= | > | >= | in | contains | isrole (9) <SExpr> ::= <SExpr1> [{<AdditOper> <SExpr1>}] (10) <AdditOper> ::= - | + (11) <SExpr1> ::= <ExprDepJoin> [{<MultiOper> <ExprDepJoin>}] (12) <MultiOper> ::= * | / (13) <ExprDepJoin> ::= <ExprNavig> [{ with <ExprNavig> }] (14) <ExprNavig> ::= [<Sign>] <SExpr2> [{ .SExpr2}] (15) <Sign> ::= - | + (16) <SExpr2> ::= FALSE | TRUE | integer | float | string | <FuncCall> | <name> | (role <roleName>) <SExpr>| roles [<roleName>] of <SExpr> | nameof <SExpr> | (<Expr>) (17) <FuncCall> ::= <funcName> [( <ActParamList> )] (18) <ActParamList> ::= <Expr> [{;<Expr>}] Apart from the keyword and symbols, the following terminals are used: name – denotes any name; roleName – denotes a role name; 137 auxName – denotes an auxiliary name; integer, float, string – denote an integer, real number and string, respectively; funcName – denotes the name of a function. A.5. Query Evaluation Module The evaluation of queries has been developed by query evaluation module. The module consists of functions for organizing and manipulating the environment and query result stacks during evaluation of queries. In this section, we describe the structures of stacks and the main functions of the module. A.5.1. The Organization of Environment Stack The environment stack (ES) is held in the store as a complex object with reserved name ENV_STACK. Initially, the environment stack contains three sections ordered as follows: The global section that stores binders to global procedures, variables, etc. (this section is empty in the prototype). It is represented by a complex object with reserved name GLOBAL_SECTION; The root section that stores binders to root database objects. It is represented by a complex object with reserved name ROOT_SECTION; The current session that stores binders to global properties of the current session (this section is empty in the prototype). It is represented by a complex object with reserved name SESSION_SECTION. Higher sections, which occur during a query evaluation, are represented by complex objects with reserved name ES_SECTION. The physical organization of ES consists in the fact that each new (higher) ES section is created as a subobject of the actual top section of ES. Thus, the top section of ES is this one, which has no subobject with name ES_SECTION, and pop, push operations can be easily implemented. New concepts, i.e., class, inheritance, substitutability, and role cause that the classical organization of ES has to be extended with the new concept of a subsection. A subsection is a part of an ES section, which contains binder(s) derived from a class or a role. 138 It is represented by a complex object with reserved name ES_SUBSECTION. Each section can contain one or more subsections, and each subsection can contain one or more nested subsection. Details, about how sections and subsections are organized, are described by nested function. Sections and subsections consist of binders that are represented by reference objects. void ESInit( void ) The function creates the environment stack as a complex object with reserved name ES_STACK at the top level of store. Then it creates global, root and session sections. Finally, the method RootInit is called. int RootInit( POINT Root_Section ) The function finds root objects and inserts binders derived from them into the root section. A.5.2. The Organization of Query Result Stack The physical organization of QRES sections is similar to ES organization (except for subsections, because in the prototype we do not introduce them). The environment stack (ES) is held in the store as a complex object with reserved name QRES_STACK. When a new (higher) QRES section is created, then a new object with reserved QRES_SECTION name is created and it is inserted into the object, which represents the actual top section of QRES. The query result stack (QRES) stores all temporary and final results of query or subquery. A result is a table, understood as a bag of rows. A row can contain the following elements: Atomic values, i.e., integer and float numbers or strings which are represented by atomic objects with reserved name QRES_VALUE; Identifiers which are represented by reference object with reserved name QRES_VALUE; Binders n(x), where n is a name and x is a table; binders are represented by reference or complex objects with a name n. When x is 1 x 1 table than the binder is stored as a reference pointer, otherwise the binder is stored as complex object whose content consists of table x. 139 void QRESInit( void ) The function creates the query result stack as a complex object with reserved name QRES_STACK at the top level of store. A.5.3. Query Evaluation The execution of queries requires implementing the eval procedure. Differences with eval in the prototype and in the classical SBA [SMA90, Subieta90, Subieta91] are related to: Modified nested function for a model with classes and roles; New role-specific SBQL operators, e.g., isrole and operator of dynamic casting. void EvalQueryS(POINT P) The function realizes eval procedure for a given query P placed in the store. The result of the evaluation is pushed on the top of QRES in a new QRES section. void PushESNested( POINT QRESValue ) The function opens a new scope of ES (creates new section at top of ES) and places onto it binders that are results of nested(QRESValue) procedure. QRESValue is an element from a QRES section. void Nested( POINT SectES, POINT Ref ) The function inserts into a given ES section (SectES) binders to all properties, which are available within the object identified by Ref that is, binders to attributes, methods, and references placed within Ref and binders to properties derived from class hierarchy whose member is Ref. When Ref is a role of another object OB1 then binders to properties, which are placed within OB1 and derived from a class hierarchy of which OB1 is a member, are inserted into a new subsection of SectES recursively35 int BindName( POINT Name ) 35 When OB1 is a role of another OB2 object then binders to properties, which are placed within OB2 and derived from a class hierarchy of which OB2 is a member are inserted into a new subsection of SectES, etc. 140 The function binds a given name n (which is placed in a node Name of syntax tree). Binding mechanism is looking for the ES section (closest to the top of the stack) containing a binder n(x) (or binders n(x1), …, n(xm)). The results are inserted into a new QRES section as objects (atomic, reference or complex w.r.t. to x) with reserved name QRES_VALUE containing x. The function returns 0 if successful and 1 – otherwise. POINT SearchES( unsigned short int NameCode ) The function performs searching ES for a given name. It starts from the top of ES. When a name is not found then searching is realized into subsections (if they exist), and then into a lower section. The function returns reference to a section or a subsection where a given name is found or NULLPNTR if at bottom ES section a name was not found. POINT getTopES( void ) The function returns the top section of ES. POINT NewSectionES( void ) The function creates a new ES section at the top of ES. POINT NewSubSectionES( void ) The function creates a new ES subsection at the top ES section. A new subsection is represented by a complex object with reserved name ES_SUBSECTION placed as the last member of the structure owned by the object represented actual top ES section. POINT getTopQRES( void ) The function returns the top section of QRES. POINT NewSectionQRES( void ) The function creates a new QRES section at the top of QRES. A new QRES section is represented by a complex object with reserved name QRES_SECTION placed as a member of the object represented actual top QRES section. POINT PushCQRES( POINT Constant ) The function pushes a constant, i.e., a boolean value, an integer or a float number, or a string, at the top of QRES (creates a new QRES section and inserts a constant onto it). A constant is 141 retrieved from a node Name within syntax tree. The function returns reference to a new atomic object with reserved name QRES_VALUE. POINT BinderRefCreate( unsigned short int NameCode, POINT RefValue, POINT Section ) The function creates in a given section a new reference binder which has NameCode name and points to RefValue. POINT BinderIntCreate( unsigned short int NameCode, IntegerT IntValue, POINT Section ) The function creates in a given section a new integer binder, which has NameCode name and stores value IntValue. POINT BinderFloatCreate(unsigned short int NameCode, FloatT FloatValue,POINT Section) The function creates in a given section a new float binder, which has NameCode name and stores value floatValue. POINT BinderStringCreate( unsigned short int NameCode, StringT StringValue, POINT Section) The function creates in a given section a new string binder, which has NameCode name and stores value StringValue. void searchCodeNameRoleHierarchy( POINT Object, unsigned short int NameCode, POINT Result, unsigned short int NewName ) The function is lookup within role hierarchy for objects with name NameCode and inserts references found into Result as pointer objects with name NewName. void PrintTopQRES( void ) void PrintTopES( void ) The function prints a content of top section of QRES (ES) to the standard output device (stdout). 142 A.6. Types and Reserved Names Types defined in the VPOS: #define RING 0 #define AGGREGATE 1 #define VPOSLINK 6 #define VPOSINT 11 #define VPOSDATE 12 /* Integer denoting date in days */ #define VPOSTIME 13 /* Integer denoting time in seconds */ #define VPOSREAL 21 #define VPOSSTRING 22 #define FREEDATA 240 #define SPIDER 254 Types additionally defined in the prototype: #define VPOSCLASS 3 /* class */ #define VPOSOBJECT 1 /* object*/ #define OBJTOCLLINK VPOSLINK /* object-to-class link*/ #define CLTOSUPERCLLINK VPOSLINK /* class-to-superclass link*/ #define ROLEDEFTYPE AGGREGATE /* role definition object */ #define ROLEDEFTOCLLINK VPOSLINK /* role class-to-base class link */ #define ATTRDEFTYPE SPIDER /* attribute definition object */ #define METHDEFTYPE SPIDER /* method definition object */ #define METHSIGTYPE VPOSSTRING /* method signature */ #define METHBODYTYPE VPOSSTRING /* method body */ #define RELDEFTYPE SPIDER #define RELTARGETCLTYPE VPOSLINK /* link to target class within relationship*/ #define ROLETOBASELINK VPOSLINK /* role -to-base link */ #define BASETOROLELINK VPOSLINK /* base-to-role link */ #define ATTRINSTTYPE SPIDER /* attribute instance object */ #define RELINSTTYPE SPIDER /* relationship instance object */ #define LINKTYPE VPOSLINK /* link */ 143 /* relationship definition object */ The list of reserved names is as follows: OBJTOCLNAME object-to-class link object CLTOSUPERCLNAME class-to-superclass link object ROLEDEF role definition object ROLEDEFTOCLNAME role class to base class link ROLEMULTIPLICITY class role multiplicity ATTRDEF attribute definition object ATTRNAME attribute name ATTRATOMTYPE attribute atomic value type ATTRCOLLTYPE attribute collection type METHDEF method definition object METHSIG method signature METHBODY method body RELDEF relationship definition object RELNAME relationship name RELTARGETCLNAME relationship target class name RELCOLLTYPE relationship target's end multiplicity RELINVNAME relationship inverse name ENV_STACK environment stack GLOBAL_SECTION global section ROOT_SECTION root section SESSION_SECTION session section QRES_STACK query result stack QRES_SECTION QRES section QRES_VALUE QRES value ES_SECTION ES section ES_SUBSECTION ES subsection TEMP_RESULT temporary result ATTRVAL instance of attribute ATOMICVAL atomic value ATTRDEFREF attribute instance-attribute definition link ROLETOBASENAME role-base link 144 BASETOROLENAME base-role link RELVAL instance of relationship LINKVAL link value RELDEFREF relationship instance-relationship definition link QUERY_NODE query node 145 Appendix B. The Stack-Based Approach In this chapter, we present in detail those SBA concepts that are important for our optimization methods. We also present the semantics and the most relevant for our approach features of SBQL. Other concepts of object models (e.g., encapsulation, null values, variants, strong typing, etc.) occurring in SBA and SBQL and other concepts of query/programming languages (e.g., imperative statements) are discussed in [Subieta91, SKL95]. B.1. Introduction SBA assumes that query languages are a special case of programming languages; it is an attempt to build a uniform semantic foundation for integrated query and programming languages. The approach is abstract and universal, which makes it relevant to a very general object model. SBA makes it possible to precisely determine the semantics of query languages, their relationships with object-oriented concepts, with imperative programming constructs, and with programming abstractions, including procedures, functional procedures, views, modules, etc. Its main features are the following: The naming-scoping-binding principle is assumed, which means that each name occurring in a query is bound to the appropriate run-time entity (an object, attribute, method parameter, etc.) according to the scope of this name. One of its basic mechanisms is an environment stack (ES). The stack is responsible for scope control and for binding names. In contrast to classical stacks it does not store objects, but some structures built upon object identifiers, names, and values. The principle of orthogonal persistence is assumed, which means that there are no differences in defining queries accessing persistent and volatile data. Results of functional procedures and methods belong to the same semantic category as results of queries. As a consequence, functions and methods can be invoked in queries. Those results can be additionally augmented with “virtual names”, like in SQL. 146 In contrast to relational languages and OQL, the relativity principle is assumed, that is, the syntax, semantics, and pragmatics are identical at an arbitrary level of data hierarchy (in particular, an attribute is an object). Types are a mechanism to determine whether objects are built in a proper way (i.e., in accordance with the database schema). For objects the principle of internal identification is assumed (i.e., each run-time entity has a unique internal identifier). B.2. Objects, Classes and Abstract Object-Oriented Store Model In SBA each object has the following features: Internal identifier (OID); identifiers cannot be directly written in queries and are not printable; External name (invented by the programmer or database designer) that is used to access the object from a program; Contents that can be a value, a link, or a set of objects. Hence, the following three sets are used to define objects: I – the set of unique internal identifiers, N – the set of external data names, V – the set of atomic values, for instance integers, strings, pointers, blobs, etc. Atomic values are also procedures, functions, methods, views, and other procedural entities (to be precise, their codes). Formally, let i, i1, i2 I, n N, v V. Objects are modeled as the following triples: Atomic objects as <i, n, v>. (Because the codes of procedural entities are atomic values, subprograms are modeled as atomic objects.) Link objects as <i1, n, i2>. Complex objects as <i, n, S>, where S denotes a set of objects. 147 This definition is recursive and makes it possible to create complex objects with an arbitrary number of hierarchy levels. Relationships (e.g., associations) are modeled through link objects. In order to model collections SBA does not assume the uniqueness of external names at any level of data hierarchy. The unification of records, tuples, arrays and all bulk structures is assumed; SBA abstracts from their differences. In the model all names (of objects, attributes, relationships, etc.) are shifted to the first-class citizenship. In SBA classes are used as prototypes, which means that they are objects (i.e., they have the features of objects introduced above), but their task is different. A class object stores invariants (e.g., methods) of the objects that are instances of that class; a special relationship – instantiation – between a class and its instances is introduced. Moreover, inheritance relationship between objects is assumed; this relationship makes it possible to apply the substitutability principle. In SBA objects populate an object store, which is formed of: The structure of objects, subobjects, etc., as defined in this section. OIDs of root objects (they are accessible from the outside, that is, they are starting points for querying); it is assumed that objects modeling classes cannot be accessed directly by the programmer. Constraints (e.g., the uniqueness of OIDs, referential integrities, etc.). The term “object” is associated exclusively with elements of the object store. In the model there are no other objects. Queries in this approach never return objects, but some structures built upon object identifiers, values and names. In consequence, the closure property is understood not as a closure over objects, but as a closure over such structures (for details see [SP00a]). B.3. Example Database In this section we present the schema (i.e., the class diagram in a little modified UML) of an example database (which will be used in examples in this thesis) and discuss how this database is modeled in SBA. The schema is shown in Fig. B.1. The schema defines five classes (i.e., five collections of objects): Lecture, Student, Professor, Person, and Faculty. The classes Lecture, Student, Professor and Faculty model lectures attended by students and given by professors working in faculties, respectively. 148 Person is the superclass of the classes Student and Professor. Professor objects can contain multiple complex prev_job subobjects (previous jobs). The name of a class (attribute, etc.) is followed by its cardinality, unless the cardinality is 1. Student has one class property: the avgGrade method (it converts a student’s grades to points according to some formula and returns the average of those points). All object attributes and class properties are public. Person [0..*] name: string; age: integer; Student [0..*] gives year: integer; grades [0..*]: string; avgGrade: real; 1..30 attends 1 0..* given_by 1..10 attended_by Lecture [0..*] Professor [0..*] salary: real; prev_job [0..*] place: string; years: string; 1..* works_in subject: string; credits: integer; employs 1 Faculty [0..*] fname: string; loc [1..*]: string; Fig. B.1. The class diagram of the example database Fig. B.2 presents the object store of our tiny database built in accordance with SBA upon the schema presented in Fig. B.1 (Fig. B.3 presents that store in a graphical form as a graph). The relationships between the class diagram and the object store are the following: For each class in the class diagram the object store contains one object modeling that class (i.e., storing its invariants; its name is the name of the class from the diagram augmented with the “Class” prefix) and (possibly) several objects modeling its instances that store its attributes modeled as subobjects (the potential number of those instance objects is determined by the cardinality of the class; their names are the name of the class from the diagram). Moreover, the store contains the instances of the instantiation relationship modeled as pairs <io, ic>, where ic is the identifier of the object modeling a 149 class and io is the identifier of its instance (in Fig. B.2 such relationships are designated by the INST set, and in Fig. B.3 as the thick black arrows). Objects: <i1, Professor, {<i2, name, „Smith”>, <i3, age, 52>, <i4, salary, 3500>, <i5, works_in, i16>, <i6, gives, i21>}> <i7, Professor, {<i8, name, „White”>, <i9, age, 44>, <i10, salary, 2900>, <i11, works_in, i16>, <i12, gives, i26>, <i13, prev_job, {<i14, place, „UCLA”>, <i15, years, „1985-1993”>}>}> <i16, Faculty, {<i17, fname, „engineering”>, <i18, loc, „Elms St. 21”>, <i19, employs, i1>, <i20, employs, i7>}> <i21, Lecture, {<i22, subject, „algebra”>, <i23, credits, 5>, <i24, given_by, i1>, <i25, attended_by, i40>}> <i26, Lecture, {<i27, subject, „physics”>, <i28, credits, 3>, <i29, given_by, i7>, <i30, attended_by, i33>, <i31, attended_by, i40>, <i32, attended_by, i47>}> <i33, Student, {<i34, name, „Russell”>, <i35, age, 23>, <i36, year, 2>, <i37, grades, „A”>, <i38, grades, „B”>, <i39, attends, i26>}> <i40, Student, {<i41, name, „Jones”>, <i42, age, 45>, <i43, year, 1>, <i44, grades, „A”>, <i45, attends, i21>, <i46, attends, i26>}> <i47, Student, {<i48, name, „Black”>, <i49, age, 20>, <i50, year, 1>, <i51, attends, i26>}> <i52, ClassPerson, {}> <i53, ClassStudent, {<i54, avgGrade, (…the code of the method...)>}> <i55, ClassProfessor, {}> <i56, ClassLecture, {}> <i57, ClassFaculty, {}> INHER = {<i53, i52>, <i55, i52>} INST = {<i33, i53>, <i47, i53>, <i40, i53>, <i1, i55>, <i7, i55>, <i16, i57>, <i21, i56>, <i26, i56>} ROOT = {i1, i7, i16, i21, i26, i33, i40, i47} Fig. B.2. The object store For each inheritance relationship in the class diagram between classes Class1 and Class2 (Class1 is a subclass of Class2) the object store contains one pair <ic, isc>, where ic is the identifier of the object modeling the Class1 class, and isc is the identifier of the object modeling the Class2 class (in Fig. B.2 such pairs are designated by the INHER set, and in Fig. B.3 by the thick gray arrows). Associations in the class diagram are unidirectional. For each association the object store contains a set of link subobjects modeling it (the number of such subobjects is determined by the cardinality of the relationship; in Fig. B.2 those subobjects are contained in their owners, and in Fig. B.3 they are designated by the dotted black arrows). 150 The names of classes whose instances are root objects are in bold in the class diagram. The object store contains a ROOT set, which is comprised of the identifiers of root objects (we assume that objects Person, Student, Professor, Lecture and Faculty are root ones). In Fig. B.2 and Fig. B.3 the identifiers of such objects are printed in bold. Constraints are embedded in the system (for instance as active rules). Fig. B.3. The object store as a graph B.4. Stacks An environment stack is one of the most basic auxiliary data structures in programming languages. It accomplishes the abstraction principle, which allows the programmer to consider the currently being written piece of code to be independent of the context of its possible uses. The stack makes it possible to associate parameters and local variables to a particular procedure (function, method, etc.) invocation. Thus safe nested calls of procedures from other procedures are possible, including recursive calls. The stack is also used to accomplish strong typing, encapsulation, inheritance, and overriding. 151 In SBA the stack has a new function: processing queries acting on the object store. It makes it possible to control scopes of all names occurring in a query in a simple and uniform way. It also makes it easier to understand the precise semantics of queries. There are some changes of the construction of the SBA ES in comparison with programming languages. While they usually assume that objects live on the stack (i.e., they are allocated dynamically in proper stack sections), in SBA the object store and the stack are separate data structures; the stack contains only references to objects. The main reason for this assumption is the fact that the same object can be referred to from different stack sections. The stack consists of sections that are sets of binders. A binder is an SBA concept used to cope with various naming issues that occur in object models and their query languages. Formally, a binder is a pair (n, x), where n is an external name (n N), and x is a reference to an object (x I); such a pair will be written as n(x). We will refer to n as the name of that binder, and to x as its value. The concept of a binder can be generalized; in particular, x can be an atomic value, or a complex structure (a table; see Section B.6.2). Moreover, if a binder models a procedural entity, then its value is the address of that subprogram. B.5. Binding The mechanism, which makes it possible to determine the meaning of each name is called binding. Binding follows the “search from the top” rule: to bind a name n the binding mechanism is looking for the ES section (closest to the top of the stack) containing a binder n(x) storing some value x. The result of the binding is x. To cover bulk data structures of the store model, SBA assumes that binding can be multi-valued, that is, if the relevant section contains several binders whose names are n: n(x1), n(x2), n(x3),..., then all of them contribute to the result of the binding. In such a case the binding of n returns the collection {x1, x2, x3, ...}. Some modification of the binding rules is necessary to take into account inheritance and the substitutability principle. The principle means that while binding the name of some class the names of its subclasses are also bound. In particular, if c1 is the name of an object from the extension of some c1 class, c1 has a subclass c2, and ES contains a binder c2(x) storing some value x, then the binding of c1 is successful and returns x. For example, if we want to bind the name Person, and ES contains the binder Professor(i7), then the result of the binding is i7. 152 This rule is generalized for multi-valued bindings too. (It is assumed that typing constraints disallow the use of properties specific for Professor when binding concerns Person.) In SBA at the beginning of a user session ES consists of a single section containing binders for all root database objects. During query evaluation the stack is growing and shrinking according to query nesting. Assuming no side effects in queries (that is, no calls of updating methods) the final ES state is exactly the same as the initial state. In Fig. B.4 we present the beginning state and the final state of the stack for the example database from Fig. B.2 and an evaluation of some interactive query (the query language is presented in Section B.6): Professor where p name(i8) age(i9) salary(i10) works_in(i11) gives(i12) prev_job(i13) Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i7) Lecture(i21) Student(i33) Student(i47) The initial state of ES Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i7) Lecture(i21) Student(i33) Student(i47) The state of ES during evaluation of p in query “Professor where p” in the second iteration loop Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i7) Lecture(i21) Student(i33) Student(i47) base section The final state of ES Fig. B.4. The beginning and final states of ES In a general case object-oriented data structures and complex query/programs acting on them may create much more complex states of ES. In particular, ES can store class properties and subprograms’ local environments; see Fig. B.5. 153 The order of search during binding name p The query: Professor where ... p ... occurs within method m1, which is called from m2 Binders to public attributes of current Professor object Binders to private attributes of current Professor object Binders to public properties of Professor class Binders to private properties of Professor class Binders to public properties of Person class Binders to local properties of m1 (parameters, etc.) ...... stack sections induced within m2 Binders to local properties of m2 (parameters, etc.) ...... stack sections induced by the caller of m2 Binders to global properties of the current session Binders to root database objects, views, ... base sections Binders to global procedures, variables, ... The state of the environment stack Fig. B.5. A more complex state of ES The presented state concerns the binding of the name p occurring in the where clause of the query which is now invoked in the body of some method m1 called by some method m2 (suppose that the m1 method is defined in the Professor class). The name p can be the name of an Professor object attribute, the name of a method from the Professor class, the name of a method from the Person class, the name of a root database object, the name of a view, etc. The system is trying to bind p to the proper entity of the environment following the order presented by the arrows. Lexical scoping is assumed; for instance, the environments of the m2 method and of its potential caller (and of its potential caller, and so on; those environments are designated by the black sections) are invisible within the body of the m1 method. In this case at the beginning of a user session ES has more than one section: it has sections containing binders to the global properties of the current session, root database objects, views, global procedures, variables, etc. These sections are called the base sections (or the base environment/scope). B.6. Query Language In this subchapter we present SBQL, in particular, its operational semantics. The denotational semantics is described in [Subieta85, Subieta87]. The language is implemented in the Loqis system ([SMA90, Subieta90, Subieta91]).) 154 B.6.1. SBQL Syntax SBQL is based on an abstract syntax and the principle of compositionality: syntactic sugar is avoided and query operators are syntactically separated as far as possible. The syntax of SBQL is as follows: A single name or a single literal is an (atomic) query. For instance, Student, name, year, x, y, “Smith”, 2, 2500, etc., are queries. If q is a query, and is a unary operator (e.g., sum, count, distinct, sin, sqrt), then (q) is a query. (The operator defining a new name is a unary operator too (parameterized by the name); the traditional syntax is used: q as n (where n N).) If q1 and q2 are queries, and is a binary operator (e.g.: where, dot, , +, =, <, and, union, ), then q1 q2 is a query. (Quantifiers are considered binary operators too, but the traditional prefix forms are applied to them: q1q2) and q1q2).) With the exception of typing constraints (which are implicit in this dissertation) the orthogonality of operators is assumed. For instance, Professor, name, age, and "White" are atomic queries; they can be used to build complex queries, e.g., “retrieve the ages of professors whose names are White” (when formulating queries in the natural language and talking about objects, we will usually omit the phrase “… identifier(s) of …” if it does not cause any misunderstanding): (Professor where (name = "White")).age The query above has the following equivalent in SQL: select age from Professor where name = "White" and in OQL: select p.age from Professors as p where p.name = “White” In contrast to SQL and OQL, SBQL queries have an interesting and very useful property: they can be easily decomposed into subqueries, down to atomic ones, connected by unary or 155 binary operators. In particular, name and age are queries of their own rights. This is due to the fact that all queries in SBQL are evaluated relatively to the current state of ES. Assuming the stack presented in Fig. B.4 (or in Fig. B.5) these queries can be evaluated according to the normal rules. B.6.2. Results of SBQL Queries Each SBQL (sub)query returns a table, understood as a bag of rows. Such a table can be empty, can contain a single row, or can contain an arbitrary number of rows. A row is a sequence of elements; all rows in a table have the same type. A row can contain the following elements: Atomic values v V, Identifiers i I, Binders n(x), where n N and x is a table. In Fig. B.6 we present tables, which can be returned as results of some queries. They represent the results of the queries 2+2, Student, Professor as p, etc. (cf. Fig. B.2). The last one represents the result of the query “Get the identifiers of professors’ names, their ages, and the identifiers of lectures they give naming them with an auxiliary name lect”. 4 i33 i40 i47 p(i1) p(i7 ) “Russell” “Jones” “Black” i15 i2 i8 52 lect(i21) 44 lect(i26) Fig. B.6. Example result tables If a table has one row and one column, it is a 11 table. No formal difference between such a table and the element stored in it is made. For instance, the first table from Fig. B.6 is equivalent to the value 4. B.6.3. SBQL Semantics To define the operational semantics of queries another concept is introduced: an auxiliary stack QRES (Query RESults), which stores all temporary and final results of (sub)queries. Its 156 elements are tables, as defined in the previous subsection. At the beginning of evaluation the stack is empty. In SBA a special recursive procedure eval is used to define the semantics of SBQL. It maps a syntactically correct query and a machine state to the result table of that query. The machine state consists of: the state of the object store, the state of ES. The eval procedure modifies ES; however, the state of the stack after evaluation is always the same as before that evaluation (cf. Fig. B.4). The result of a (sub)query is pushed onto the top of QRES. The procedure is defined by cases: For the query l, where l is a literal denoting an atomic value v V, eval(l) pushes a 11 table {<v>} onto the top of QRES. For instance, the query “2” pushes a 11 table containing the value 2. For the query n, where n N, eval(n) inspects ES going from the top to bottom, and pushes the result of binding n onto the top of QRES as a single-column table of the values of all the binders named n occurring in the first section containing one or more such binders. For instance (cf. Fig. B.4), the query Student pushes onto QRES the second table shown in Fig. B.6. A slightly different case is when n is the name of a subprogram. In such a case n is evaluated as follows: 1. First, n is bound on ES in the standard way as described above, and the procedure is invoked. A new section storing its parameters, local environment, and return address is pushed onto ES. 2. Next, its body is executed. If it is a function, then it returns its result table just like any (sub)query by pushing it onto QRES. 3. Finally, it is terminated, and the ES section implied by the procedure is popped. Queries (and their results) are combined by operators, which in SBA are subdivided into algebraic and non-algebraic. The main difference between algebraic and non-algebraic operators is whether they modify the state of ES during evaluation or not. 157 B.6.3.1. Algebraic Operators An operator is algebraic if it does not modify the state of ES. The majority of operators in SBQL are algebraic. They include numerical comparisons, numerical operators, string comparisons, Boolean and, or, not, aggregate functions, set, bag and sequence operators and comparisons, the Cartesian product (denoted by a comma), etc. The definition of an algebraic operator is as follows. Let q1 q2 be a query formed of two subqueries connected by a binary algebraic operator . The eval procedure evaluates q1 and pushes its result table onto the top of QRES, then does the same with q2, and performs with the two QRES top tables as its arguments ( denotes the semantics of the operator). Finally, it removes those two tables from QRES and pushes onto it the final result. (Similarly for a unary operator.) The corresponding part of the eval procedure is presented below: procedure eval(query); begin ... case query is q1 q2: (*is an algebraic operator*) begin q1result, q2result: table; eval(q1); (*evaluate q1 and push the result onto QRES*) eval(q2); (*evaluate q2 and push the result onto QRES*) q2result := top(QRES); (*read the result of q2*) pop(QRES); (*pop the result of q2*) q1result := top(QRES); (*read the result of q1*) pop(QRES); (*pop the result of q1*) push(QRES, q1result q2result); (*apply to the results of both the subqueries and push the final result onto QRES*) end; (*case*) ... end; (*eval*) 158 One of the most useful algebraic operators is the definition of an auxiliary name n (for the result of a query q). The semantics of this operator is the following: it assigns the name n to each row of the result table returned by q. If, for example, the query Professor returns the table {<i1>, <i7>} then the query Professor as p returns the table {<p(i1)>, <p(i7)>} (cf. Fig. B.6). Similarly, if some query q returns the table {<”Russell”>, <”Jones”>, <”Black”>} then q as N returns {<N(“Russell”)>, <N(“Jones”)>, <N(“Black”)>}. Another naming operator is group as which names the entire result of a query. The semantics of this operator is as follows: if q returns a table t, then the query q group as n returns a single binder n(t). For instance, for the previous query the result of q group as N is {N(<“Russell”>, <“Jones”>, <“Black”>)} 159 The dereferencing operator is algebraic as well; it is called implicitly by some operators (only objects storing atomic values can be dereferenced). For example, in the subquery age > 40 the subquery age returns a 11 table with an identifier, say i9. The comparison operator > calls dereferencing, which replaces this identifier with the value 44. B.6.3.2. Non-Algebraic Operators The evaluation of non-algebraic operators (a selection, a dot, a dependent join, quantifiers, etc.) is more complicated and requires further notions. Recall that if is an algebraic operator, then the subqueries q1 and q2 occurring in the query q1 q2 are evaluated independently; then their results make up the final query result. This is not the case of nonalgebraic operators. If the query q1 q2 involves a non-algebraic operator , then q2 is evaluated in the context determined by q1. This is the reason for which these operators are referred to as “non-algebraic”: they do not follow the basic property of algebraic expressions. The query q1 q2 is evaluated as follows. For each row r of the table returned by q1, the subquery q2 is evaluated. Before each such evaluation ES is augmented with a new scope (that can be formed of a section or of several sections, cf. Fig. B.4 and Fig. B.5) determined by r. After evaluation the stack is popped to the previous state. A partial result of evaluation is a combination of a row r and the table returned by q2 for that row; the kind of combination depends on . Next, those partial results are merged to form the final result. New stack section(s) (pushed onto ES as a new environment) are constructed and returned by a special function nested for a row r in the following way: If r consists of a single identifier of a link object, e.g., i6, then one section is built; the section contains the binder of the object the link points to (for i6 it is Lecture(i21)). If r contains a binder, then one section is built; the section contains that binder. If r consists of a single identifier of a complex object, e.g. i7, then potentially several sections are built; the sections contain binders to attributes of that object and binders to the invariants of its class and superclass(es). For i7 the scope is the top five sections in Fig. B.5. The top two sections contain binders to the internal properties of the object whose OID is i7; since in our example database all attributes are public, the top section contains the set of binders {name(i8), age(i9), salary(i10), works_in(i11), gives(i12), 160 prev_job(i13)}, and the other one is empty (since the classes Professor and Person have no invariants, the other three sections are empty too). For other kinds of elements of r the nested function does not build any section(s) (that is, its result is empty). If as or group as have been applied to a table, then each of its rows or the whole table is treated as a binder, respectively. The result of nested for one element of a table row r is called a list of sections. If r contains more than one element, then all the lists of sections built by nested form a new scope: sections of different lists that are at the same level are merged. Fig. B.7 presents the state of ES after a new scope was pushed by the where operator occurring in the following interactive query: (Professor gives.Lecture (count(attended_by) as c)) where q (i) Binders to public attributes Binder for the result of subquery Binders to public attributes of current Professor object + of current Lecture object + (count(attended_by) as c) Binders to public properties Binders to public properties + of Lecture class of Professor class sections pushed by the where operator Binders to public properties of Person class Binders to global properties of the current session Binders to root database objects, views, ... base sections Binders to global procedures, variables, ... Fig. B.7. ES with a new scope The thick horizontal line separates different environments, and the thin horizontal lines separate different sections. The general pattern of evaluation of queries with non-algebraic operators is as follows: procedure eval(query); begin ... case query is q1 q2: (*is a non-algebraic operator*) begin partial_results: array of table; final_result: table; i: integer; 161 i := 1; eval(q1); (*evaluate q1 and push the result onto QRES*) for each row r in top(QRES) do push(ES, nested(r)); (*open a new scope on ES*) eval(q2); (*evaluate q2 and push the result onto QRES*) partial_results[i] := combine(r, top(QRES)); (*create a partial result*) pop(ES); (*pop the current scope*) pop(QRES); (*pop the result of q2*) i := i + 1; end; (*for each*) final_result := merge(partial_results); (*create the final result*) pop(QRES); (*remove the table created by q1*) push(QRES, final_result); (*push the final result onto QRES*) end; (*case*) ... end; (*eval*) The pattern is a little different for the operators of ordering and transitive closure (see [SBMS93]). The SBA semantics is a uniform basis for the definition of several non-algebraic operators of OQL-like languages: Selection: q1 where q2, where q1 is any query, and q2 is a Boolean-valued query. If q2 returns TRUE for a particular row r returned by q1, then r is an element of the final result table; otherwise it is skipped. Navigation, projection, path expression: q1.q2. The final result table is the union of tables returned by q2 for each row r returned by q1. This construct covers generalized path expressions, e.g., q1.q2.q3.q4 is understood as ((q1.q2).q3).q4. Dependent join, navigational join: q1 q2. A partial result for a particular r returned by q1 is a table obtained by a concatenation of the row r with each row returned by q2 for this r. The final result is the union of the partial results. 162 Quantifiers: q1(q2) and q1(q2), where q1 is any query, and q2 is a Boolean-valued query. For the final result is FALSE, if q2 returns FALSE for at least one row r returned by q1; otherwise the final result is TRUE. For a dual definition is applied. B.7. An Example of Query Evaluation To summarize we present in Fig. B.8 all stages of an evaluation of a simple interactive query in SBA (for simplicity we assume that there is only one base section and we do not show sections opened for class properties): Student where (age > 25) i33 i40 i47 Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i7) Lecture(i21) Student(i33) Student(i47) Professor(i7) Lecture(i21) Student(i33) Student(i47) The initial state of ES comparison ren c efe 25) i35 23 false 25 i42 45 true 25 i49 20 false 25 name(i41) age(i42) year(i43) grades(i44) attends(i45) attends(i46) Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i1) Faculty(i16) Lecture(i26) Student(i40) der name(i34) age(i35) year(i36) grades(i37) grades(i38) attends(i39) > ing (age binding where binding Student Professor(i7) Lecture(i21) Student(i33) Student(i47) name(i48) age(i49) year(i50) attends(i51) Professor(i1) Faculty(i16) Lecture(i26) Student(i40) Professor(i7) Lecture(i21) Student(i33) Student(i47) States of ES during evaluation of (age > 25) for particular rows returned by Student i40 The final result Fig. B.8. Evaluation of the query Student where (age > 25) 163 Appendix C. List of Figures Fig. 1.1 Roles played by a person .............................................................................................. 7 Fig. 3.1 An example of inheritance .......................................................................................... 16 Fig. 3.2 Inheritance and availability ......................................................................................... 16 Fig. 3.3 An example of class diagram with objects ................................................................. 17 Fig. 3.4 An example of class diagram with multiple inheritance (in UML) ............................ 18 Fig. 3.5 The inheritance in a tree and lattice hierarchy ............................................................ 19 Fig. 3.6 The ambiguity of reference copying (deep, shallow copying) ................................... 20 Fig. 3.7 An example of multiple-aspect inheritance in UML. ................................................. 21 Fig. 3.8 Different inheritance paths .......................................................................................... 33 Fig. 4.1 An example of prototyping ......................................................................................... 47 Fig. 4.2 A relation between a table (persons) and a subtable (employees) .............................. 48 Fig. 4.3 A class diagram for Object Role pattern (taken from [BRSW97]) ............................. 50 Fig. 5.1 Roles played by a person ............................................................................................ 52 Fig. 5.2 Objects with their roles and classes ............................................................................ 54 Fig. 5.3 An example state of an object store in the classical object store model ..................... 57 Fig. 5.4 An example state of the object store in the object model with roles .......................... 59 Fig. 5.5 Graphical representation of the object store from Fig. 5.4 ......................................... 60 Fig. 5.6. Interfaces to database objects with roles .................................................................... 69 Fig. 5.7 Declarations of data structures stored in the database ................................................ 72 Fig. 5.8 A conceptual structure of the metadata repository (in UML) ..................................... 74 Fig. 5.9 An example state of the environment stack for classical object store model ............. 79 Fig. 5.10 An example state of the environment stack for the object store with roles .............. 80 Fig. 5.11 Sections pushed onto ES by a non-algebraic operator in the classical object store model ................................................................................................................................ 81 Fig. 5.12 Sections pushed onto ES by a non-algebraic operator in the object model with roles .......................................................................................................................................... 82 Fig. 5.13 States of ES during processing a query..................................................................... 83 Fig. 5.14 States of QRES during evaluation of the query: (Student) Employee ...................... 87 Fig. 5.15 An example of an object with nested roles (Conference). ........................................ 88 Fig. 6.1 An example of dynamic classification in [FS97] ........................................................ 93 164 Fig. 6.2 A man has one role ..................................................................................................... 93 Fig. 6.4 A man has at most one role of a given kind ................................................................ 94 Fig. 6.5 A man has several roles of the same kind ................................................................... 95 Fig. 6.6 A class diagram modeling the situation from Fig. 6.5 through composition.............. 95 Fig. 6.7 A class diagram modeling the situation from Fig. 6.5 through dynamic object roles 97 Fig. 6.8 A class diagram from Fig. 6.7 in separated target style .............................................. 97 Fig. 6.9 A diagram in which a class of roles is involved both in static specialization and the RoleOf relationship .......................................................................................................... 98 Fig. 6.10 An example of the RoleOf relationship between roles with an xor-constraint ......... 99 Fig. 6.11 A class diagram involving static specialization and the RoleOf relationship in one hierarchy ......................................................................................................................... 100 Fig. 6.12 A diagram with multiple RoleOf relationships instead of multiple-aspect static inheritance ...................................................................................................................... 100 Fig. 6.13 A person with a SoccerFan role .............................................................................. 101 Fig. 6.14 A class diagram in which the same role class is defined for two classes with different natures.............................................................................................................. 101 Fig. 6.15 Subscriber example ................................................................................................. 102 Fig. 6.16 An object diagram from Fig. 6.5 ............................................................................. 103 Fig. 7.1 An example of class schema ..................................................................................... 105 Fig. 7.2 The query from Example 1 after the syntax analysis. ............................................... 106 Fig. 7.3 The result of query from Example 1. ........................................................................ 106 Fig. 7.4 The result of query from Example 2. ........................................................................ 107 Fig. 7.5 The result of query from Example 3. ........................................................................ 108 Fig. A.1 The syntax tree for query ......................................................................................... 135 Fig. B.1. The class diagram of the example database ............................................................ 149 Fig. B.2. The object store ....................................................................................................... 150 Fig. B.3. The object store as a graph ...................................................................................... 151 Fig. B.4. The beginning and final states of ES ....................................................................... 153 Fig. B.5. A more complex state of ES .................................................................................... 154 Fig. B.6. Example result tables .............................................................................................. 156 Fig. B.7. ES with a new scope................................................................................................ 161 Fig. B.8. Evaluation of the query Student where (age > 25) .................................................. 163 165 Bibliography [ABGO93] A. Albano, R. Bergamini, G. Ghelli, R. Orsini. An Object Data Model with Roles. Proc. of the 19th VLDB Conf., Dublin, Ireland, pp. 39-51, 1993 [AGO95] A. Albano, G. Ghelli, R. Orsini. Fibonacci: A Programming Language for Object Databases. VLDB Journal, vol. 4, no. 3, pp. 403-444, 1995 [AU79] A. V. Aho, J. D. Ullman. Universality of Data Retrieval Languages. Proc. of 6-th ACM Symposium on Principles of Programming Languages, San Antonio, TX., ACM New York, pp. 110-117, Jan. 1979 [Alag97] S. Alagic. The ODMG Object Model: Does it Make Sense? Proc. of OOPSLA Conf., pp. 253-270, 1997 [AAG00] A. Albano, G. Antognoni, G. Ghelli. View Operations on Objects with Roles for a Statically Typed Database Language. TKDE 12(4), pp. 548-567, 2000 [ANSI99] American National Standards Institute (ANSI) Database Committee (X3H2), Database Language SQL Part 2: Foundation (SQL/Foundation), J.Melton, Editor, Working Draft, March 1999, <ftp://jerry.ece.umassd.edu/isowg3/dbl/BASEdocs/public/sql-foundationwd-1999-03.pdf> [BCGKWB87] J. Banerjee, H. T. Chou, J. F. Garza, W. Kim, D. Woelk, N. Ballou. Data Model Issues for Object-Oriented Applications, ACM Trans. On Office Information Systems, 5(1), pp. 3-26, January 1987 [BD77] C. Bachman, M. Daya. The Role Concept in Data Models. Proc. of the 3rd VLDB Conf., Tokyo, pp. 464-476, 1977 [BG95] E. Bertino, G. Guerrini. Objects with Multiple Most Specific Classes. Proc. of ECOOP Conf., Springer LNCS 952, pp. 102-126, 1995 [BO98] C. Bock, J. J. Odell. A More Complete Model of Relations and Their Implementation: Roles, Journal of Object-Oriented Programming, 11:2, pp. 51-54, 1998 [Booch94] G. Booch Object-Oriented Analysis and Design with Applications, AddisonWesley, Menlo Park, 1994 166 [BRSW00] D. Bäumer, D. Riehle, W.Siberski, M. Wulf. Role Object. In Pattern Languages of Program Design 4. Edited by N.Harrison, B.Foote, H.Rohnert. Addison-Wesley, pp. 15-32, 2000 [CM99] P. Coad, M. Mayfield. JAVA Design: Building Better Apps and Applets 2. Ausgabe, Yourdon Press, Upper Saddle River, 1999 [CZ97] W. W. Chu, G. Zhang. Associations and Roles in Object-Oriented Modeling, Proceedings of the 16th International Conference on Conceptual Modeling: ER’97, Springer Verlag, Berlin, pp. 257-270, 1997 [DCMI] DCMI. Dublin Core Metadata Initiative. http://dublincore.org [Dijkstra76] E. W. Dijkstra. A Discipline of Programming. Englewood Cliffs, NJ. Prentice Hall, 1976 [EWH85] R. Elmasri, J. Weeldreyer, A. Hevner. The Category Concept: an Extension to the Entity Relationship Model, Data & Knowledge Engineering 1:1, 75–116, 1985 [Fishman87] D. Fishman et al. IRIS: An Object-Oriented Database Management System. ACM Transactions on Office Information Systems, 5(1), pp. 48-69, 1987 [Fowler97] M. Fowler. Dealing with Roles. http://www.martinfowler.com/apsupp/roles.pdf, 1997 [FS97] M. Fowler, K. Scott. UML Distilled: Applying the Standard Object Modeling Language, Addison-Wesley, 1997 [GHJV95] E. Gamma, R. Helm, R. Johnson, J. Vlissides. Design Patterns - Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995 [Guarino92] N. Guarino. Concepts, Attributes and Arbitrary Relations, Data & Knowledge Engineering 8, pp. 249–261, 1992 [GSR96] G. Gottlob, M. Schrefl, B. Rock. Extending Object-Oriented Systems with Roles. ACM Transactions on Information Systems, pp. 268-296, 1996 [HO93] W. Harrison, A. Ossher. Subject-Oriented Programming (A Critique of Pure Objects). Proc. of 8-th ECOOP Conf., ACM SIGPLAN Notices, vol.28, no.10, pp. 411-428, 1993 [JHPS01] A. Jodłowski, P. Habela, J. Płodzień, K. Subieta: Dynamic Object Roles in Conceptual Modeling and Databases. ICS PAS Reports No 932. Warszawa, November 2001 167 [JHPS02] A. Jodłowski, P. Habela, J. Płodzień, K. Subieta. Objects and Roles in the Stack Based Approach. Proceedings of the Database and Expert Systems Applications Conference, Springer LNCS 2453, pp. 514-523, 2002 [JPSS02] A. Jodłowski, J. Płodzien, E. Stemposz, K. Subieta. Introducing Dynamic Object Roles into the UML Class Diagram, Proc. of IASTED International Conference on Software Engineering and Applications (SEA), ACTA Press, pp. 629-634, Cambridge, MA, USA, 2002 [KRR00] G. Kappel, S. Rausch-Schott, W. Retschitzegger. A Framework for Workflow Management Systems Based on Objects, Rules and Roles. ACM Computing Surveys 32, p. 27, 2000 [KLMM+97] G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Lopes, J. Loingtier, J. Irwin. Aspect-Oriented Programming. Proc. of 11-th ECOOP Conf., Springer LNCS 1241, 1997 [KØ96] B. B. Kristensen, K. Østerbye: Roles: Conceptual Abstraction Theory and Practical Language Issues. Theory and Practice of Object Systems, pp. 143-160, vol. 2, no. 3, 1996 [Kristensen95] B. B. Kristensen. Object-Oriented Modeling with Roles. Proc. of OOIS Conf., pp. 57-71, 1995 [LCW97] F. M. Lam, H. L. Chau, R. K. Wong. An Efficient Indexing Scheme for Objects with Roles. Proc. of BNCOD , pp. 139-153, 1997 [LL94] Q. Li, F. H. Lochovsky. Roles: Extending Object Behavior to Support Knowledge Semantics. Proc. of Intl. Symposium on Advanced Database Technologies and Their Integration, Nara, Japan, pp. 314-322, 1994 [LL98] Q. Li, F. H. Lochovsky. ADOME: An Advanced Object Modeling Environment. IEEE Transactions on Knowledge and Data Engineering, 10(2), pp. 255-276, 1998 [LW99] Q. Li, R. K. Wong. Multifaceted Object Modeling with Roles: A Comprehensive Approach. Information Sciences 117 (3-4), pp. 243-266, 1999 [MD93] G. Maughan, B. Durnota. MON: An Object Relationship Model Incorporating Roles, Classification, Publicity and Assertions, Proceedings of OOIS'94, International Conference on Object Oriented Information Systems, 1993 168 [MMW94] A. O. Mendelzon, T. Milo, E. Waller. Object migration, Proceedings of the 13th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems PODS, pp. 232–242, 1994 [ODMG00] Object Data Management Group: The Object Database Standard ODMG, Release 3.0. R.G.G.Cattel, D.K.Barry, Ed., Morgan Kaufmann, 2000 [OMG95] Object Management Group: CORBA: The Common Object Request: Architecture and Specification. July 1995, Release 2.0. http://www.omg.org [Papazoglou91] M. Papazoglou. Roles: A Methodology for Representing Multifaced Objects. Proc. of Intl. Conf. on Database and Expert Systems Applications, pp. 7-12, 1991 [Pernici90] B. Pernici. Objects with Roles. Proc. of IEEE/ACM Conf. on Office Information Systems, Cambridge, Mass., 1990 [PK00] J. Płodzień, A. Kraken. Object Query Optimization through Detecting Independent Subqueries. Information Systems, Vol. 25, No. 8, pp. 467-490, 2000 [Płodzień00] J. Płodzień. Optimization Methods in Object Query Languages. Ph.D. Thesis. Institute of Computer Science, Polish Academy of Sciences, 2000, available via http://www.ipipan.waw.pl/~jpl [PS01a] J. Płodzien, K. Subieta. Applying Low-Level Query Optimization Techniques by Rewriting, Proc. of Database and Expert Systems Applications (DEXA), Springer LNCS 2113, pp. 867-876, Munich, Germany, 2001 [PS01b] J. Płodzien, K. Subieta. Query Optimization through Removing Dead Subqueries, Proc. of Advances in Databases and Information Systems (ADBIS), Springer LNCS 2151, pp. 27-40, Vilnius, Lithuania, 2001 [PS01c] J. Płodzien, K. Subieta. Static Analysis of Queries as a Tool for Static Optimization, Proc. of International Database Engineering and Application Symposium (IDEAS), pp. 117122, IEEE Computer Society, Grenoble, France, 2001 [Reimer98] U. Reimer. A Representation Construct for Roles, Data & Knowledge Engineering 1:3, pp. 233–251, 1985 [RG98] D. Riehle, T. Gross. Role Model Based Framework Design and Integration. Proc. of the 1998 Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA '98). ACM Press, pp. 117-133, 1998 169 [RH95] D.W. Renouf, B. Henderson-Sellers. Incorporating Roles into MOSES” in: C. Mingins, B. Meyer. Proceedings of the 15th International Conference on Technology of Object-Oriented Languages and Systems: Tools 15, Prentice Hall, pp. 71–82, 1995 [RS91] J. Richardson, P. Schwartz. Aspects: Extending Objects to Support Multiple, Independent Roles. Proc. of ACM SIGMOD Conf., pp. 298-307, 1991 [SBMS93] K. Subieta, C. Beeri, F. Matthes, J. W. Schmidt. A Stack-Based Approach to Query Languages, Institute of Computer Science, Polish Academy of Sciences, Report 738, Warsaw, December 1993 [SBMS94] K. Subieta, C. Beeri, F. Matthes, J. W. Schmidt. A Stack-Based Approach to Query Languages. Proc. of 2nd Intl. East-West Database Workshop, Klagenfurt, Austria, September 1994, Springer Workshops in Computing, pp. 159-180, 1995 [Sciore89] E. Sciore. Object Specialization. ACM Transactions on Information Systems, 7(1), pp. 103-122, April 1989 [SKL95] K. Subieta, Y. Kambayashi, J. Leszczyłowski. Procedures in Object-Oriented Query Languages. Proc. of VLDB Conf., pp.182-193, 1995 [SMA90] K. Subieta, M. Missala, K. Anacki. The LOQIS System. Description and Programmer Manual, Institute of Computer Science, Polish Academy of Sciences, Report 695, Warsaw, November 1990 [SMSRW93] K. Subieta, F. Matthes, J. W. Schmidt, A. Rudloff, I. Wetzel. Viewers: A DataWorld Analogue of Procedure Calls. Proc. 19th VLDB Conf., Dublin, Ireland, pp. 269-277, 1993 [SN88] M. Schrefl, E. Neuhold. Object Class Definition by Generalization Using Upward Inheritance. Proc. of 4th IEEE Intl. Conf. on Data Engineering, pp. 4-13, 1988 [Sowa88] J. F. Sowa. Conceptual Structures: Information Processing in Mind and Machine, Addison-Wesley, 1984 [Steimann00] F. Steimann. On the Representation of Roles in Object-Oriented and ConceptualModeling. Data & Knowledge Engineering, 35(1), pp. 83-106, 2000 [Stein87] L. A. Stein. Delegation is Inheritance. Proc. ACM Conf. on Object-Oriented Programming Systems, Languages, and Applications, pp. 138-146, 1987 170 [Su91] J. Su. Dynamic Constraints and Object Migration in: GM Lohman, A Sernadas, R Camps (eds) Proceedings of the 17th International Conference on Very Large Databases, VLDB Endowment Press, Sarattoga, pp. 233–242, 1991 [Subieta85] K. Subieta. Semantics of Query Languages for Network Databases, ACM Transaction on Database Systems, Vol. 10, No. 3, pp. 347-394, 1985 [Subieta87] K. Subieta. Denotational Semantics of Query Languages, Information Systems, Vol. 12, No. 1, 1987 [Subieta90] K. Subieta. LOQIS: The Programming System Having Database Capabilities, Institute of Computer Science, Polish Academy of Sciences, Report 694, Warsaw, October 1990 (in Polish) [Subieta91] K. Subieta. LOQIS: The Object-Oriented Database Programming System. Proc. of 1st Intl. East/West Database Workshop, Kiew, USSR 1990, Springer LNCS 504, pp. 403421, 1991 [Subieta91a] K. Subieta. Virtual Persistent Object Store Package. Programmer Manual and Technical Description, 1991 [Subieta91b] K. Subieta. Dynamic Memory Package. User Manual for SUN, 1991 [Subieta97] K. Subieta. Object-Oriented Standards: Can ODMG OQL be Extended to a Programming Language? (In) Cooperative Databases and Applications, World Scientific, pp. 459-468, 1997 [Subieta98] K. Subieta. Object–Orientedness in Design and Databases, Akademicka Oficyna Wydawnicza PLJ, Warsaw, 1998 (in Polish) [Subieta99] K. Subieta. The Dictionary of Object-Oriented Terms, Akademicka Oficyna Wydawnicza PLJ, Warsaw 1999 (in Polish) [SZ89] L. Stein, S. Zdonik. Clovers: The Dynamic Behavior of Type and Instances. Technical Report CS-89-42, Brown University, November 1989 [UML99a] J. Rumbaugh, I. Jacobson, G. Booch. The Unified Modeling Language Reference Manual, Addison-Wesley, 1999 [UML01] Unified Modeling Language, version 1.4. Object Management Group, September 2001, http://www.omg.org 171 [W3CRDF] World Wide Web Consortium. Resource Description Framework (RDF), http://www.w3.org/RDF [WCL96a] R. K. Wong, H. L. Chau, F. H. Lochovsky. DOOR: A Dynamic Object-Oriented Data Model With Roles. Proc. of 21st Intl Conf. on Technology of Object-Oriented Languages and Systems (TOOLS), Prentice-Hall, November 1996 [WCL96b] R. K. Wong, H. L. Chau, F. H. Lochovsky. The Roles and Views of Multimedia Objects. Proc. of Intl. Conf. on Multi-Media Modeling. World Scientific Press, 1996 [WCL97] R. K. Wong, H. L. Chau, F. H. Lochovsky. A Data Model and Semantics of Objects with Dynamic Roles. Proc. of Intl. Conf. on Data Engineering, 1997 [WJ89] R. Wieringa, W. de Jonge. The Identification of Objects and Roles - Object Identifiers Revisited. Technical Report IR-267. Faculty of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, December 1991 [WJS95] R. Wieringa, W. de Jonge, P. Spruit. Using dynamic classes and role classes to model object migration, Theory and Practice of Object Systems 1:1, pp. 61–83, 1995 [WL94] R. K. Wong, Q. Li. Advanced Object-Oriented Techniques on Modeling Automated Manufacturing Systems. Proc. of Intl. Conf. on Data and Knowledge Systems in Manufacturing Engineering, 1994 [WL95] R. K. Wong, Q. Li. Manufacturing Systems Modeling with Roles. A Comprehensive Approach. Proc. of IFIP WG2.6 Sixth Working Conference on Database Semantics (DS-6), Atlanta, Georgia, USA, pp. 461-478, 1995 [Wong99] R. K. Wong. Heterogeneous and Multifaceted Multimedia Objects in DOOR/MM: A Role-Based Approach with Views. Journal of Parallel and Distributed Computing 56 (3), pp. 251-271, 1999 172