Object Oriented Databases 1. Advanced Database Applications 2. Object-Oriented Concepts 3. OODBMS 4. Common Issues 4. ODMG 2.0 Ioan Despi 1 1. Advanced Database Applications RDBMS: widespread acceptance for traditional business applications: order processing, inventory control, banking, airline reservations proven inadequate for new technologies: computer-aided design (CAD), computer-aided manufacturing (CAM), computer-aides software engineering (CASE), office information systems and multimedia systems, digital publishing, geographic information systems 2 Disadvantages of Relational DBMS: Poor representation of “real world” entities. Semantic overloading Poor support dor integrity and enterprise constraints. Homogenous data structure. Limited operations. Difficulty handling recursive queries Impedance mismatch. Other problems concerning: concurrency, schema access, navigational access, and so on 3 2. Object- Oriented Concepts Loosely speaking, an object correspond to an entity in the ER model. The object -oriented paradigm is based on encapsulating data and code related to an object into a single unit. Abstraction: the process of identifying the essential aspects of an entity and ignoring the unimportant properties. 1. Encapsulation: an object contains both the data structure and the set of operations that can be used to manipulate it. 2. Information hiding: we separate the external aspects of an object from its internal details, which are hidden from the outside world. The internal details of an object can be changed without affecting the application that use it. 4 The current state of an object is described by one or more attributes, or instance variable. The value of each variable is itself an object. 1. A simple attribute: can be a primitive type (integer, string, real,… 2. A complex-attribute: can contain collections and/or references 3. A reference attribute: represents a relationship between objects contains a value or collection of values, which are themselves objects (like a Foreign Key or a pointer) Complex object: an object that contains one or more complex attributes Notation: “dot” notation: branch.street, branch.manager, branch.city Object= a uniquely identifiable entity that contains both the attributes that describe the state of the object and the actions that are associated with it, that is its behaviour.(Simula) 5 The behaviour of an object is given by: a set of messages to which the object responds each message may have 0, 1 or more parameters a set of methods, each of which is a body of code to implement a message a method returns a value as the response to the message The physical representation of data is visible only to the implementor of the object. Messages and responses provide the only extenal interface to an object. The term message does not necessarily imply physical message passing. Messages can be implemented as procedures calls. 6 Object showing attributes and methods Method 4 Method 1 Attributes Method 3 Method 2 7 Methods are programs written in a general-purpose language respecting thee following restrictions: 1. Only variables in the object itself may be referenced directly 2. Data in other objects are referenced only by sending messages They can be used to change the object’s state by modifying its attribute values, or to query the values of selected attributes. A method consists of a name and a body that performs the behavior associated with the method name: method void update_salary (float increment) { salary = salary + increment; } 8 Messages are the means by which objects communicate. A message is simply a request from an object (the sender) to another object (the receiver)asking the second object to execute one of its methods. The sender and receiver may be the same object. The “dot” notation is generally used to access a method. staff_object.update_salary(1000) In a traditional programming language, a message would be written as a function call: update_salary(staff_object, 1000) 9 Object classes Similar objects (have the same attributes, respond to the same messages) are grouped into a class. The attributes and associated methods are defined once for the class. Al objects in a class have the same: variable types message interface methods They may differ in the values assigned to variables Classes are analogous to entity sets in the ER model. Example: all branch objects would be described by a single Branch class. 10 BRANCH Attributes bno street city area … Methods print update_tel_no …. bno=B5 street=12 Deer St city=Sidcup area=London ... bno=B7 street=16 Dever St city=Dyce area=Aberden ... bno=B3 street=154 Main St city=Partick area=Glasgow ... 11 A class is an object ===>has is own class attributes and class methods The class is an instance of a higher-level class called metaclass Class attributes describes the general characteristics of the class, such as totals or averages( ex: total no of branches) Class methods are used to change or query the state of class attributes There are special class methods to create new instances of the class: new --constructor destructor In the following example, employment-length is a derived attribute. For strict encapsulation, methods to read and set other variables are also needed 12 class employee { /*Variables */ string name; string address; date start-date; int salary; /* Messages */ int annual-salary; string get-name; string get-address; }; int set-address ( string new-address) int employment-length; 13 Inheritance Inheritance allows one class (subclass) to be defined as a special case of a more general class (superclass). The process of forming a superclass is referred to as generalization. The process of forming a subclass is referred to as specialization. By default, a subclass inherits all the properties of its superclass(es) and, additionally, defines its own unique properties. A subclass can redefine inherited methods. All instances of the subclass are also instances of the superclass. Principle of substitutability: we can use an instance of the subclass whenever a method or a construct expects an instance of the superclass.. 14 The relation between the subclass and superclass: A KIND OF (AKO) The relation between an instance and its class: IS-A. Examples: Manager is AKO Staff. Susan Deer IS-A Manager. Inheritance: 1. Single inheritance: the subclass inherits from no more than one superclass 2. Multiple inheritance: the subclass inherits from more than one superclass ===> conflicts! 15 Person Single inheritance Staff Manager Sales_Staff Multiple Manager Sales_Staff inheritance Sales_Manager 16 3. Repeated inheritance: a special case of multiple inheritance superclasses inherit from a common superclass The inheritance mechanism must ensure that the subclass does not inherits properties twice. Staff Manager Sales_Staff Sales_Manager 4. Selective inheritance:allows a subclass to inherit a limited number of properties from the superclass. 17 Object Identity Each object is assigned an Object Identifier (OID) when it is created that is: system generated unique to that object invariant independent of the values of its attributes inivisible to the user Other concepts: overriding (+ overloading) polymorphism & dynamic binding complex objects persistence 18 3. OODBMS Hierarchical Data Model 1960 1970 IMS First generation DBMS Network Data Model E. Codd, 1970 1970 - 1980 Relational Data Model Second generation DBMS Chen, 1976 ER Data Model Third generation DBMS 1980-2000 Semantic Data Model Object-Relational Data Model Hammer, McLeod, 1981 Object Oriented Data Model 19 OODM: a logical data model that captures the semantics of objects supported in oo programming OODB: a persistent and sharable collection of objects defined by an OODM OODBMS: the manager of an OODB (W. Kim 1991) Zdonik &Maier(1991) ---An OODBMS must (at minimum) satisfy: must provide database functionality must support object identifier must provide encapsulation must support objects with complex state or (Khoshafian &Abnous 1990) object oriented= ADT + Inheritance + OID OODBMS= OO + database capabilities 20 Traditional DBMS OO programming •persistence •object identity •sharing •encapsulation •transasctions •inheritance •concurrency control •types & classes •recovery control •methods •security •complex objects •integrity •quering Object Oriented Data Model •polymorphism •extensibility Special requirements Semantic data models •versionong •generalization •schema evolution •aggregation 21 Strategies for Developing an OODBMS: 1. Extend an existing object-oriented programming language with database capabilities. Smalltalk, C++, Java --> GemStone 2. Provide extensible object-oriented DBMS libraries. Ontos, Versant, ObjectStore 3. Embed object-oriented database language constructs in a convenient host language. O2 embeds OODL in C. 4. Extend an existing database language with object-oriented capabilities. Extend SQL--> SQL3, OQL. 5. Develop a novel database data model / data language. SIM (semantic information manager, 1988). 22 OODBMS Perspectives: Modern database systems are characterized by their support of the following features: 1. Data model: a particular way of describing data, relationships between data, and constraints on the data 2. Data persistence: the ability of data to outlive the ecxecution of a program, and possibly thee lifetime of the program itself. 3. Data sharing: the ability of multiple applications to access common data, possibly at the same time 4. Reliability: the assurance that the data in the database is protected from hardware and software failures 5. Security : the protection of the data against unauthorized access 23 7. Integrity: the assurance that the data conforms to specified correctness and consistency rules 8. Distribution: the ability to physically distribute a logically interrelated collection of shared data over a network Traditional programming languages provide: 1. Constructs for procedural control and for data and functional abstraction 2. Lack built-in support for many of the above database features Novel applications require functionality from both perspectives. 24 Issues: 1. Persistence: objects must survive user session or application program that created them has terminated transient objects: only last for the invocation of the program To implement persistence in OODB: 3 schemes A. Checkpointing: copy all or part of a program’s address space to secondary storage • a checkpoint can only be used by the program that created it • a checkpoint may contain a large amount of data that is of no use in subsequent executions 25 B. Serialization: copy the closure of a data structure to disk. A write operation on a data value involves the traversal of the graph of objects reachable from the value and, then, the writing of a flattened version of the structure to disk. Reading back this flattened structure: serialization, pickling, marshaling. • Does not preserve object identity: if two data structures that share a common sunstructure are separately serialized, then on retrieval the substructure will no longer be shared in the new copies. • It is not incremental, and so saving small changes to a large data structure is not efficient. 26 C. Explicit paging: paging objects between the application heap and the persistent store. Requires the conversion of object pointers from a disk-based scheme to a memory-based scheme. There are two common methods for creating/updating persistent objects: a. Reachability-based: an object will persist if it is reachable from a persistent root object at any time after creation, an object can become persistent by adding it to the reachability tree. Garbage collection: deletes objects when they are no longer accessible from any other object Smalltalk, Java 27 b. Allocation-based: an object is explicitely declared as being persistent within the application program i) By class: a class is statically declared to be persistent --> all instances of the class are made persistent when they are created a clas may be a subclass of a systemsupplied persistent class Ontos, Objectivity/DB ii) By explicit call: an object may be specified as persistent when it is created or, in soome cases, dynamically at runtime (added to a persistent collection) ObjectStore 28 Alternatively, to provide persistence in a programming language: orthogonal persistence, based on the following principles: 1. Persistence independence: the persistence of a data object is independent of how the program manipulates the data object and conversely, a fragment of the program is expressed independently of the persistence of data it manipulates. 2. Data type orthogonality: all data objects should be allowed the full range of persistence irrespective of their type: Ps-algol, Napier88, Galileo, GemStone Persistence is only a quality attributable to a subset of the language data types: Pascal/R, Amber, E, Avalon/C++ 3. Transitive persistence: the choice of how to identify and provide persistent objects at the language level is independent of the choice of data types in the language. Most used technique: 29 reachability-based. Orthogonal persistence: Advantages: 1. There is no need to define long-term data in a separate schema language 2. No special application code is required to access or update persistent data 3. There is no limit to the complexity of the data structures that can be made persistent 4. Improved programmer productivity from simpler semantics 5. Improved maintenance 6. Consistent protection mechanisms over the whole environment 7. Support for incremental evolution 8. Automatic referential integrity 30 Issues: 2. Pointer Swizzling Techniques: the action of converting object identifiers (OIDs) to main memory pointers, and back again Aim: to optimize access to objects. Obvious approach: to hold a lookup table that maps OIDs to main memory pointers Pointer swizzling: stores the main memory pointers in the place of the referenced OIDs and vice versa, when the object has to be written back to disk 31 A. No swizzling: the OID is used every time the object is accessed the system maintains a lookup table, so that the object’s virtual memory pointer can be located and then used to access the object. Could be inefficient if the same objects are accessed repeatedly Could be acceptable if applications access an object once B. Object referencing: to be able to swizzle a persistent object’s OID to a virtual memory pointer, a mechanism is required to distinguish between resident and non-resident objects. Most techniques are variations of edge marking or node marking: (Hoskings&Moss, 1993): 32 Virtual memory is considered to be a directed graph, with objects as nodes and references as directed edges: 1. Edge marking marks every object pointer with a tag bit. If the bit is set, then the reference is to a virtual memory pointer Otherwise, it is still pointing to an OID and needs to be swizzled when the object it referes to is faulted into the application’s memory space. 2. Node marking requires that all object references are immediately converted to virtual pointers when the object is faulted into memory. 1 is a software-based technique; 2 can be implemented using software or hardware-based techniques. 33 C. Hardware-based schemes: use virtual memory access protection violations to detect accesses of non-resident objects(Lamb91) Use the standard virtual memory hardware to trigger the transfer of persistent data from disk to main memory. Once a page has been faulted in, objects are accessed on that page via normal virtual memory pointers. The hardware approach avoids the overhead of residency checks incurred by software approaches but limits the amount of data that can be accessed during a transaction to the size of virtual memory and complicates other issues, like recovery, fine-grained locking, aso. ObjectStore, Texas 34 Issues: 3. Transactions in classical DBMSs: short duration transactions in CAD, CASE,…: long duration transactions (hours, days) a need for new protocols: nested transactions, sagas, multi-level transactions. 4. Versions: Ontos, Versant, ObjectStore, Objectivity/DB, Itasca object version = an identifiable state of an object version history = the evolution of an object version management = object references always point to the correct version of an object 35 Types of versions: 1. Transient version: unstable, can be updated and deleted it can be created from new by checking out a released version from a public database or by deriving it from a working or transient version in a private database, when the base transient version is promoted to a working version. Always sored in the creator’s private workspace. 2. Working version: stable and cannot be updated but it can be deleted by its creator. It is stored in the creator’s private workspace. 3. Released version: stable, cannot be updated or deleted. it is stored in a public database by checking in a working 36 version from a private database Issues: 5. Schema evolution: design is an incremental process. To support this process, applications require flexibility in dynamically defining and modifying the database schema. Typical changes to the schema include: changes to the class definition: modifying attributes, modifying methods changes to the inheritance hierarchy: making a class S the superclass of a class C, removing a class S from the list of superclasses of C, modifying the order of superclasses of C changes to the set of classes: creating and deleting classes, modifying class names 37 Client - Server Architectures: 1. Object server: distributes the processing between the two components Server process: responsible for managing storage, locks, commits to secondary storage, logging, recovery, security, query optimization and executing stored procedures Client process: responsible for transaction management and interfacing to the programming languages 2. Page server: most of the database processing is performed by the client. Server process: responsible for secondary storage and for providing pages at the client’s request 38 3. Database server: most of the database processing is performed by the server. Client process: passes requests to the server, receives results and passes them on to the application. Used by relational DBMS In each case, the server resides on the same machine as the physical database. The client may reside on the same or different machine. If the client needs access to databases distributed across multiple machines, then the clients communicates with a server on each machine. There may be a number of clients communicating with one server: for example, one client for each user or application. 39 Advantages od OODBMSs: enriched modeling capabilities extensibility removal of impedance mismatch more expressive query language support for schema evolution support for long duration transactions aplicability to advanced database applications improved performance 40 Disadvantages of OODBMSs: lack of universal data model lack of experience lack of standards query optimization compromises encapsulation locking at object level may impact performance complexity lack of support for views lack of support for security 41 Object Database Standard ODMG 2.0 1997 Object Database Management Group proposed an OODM consisting of: 1. An object model 2. An object definition language (ODL) (like traditional DDL) 3. An object query language, with a SQL-like syntax ODMG object model is a superset of the Object Management Group (OMG) object model. 1990: OMG published its Object Management Architecture (OMA) Guide document . It specified a single terminology for oo languages, systems, databases and applications. 42 Application objects WP Spreadsheet Common facilities CAD Object Request storage Transaction management Help email browser versioning security Broker queries Object OMA services 43 1. The Object Model-- OM is a design-portable abstract model for communicating with OMG-compliant object-oriented systems a requester sends a request for object services to the ORB which keeps track of all the objects in the system and the types of services they can provide the ORB then forwards the message to a provider who acts on the message and passes a response back to the requester via the ORB requester ORB provider 44 2. The Object Request Broker -- ORB handles distribution of messages between application objects is a distributed ‘software bus’ that enables objects (requesters) to make and receive requests and responses from a provider on receipt of a response from the provider, the ORB translates the response into a form the original requester can understand --> provides a mechanism by which objects make and receive requests and responses transparently --> interoperability between applications in a heterogeneous distributed environment 45 3. The Object Services --OS provide the main functions for realizing basic object functionality collection: a uniform way to create and manipulate most common collections generically: sets, queues, stacks, lists, binary trees concurrency control: a lock manager that enables multiple clients to coordinate their access to shared rresources event management: allows components to dynamically register or unregister their interest in specific events exeternalization: provides protocols and conventions for externalizing and internalizing objects. 46 externalization: records the state of an object as a stream of data (in memory, on disk, across network) internalization: creates a new object from it in a different process licensing: operations for metering the use of components to ensure fair compensation for their use, and protect intellectual property lifecycle: operations for creating, copying, moving, and deleting groups of related objects naming: facilities to bind a name to an object relative to a naming context persistence: interfaces to mechanisms for storing and managing objects persistently property: operations to associate named values (properties) 47 with any (external) component query: declarative query statements with predicates, the ability to invoke operations and other object services relationship: a way to create dynamic associations between components that know nothing of each other security: services such as identification and authentification, authorization and access control, auditing, security of communication, non-repudiation, administration time: maintains a single notion of time across different machines trader: a matchmaking service for objects. It allows objects to dynamicaly advertise their services, and other objects to register for a service. transactions: a two-phase commit coordination among recoverable components using flat or nested transactions 48 4. The Common Facilities --CF comprise a set of tasks that many applications must perform but are traditionally duplicated within each one. they are made available through OMA-compliant class interfaces in the latest version: CF are split in horizontal common facilities (printing, electronic mail, aso) and vertical domain facilities (finance, helthcare, manufacturing, e-commerce, transportation, telecommunications) 49 The Common Request Broker Architecture -- CORBA defines the architecture of ORB-based environments is the basis of any OMG component, defing the parts that form the ORB and its associated structure 1991: CORBA 1.1 defined: Interface Definiton Language (IDL) Application Programming Interfaces (API) - enable clientserver interaction with a specific implementation of an ORB 1994dec: CORBA 2.0 improved interoperability specified how ORBs from different vendors can interoperate 1997: CORBA 2.1 50 Main elements: IDL: permits the description of class interfaces independent of any particualr DBMS or programming language a type model that defines the values that can be passed over the network. an Interface Repository, which provides information on interfaces and types, and is used to construct dynamic runtime requests, by the Dynamic Invocation Interface Methods for getting the interfaces and specifications of objects Methods for transforming OIDs to and from strings From the IDL definitions, CORBA objects can be mapped into particular programming languages, as C, C++, Smalltalk and Java. This produces interface stubs within the application programming language (client) that are used to invoke the requests. The same stubs are used on the object implementation side (server) to 51 create skeletons, which are completed to provide the requested behavior. The ODMG Object Model Vendors: GemStone Systems, Object Design, O2 Technology, Versant Object Technology, UniSQL, POET Software, Objectivity, IBEX Computing SA, Lockheed Martin formed Object Database Management Group (ODMG) It produced an object model that specifies a standard model for the semantics of database objects. The model is important because it determines the built-in semantics that the OODBMS undestands and can enforce The design of class libraries and applications that use these semantics should be portable across the various OODBMSs that support the object model. 52 The major components of the ODMG for an OODBMS are: 1. Object model--OM 2. Object definition language --ODL 3. Object query language -- OQL 4. C++ language bindings 5. Smalltalk language bindings 6. Java language bindings Initial ODMG standard: 1993 Major version: ODMG 2.0 september 1997 53 1. The Object Model --OM ODMG object model is a superset of th OMG object model enables both designs and implementations to be ported between complian systems Basic modeling primitives: the object and the literal. Objects and literals can be categorized in types: all objects of a given type exihibit common behavior and state. A type is an object. Behavior is defined by a set of operations that can be performed on or by object. State is defined by the values an object carries for a set of properties A property may be either an attribute or a relationship between the object and one or more other objects. 54 Literal_type Atomic_type Collection_literal long set < > short bag < > unsigned long unsigned short list < > array < > float dictionary < > Structured_literal date time timestamp double boolean octet interval structure < > char string enum < > 55 Object_type Atomic_object Structured Collection_object Set< > _object Bag< > Date List < > Time Array < > Timestamp Dictionary < > Interval 56 A database stores objects, enabling them to be shared by multiple users and applications. A database is based on a schema that is defined in ODL. The database contains instances of the types defined by its schema. Objects types are: atomic, collections or structured types. Types shown in italics are abstract types. Types shown in normal are directly instantiable. They are the only base types. Types with < > indicate type generators. Objects are created using the new() method of the corresponding factory interface provided by the language binding interface. All objects have an ODL interface which is implicitly inherited by the definition of all user-defined objects: 57 Interface Object { enum Lock_Type {read, write, upgrade}; exception LockNot Granted {}; void lock(in Lock_Type mode) raises (LockNotGranted); boolean try_lock(in Lock_Type mode); boolean same_as(in Object anObject); Object copy(); void delete(); } Each object has an unique identity, OID, which does not change and is not reused when the object is deleted. In addition, each object has one or more meaningful user names Objects can be transient or persistent. 58 Literals : atomic, collections, structured, null The values of a literal’s properties may not change. Literals do not have their own OID and cannot stand alone as objects: they are embedded in objects Structured literals contain a fixed number of named heterogenous elements of the form: < name , value >, where value may be any literal type. Struct Address { string street; string area; string city; string post_code; }; attribute Address branch_address; 59 Collections: contain an arbitrary number of unnammed homogeneous elements, each of which can be an instance of an atomic type, a collection or literal type There are ordered and unordered collections. Ordered collections must be traversed first to last or vice versa; unordered collections have no fixed order of iteration. Set: unordered collections that do not allow duplicates Bag: unordered collections that do allow duplicates List: ordered collections that allow duplicates Array:one-dimensional array of dynamically varying length Dictionary: unordered sequence of key-value pairs with no duplicate ekeys Each subtype has operations to create an instance of the type and insrt an element into the collection. Sets and Bags have usual set operations: , , 60 Interface Collection: Object { exception InvalidCollection{}; exception ElementNotFound{any element}; unsigned long cardinality(); boolean is_empty(); boolean is_ordered(); boolean allows_duplicates(); boolean contains_element(in any_element); void insert _element(in any_element); void remove _element(in any_element); raises (ElementNotFound); Iterator ` create_iterator(in boolean stable); Bidirectionaliterator create_bidirectional_iterator(in boolean stable); Raises(InvalidCollectionType); }; ODL interface for collections 61 A type has a specification and one or more implementations. The (external)specification defines the properties and operations that can be invoked on instances of the type. An implementation defines data structures, exceptions and methods that operates on the data structures to support the required state and behavior. Class: The combiantion of a type specification and an implementation. An interface definition is a specification that defines only the abstract behavior of an object type: supertypes, extend and keys. A literal definition defines only the abstract state of a literal type. 62 Properties: in ODMG object model: attributes and relationships Attributes: is defined on a single object type is not a “first class” object (is not an object)--> no OID its value is a literal or an OID Relationships: only binary and are defined between types cardinality: 1:1, 1:M, M:N is not a “first class” object, does not have a name traversal paths are defined in the interface for each direction of traversal on the many side: objects can be unordered (set, bag) or ordered (list). OODBMS maintains referential integrity. 63 Example: a Branch Has a set of Staff and a member of Staff WorksAt a Branch: interface Branch { relationship set <Staff> Has inverse Staff:: WorksAt } interface Staff { relationship Branch WorksAt inverse Branch:: Has} The model has built-in operations to form and to drop members from relationships and to manage the required referential integrity constraints attribute BranchWorksAt; void form_WorksAt(in Branch aBranch); void drop_WorkAt(in Branch aBranch); 64 2. The Object Definition Language --ODL is a specification language for defining the specifications of object types for OMG-complian systems. facilitates portability of schemes between compliant systems defines the attributes and relationships of types specifies (but not addresses the implementation of) the signature of the operations the syntax of ODL extends the IDL (Interface Definition Language) of the CORBA will be the basis for integrating schemas from multiple sources and applications 65 3. The Object Query Language --OQL provides declarative access to the object database using an SQL-like syntax. does not provide explicit update operators, but leaves this to the operations defined on object types. can be used as a standalone or as an embedded language in another language (now: C++, Smalltalk, Java). can invoke operations programmed in these languages An OQL query is a function that delivers an object whose type may be infered from the operator contributing to the query expression. 66 Query definition expression: DEFINE Q AS e /* defines a query with name Q given a query /* expression e 1. Elementary expressions: • an atomic literal: 10, 17.5, ‘c’, “qwerty”, false, nill • a named object: • an iterator variable from the FROM clause of the SELECT-FROM-WHERE: e as x or e x or x in e where e is of type collection(T), then x is of type T • a query definition expression (Q above) 67 2. Construction expression: •If T is a type name with properties p1, p2, …,pn and e1, e2, …, en are expressions then T(p1 : e1, p2 : e2, …,pn : en) is an expression of type T. Example: Branch(bno : ”B22”, manager : ”Susan Brand”) •Similarly, we can construct expressions using struct, set, list, bag and array: struct (bno : “B22”, street : “166 Main ST”) is an expression which dynamically creates an instance of this type 68 3. Atomic Type Expressions •Expressions can be formed using the standard unary and binary operations on expressions. •If S is a string, expressions can be formed using: the string concatenation operation ( || or + ) a string offset Si , meaning the i + lth character of the string S[low : up], meaning the substring of S from low + lth to up+lth character c in S (where c is a char) returning a bolean expression S like pattern . Pattern contains the characters ? or _ , meaning any char, or the wildcard characters * or %, mening any substring. Returns a boolean expression 69 4. Object Expressions •Expressions can be formed using the equlity and inequality operations ( = and != ) returning a boolean. •If e is an expression of a type having an attribute or a relationship P of the type T, then e.P and e -->P are expressions of type T. •In a same way, methods can be invoked to return an expression •If a method has no parameteers, the brackets in the method call can be omitted 70 5. Collections expressions Expressions can be formed using universal quantification for all existential quantification exists membership testing in select clause select from where sort-by operator sort unary set operators min, max, count, sum, avg group-by operator group The format of the SELECT clause is similar to the standdard SQL SELECT clause: 71 SELECT [DISTINCT] <expression> FROM <from_list> [WHERE <expression>] [GROUP BY <attributes> [HAVING <predicate>] [ORDER BY <expression>] Where: <from_list>::= <variable_name> IN <expression> | <variable_name> IN <expression>, <from_list> | <expression> AS <variable_name> | <expression> AS <variable_name>, <from_list> | The result of a SELECT DISTINCT query is a set The result of a SELECT query is a bag 72 6. Indexed Collections Expressions • If e1 and e2 are lists or arrays and e3 and e4 are integers, then e1[e3], e1[e3:e4], first(e1), last(e1) and (e1 + e2) are expressions 7. Binary Set Expressions • If e1 and e2 are sets or bags, then the set operators union, except and intersect of e1 and e2 are expressions. 8. Structure Expression • If e is a expression and p is a property name, then e.p and e-->p are expressions, which extract the property p of an object e. 73 9. Conversion Expressions If e is an expression, then element(e) is an expression that checks e is a singleton, raising an exception if it is not. If e is a list expression, then listtoset(e) is an expression that converts the list into a set. If e is a collection-valued expression, then flatten(e) is an expression that converts a collection of collections into a collection, that is, it flattens the structure. If e is an expression and c is a type name, then c(e) is an expression that asserts e is an object of type c, raising an exception if it is not. 10. Object Expressions If e is an expression and f is an operation, then e.f and e-->f are expressions that apply an operation to an object. The operation can optionally take a number of expressions as parameters. 74 A query consists of a (possibly empty) set of query definition expressions followed by an expression. The result of a query is an object with or without identity. Examples: A. get the set of all staff (with identity) staff B. get the set of all branch managers (with identity): branch_offices.ManagedBy 75 C. get the set of all staf who live in London (without identity): define Londoners as select x from x in staff where x.address.city = “London” select x.name.lname from x in Londoners returns a literal of type set<string> D. get the structured set (without identity) containing name, sex, and age for all staf who live in London: select struct (lname:x.name.lname, sex:x.sex, age:x.age) from x in staff where x.address.city = “London” returns a literal of type set<struct> 76 E. get the structured set (with identity) containing name, sex, and age for all deputy managers over 60: type deputies {attribute lname : string; sex: string; age : integer;} deputies (select struct ( lname:x.name.lname, sex:x.sex, age:x.age) from x in (select y from staff where position = “Deputy”) where x.age > 60) 77 F. get a structured set (without identity) containing branch number and the set of all Assistants at the branches in London: select struct (bno:x.bno, assistants: (select y from y in x.WorksAt where y.position=“Assistant”)) from x in (select z from branch_offices where z.address.city= “London”) Object without identity are created using struct, (see D, F). 78