Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore 1 Outline I. II. III. IV. V. Introduction. ORA-SS data model. Designing Valid XML Views. Comparison with Related Work. Conclusion & Future Work. 2 I. Introduction 3 I. Introduction Background XML views are views in XML form on top of the underlying data. XML views enable presentation and exchange of data in underlying databases in XML form on the Internet. XML views are analogous to Relational views. Logical data independence. Data protection. Flexibility of data presentation. 4 I. Introduction Related Works ActiveViews System [2] Based on XML documents, the system offers a novel declarative view specification language to describe views that include the relevant data and activities of each different actor participating in electronic commerce activities. the novelty is in the combination with active features. However, the views considered here are much simpler than views in databases. Generally, the views support simple query operators such as selection operator. [2] S. Abiteboul, B. Amann, S. Cluet, A. Eyal, L. Mignet, and T. Milo. Active views for electronic commerce. In Int. Conf. on Very Large DataBases (VLDB), Edinburgh, Scotland, pages 138-149,1999. 5 I. Introduction Related Works (cont.) Mediation of Information using XML (MIX) [4] MIX provides users with an integrated XML view of the underlying heterogeneous sources. The sources may be relational databases, OO databases or HTML files. MIX uses XML DTD as the data model of XML views and a declarative query language called XMAS to define views. The novelty of MIX is a graphical user interface that integrates browsing and querying XML views. It can support selection operator. However, DTD is still not enough to to express semantics hold in XML data. For example, DTD cannot distinguish whether an attribute belongs to an object class or a relationship type. [4]. C. Baru, A. Gupta, B. Ludaescher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XMLBased Information Mediation with MIX. ACM-SIGMOD, Philadelphia, PA, pages 597-599, 1999. 6 I. Introduction Our contribution Problems in the related works. Did not support validation of XML views. That is, designed views may violate implicit semantics. Did not support more complex operators, such as join, projection and swap. We use a systematic approach to solve the two problems above. Transform XML documents into ORA-SS schema diagram. Enrich the schema diagram with semantics. Propose a set of rules to guide the design of valid XML views. 7 II. ORA-SS data model 8 II. ORA-SS data model Main Concepts ORA-SS: Object-Relationship-Attribute Semi Structured data model. Three main parts: object class, relationship type and attribute. An object class is similar to an element in XML documents. A relationship type describes a relationship among object classes. An attribute is a property of an object class or a relationship type. 9 II. ORA-SS data model An example of ORA-SS schema diagram An object class is represented as a labeled rectangle. An attribute is represented as a labeled circle. A key attribute is represented as a filled circle. Attributes of relationship type have labels on their incoming edges, while attributes of object class do not have. A relationship type is described by name, n, p, c. Name denotes the name of the relationship type. n is the degree of the relationship type. p is the participation constraint of the parent object class in the relationship type. c is the participation constraint of the child object class in the relationship type. project js,2,1:n,1:n jno supplier sp,2,1:n,1:n spj, 3, 1:n, 1:n sno part spj sp pno price qty Figure 1: An ORA-SS Schema Diagram 10 III. Designing Valid XML Views 11 III. Designing valid XML views An introduction Before designing valid XML views, we have two preprocess steps: Transforming XML into ORA-SS Semantic enriching ORA-SS Based on the enriched ORA-SS schema diagram, we begin to design XML views. Four operators can be applied to the XML views: selection, projection, join and swap. The first three operators are similar to selection, projection and join in relational databases. The fourth operator exchanges the positions of parent and child object classes. 12 III. Designing valid XML views Selection operator A selection operator filters data by using predicates. For example, we design a view that depicts projects for which there exist suppliers for which there exist parts with a price > 80. project project js,2,1:n,1:n jno jno supplier sp,2,1:n,1:n spj, 3, 1:n, 1:n sno js,2,1:n,1:n sp,2,1:n,1:n spj, 3, 1:n, 1:n sno part sp Selection operator spj pno supplier part sp spj pno price Source schema qty price > 80 qty Source schema 13 III. Designing valid XML views Selection operator (cont.) Features of selection operator Selection operator put predicates on the source schema to filter data. They do not restructure the source schema. The resulting view schema contains the conditions specified in the selection operator. 14 III. Designing valid XML views Projection operator projection operator selects or drops object classes or attributes in the source schema. the source semantics may be affected. For example, the following view drops the object class supplier and its attributes. project project js,2,1:n,1:n jno jp,2,1:n,1:n Projection operator supplier jno part sp,2,1:n,1:n spj, 3, 1:n, 1:n jp sno part pno sp spj avg_price View schema pno price Source schema qty 15 III. Designing valid XML views Projection operator (cont.) Several changes in the view schema. The attribute sno has been dropped with supplier. The relationship types js, spj and sp have been dropped. The attribute price has been mapped into an aggregate attribute, e. g, avg_price, which represents the average price of each part in a given project. The attribute qty has been dropped. The example shows flexible views can be designed based on ORA-SS with its additional semantics. However, we need to handle the views properly so that semantics will not be violated. 16 III. Designing valid XML views Projection operator (cont.) The rules for applying projection operators. Rule Proj1. Rule Proj2. If an object class has been dropped, its attributes must be dropped too. If an object class has been dropped, all relationship types containing the object class must be dropped too. The attributes of these relationship types must be dropped, or mapped into attributes with some aggregate function, such as avg, max/min or sum, or mapped into attributes typed in bag of values if they cannot be aggregated. Based on the rules, the views designed are guaranteed to be valid when projection operators are applied. 17 III. Designing valid XML views Join operator Join operator joins two object classes and their attributes together by key-foreign key reference. For example, the following view joins project and project’ together. supplier supplier sp,2,1:n,1:n sp,2,1:n,1:n jno part jno employee part sp spj,2,1:n,1:n mj, 2, 1:n, 1:n eno ename pno price project jno sp Join operator pno project' spj,2,1:n,1:n price project mj spj spj qty jno jname Source schema progress jno jname qty View schema 18 III. Designing valid XML views Join operator (cont.) In the view, the attributes jno and jname of project are selected and placed below the object class project. However, the attribute progress is dropped because it belongs to the relationship type mj, which does not exist in the view. Actually, the attribute progress can also be mapped into an attribute typed in bag of values if users want. 19 III. Designing valid XML views Join operator (cont.) The rules for applying join operators Rule Join1 when a join operator is applied to two object classes, if there are relationship types below the referenced object class that contain object classes above the referenced object class, then these relationship types must be dropped. The attributes of these relationship types must be dropped too, mapped into attributes with some aggregate function or mapped into attributes typed in bag of values if they cannot be aggregated. 20 III. Designing valid XML views Join operator (cont.) The rules for applying join operators Rule Join2 When a join operator is applied to two object classes, if there are relationship types below the referenced object class that do not contain any object classes above the referenced object class, then these relationship types can be selected or dropped in the view according to the users’ requirement. The attributes of these relationship types can be selected or dropped too according to the users’ requirement. 21 III. Designing valid XML views Swap operator Swap operator exchanges the positions of a parent object class and one of its child object class. For example, the following view swaps supplier and part. project project ps,2,1:n,1:n jp,2,1:n,1:n jno Swap operator part jno sp,2,1:n,1:n spj, 3, 1:n, 1:n sp,2,1:n,1:n spj, 3, 1:n, 1:n pno sno supplier supplier part spj spj sp sp pno sno price Source schema qty price View schema qty 22 III. Designing valid XML views Swap operator (cont.) In the view, the parent object class part and its child object class supplier are swapped, and the attribute sno moves with its object class supplier. However, the attribute price does not move with supplier. Because price is an attribute of the relationship type sp, it stays below the new lowest object class (part) of sp in the view. Similarly, since the attribute qty belongs to the relationship type spj, it also stays below the lowest object class (part) of spj. 23 III. Designing valid XML views Swap operator (cont.) The rules for applying swap operators. Rule Swap1 If two object classes are swapped in the view, then the attributes of each of the object classes must stay with the object class. Rule Swap2 If two object classes are swapped in the view, then the attributes of relationship types involving the two object classes must stay below the lowest participating object class in the relationship types. 24 III. Designing valid XML views Views on schema with IDD relationship IDentifier Dependency Relationship (IDD) employee IDD,2,0:n,1:1 eno child cname sex DoB ORA-SS source sche ma diagram of an IDD re lationship type Definition1. An object class A is said to be ID Dependent (IDD) on its parent object class B if A does not have its own identifier attributes, and an A object can only be identified by its parent’s key value (say k1) together with some of its own attributes (say k2). That is, the key of A is {k1, k2}. The relationship type between A and B is then called IDD relationship type. 25 III. Designing valid XML views Views on schema with IDD relationship (cont.) If projection, join and swap operators are applied on IDD relationship, rules need to be modified. For example, we design a view that swaps employee and child. In the view, the key attribute of employee – eno is added under the object class child so that {eno, cname} becomes a composite key for child. child employee IDD,2,1:n,1:1 eno child swap operator eno cname employee Constraint: the eno of cname Source sche ma of an IDD re lationship type eno (derived) employee must be the same as the eno of child vie w sche ma swapping employee and child 26 III. Designing valid XML views Views on schema with IDD relationship (cont.) The rules for IDD relationship. Rule Proj_IDD. If a parent object class of an IDD relationship is dropped in the view, then its key attribute must be added to the child object class to construct a key for the child. Rule Join_IDD. If an child object class an IDD relationship type is referenced by another object class in the source schema in the view, then the key attribute of the parent object class must be added to the child to construct a key for the child. Rule Swap_IDD. If two object classes of an IDD relationship type are swapped in the view, then the key attribute of the parent object class must be added to the child object class to construct a key for the child. 27 III. Designing valid XML views View validation algorithm All given rules are integrated into an algorithm to validate XML views. The algorithm monitors the process of designing view until the view is completely designed. According to different operators, the algorithm uses corresponding rules to modify view schema to keep it valid. Once an operator is applied to the view, the algorithm first checks whether IDD relationship type is involved and applies rules for it. Then the algorithm applies the normal rules for the operator. 28 IV. Comparison with Related Work 29 Comparison with related work Active Views system MIX system Our approach Data model XML XML DTD ORA-SS Projection operator No No Yes Join operator No No Yes Swap operator No No Yes Validate views No No Yes Design views graphically No No Yes 30 V. Conclusion & Future Work 31 Conclusion We proposed a systematic approach for valid XML views design. 1. 2. 3. Transform an XML document into an ORA-SS schema diagram. Enrich the ORA-SS schema diagram with additional semantics. Develop a set of rules to guide the design of valid XML views. The approach guarantees validity of XML views and it supports four operators, i.e., selection, projection, join and swap operator. The approach also handles IDD relationships. 32 Future work View definition generation. Query rewriting. Generate the view definition in XQuery from the graphical view schema that has been designed. Rewrite queries on views into queries on source data. View update. Which views are updateable and which are not. How to update those updateable views. 33 Q&A 34 References 1. S. Abiteboul. On views and XML. In Proceedings of the Eighteenth ACM Symposium on Principles of Database Systems, ACM Press, pages 1-9, 1999. 2. S. Abiteboul, B. Amann, S. Cluet, A. Eyal, L. Mignet, and T. Milo. Active views for electronic commerce. In Int. Conf. on Very Large DataBases (VLDB), Edinburgh, Scotland, pages 138-149,1999. 3. S. Abiteboul, D. Quass, J. McHugh, J.Widom, and J. L. Wiener. The lorel query language for semistructured data. International Journal of Digital Libraries, Volume 1, No. 1, pages 68-88, 1997. 4. C. Baru, A. Gupta, B. Ludaescher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XML-Based Information Mediation with MIX. ACM-SIGMOD, Philadelphia, PA, pages 597-599, 1999. 5. Gillian Dobbie, Xiaoying Wu, Tok Wang Ling, Mong Li Lee. ORA-SS: An Object-Relationship-Attribute Model for Semi-Structured Data. Technical Report TR21/00, School of Computing, National University of Singapore, 2000. 6. Tok Wang Ling, Mong Li Lee, Gillian Dobbie. Application of ORA-SS: An Object-Relationship-Attribute Model for Semi-Structured Data. In Proceedings of the Third Interna-tional Conference on Information Integration and Web-based Applications & Services (IIWAS), Linz, Austria, 2001. 7. http://www.w3.org/TR/xquery. 8. http://www.w3.org/XML/Schema. 35