REVIEW - DBGroup

We made some minor changes (some of these to accomplish reviewer requirements). In particular: Introduction: - we move figure 1 to section 1 - page 3: annotations of the GVV: we extract from the bullet list the definition of annotation Section 1 - section 1.3: the first step of the ontology development --> the first step of the ontology building - section 1.5: we introduced footnote 5, we changed schema-derived relationships explanation (pg. 9) and we changed the representation of relationships producer in the example (pg. 9) Section 3 - Supporting the evolution of an ontology represents a challenging issue (to be faced). --> Supporting the evolution of an ontology represents a challenging issue to be faced. Following pages include detailed answers to the reviewer. REVIEW ****************************** Reviewer: 3 A Reader Interest 1. Which category describes this manuscript? Application 2. How relevant is this manuscript to the readers of this periodical? Please explain your rating. Relevant B. 1. Please summarize what you view as the key point(s) of the manuscript and the importance of this content to the readers of this periodical. The paper presents an approach to use annotations of local database schemas in conjunction with a common thesaurus to generate a global virtual view along with associated annotations and extension of a built up ontology by addition of another source. 2. Is the manuscript technically sound? Please explain your answer. Appears to be - but didn't check completely C. Presentation 1. Are the title, abstract, and keywords appropriate? Please comment. In the revised paper, they are more appropriate. 2. Does the manuscript contain sufficient and appropriate reference? Please comment. Important references are missing: more references are needed 3.Does the introduction state the objective of the manuscript in terms that encourage the reader to read on? Please explain your answer. Yes. Better than before 4. How would you rate the organization of the manuscript? Is it focused? Is the length appropriate? Satisfactory 5. Please rate and comment on the readability of this manuscript. Easy to Read Section II. Summary and Recommendation A. Evaluation Fair B. Recommendation Please make your recommendation and explain your decision. I would recommend a revision of the paper addressing the issues mentioned below. Section III. Detailed Comments A. Public Comments (these will be made available to the author) The authors have described a high level process of creating a virtual view by using annotations in conjunction with a lexical thesaurus. However important details appear to be missing: - In the case of annotations of attributes, what about the annotations of the values of those attributes... how will the common thesaurus help in this case? In your response you have mentioned that you focus only on the element names and not the values. It is immaterial whether you have covered it in another paper or not. The point here being, that annotations of the names and the values will give you better help in understanding the domains and ranges of the classes/attributes and increase the quality of the schemas generated. If you do not believe that much value is added, please give convincing arguments to the same. We agree with your opinion: analyzing values of attributes we are helped in understanding the domains and ranges of the classes/attributes. For example, domains and ranges of the attributes are properly specified in the schema of relational databases, and these descriptions are translated into the internal ODLI3 language and taken into account in the integration process. We suppose that the annotation phase is made from a source expert. For this reason, we are considering to move the annotation phase from "a global level" to a "local level" associating it to the wrapping phase. In this way, when a new source has to be involved, a wrapper has to be placed to manage the source and an expert has to annotate the source. A preliminary work on this topic will be published in International Journal of Web Engineering and Technology (IJWET) ISSN : 1476-1289. Moreover, in that particular case, whereas we consider only HTML sources that are translated by “commercial” wrappers into XML and DTD files (that are directly managed by our wrapper), ranges and domains are not relevant due to the fact that DTD does not manage this kind of information. Mettere discorso con domain expert + annotazione su wrapper IJWET - Local schema constraints such as key and integrity constraints also can play an important role in the process. This has not been explored. You have explained it in the response. Please include it in the paper, it will add to the completeness of the discussion. DONE (see section 1.5). - The Common Thesarus generation process needs to be described in more detail: - How are the schema relationships analyzed and derived? Your response discussed the use of primary and foreign keys to deduce BT/NT relationships. Actually you can deduce subclass of relationships, much stronger than BT/NT. Please include that discussion in your paper. We included in section 1.5 a discussion about schema-derived relationships in XML data files. Your consideration about primary and foreign keys is right and involves relational databases that are not the focus of our paper. For this reason we prefer to include this discussion as a footnote. Regarding subclass of relationships much stronger than BT/NT, other papers related to our integration methodology take into account intensional (terminological relationships, with no implications on the extension of the classes) and extensional relationships (with implications on the extension of the classes). In particular, primary and foreign keys produce BT/NT relationships both intensional and extensional. See for more details [5] and : D. Beneventano, S. Bergamaschi, F. Mandreoli: "Extensional Knowledge for semantic query optimization in a mediator based system", International Workshop on Foundations of Models for Information Integration (FMII-2001), Viterbo, Italy, 16-18 Semptember, 2001. For the goal of this paper, the distinction is not relevant; for this reason we preferred to omit this specification and we considered only intensional relationships, called in the paper simply relationships. - How are DLs used to infer new relationships? Do you interpret hypernymy/hyponymy using the subclass relationship? Does it not generate spurious relationships? Seems to me that there are two sources of generating the BT/NT relationships. One is the intra-schema key constraints in which case they can be represented using the DL subsumption operation. The other source is the lexical ontology (WordNet) itself. You have not displayed any examples of it, but there could be BT/NT relationships thrown up which are NOT subclass of relationships. Do you have approaches of avoiding this happening manually? What impact will it have on the quality of the GVV in the above case? You need to discuss these issues clearly and convincingly. Some relationships are derived directly from the lexical Ontology (WordNet), for example (see section 1.4), since the meaning assigned to Article is an hyponymy of the meaning assigned to Publication, the tool derives the following lexicon relationship: UNI.Article NT CS.Publication This relationship is inserted in the Common Thesaurus and participates to the clustering phase also if it is not a subclass relationships; in other words, lexicon-derived relationships are established independently from the involved class structures. In the clustering phase, the classes UNI.Article and CS.Publication are placed, by the tool, in the same cluster and the generated Global Class (represented in table 4) takes into account their different structure by means of the mapping tables. As we shown in [5], ODLI3 relationships are translated into OLCD descriptions in order to perform inference task typical of Description Logics. New relationships are inferred by using the OLCD subsumption algorithm. A further phase, the relationships’ validation exploits OLCD to validate relationships between attributes in the Common Thesaurus and to delete spurious relationships. The validation is based on the compatibility of domain associated with attributes, distinguishing valid and invalid relationships. - In the GVV generation process, what if there are multiple BT terms that are not comparable to each other (assuming a lattice structure of the thesaurus) Your response indicates that you chose a union of the BTs. What you need to do is to consider the least upper bound (or most common ancestor). Explain why choosing the union works well in most cases. Our previous answer, "If they are not comparable BT terms that are associated to a global class, we consider the union of these terms (see example of global class GC1 in section 2.1)" was referred to the GCB definition in section 2.1 Defining GCB (in order to semi-automatically associate an annotation to the global class GC) we have not considered the least upper bound since we want to consider the set of "broadest" local classes that belong to GC. Let us consider the following example: Local Classes = {L1, L2, L3, L4} Common Thesaurus = {(L2 NT L1), (L3 NT L2), (L4 NT L2)} Global Classes: GC1 = {L1,L2} GC2 = {L3,L4} In this case, the least upper bound of GC2= {L3, L4} is L2 but we can not consider the L2 annotation, because L2 belongs to a different Global Class (GC1). ************************************ End of Review

REVIEW - DBGroup

Related documents

Products

Support

REVIEW - DBGroup

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib