Knowledge Modeling, use of information sources in the study of relationships

Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker Introduction • Introduction – Background – ADEPT – Problems – Contributions Background • Web is an ever-increasing source of information • Information of interest to user is distributed across multiple heterogeneous sources • Need for integration to provide a one point access for querying ADEPT • Besides querying, use the data sources to extract useful knowledge • Provide an environment for studying domains • Provide means to study and explore complex inter-domain relationships • Ability to pose complex information requests across multiple domains Problems • Diverse and distributed sources • Web sources unlike database – Unstructured or semi-structured – Inconsistencies and information overlapping • Heterogeneities – Semantic – Structural – Syntactic Problems • Representation of complex relationships • Use of Knowledge Model for complex information request capability with embedded semantic information Contribution • • • • Knowledge Model Information Scape Model Learning Paradigm Visual Interfaces Outline • • • • • • • Knowledge Modeling Information Scapes Learning Paradigm Visual Interfaces Related Work Future Work Demo Knowledge Modeling • Approach to source modeling – Global model and source model – Source centric / query centric Source Centric Advantages – Global model independent of source model – Modeling a source is independent of other sources – Dynamic addition, removal and modification of sources – Global view remains unaffected – No source mapping required during information integration – More suitable for sources other than database sources ( web sources) Knowledge Base • Comprises of – Ontologies (Domain model) – Resources – Relationships – Operations Domain Hierarchy Ontology • Standardize meaning, description, representation of involved attributes • Capture the semantics involved via domain characteristics • Allow knowledge sharing and reuse • Resolve resource model differences by mapping them to the global model of the ontology they represent • Global interface Ontology • Description includes – Attributes – Domain Rules – Functional Dependencies Resource • Desirable characteristics: – Add, modify and delete resources for an ontology dynamically without affecting the systems knowledge – Specify the sources in a manner such that one can declaratively query them – Since the number of resources is large there is a need to identify the exact usefulness of resources from the query viewpoint and prune the others Resource • Description includes – Attributes – Binding Patterns – Data Characteristics – Local Completeness Relationships • Simple relationships: – equals, less-than, like, is-a, is-part-of • Are hierarchical or similarity based • Complex relationships – “Earthquakes cause Tsunami”, “Nuclear explosions cause earthquakes”, “Airpollution affects vegetation” Relationships • Characteristics – Involves multiple ontologies – Requires understanding the semantics involved in their interaction – Cannot be expressed by simple relational and logical operators alone – Involves use of complex operations like functions and simulations Relationship • Example – “Nuclear explosion causes Earthquakes” • NuclearTest Causes Earthquake: dateDifference(NuclearTest.eventDate, Earthquake.eventDate)<30 AND distance(NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude)<10000 Operations • Functions, Simulations • Functions – user defined – used to model the semantics involved in the relationships – used in post processing of result data – example distance, dateDifference • Simulations – independent programs – used for post processing of result data – example clarke urban growth model Information Scape (Iscape) • Representation of an information request across multiple domains • Can be deployed and executed • Sources not explicitly specified like in a query • System is aware of the sources and is able to identify the useful sources • Semantic correlation across domains is embedded within the information request Information Scape • Definition – An IScape may be defined as information request over distributed heterogeneous sources of information involving multiple ontologies and the relationships between them that contains meta-information constructed to facilitate the bridging of semantic relationships between individual sources. Information Scape • Ontologies • Relationships • Constraint – Conjunctive boolean expression • Runtime configurable constraint – Conceptually different • Grouping and group constraint – Similar to having clause in SQL • Projection list Learning Paradigm • Study of domain • Use IScapes to study the domain interaction by using relationships • Relationships could lead to transitive findings • Explore the hypothetical relationships to validate and establish them or invalidate them Learning Paradigm • Data mining – Age and breast cancer • Relationships – Nuclear Explosion causes Earthquakes • Post processing – Functions – Simulations – Charting tool Learning Paradigm • Find the earliest recorded Nuclear test conducted • Plot a graph of the average number of Earthquakes of magnitude greater than 5.8 per year starting from 1900 • Find the average number of Earthquakes of magnitude greater than 5.8 between 1900-1949 and between 1950-present Learning Paradigm • Find the average number of Earthquakes of magnitude greater than 7 between 1900-1949 and between 1950-present • Find pairs of Nuclear tests and Earthquakes that occurred with a certain radius and a certain time period of the explosion Visual Interfaces • • • • Knowledge Builder IScape Builder Web Interface IScape Processing Monitor Knowledge Builder • GUI to build the knowledge base – fast and easy to use – Manually creating the knowledge could be arduous and error prone • Knowledge is stored in the standard XML format • Abstraction from the underlying format and other technical details Knowledge Builder • Assists in the creation, deletion and modification of the knowledge base • Automatically creates a knowledge tree that assists in relating the knowledge in a better manner Knowledge Builder Knowledge Hierarchy IScape Builder • GUI to create, deploy and execute IScapes in a step by step manner • IScape stored in XML format • User abstraction to the underlying structure • Validity checks implemented • Integrated tools – the charting tool to plot charts with the result data IScape Builder Web Interface • Web accessible – Knowledge Base – Existing Iscapes • Set the runtime configurable constraint • Execute existing IScapes • View the tabulated results • Cannot create new IScapes Web Interface Result Screen IScape Processing Monitor • Color coded log entries describing the IScape processing are generated – Brief message along with agent name – Time stamp – detailed description and associated data, if any – IScape plan for the existing sources – Intermediate results • High level debugging tool – Understand execution, locate failures • Not available with the web interface Monitor GUI Related Work • State of the art – SIMS, TSIMMIS, Information Manifold, Observer, Infosleuth • Mainly focussed on one point access for querying of integrated data of a domain • What makes ADEPT unique – Relationships, IScapes, learning paradigm distinguishes our system from any prior work Future Work • Support rules of type “if-then” and use of induction learning to speed up the processing • Recursive query capability required • IScape over Iscape support required • Simulations currently supported as specialized function in our framework • Statistical analysis tools like SAS for time series analysis, logistic regression

Knowledge Modeling, use of information sources in the study of relationships

Related documents

Products

Support

Knowledge Modeling, use of information sources in the study of relationships

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib