Sharing our understanding: ontology is useful, but we need more… Mark Gahegan GeoVISTA Center, Penn State University, USA Credits GeoVISTA Center, Penn State: (GEON, HERO, Dialog-Plus) Junyan Luo Bill Pike, (Pacific Northwest National Labs) Boyan Brodaric (Geological Survey Canada) Tawan Banchuen A-J Jaiswal Kean-Huat Soon Steve Weaver The Geosciences Network (GEON): www.geongrid.org Hypothesis: We are relying too much on ontology as our ‘carrier of meaning’… • “You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird... So let's look at the bird and see what it's doing -- that's what counts.” -- Richard Feynman • Ontology tells us what is known, but epistemology considers how it is known, how it came to be, why it came to be the way it is (and not some other way), how it is used, why it is used… What kind of science record are we currently leaving behind? • Can we understand our products into the future? The “knowledge as fish” problem – 2 days – 2 months? – 20 years? • Can other researchers understand them? – Often, no! – What are the problems they face? • Changing taxonomies, non-commensurate semantics • Changing methods, workflows • Changing practices, individuals Where does meaning come from? • The concepts & relations we use to describe the world do not exist in the world—we create them. • How we do this—create meaning out of data—is often unrecorded… – for lack of a model of the scientific process that can capture knowledge as it is created and used. • We need an approach to representing scientific concepts that reflects: – the situated processes of science work, – the social construction of knowledge, and – the evolution of knowledge over time. • In this model, knowledge is the result of investigation, negotiation, and collaboration by teams of researchers. Where does meaning come from? • We ‘know’ things in many ways: – Theoretical, Experiential, Procedural • Domain understanding / theory (ontology) • The way science is done (epistemology) – How are resources created and used (work practices / situations)? • Social negotiation among the community of users (social network, group cognition) • i.e. the interplay of top-down and bottom-up knowledge played out in private and social situations “Knowledge soup” – Sowa, 2002 The world is complex, our understanding is complex also • “Human knowledge is a process of approximation. In the focus of experience, there is comparative clarity. But the discrimination of this clarity leads into the penumbral background. There are always questions left over. The problem is to discriminate exactly what we know vaguely”. Alfred North Whitehead, Essays in Science and Philosophy • “Get rid, thoughtful Reader, of the Okhamistic prejudice of political partisanship that in thought, in being, and in development the indefinite is due to a degeneration from a primal state of perfect definiteness. The truth is rather on the side of the Scholastic realists that the unsettled is the primal state, and that definiteness and determinateness, the two poles of settledness, are, in the large, approximations, developmentally, epistemologically, and metaphysically.” C.S. Peirce P. J. Braspenning: Symposium on Intelligent Agents in Software Engineering for Planning, KaHo St.-Lieven, Gent, 23rd February, 2000 Meaning and language (from P. J. Braspenning) • Many people assume that conveying understanding is a merely linguistic problem that would not occur in a purified language like description logic (ontology). – Yet that assumption is wrong. Most of our problems are caused by the complexity of the world itself, and our uneven experiences of it. • The knowledge soup has a loose organization characterized by the “disorder” and the “leftover questions” – The problem is to “discriminate exactly what we know vaguely” – The task therefore is to make “little bits of order” that organize, interpret, and give meaning to the disorder. • For both of them, language is a tool for discriminating [DISCOVERING] and creating [INVENTING] structure out of the primordial knowledge soup. – i.e. Emergence and Imposition • This structure is essential for precise reasoning, and any reasoning system – human or artificial – must either find structure in the soup or create structure that can provide, in Peirce’s terms: “a solid foundation for great and weighty thought.” P. J. Braspenning: Symposium on Intelligent Agents in Software Engineering for Planning, KaHo St.-Lieven, Gent, 23rd February, 2000 What’s in the soup? A nexus of knowledge structures (Whitehead, 1923) Describing our resources: Options (Do nothing... It’s a hard problem, after all) 1. Build community ontologies—register to these 2. Allow user to describe—build very good matching tools! 3. Infer from usage patterns—needs data mining technology 4. Infer from workflows used to create the resource—needs workflow capture & representation Why ontologies? (Noy and McGuinness) • To share common understanding of the structure of information among people or software agents • To enable reuse of domain knowledge • To make domain assumptions explicit • To automatically integrate disparate databases… Rock Taxonomy (ontologically based) Geological taxonomy converted to an ontology Gathered from experts during a specially convened workshop Formalizes relationships between concepts Ontology: the fine print 1. 2. 3. Lack of working processes by which to capture ontological knowledge effectively Lack of tools to adequately convey ontological meaning to others (and to ourselves) Lack of useful matching measures to show how close ontologically one concept is to another, and hence problems of interpretation for the user – 4. 5. Lack of mechanisms to achieve community consensus & manage ontology releases The world is in constant flux, as is our own understanding of the world, but our ontologies are static: – – 6. Can we support versioning, revision, refinement How do we do so WITHIN A KNOWLEDGE COMMUNITY? The world is very complex: – 7. By taking what is essentially a reductionist approach, we remove potentially important variance There are no natural or given categories in nature—we invent & impose them all: – – – 8. No rose, no red, no love, no like—no colours, tastes, no forests, no oceans, no granite outcrops but we invent them differently from one another, and we differ in how we understand or differentiate them, we may even be inconsistent with ourselves over time What, then, carries meaning? Is it these ontological labels? Or the processes by which we arrived at them (epistemology)? Or the ways they are used in practice (pragmatics?) – – 9. Cf map accuracy: features on a map, is my map fit for use? I would strongly argue that it is mostly the latter two, things are what they are because of processes we have developed to isolate and label them—either with computers or with our own heads, and protocols for matching / comparing Problem is we are not very good at describing said processes How do we retrospectively tag all the millions of data resources we already have? Remembering situations Situations Creation Application Represented by Who did it? Who should use it? Collections of people Where was it made? Where does it apply? Collections of sites / scales When was it made? When does it apply? Collections of temporal intervals How was it made? How should it be used? Collections of methods and data Why was it made? Why should it be used? Collections of research questions, motivations, theories What is in an e-Science Knowledge Soup? People – – – – – – PIs and CoPIs Contributors Developers Users Knowledge engineers Reviewers Organizations – Sponsors – Participants – Hosts Resources – – – – – – Datasets Methods Workflows Ontologies / Vocabularies Articles / Reports ‘Signifiers’ Tools – Concept browsing / mapping – Concept matching – Knowledge engineering Codex: Situating resources in Whitehead’s Nexus Perspectives as filters Perspectives filter an information space according to particular situations. Perspectives A and B preferentially select different types of resources and relations; the ability to view perspectives can show how someone else made sense of a given set of resources. Four perspectives on a “seismic velocity” concept (red node). a) Intensional concept structure. b) A task that describes how seismic velocity can be measured. c) A social network built around users of the concept. d) Data resources that have been used to describe seismic velocity. Concept use and evolution Evolution of “Depositional environment” concept through use by different researchers over time, progressing from upper left to lower right. ConceptVista (CV4): Navigating through GEON’s conceptual universe Some GEON themes GEON data formats and ESRI ShapeFile instances GEON: Institutions, Personnel, PIs, Co-PIs, grad students Combining perspectives: e.g. GEON institutions, publications and personnel Articles, authors, readers, keywords, themes Intersecting research interests What did A create that B used? User’s interests vs. declared themes? Can we do more? • Can we still have ontology, but with perspectives, to map to specific ways of thinking / specific tasks? • Perspectives allow us to ‘discriminate clarity from the penumbral background’ (after Whitehead) • Knowledge horizons: an idea from Hermeneutics – Useful for working out if mappings between ontologies / perspectives are possible – Creating flexible horizons – Relations become properties (internalized), properties become relations (externalized) – Perspectives can be applied locally or globally Perspectives: folding in, folding out Fold in n o m country content p user B user A user C image image Properties Properties Date: ddmmyyyy Date: ddmmyyyy Scale: 1:xxxxxxxx User: (A, B, C) Scale: 1:xxxxxxxx Fold out Horizons--hermeneutics Country: “………..” Content: (m, n, o, p) Perspectives in CV4: Person as Contributor… Perspectives inCV4: Person as User Hermeneutics • Any knowledge fragment that we represent may only be the tip of the iceberg, in terms of meaning. • “How much is that Doggie in the window?” – This phrase can be perfectly represented in an ontology or set of relations Some possible meanings? doggie in window How much cost? I want to buy the doggie in the window (face value) I want to buy a dog (generalization) I am researching the cost of pet ownership (context missing—but still face value) I no longer want to buy a hamster (emphasis) I want you to think I want a dog (deception) I want to let you know the dog is not in its cage (cryptic) I want to convey that I understand that I must pay for the dog (meta-level communication) Horizons • The connection between intended meaning and the phrases used may not be straightforward – Two communicators share a background context that is unknown to an eaves-dropper (the first and second share a horizon that is not shared by a third) – Therefore, much knowledge in a computer is semantics without a role, without a horizon… • Intended and possible meanings may need to be carefully represented along with knowledge fragments – Knowledge roles drive at this problem to some extent – But knowledge roles may be unknown (unless you define them) Knowledge Perspectives & Horizons in a systems context Knowledge Perspective Knowledge Horizon Concentric knowledge Concentric knowledge Overlapping knowledge Overlapping knowledge Overlapping knowledge Other is partly beyond Common knowledge horizon Overlapping knowledge Overlapping knowledge horizons knowledge horizon horizons Other is partly beyond knowledge horizon Disjoint knowledge Overlapping knowledge horizons Disjoint knowledge Overlapping knowledge horizons Other is partly within knowledge horizon Disjoint knowledge Overlapping knowledge horizons Other is beyond knowledge horizon Surveyor. © 2002-6 by Infomaniacs/Neological. All Rights Reserved. VonSchweber Disjoint knowledge Disjoint knowledge horizons 40 CV4 can now browse large concept universes (e.g. OpenCyc) WordNet RMI Service • • • • Remote Service (RMI). Implements caching mechanism. Hides underlying architecture from clients. Supported operations: • • • • Synonyms, Antonyms Hypernyms, Hyponyms. Meronyms, Holonyms. Definitions. Architecture Communication Medium (Network/ Internet) Client Application Word WordNet Server Word Model Word Word Model Word Not Present Word Model Creator Index Searcher Word Model WordNet API Indexer Check word Retrieve Word Model Write WordNet Database Client Side Server Side Lucene Index Web Browser Integration • JDesktop Integration Components (JDIC) • Enables embedding a native web browser with JAVA applications. • Supported browsers • Internet Explorer • Mozilla • Support for Highlighting words in HTML Pages. Web Page Highlight Mechanism JPanel Application Word Inputs Highlight Translation Button On Click Rebuild Relative URLs in Web page Web Browser (Native Internet Explorer) Cache rebuilded Page to Disk Page is Highlighted = true Reload Browser AND On Click Highlighted Word Thread Scheduler Start Get Word Translations HTTP Server Highlight cached Page and write to disk. Decode Data Fire Event for Application WordNet RMI Service Server Webpage Markup Concept Mapping : Formal Ontology ↔ Informal Ontology Matching result • String • Nym service • By types & Values Formal Ontology represents mutual agreement Informal Ontology represents individual beliefs Merged Ontology Concept Mapping Wizard Step 2 (option 1) Formal ontology selection Extracted from a formatted text: Ontology name, URL, and description Informal ontology selection Step 1 Step 2 (option 2) mapping approaches Matching imprecise / vague / informal concepts… Using WordNet / formal ontologies to match ontologies (i.e. informal and formal), in addition to the exact & partial string matching Based on linguistic/semantic relationships, such as: antonyms, synonyms, hyponyms, meronyms, holonyms, … Ontology 1 concept3 conpt3 conpt1 concept1 conpt2 concept2 Dew Point Temperature Ontology 4 Ontology 2 Temperature cpt1 Physical Property copt1 Ontology 3 copt2 cpt3 cpt2 Matching wizard Matching based on properties Summary • Rich, Living Knowledge – “Knowledge keeps no better than fish” -- Alfred North Whitehead – “You cannot put your foot in the same stream twice” -- Heraclitus – “…So let's look at the bird and see what it's doing -- that's what counts.” -- Richard Feynman • Perspectives allow scientists to ‘describe what they know’ onto shared ontological resources. • Irony of Ontology is that ontologically-based languages can be used to represent its obverse— Epistemology. End Questions? www.geovista.psu.edu/conceptvista William Pike’s PhD Dissertation: online dissertation library at Penn State e-Learning objects Learning Approach Subject (GPS) Interactions Learning Activity Tasks Outcomes Current work: integrating data analysis and concepts in a single system Workshop Questions • What are effective carriers of meaning (between researchers) in the geosciences? • A field trip? A photograph? Text? Article? The same for us all? – Can these carriers be represented / emulated in systems? – How are they best represented / signified? – Are they top down, or bottom up, or both?