Intelligent Information Systems 8. Educational Challenges Gio Wiederhold EPFL, April-June 2000, at 14:15 - 15:15, room INJ 289 7/26/2016 EPFL - Gio spring 2000 1 Schedule Presentations in English -- but I'll try to manage discussions in French and/or German. 1. 13/4 Historical background, enabling technology:ARPA, Internet, DB, OO, AI., IR 2. 27/4 Search engines and methods (recall, precision, overload, semantic problems). 3. 4/5 Digital libraries, information resources. Value of services, copyright. 4. 11/5 E-commerce. Client-servers. Portals. Payment mechanisms, dynamic pricing. 5. 19/5 Mediated systems. Functions, interfaces, and standards. Intelligence in processing. Role of humans and automation, maintenance. 6. 26/5 Software composition. Distribution of functions. Parallelism. [ww D.Beringer] 7. 31/5 Application to Bioinformatics. 8. 15/6 Educational challenges. Expected changes in teaching and learning. 9. 22/6 Privacy protection and security. Security mediation. 10.29/6 Summary and projection for the future. • Feedback and comments are appreciated. 7/26/2016 EPFL - Gio spring 2000 2 Open question? • Web enables remote education 7/26/2016 EPFL - Gio spring 2000 3 Stanford Model • Based on TV courses offered to industry • Part of normal curriculam – TV operator in special classroom shows notes (must be legible), blackboard, teacher – tutor at remote site (has taken class earlier) – voice link for questions (if live TV) • Can be replayed on web in students rooms, … – morning classes getting to be empty 7/26/2016 EPFL - Gio spring 2000 4 Threat to smaller schools Alternatives • Overloaded professor with older material • Inaccessible professor with up-to-date material – technology from the entertainment industry • Education when and where wanted 7/26/2016 EPFL - Gio spring 2000 5 HPKB Master file on Birch S K C Scalable Knowledge Composition September 1997 Gio Wiederhold Stanford University An abstract concept is like a valise with a false bottom. you may put in what you please, and take them out again, without being observed. Alexis de Toqueville, Democracy in America, 1838. 7/26/2016 What are Ontologies? Ontologies list the terms and their relationships that allow communication among partners in enterprises (in machine-readable form) Relationships determine meaning - parent, school, company Databases use ontologies during design in their E-R diagrams (Implicitly) and represent the leaf nodes in their schemas Knowledge-bases use ontologies (often implicitely) add class definition (to hold instances), constraints, and operations among the terms 7/26/2016 Functions of Ontologies . • Define Terms used in System Construction to enable Correctness in Understanding system = designers, implementors, users, maintainers designers = implementors = users = maintainers • Define Higher-level Abstractions needed to communicate in larger contexts managers, decision-makers, systems in own, other domains • Share the Cost of Knowledge Acquistion & Maintenance reuse encoded knowledge, remain up-to-date as domains change 7/26/2016 Ancestors of Ontologies Lexicons: collect terms used in inform. systems Taxonomies: categorize, abstract, classify terms Schemas of databases: attributes, ranges filed Data dictionaries: integration of files, attributes Object libraries: grouped attributes, methods Symbol tables: collect terms used in a program Domain object models: . . . More Knowledge 7/26/2016 re-engineering terms Establishing Ontologies Top-down: – Commonly acceptable UPPER layers Domain-specific – Sharing tools – Object based Bottom-up – Pragmatic, TASK-specific collections – Database schemas and models 7/26/2016 IFIP note Ich weiss nicht was soll es bedeuten, ... -- an early complaint about semantics [Heinrich Heine: Die Lorelei] Ontologies in Use Implicit Ontologies are a prerequisite for communication among humans and organizations. Knowledge is explicitely represented in AI-systems; sometimes the ontology is explicit as well. Database schemas are partial explicit ontologies • Relational schemas only terms & 1:1 dependencies. • E-R designs contain 1:n, m:n cardinalities • Structural schemas contain semantic dep. types Conceptual graphs define terms of discourse and a modest number of relationship types Variables in software represent ontologies poorly. 7/26/2016 Ontologies at work per Hans Akkermans (VU Amsterdam, consulting) • Knowlegde elicitation for experts – tacit knowledge in organizations • PDES/STEP annotation • adding knowlegde to processes [Unilever] • Software requirements engineering – – – – what does the cient really want definition of domain content for CS folk reuse across very disparate domains [viz Musen] relates to OO work and recognition of patterns – distributed service integration (AMR, DA, 7/26/2016 Large Ontologies? Have all the Knowledge together + simple for customers of KBs – hard for owners of KBs Large KB will cover multiple domains created by a committee -- slow maintained by a committee -- costly Differences in level of abstraction -- efficiency homeowner: nail carpenter: sinker, brad, boxnail, . . . 7/26/2016 SKC Objective Provide for Maintainable Ontologies • devolve maintenance onto many domain-specific experts / authorities • provide an algebra to compute composed ontologies that are limited to their articulation terms • enable interpretation within the source contexts 7/26/2016 SKC SKC Working Definition . • Ontology: a set of terms and their relationships • Term: a reference to real-world and abstract objects • Relationship: a named and typed set of links between objects • Reference: a label that names objects • Real-world object: an entity instance with a physical manifestation • Abstract object: a concept which refers to other objects 7/26/2016 Domains and Consistency . • a domain will contain many objects • the object configuration is consistent • within a domain all terms are consistent & • relationships among objects are consistent Domain Ontology • context is implicit No committee is needed to forge compromises * within a domain Compromises hide valuable details 7/26/2016 We consider to be ontologies: • Object oriented class hierarchies, (snapshots of executing programs capture object instances) • Database schemas, (via their E-R or structural models) • Semi-structured databases, (OEM <OID, label, type, value>) • Definitional thesauri, (UMLS: see http://www.lexical.com) • Knowledge bases (CYC, Ontolingua) SKC specifically does not restrict its applicability to a purely extensional (object) or intensional (schema) definition of ontology, since its purpose is to support useful processing of extensions using intensional knowledge for all parties. To that end it is important that the intensional specifications include predicates or methods that permit the collection of extensional access to real-world objects. We do not require ontologies to be complete specifications of a domain, but rather that usage of an ontology provide results complete with respect to the ontology. 7/26/2016 Aspects that Focus SKC • The mapping of terms to objects differs between autonomous domains. • The collections of real-world objects provides a grounding for the definitions, and an opportunity for validation of the meaning of the terms being employed.: • Relationships have semantic, and derived from that, structural significance. Multiple relationship types may share structural characteristics, as IS-A, Ownership, Part-of, Reference, • We will keep the number of primitive relationships limited, • The mapping of relationship types differs between autonomous domains. 7/26/2016 Heterogeneity among Domains If interoperation involves distinct domains mismatch ensues • Autonomy conflicts with consistency, – Local Needs have Priority, – Outside uses are a Byproduct Heterogeneity must be addressed • Platform and Operating Systems 4 4 • Representation and Access Conventions 4 • Naming and Ontology : 7/26/2016 An Ontology Algebra A knowledge-based algebra for ontologies Intersection Union Difference create a subset ontology keep sharable entries create a joint ontology merge entries create a distinct ontology remove shared entries The Articulation Ontology (AO) consists of rules that link domain ontologies 7/26/2016 matching Sample Operation: INTERSECTION Result contains shared terms Source Domain 1: Owned and maintained by Store 7/26/2016 Terms useful for purchasing Source Domain 2: Owned and maintained by Factory INTERSECTION support Articulation ontology Terms useful for purchasing Matching rules that use terms from the 2 source domains Store Ontology 7/26/2016 Factory Ontology Sample Intersections Articulation size = size ontology matching rules : color =table(colcode) . style = style Anatomy {. . . } Shoe Factory Shoe Store • Shoes { . . . } • Customers { . . . } • Employees { . . . } foot = foot Employees Nail (toe, foot) ... 7/26/2016 • Material inventory {...} • Employees { . . . } • Machinery { . . . } • Processes { . . . } • Shoes { . . . } Department Store Hardware Employees Nail (fastener) ... Other Basic Operations DIFFERENCE: material fully under local control UNION: merging entire ontologies Articulation ontology 7/26/2016 typically prior intersections Features of an algebra Operations can be composed Operations can be rearranged Alternate arrangements can be evaluated Optimization is enabled The record of past operations can be kept and reused 7/26/2016 Knowledge Composition Composed knowledge for Articulation knowledge Legend: U U for applications using A,B,C,E (A B) U (B C) U (C E) Articulation knowledge (C E) U U : union U : intersection Knowledge resource E U Knowledge resource A 7/26/2016 U (B C) Knowledge resource B Knowledge resource C (C U U Articulation knowledge for (A B) D) Knowledge resource D Primitive Operations Model and Instance Unary • Summarize -- structure up • Glossarize - list terms • Filter - reduce instances • Extract - circumscription Binary • Match - data corrobaration • Difference - distance measure • Intersect - schem discovery • Blend - schema extension 7/26/2016 Constructors • create object • create set Connectors • match object • match set Editors • insert value • edit value • move value • delete value Converters • object - value • object indirection • reference indirection Exploiting the result Result has links to source 7/26/2016 . Avoid n2 problem of interpreter mapping as stated by Swartout as an issue in HPKB year 1 Processing & query evaluation is best performed within Source Domains & by their engines SKC Synopsis • Research: Reliable query answers from heterogeneous, imperfect data sources • Sources: – General: CIA World Factbook ‘96, UN WWW – Topical: OPEC, BattleSpace Sensors • Client: DARPA High Performance Knowledge Base (HPKB) project • Theory: Rule-based algebra 7/26/2016 – Translation & Composition primitives • • • • Innovation in SKC No need to harmonize full ontologies Focus on what is critical for interoperation Rules specific for articulation Potentially many sets of articulation rules • Maintenance is distributed – to n sources – to m articulation agents is m < n2 , depending on architecture density a research question 7/26/2016 Domain Specialization . • Knowledge Acquisition (20% effort) & • Knowledge Maintenance (80% effort *) to be performed • Domain specialists • Professional organizations • Field teams of modest size automously maintainable 7/26/2016* Empowerment based on experience with software Rules for Real-Time Data if [base_station.receiving] = true then satellite_data = [base_station] satellite_data.timestamp = now if [satellite_data.age] < 24 hours or [radio_jamming.level] > 30% then recon_data = [satellite_data] except when [flight_data.age] < 1 hour or [rain_sensor.daytotal] > 1 inch then recon_data = [flight_data] assert [recon_data] 7/26/2016 Sample Processing in HPKB • What is the most recent year an OPEC member nation was on the UN security council? – Related to DARPA HPKB Challenge Problem – SKC resolves 3 Sources • CIA Factbook ‘96 (nation) • OPEC (members, dates) • UN (SC members, years) – SKC obtains the Correct Answer • 1996 (Indonesia) 7/26/2016 – Problems resolved by SKC * Factbook has out of date OPEC & UN SC lists – Indonesia not listed – Gabon (left OPEC 1994) * different country names – Gambia => The Gambia * historical country names – Yugoslavia • UN lists future security council members – Gabon 1999 • intent of original question – Temporal variants Status September 1997 • Base HPKB funding from AFOSR – New World Vistas – some industrial co-funding • Prior work supported through Commercenet – support for common representation, an interlingua • Acquiring ontologies that – – – – are interesting to HPKB projects not trivial, I.e., represent realistic activities intersectable Logistics: DoD CIM, CIA, Cyc, . . . • Starting smart students • Integrating into architecture managed by TFS 7/26/2016 . Information Flow for Training Initiative sample scenarios scenario refinement trainer / controller aggregation/ analysis/ evaluation ISI scenario language Scenarios Objectives tasks explosion aggregation doctrine TRADOC mediator knowledge base Requirements exercise design Legend sources scenario justification Data collection Probepoint settings draft 1 Interlingua(s) Interlingua: Query : Object Exchange Model Mediator Specification Language OEM MSL { OID, LABEL, TYPE, VALUE } <document {<author AUTHOR> <title TITLE>}:- <biblioentry {<author AUTHOR>}>@biblio <inproceedings {<title TITLE>}> @sybase AND AND Equal(AUTHOR, “Jeff Ullman”) Interlingua: Query: Knowledge Interchange Format Knowledge Query and Manipulation Language (PACKAGE :FROM ap001 :TO ap002 :CONTENT (MSG :TYPE query :CONTENT-LANGUAGE KIF :CONTENT (and (document (author@biblio ?a) (title@sybase ?t)) (eq “Jeff Ullman” ?a))) KIF KQML Support for KB-Algebra • Ontolingua [Gruber, Fikes @ Stanford KSL]: Repository for Domain Terminologies Used for mechanical design, bibliographies, catalogs • LOOM [MacGregor@ USC ISI]: Classification-based Expert System Helps in structuring and processing ontologies • PROTÉGÉ [Musen@ Stanford MIS] Reuse • Penguin [Barsalou, Keller@ Stanford MIS, CIFE]: Object manipulation based on Relational Algebra Used for genetics laboratory, building design 7/26/2016 Current Directions • Experience with real world (imperfect) data confirms validity of our approach – Expert sources are better maintained than general sources – Rules applied to multiple sources provide more reliable and accurate query results – Component architecture enables scalable, maintainable knowledge base development • Developing proof of concept environment with HPKB standard knowledge base connectivity interface 7/26/2016 • Summary Algebra enables Interoperation by . dealing explicitly with differences by knowledge identifying maintenance domains keeping sources autonomous • Assumes domain has a common ontology composing domain ontologies requires the algebra to manage the linkages where articulation occurs processes are best executed within the domains • Knowledge about articulation is disjoint allows integration specialists to work independently supports multiple intersections and views • Maintenance is structured and partitioned 7/26/2016