Technology for Superimposed Information Lois Delcambre with Shawn Bowers, David Maier, Mat Weaver Database and Object Technology Lab Computer Science and Engineering Department Oregon Graduate Institute Superimposed Information - Stanford DB talk 1 Outline • introduction to superimposed information • a superimposed application: SLIMPad (DLI2 Project) • model-based representation and transformation of information • harvesting information to sustain our forests (NSF Digital Government project) Superimposed Information - Stanford DB talk 2 What is Superimposed Information? data “placed over” existing information sources to: highlight annotate elaborate select collect organize connect reuse information elements often to support new applications, beyond the original Superimposed Information - Stanford DB talk 3 Examples of Superimposed Information Non-electronic examples: Commentaries on religious texts, law, literature Concordances, citation indexes Electronic examples: Your bookmark file in your web browser RDF metadata Superimposed Information - Stanford DB talk 4 Why work on it now? • Broadening range of digital information – Easier to overlay than “hard copy” forms – More and more sources of base information • Accessibility/addressability to base information – Reference (e.g., URL) can be resolved quickly – Addressing at various levels of granularity • Emerging Standards: RDF, Topic Maps, XLink Superimposed Information - Stanford DB talk 5 The superimposed and base layers with marks Superimposed Layer marks Base Layer Information Information … Information Source1 Source2 Sourcen Superimposed Information - Stanford DB talk 6 Outline • introduction to superimposed information • a superimposed application: SLIMPad (DLI2 Project) • model-based representation and transformation of information • harvesting information to sustain our forests (NSF Digital Government project) Superimposed Information - Stanford DB talk 7 Paul Gorman, MD Lois Delcambre, PhD David Maier, PhD Superimposed Information - Stanford DB talk 8 Bundles in the wild……….. Observational team: Paul Gorman Joan Ash Mary Lavelle Jason Lyman …………..Bundles in captivity Computer science team: Lois Delcambre Dave Maier Shawn Bowers Longxing Deng Mathew Weaver Superimposed Information - Stanford DB talk 9 Let’s take a trip to the ICU Superimposed Information - Stanford DB talk 10 (Wild) Bundles Superimposed Information - Stanford DB talk 11 (Wild) Bundles Superimposed Information - Stanford DB talk 12 (Wild) Bundles Superimposed Information - Stanford DB talk 13 (Wild) Bundles • manage information for diverse, complex tasks • contain selected, collected, structured, annotated • are often used in settings with: – – – – high uncertainty low predictability potentially grave outcomes time & attention are highly constrained Superimposed Information - Stanford DB talk 14 (Wild) Bundles • There is benefit in creating (active processing of information) • There is benefit in reusing (trigger memory) • There is benefit in sharing (establish collective, situated awareness) Superimposed Information - Stanford DB talk 15 Given…. • bundles are everywhere! • access to bundles provides access to important information • information in bundles is often copied from other information sources • we can keep copied/referenced information linked through the use of marks Superimposed Information - Stanford DB talk 16 (Captive) Bundles • SLIMPad - a scratchpad application to create bundles but….with referenced information connected to the underlying source data • helping us explore architectural issues for building superimposed applications • motivating definition of a metamodel to represent information with mappings to transform • inspired by the observational work (but not focused on a specific medical task) Superimposed Information - Stanford DB talk 17 SLIMPad demo Superimposed Information - Stanford DB talk 18 Superimposed Layer Information Manager (SLIM) Architecture: Contributions • Mark Management - to create/resolve marks • SLIM API - for the application developer • TRIM store - for generic storage of superimposed information Superimposed Information - Stanford DB talk 19 Superimposed Information Management Application Data Superimposed Application creates and manages Application Specific API Generic Management TRIM Store Mark Management The general architecture for managing superimposed information Superimposed Information - Stanford DB talk 20 Superimposed Information Management SLIMPad Mark Management user PDF Viewer IE Explorer MS Excel MS PowerPoint XML Viewer Mark Manager PDF files PDF Module Web Pages HTML Module Mark DB Excel Spreadsheets Excel Module PPT Files PowerPoint Module XML Documents Superimposed Information - Stanford DB talk XML Module 21 SLIM API: as seen by application AbstractBundle 1 * 1 Scrap scrapName : String scrapXPos : Number scrapYPos : Number * SLIMPad Bundle padName : String bundleName : String bundleXPos : Number bundleYPos : Number bundleHeight : Number bundleWidth : Number 0..1 * Mark markId : String Structured Bundle Model for SLIMPad. Superimposed Information - Stanford DB talk 22 What’s Next for this Project? • Validation - cardiologists, ICU nurses, … • Extend the informational model of SLIMPad • Extend SLIMPad to suit a selected medical task • Extension of observational work to other domains Superimposed Information - Stanford DB talk 23 www.cse.ogi.edu/footprints • demos - including the QTVR of the ICU (with toys) and SLIMPad • personnel • project description • papers – “Bundles in the Wild: Tools for Managing Information to Maintain Situation Awareness” – “Bundles in Captivity: An Application of Superimposed Information” – papers discussing superimposed information Superimposed Information - Stanford DB talk 24 Outline • introduction to superimposed information • a superimposed application: SLIMPad (DLI2 Project) • model-based representation and transformation of information • harvesting information to sustain our forests (NSF Digital Government project) Superimposed Information - Stanford DB talk 25 Model-Based Superimposed Information Model Superimposed Layer Schema Data Instance Data with Marks Base Layer marks marks Information Source1 Information Source2 But the model and schema are optional Superimposed Information - Stanford DB talk 26 Our Goals • Represent information generically, for various models • Convert information from one representation scheme to another Superimposed Information - Stanford DB talk 27 Transforming Information Influenced by Generic Rep. by painter Painting TM Browser mentioned Painter critiqued mentioned biography (Topic Map model) convert XML Viewer Generic Rep. XML (XML model) convert SQL Superimposed Information - Stanford DB talk DB Generic Rep. (Relational model) 28 Our Approach • Metamodel – to represent multiple data models • Generic, Uniform Representation Scheme – to store model, schema, and instances for model-based information • Mapping Formalism – to transform between representation schemes Superimposed Information - Stanford DB talk 29 The Metamodel • Provides a level of abstraction above models • Describes the structural features of models Basic Set of Abstractions Metamodel Topic Map XML Model Constructs and Relationships Topic Map Defintions DTD Schema-Level Data Topic Map Instances XML Document Instance-Level Data Superimposed Information - Stanford DB talk 30 XML Model, Schema, and Instance XML Model XML DTD (Schema) XML Document (Instances) • Elements, Element Types, Attributes, Attribute Types • Elements contain Attributes • Elements can be nested Model constructs and relationships defined using the metamodel <!ELEMENT schedule (flight*)> <!ELEMENT flight (from, to, price)> <!ATTLIST flight name CDATA #REQUIRED> <schedule> <flight name=“Air Canada Flight 1575”> <from> PDX </from> <to> YVR </to> <price> $213.84 </price> </flight> ... </schedule> Superimposed Information - Stanford DB talk 31 Topic Map Example by painter Painting mentioned critiqued “Captive” critiqued by painter http://... mentioned “1914” Painter mentioned “Paul Klee” biography http://... Influenced by biography influenced by biography “Francisco de Goya” http://... mentioned http://... by painter http://... Superimposed Information - Stanford DB talk 32 Topic Map Model in UML <<conformance>> topic_instOf TopicType 1 1 topicType * topic Type1 1 * topic Type2 1 relType : String AnchorType anchorRole : String <<conformance>> 1 * rel_instOf <<conformance>> anchor_instOf 1 1 title : String topicInsID : Number topic Ins1 * TopicRelType * 1 ttypename : String TopicInstance 1 * topic Ins2 topicIns * TopicRelInst AnchorInst * * * address 1 <<Mark>> Address markID : String Superimposed Information - Stanford DB talk 33 Generic, Uniform Representation • We use RDF and RDF Schema to represent model, schema, and instance uniformly RDF Triples RDF Graph http://…/~john creator person1 name ‘John Smith’ RDF Schema Triples RDF Schema Graph Property WebPage type (creator, ‘http://…/~john’, person1) (name, ‘person1’, ‘John Smith’) type domain Class type (type, ‘creator’, Property) (domain, ‘creator’, WebPage) (range, ‘creator’, Person) (type, ‘Person’, Class) (type, ‘WebPage’, Class) Person creator range Superimposed Information - Stanford DB talk 34 The Metamodel Definition Basic Metamodel Elements Special Elements Construct Mark Lexical connects 2 constructs Structural Connector Conformance Generalization Construct: A basic structural unit Mark: A connection-point to the base-layer Lexical: A primitive-value type Connector: A relationship between 2 constructs Conformance: A schema-instance relationship Generalization: An inheritance relationship Superimposed Information - Stanford DB talk 35 Representing Models (instanceOf, “TopicType”, Construct) (instanceOf, “TopicInstance”, Construct) TopicType ttypename : String (instanceOf, “topic_instOf”, Conformance) (domain, “topic_instOf”, TopicInstance) (range, “topic_instOf”, TopicType) (domainMult, “topic_instOf”, “*”) (rangeMult, “topic_instOf”, “1”) 1 <<conformance>> topic_instOf * TopicInstance (instanceOf, “ttypename”, Connector) (domain, “ttypename”, TopicType) (range, “ttypename”, String) (domainMult, “ttypename”, “*”) (rangeMult, “ttypename”, “1”) Superimposed Information - Stanford DB talk 36 Representing Schema (instanceOf, “painting_tt”, TopicType) (ttypename, “painting_tt”, “painting”) (instanceOf, “painter_tt”, TopicType) (ttypename, “painter_tt”, “painter”) Topic Types (schema): painting, painter (instanceOf, “byPainter_rt”, TopicRelType) (relType, “byPainter_rt”, “by painter”) (topicType1, “byPainter_rt”, painting_tt) (topicType2, “byPainter_rt”, painter_tt) Topic Rel Types (schema): by painter (instanceOf, “biography_at”, AnchorType) (anchorRole, “biography_at”, “biography”) (topicType, “biography_at”, painter_tt) Anchor Types (schema): biography painting by painter painter biography Superimposed Information - Stanford DB talk 37 Representing Instances (instanceOf, “painter1”, TopicInstance) (title, “painter1”, “Paul Klee”) (topicInsID, “painter1”, “5”) (topic_instOf, “painter1”, painter_tt) (instanceOf, “painting1”, TopicInstance) (title, “painting1”, “Captive”) (topicInsID, “painting1”, “19”) (topic_instOf, “painting1”, painting_tt) Topic (instances): Paul Klee, Captive (instanceOf, “byPainter1”, TopicRelInst) (rel_instOf, “byPainter1”, byPainter_rt) (topicIns1, “byPainter1”, painting1) (topicIns2, “byPainter1”, painter1) Topic Relationship (instance): a by painter relationship (instanceOf, “biography1”, AnchorInst) (anchor_instOf, “biography1”, biography_at) (address, “biography1”, a1) Anchor (instance): a biography anchor (instanceOf, “a1”, Address) (markID, “a1”, “URLMarkManager@954308545”) Address (instance): mark to URL Superimposed Information - Stanford DB talk 38 Basic Types of Mappings Model1 Inter-Model Schema1 Instances1 Inter-Schema Model1 Mapped Converted Schema1 Converted Instances1 Mapped Schema1 Model-to-Schema Model2 Model1 Schema2 Instances1 Converted Instances1 Model1 Mapped Model2 Schema1 Converted Schema2 Instances1 Converted Instances2 Superimposed Information - Stanford DB talk 39 Mapping Rules Simple production rules over triples Mapped TopicInstance XMLElem S(‘source’, (‘instanceOf’, X, ‘TopicInstance’)) S(‘target’, (‘instanceOf’, X, ‘XMLElem’)) Superimposed Information - Stanford DB talk 40 Mapping Rules (cont.) TopicInstance topic_instOf XMLElem Mapped TopicType elem_instOf XMLElemType S(‘source’, (‘topic_instOf’, X, Y)) S(‘target’, (‘instanceOf’, X, ‘XMLElem’)) S(‘target’, (‘instanceOf’, Y, ‘XMLElemType’)) S(‘target’, (‘elem_instOf’, X, Y)) Superimposed Information - Stanford DB talk 41 Superimposed Information Management Application Data Superimposed Application creates and manages Application Specific API Generic Management TRIM Store Mark Management The general architecture for managing superimposed information Superimposed Information - Stanford DB talk 42 Applications • SLIM Pad – Scratchpad application with Bundle-Scrap model (uses superimposed information) • XML Extractor – “Extracts” XML information and transforms it into a Topic Map for searching/browsing in XML Extractor out mapped stored DBMS XML Files Generic Rep. (XML model) Generic Rep. (TM model) Topic Map Browser Superimposed Information - Stanford DB talk 43 IDMEF to CISL • IDMEF - Intrusion Detection Superimposed Information - Stanford DB talk 44 Harvesting Information to Sustain our Forests: Creating an Adaptive Management Portal NSF DIGITAL GOVERNMENT PROGRAM Tim Tolle & Lois Delcambre ttolle@fs.fed.us lmd@cse.ogi.edu Co-Project Directors Superimposed Information - Stanford DB talk 45 Project focuses on the: Adaptive Management Areas USDA Forest Service USDI Bureau of Land Management USDI Fish and Wildlife Service Superimposed Information - Stanford DB talk 46 Adaptive Management Portal: a value-added, Internet-based service • Provide multiple access paths to forest information. • Preserve local autonomy and local focus of each site. • Support diverse users and types of information. • Use proposed, existing, and de facto standards for content, classification, and technology. • Be low-cost, scalable, extensible. Superimposed Information - Stanford DB talk 47 Project Funding • Duration: 3 years • Budget: $1.5 million • Principal financial sponsors – – – – National Science Foundation Bureau of Land Management (Oregon State Office) Forest Service (R-6 and PNW Station) National Park Service (Western Region) Superimposed Information - Stanford DB talk 48 Team Members Tim Tolle Regional Coordinator for AMA, US Forest Service Eric Landis Forest Information System Specialist, Consultant Craig Palmer Natural Resources Monitoring Expert, UNLV Fred Phillips Professor, Head, Mgt. of Science and Tech., OGI Patty Toccalino Asst. Prof., Environmental Science and Eng., OGI Lois Delcambre Professor, Computer Science and Eng., OGI David Maier Professor, Computer Science and Eng., OGI Shawn Bowers PhD Student, Computer Science and Eng., OGI Mat Weaver PhD Student, Computer Science and Eng., OGI Superimposed Information - Stanford DB talk Forest/environmental expertise 49 Computer science expertise Advisory Board Michel Biezunski Co-Inventor of the Topic Map Model Jeff Burley President, IUFRO, Oxford Forestry Institute, Dept of Plant Sciences Robert Devlin USDA Forest Service, Pacific NW Region Martin Goebel Sustainable Northwest Paul Gorman MD, Asst. Professor, Division of Medical Informatics and Outcomes Research, OHSU Fred Johnson Executive Director, IMFN Secretariat Monty Knudsen Chief, Office of Technical Support, Forest Resources, USDI Fish and Wildlife Service Cynthia L. Miner Communications Director, USDA Forest Service, PNW Research Station Regina Rochefort Science Advisor, USDI, National Park Service Mark Whiting Staff Scientist, Pacific Northwest National Laboratory Superimposed Information - Stanford DB talk Forest/environmental expertise 50 Computer science expertise Task 1 – Status • Workshops @ Snoqualmie Pass Adaptive Management Area, Cle Elum, WA (June and July) • Interviews with Forest Service Corvallis Forest Sciences Lab and USGS FRESC, Corvallis (August) • Interviews with Central Cascades Adaptive Management Area, Eugene (August) • Interviews with the Applegate Partnership and its associated agencies (August) • Rainier National Park (planned for October) Superimposed Information - Stanford DB talk 51 Things we’ve learned from Task 1 NSF Digital Government • work is project-based • primary product is information: assessments, studies, surveys, environmental impact statements • multiple agencies are involved • each agency serves as information gatherer; information broker; information consumer • even though information is a primary product, information technology is secondary (stewardship of the land is the primary mission) Superimposed Information - Stanford DB talk 52 Superimposed Information - Stanford DB talk 53 Research Issues • Models for the superimposed layer • How does the superimposed model influence the capabilities it supports? • How does the form of superimposed information affect the effort to construct and maintain it? – Are some forms more robust to updates in the base layer – What forms map onto current information management tools Superimposed Information - Stanford DB talk 54 Research Issues (2) • Challenges when superimposed and base layer have different models – E.g., structured over unstructured, or vice versa • Bi-level tools – Browsing between layers – Queries over both layers • How do we delimit the universe of discourse in the base layer? • Is it easier to fuse superimposed information than base information? Superimposed Information - Stanford DB talk 55 Research Issues (3) • Variations on the conceptual architecture – Commingled layers – “Super-superimposed information” • How do capabilities of base layer affect structure and operations over superimposed information? – Addressing modes – Address comparison – Querying • Addressing for non-web sources – Relational, object-oriented DBs Superimposed Information - Stanford DB talk 56 Research Issues (4) • How to extend DBMSs to better deal with information they don’t store. • How to help population superimposed information spaces. • What are good formats for representation and exchange of superimposed information? Superimposed Information - Stanford DB talk 57 Why Databases Don’t (Currently) Solve It • Seems closely related to view and data integration • However – Superimposed information can’t always be derived from the base data – DB approaches assume schema and common model – DBs like to work with data they control – Traditional approaches are heavy weight • • • • semantic analysis schema integration query mapping On a source-by-source basis Superimposed Information - Stanford DB talk 58