Technology for Superimposed Information

advertisement
Technology
for
Superimposed
Information
Lois Delcambre
with Shawn Bowers, David Maier, Mat Weaver
Database and Object Technology Lab
Computer Science and Engineering Department
Oregon Graduate Institute
Superimposed Information - Stanford DB talk
1
Outline
• introduction to superimposed information
• a superimposed application: SLIMPad (DLI2 Project)
• model-based representation and transformation of
information
• harvesting information to sustain our forests
(NSF Digital Government project)
Superimposed Information - Stanford DB talk
2
What is Superimposed Information?
data “placed over” existing information sources to:








highlight
annotate
elaborate
select
collect
organize
connect
reuse information elements
often to support new applications, beyond the original
Superimposed Information - Stanford DB talk
3
Examples of Superimposed Information
 Non-electronic examples:
 Commentaries on religious texts, law, literature
 Concordances, citation indexes
 Electronic examples:
 Your bookmark file in your web browser
 RDF metadata
Superimposed Information - Stanford DB talk
4
Why work on it now?
• Broadening range of digital information
– Easier to overlay than “hard copy” forms
– More and more sources of base information
• Accessibility/addressability to base information
– Reference (e.g., URL) can be resolved
quickly
– Addressing at various levels of granularity
• Emerging Standards: RDF, Topic Maps, XLink
Superimposed Information - Stanford DB talk
5
The superimposed and base layers with marks
Superimposed
Layer
marks
Base
Layer
Information Information … Information
Source1
Source2
Sourcen
Superimposed Information - Stanford DB talk
6
Outline
• introduction to superimposed information
• a superimposed application: SLIMPad (DLI2 Project)
• model-based representation and transformation of
information
• harvesting information to sustain our forests
(NSF Digital Government project)
Superimposed Information - Stanford DB talk
7
Paul Gorman, MD
Lois Delcambre, PhD
David Maier, PhD
Superimposed Information - Stanford DB talk
8
Bundles in the wild………..
Observational team:
Paul Gorman
Joan Ash
Mary Lavelle
Jason Lyman
…………..Bundles in captivity
Computer science team:
Lois Delcambre
Dave Maier
Shawn Bowers
Longxing Deng
Mathew Weaver
Superimposed Information - Stanford DB talk
9
Let’s take a trip to the ICU
Superimposed Information - Stanford DB talk
10
(Wild) Bundles
Superimposed Information - Stanford DB talk
11
(Wild) Bundles
Superimposed Information - Stanford DB talk
12
(Wild) Bundles
Superimposed Information - Stanford DB talk
13
(Wild) Bundles
• manage information for diverse, complex tasks
• contain selected, collected, structured, annotated
• are often used in settings with:
–
–
–
–
high uncertainty
low predictability
potentially grave outcomes
time & attention are highly constrained
Superimposed Information - Stanford DB talk
14
(Wild) Bundles
• There is benefit in creating
(active processing of information)
• There is benefit in reusing
(trigger memory)
• There is benefit in sharing
(establish collective, situated awareness)
Superimposed Information - Stanford DB talk
15
Given….
• bundles are everywhere!
• access to bundles provides access to important
information
• information in bundles is often copied from other
information sources
• we can keep copied/referenced information linked
through the use of marks
Superimposed Information - Stanford DB talk
16
(Captive) Bundles
• SLIMPad - a scratchpad application to create bundles
but….with referenced information connected to the
underlying source data
• helping us explore architectural issues for building
superimposed applications
• motivating definition of a metamodel to represent
information with mappings to transform
• inspired by the observational work (but not focused on a
specific medical task)
Superimposed Information - Stanford DB talk
17
SLIMPad demo
Superimposed Information - Stanford DB talk
18
Superimposed Layer
Information Manager (SLIM) Architecture:
Contributions
• Mark Management - to create/resolve marks
• SLIM API - for the application developer
• TRIM store - for generic storage of superimposed
information
Superimposed Information - Stanford DB talk
19
Superimposed Information Management
Application
Data
Superimposed
Application
creates and manages
Application
Specific
API
Generic
Management
TRIM
Store
Mark Management
The general architecture for managing
superimposed information
Superimposed Information - Stanford DB talk
20
Superimposed
Information
Management
SLIMPad
Mark Management
user
PDF
Viewer
IE
Explorer
MS
Excel
MS
PowerPoint
XML
Viewer
Mark
Manager
PDF files
PDF
Module
Web Pages
HTML
Module
Mark
DB
Excel Spreadsheets
Excel
Module
PPT Files
PowerPoint
Module
XML Documents
Superimposed Information - Stanford DB talk
XML
Module
21
SLIM API: as seen by application
AbstractBundle
1
*
1
Scrap
scrapName : String
scrapXPos : Number
scrapYPos : Number
*
SLIMPad
Bundle
padName : String
bundleName : String
bundleXPos : Number
bundleYPos : Number
bundleHeight : Number
bundleWidth : Number
0..1
*
Mark
markId : String
Structured Bundle Model for SLIMPad.
Superimposed Information - Stanford DB talk
22
What’s Next for this Project?
• Validation - cardiologists, ICU nurses, …
• Extend the informational model of SLIMPad
• Extend SLIMPad to suit a selected medical task
• Extension of observational work to other domains
Superimposed Information - Stanford DB talk
23
www.cse.ogi.edu/footprints
• demos - including the QTVR of the ICU (with toys)
and SLIMPad
• personnel
• project description
• papers
– “Bundles in the Wild: Tools for Managing Information to
Maintain Situation Awareness”
– “Bundles in Captivity: An Application of Superimposed
Information”
– papers discussing superimposed information
Superimposed Information - Stanford DB talk
24
Outline
• introduction to superimposed information
• a superimposed application: SLIMPad (DLI2 Project)
• model-based representation and transformation of
information
• harvesting information to sustain our forests
(NSF Digital Government project)
Superimposed Information - Stanford DB talk
25
Model-Based Superimposed Information
Model
Superimposed
Layer
Schema Data
Instance Data with Marks
Base
Layer
marks
marks
Information
Source1
Information
Source2
But the model and schema are optional
Superimposed Information - Stanford DB talk
26
Our Goals
• Represent information generically, for various models
• Convert information from one representation scheme
to another
Superimposed Information - Stanford DB talk
27
Transforming Information
Influenced by
Generic Rep.
by painter
Painting
TM Browser
mentioned
Painter
critiqued
mentioned
biography
(Topic Map model)
convert
XML Viewer
Generic Rep.
XML
(XML model)
convert
SQL
Superimposed Information - Stanford DB talk
DB
Generic Rep.
(Relational model)
28
Our Approach
• Metamodel
– to represent multiple data models
• Generic, Uniform Representation Scheme
– to store model, schema, and instances for model-based
information
• Mapping Formalism
– to transform between representation schemes
Superimposed Information - Stanford DB talk
29
The Metamodel
• Provides a level of abstraction above models
• Describes the structural features of models
Basic Set of
Abstractions
Metamodel
Topic Map
XML
Model Constructs
and Relationships
Topic Map
Defintions
DTD
Schema-Level
Data
Topic Map
Instances
XML
Document
Instance-Level
Data
Superimposed Information - Stanford DB talk
30
XML Model, Schema, and Instance
XML
Model
XML DTD
(Schema)
XML
Document
(Instances)
• Elements, Element Types, Attributes, Attribute Types
• Elements contain Attributes
• Elements can be nested
Model constructs
and relationships
defined using the
metamodel
<!ELEMENT schedule (flight*)>
<!ELEMENT flight (from, to, price)>
<!ATTLIST flight name CDATA #REQUIRED>
<schedule>
<flight name=“Air Canada Flight 1575”>
<from> PDX </from>
<to> YVR </to>
<price> $213.84 </price>
</flight>
...
</schedule>
Superimposed Information - Stanford DB talk
31
Topic Map Example
by painter
Painting
mentioned
critiqued
“Captive”
critiqued
by painter
http://...
mentioned
“1914”
Painter
mentioned
“Paul Klee”
biography
http://...
Influenced by
biography
influenced by
biography
“Francisco de Goya”
http://...
mentioned
http://...
by painter
http://...
Superimposed Information - Stanford DB talk
32
Topic Map Model in UML
<<conformance>>
topic_instOf
TopicType
1
1
topicType
*
topic
Type1
1
*
topic
Type2
1
relType : String
AnchorType
anchorRole : String
<<conformance>>
1
*
rel_instOf
<<conformance>>
anchor_instOf
1
1
title : String
topicInsID : Number
topic
Ins1
*
TopicRelType
*
1
ttypename : String
TopicInstance
1
*
topic
Ins2
topicIns
*
TopicRelInst
AnchorInst
*
*
*
address
1
<<Mark>>
Address
markID : String
Superimposed Information - Stanford DB talk
33
Generic, Uniform Representation
• We use RDF and RDF Schema to represent model,
schema, and instance uniformly
RDF Triples
RDF Graph
http://…/~john
creator
person1
name
‘John Smith’
RDF Schema Triples
RDF Schema Graph
Property
WebPage
type
(creator, ‘http://…/~john’, person1)
(name, ‘person1’, ‘John Smith’)
type
domain
Class
type
(type, ‘creator’, Property)
(domain, ‘creator’, WebPage)
(range, ‘creator’, Person)
(type, ‘Person’, Class)
(type, ‘WebPage’, Class)
Person
creator
range
Superimposed Information - Stanford DB talk
34
The Metamodel Definition
Basic
Metamodel
Elements
Special
Elements
Construct
Mark Lexical
connects 2
constructs
Structural
Connector
Conformance
Generalization
 Construct: A basic structural unit
Mark: A connection-point to the base-layer
Lexical: A primitive-value type
 Connector: A relationship between 2 constructs
Conformance: A schema-instance relationship
Generalization: An inheritance relationship
Superimposed Information - Stanford DB talk
35
Representing Models
(instanceOf, “TopicType”, Construct)
(instanceOf, “TopicInstance”, Construct)
TopicType
ttypename : String
(instanceOf, “topic_instOf”, Conformance)
(domain, “topic_instOf”, TopicInstance)
(range, “topic_instOf”, TopicType)
(domainMult, “topic_instOf”, “*”)
(rangeMult, “topic_instOf”, “1”)
1
<<conformance>>
topic_instOf
*
TopicInstance
(instanceOf, “ttypename”, Connector)
(domain, “ttypename”, TopicType)
(range, “ttypename”, String)
(domainMult, “ttypename”, “*”)
(rangeMult, “ttypename”, “1”)
Superimposed Information - Stanford DB talk
36
Representing Schema
(instanceOf, “painting_tt”, TopicType)
(ttypename, “painting_tt”, “painting”)
(instanceOf, “painter_tt”, TopicType)
(ttypename, “painter_tt”, “painter”)
Topic Types (schema):
painting, painter
(instanceOf, “byPainter_rt”, TopicRelType)
(relType, “byPainter_rt”, “by painter”)
(topicType1, “byPainter_rt”, painting_tt)
(topicType2, “byPainter_rt”, painter_tt)
Topic Rel Types (schema):
by painter
(instanceOf, “biography_at”, AnchorType)
(anchorRole, “biography_at”, “biography”)
(topicType, “biography_at”, painter_tt)
Anchor Types (schema):
biography
painting
by painter
painter
biography
Superimposed Information - Stanford DB talk
37
Representing Instances
(instanceOf, “painter1”, TopicInstance)
(title, “painter1”, “Paul Klee”)
(topicInsID, “painter1”, “5”)
(topic_instOf, “painter1”, painter_tt)
(instanceOf, “painting1”, TopicInstance)
(title, “painting1”, “Captive”)
(topicInsID, “painting1”, “19”)
(topic_instOf, “painting1”, painting_tt)
Topic (instances):
Paul Klee, Captive
(instanceOf, “byPainter1”, TopicRelInst)
(rel_instOf, “byPainter1”, byPainter_rt)
(topicIns1, “byPainter1”, painting1)
(topicIns2, “byPainter1”, painter1)
Topic Relationship (instance):
a by painter relationship
(instanceOf, “biography1”, AnchorInst)
(anchor_instOf, “biography1”, biography_at)
(address, “biography1”, a1)
Anchor (instance):
a biography anchor
(instanceOf, “a1”, Address)
(markID, “a1”, “URLMarkManager@954308545”)
Address (instance):
mark to URL
Superimposed Information - Stanford DB talk
38
Basic Types of Mappings
Model1
Inter-Model
Schema1
Instances1
Inter-Schema
Model1
Mapped
Converted
Schema1
Converted
Instances1
Mapped
Schema1
Model-to-Schema
Model2
Model1
Schema2
Instances1
Converted
Instances1
Model1
Mapped
Model2
Schema1
Converted
Schema2
Instances1
Converted
Instances2
Superimposed Information - Stanford DB talk
39
Mapping Rules
Simple production rules over triples
Mapped
TopicInstance
XMLElem
S(‘source’, (‘instanceOf’, X, ‘TopicInstance’))

S(‘target’, (‘instanceOf’, X, ‘XMLElem’))
Superimposed Information - Stanford DB talk
40
Mapping Rules (cont.)
TopicInstance
topic_instOf
XMLElem
Mapped
TopicType
elem_instOf
XMLElemType
S(‘source’, (‘topic_instOf’, X, Y))
S(‘target’, (‘instanceOf’, X, ‘XMLElem’))
S(‘target’, (‘instanceOf’, Y, ‘XMLElemType’))

S(‘target’, (‘elem_instOf’, X, Y))
Superimposed Information - Stanford DB talk
41
Superimposed Information Management
Application
Data
Superimposed
Application
creates and manages
Application
Specific
API
Generic
Management
TRIM
Store
Mark Management
The general architecture for managing
superimposed information
Superimposed Information - Stanford DB talk
42
Applications
• SLIM Pad
– Scratchpad application with Bundle-Scrap model
(uses superimposed information)
• XML Extractor
– “Extracts” XML information and transforms it into a Topic
Map for searching/browsing
in
XML
Extractor
out
mapped
stored
DBMS
XML Files
Generic Rep.
(XML model)
Generic Rep.
(TM model)
Topic Map Browser
Superimposed Information - Stanford DB talk
43
IDMEF to CISL
• IDMEF - Intrusion Detection
Superimposed Information - Stanford DB talk
44
Harvesting Information to
Sustain our Forests:
Creating an
Adaptive Management Portal
NSF DIGITAL GOVERNMENT PROGRAM
Tim Tolle & Lois Delcambre
ttolle@fs.fed.us
lmd@cse.ogi.edu
Co-Project Directors
Superimposed Information - Stanford DB talk
45
Project focuses on
the:
Adaptive
Management Areas
USDA Forest Service
USDI Bureau of Land
Management
USDI Fish and
Wildlife Service
Superimposed Information - Stanford DB talk
46
Adaptive Management Portal:
a value-added, Internet-based service
• Provide multiple access paths to forest information.
• Preserve local autonomy and local focus of each site.
• Support diverse users and types of information.
• Use proposed, existing, and de facto standards for content,
classification, and technology.
• Be low-cost, scalable, extensible.
Superimposed Information - Stanford DB talk
47
Project Funding
• Duration: 3 years
• Budget: $1.5 million
• Principal financial sponsors
–
–
–
–
National Science Foundation
Bureau of Land Management (Oregon State Office)
Forest Service (R-6 and PNW Station)
National Park Service (Western Region)
Superimposed Information - Stanford DB talk
48
Team Members
Tim Tolle
Regional Coordinator for AMA, US Forest Service
Eric Landis
Forest Information System Specialist, Consultant
Craig Palmer
Natural Resources Monitoring Expert, UNLV
Fred Phillips
Professor, Head, Mgt. of Science and Tech., OGI
Patty Toccalino
Asst. Prof., Environmental Science and Eng., OGI
Lois Delcambre
Professor, Computer Science and Eng., OGI
David Maier
Professor, Computer Science and Eng., OGI
Shawn Bowers
PhD Student, Computer Science and Eng., OGI
Mat Weaver
PhD Student, Computer Science and Eng., OGI
Superimposed Information - Stanford
DB talk
Forest/environmental
expertise
49
Computer science
expertise
Advisory Board
Michel Biezunski
Co-Inventor of the Topic Map Model
Jeff Burley
President, IUFRO, Oxford Forestry Institute, Dept of Plant
Sciences
Robert Devlin
USDA Forest Service, Pacific NW Region
Martin Goebel
Sustainable Northwest
Paul Gorman
MD, Asst. Professor, Division of Medical Informatics and
Outcomes Research, OHSU
Fred Johnson
Executive Director, IMFN Secretariat
Monty Knudsen
Chief, Office of Technical Support, Forest Resources, USDI Fish
and Wildlife Service
Cynthia L. Miner
Communications Director, USDA Forest Service, PNW Research
Station
Regina Rochefort
Science Advisor, USDI, National Park Service
Mark Whiting
Staff Scientist, Pacific Northwest National Laboratory
Superimposed Information - Stanford DB talk
Forest/environmental expertise
50
Computer science expertise
Task 1 – Status
• Workshops @ Snoqualmie Pass Adaptive Management Area,
Cle Elum, WA (June and July)
• Interviews with Forest Service Corvallis Forest Sciences Lab
and USGS FRESC, Corvallis (August)
• Interviews with Central Cascades Adaptive Management Area,
Eugene (August)
• Interviews with the Applegate Partnership and its associated
agencies (August)
• Rainier National Park (planned for October)
Superimposed Information - Stanford DB talk
51
Things we’ve learned from Task 1
NSF Digital Government
• work is project-based
• primary product is information: assessments, studies,
surveys, environmental impact statements
• multiple agencies are involved
• each agency serves as information gatherer; information
broker; information consumer
• even though information is a primary product, information
technology is secondary (stewardship of the land is the
primary mission)
Superimposed Information - Stanford DB talk
52
Superimposed Information - Stanford DB talk
53
Research Issues
• Models for the superimposed layer
• How does the superimposed model influence the
capabilities it supports?
• How does the form of superimposed information
affect the effort to construct and maintain it?
– Are some forms more robust to updates in the base layer
– What forms map onto current information management tools
Superimposed Information - Stanford DB talk
54
Research Issues (2)
• Challenges when superimposed and base layer have
different models
– E.g., structured over unstructured, or vice versa
• Bi-level tools
– Browsing between layers
– Queries over both layers
• How do we delimit the universe of discourse in the
base layer?
• Is it easier to fuse superimposed information than
base information?
Superimposed Information - Stanford DB talk
55
Research Issues (3)
• Variations on the conceptual architecture
– Commingled layers
– “Super-superimposed information”
• How do capabilities of base layer affect structure and
operations over superimposed information?
– Addressing modes
– Address comparison
– Querying
• Addressing for non-web sources
– Relational, object-oriented DBs
Superimposed Information - Stanford DB talk
56
Research Issues
(4)
• How to extend DBMSs to better deal with information
they don’t store.
• How to help population superimposed information
spaces.
• What are good formats for representation and
exchange of superimposed information?
Superimposed Information - Stanford DB talk
57
Why Databases Don’t (Currently) Solve
It
• Seems closely related to view and data integration
• However
– Superimposed information can’t always be derived from the
base data
– DB approaches assume schema and common model
– DBs like to work with data they control
– Traditional approaches are heavy weight
•
•
•
•
semantic analysis
schema integration
query mapping
On a source-by-source basis
Superimposed Information - Stanford DB talk
58
Download