20090629FoxDLorgFunWG - Edward A. Fox

advertisement
DL.Org (Digital Library Interoperability, Best
Practices and Modeling Foundations)
Functionality Working Group Mtg
29-30 June 2009, Athens
“Functionality modeling and functionality
interoperability, Session 1”
Functionality and Interoperability with 5S
by Edward A. Fox
• fox@vt.edu http://fox.cs.vt.edu
• Dept. of Computer Science, Virginia Tech
1
• Blacksburg, VA 24061 USA
Acknowledgements
• Mentors (Licklider, Kessler, Salton)
• Virginia Tech, CS, Digital Library Research Laboratory
• NSF and other sponsors, e.g., grants
– DUE-0840719, CCF-0722259, IIS-0535057, IIS-0325579
• Students, colleagues, co-investigators
• Robert France, Marcos André Gonçalves, Doug
Gorton, Yi Ma, Uma Murthy, Rao Shen, Hussein
Suleman, Ricardo da Silva Torres, ...
• Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh,
Seungwon Yang
2
Theses and Dissertations
•
•
•
•
•
•
•
•
Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S
Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd04252007-161736/
Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006,
PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/
Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semiautomatic Mapping-based Integration of Heterogeneous Collections into
Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis,
http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/
Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies
(5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD
dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/
Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June
2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/
Hussein Suleman, "Open Digital Libraries", Nov. 2002, PhD dissertation,
http://scholar.lib.vt.edu/theses/available/etd-11222002-155624/
Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis,
http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/
Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital
Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd3
07012002-145841/
Other Selected References
•
•
•
•
•
•
•
•
•
Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible
Interoperability for Federated Digital Libraries. ECDL 2001, 173-186, 2001
Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and
Effective Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002
Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification
and Generation of Digital Libraries. JCDL 2002, 263-272
Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML
Log Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143
Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron
Krowne, Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital
Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314
Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital
libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003
M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios,
Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312 , 2004
Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring
Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10
Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful
Digital Library? ECDL 2006, 208-219
4
Other Selected References - 2
•
•
•
•
•
•
•
•
Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M.
Wildemuth. The Core: Digital Library Education in Library and Information Science
Programs. D-Lib Magazine, 12(11), Nov. 2006
Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What
is a good digital library?" - A quality model for digital libraries. Information Processing
and Management, 43(5): 1416-1437, 2007
Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois
Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL
towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations
Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox,
Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470
Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S
Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124
Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library
Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008
Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex
Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8):
699-723, 2008
Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox.
Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123,
5
2009
Outline
• Contextual Background
– DL Definitions, Scope
– DL Curricula Efforts
– Interoperability Approaches
• 5S
• 5S Services Work
• International Repository Infrastructure
Workshop (Amsterdam, Mar 16-17, 2009)
• Discussion Topics
6
DL Definitions
• Issues and Spectra
– Collection vs. Institution
– Content vs. System
– Access vs. Preservation
– “Free” vs. Quality
– Managed vs. Comprehensive
– Centralized vs. Distributed
7
Information
Life
Cycle
Borgman et al.:
Workshop Report on
Social Aspects of
Digital Libraries:
http://www-lis.gseis.
ucla.edu/DL/ 8
Information Life Cycle
Authoring
Modifying
Using
Creating
Retention
/ Mining
Organizing
Indexing
Accessing
Filtering
Storing
Retrieving
Distributing
Networking
9
Digital Libraries
Shorten the Chain from
Editor
Reviewer
Publisher
A&I
Consolidator
Library
10
DLs Shorten the Chain to
Author
Teacher
Digital
Reader
Editor
Reviewer
Learner
Library
Librarian
11
DL Curric. Project
• NSF awards to VT and UN C-CH
• CS and LIS
• http://curric.dlib.vt.edu/
• http://curric.dlib.vt.edu/wiki/index.php/Main
_Page
• http://curric.dlib.vt.edu/modDev/modDev.ht
ml
12
RELATED
TOPICS
CORE DL
TOPICS
COURSE
STRUCTURE
DL Curriculum Framework
Semester 1:
DL collections:
development/creation
Digitization
Storage
Interchange
Metadata
Cataloging
Author
submission
Digital objects
Composites
Packages
Semester 2:
DL services and
sustainability
Architectures
(agents, buses,
wrappers/mediators)
Interoperability
Spaces
(conceptual,
geographic,
2/3D, VR)
Documents
E-publishing
Markup
Multimedia
streams/structures
Capture/representation
Compression/coding
Bibliographic
information
Bibliometrics
Citations
Content-based
analysis
Multimedia
indexing
Naming
Repositories
Archives
Services
(searching,
linking,
browsing, etc.)
Archiving and
preservation
Integrity
Architectures
(agents, buses,
wrappers/mediators)
Interoperability
Thesauri
Ontologies
Classification
Categorization
Multimedia
presentation,
rendering
Info. Needs
Relevance
Evaluation
Effectiveness
Intellectual property
rights mgmt.
Privacy
Protection (watermarking)
Routing
Filtering
Community
filtering
Search & search strategy
Info seeking behavior
User modeling
Feedback
Info
summarization
Visualization
13
DL Curric. Modules - 1
• Module 1-b: History of digital libraries
and library automation
• Module 2-c: File Formats,
Transformation, and Migration
• Module 3-b: Digitization
• Module 4-b: Metadata
• Module 5-a: Architecture overviews
14
DL Curric. Modules - 2
• Module 5-b: Application software
• Module 5-d: Protocols
• Module 6-a: Information
needs/relevance
• Module 6-b: Online information seeking
behaviors and search strategies
• Module 6-d: Interaction design and
usability assessment
15
DL Curric. Modules - 3
•
•
•
•
Module 7-b: Reference Services
Module 7-g: Personalization
Module 8-b: Web Archiving
Module 9-c: Digital library evaluation,
user studies
16
Interoperability Approaches
•
•
•
•
•
•
•
•
Browsers (Mosaic)
Federation
Heterogeneous, Homogeneous
Protocols (OAI-PMH)
Repositories
Content Standards (XML), Mapping
Integration (ETANA)
Services (Superimposed Information)
17
Integration: Challenges
• “Semantic Web” is vision, not reality.
• How can we integrate without a theory?
• How can we interoperate without a
common framework?
• How can we have a science of DLs if
we lack agreement on definitions (so
we can reason and discuss) and
measures of quality (so we can
compare and improve)?
18
Informal 5S & DL Definitions
DLs are complex systems that
•
•
•
•
•
help satisfy info needs of users (societies)
provide info services (scenarios)
organize info in usable ways (structures)
present info in usable ways (spaces)
communicate info with users (streams)
19
5S Layers
Societies
Scenarios
Spaces
Structures
Streams
20
5Ss
Ss
Examples
Objectives
Streams
Text; video; audio; image
Describes properties of the DL content
such as encoding and language for
textual material or particular forms of
multimedia data
Structures Collection; catalog;
hypertext; document;
metadata
Specifies organizational aspects of the DL
content
Spaces
Measure; measurable,
topological, vector,
probabilistic
Defines logical and presentational views
of several DL components
Scenarios
Searching, browsing,
recommending
Details the behavior of DL services
Societies
Service managers,
learners, teachers, etc.
Defines managers, responsible for
running DL services; actors, that use
those services; and relationships among
21
them
5S Overview
• 5S and Generating DLs
–
–
–
–
–
–
–
5S Framework
5S definitions, services taxonomy, ontology
5SL
5SGraph
5SGen (and DL development)
DL development of union DL, DL integration
5SGen into DSpace
• 5S Metamodels
–
–
–
–
Minimal DL
Archaeology DL
CBIR DL
Union DL
Streams
Digital Library Content
Content
Types
Text
Documents
Video
Audio
Geographic
Information
Software,
Programs
Bio
Information
Images and
Graphics
Articles,
Reports,
Books
Speech,
Music
(Aerial)
Photos
Models
Simulations
Genome
Human,
animal,
plant
2D, 3D,
VR,
CAT
23
Structure
(Degrees, Terminology)
Web
DLs
DBs
Chaotic
Organized
Structured
24
Digital Objects (DOs)
• Born digital
• Digitized version of “real” object
– Is the DO version the same, better, or worse?
– Decision for ETDs: structured + rendered
• Surrogate for “real” object
– Not covered explicitly in metamodel for a
minimal DL
– Crucial in metamodel for archaeology DL
25
Databases
• 5S perspective: structures, streams,
scenarios
• Extending database technology
• Structured and unstructured info
• Multimedia databases
• Link databases
• Performance, transaction processing
• Replicated storage, rollback/recovery
26
Spaces
User interfaces and visualization
•
•
•
•
2D interfaces
3D interfaces
GIS
Other paradigms
27
Scenarios
•
•
•
•
Services (see later)
Scenario based design, use cases
Functionality
Representation and processing for
humans and machines
28
Societies
• User communities
– Authors, editors, teachers, students, readers
– Personal(ization), group(ware), community, global
– Accessibility, universal access
• Librarians: reference, acquisition, operations
• Research community
– Associations, conferences, publications, labs, projects
• Economics
– Copyright, intellectual property rights, digital rights
management, authorization, authentication, security,
privacy, self-archiving (eprints)
– Publishers, catalogers, distributors, sustainability
– Open source, commercial, hybrid
29
Higher DL Constructs
•
•
•
•
•
•
Collections
Catalogs
Repositories and Archives
Services
Systems
Case Studies
30
Collections
•
•
•
•
Terminology: set, “database”
Distributed: basis, efficiency/effectiveness
Parallelism: federation, harvesting
Scale: object size, compression, replication,
stream splitting
• Intelligence/processing granularity: object,
cluster, collection, repository
31
NSDL Collections
•
•
•
•
Discovery of content
Classification and cataloguing
Acquisition and/or linking; referencing
Disciplinary-based themes define a natural body of
content, but other possibilities are also encouraged
• Access to massive real-time or archived datasets
• Software tool suites for analysis, modeling,
simulation, or visualization
• Reviewed commentary on learning materials and
pedagogy
32
Catalogs
•
•
•
•
•
OPACs
Distributed vs. centralized
Coverage, breadth
Specificity, depth
Management: versioning, works
33
Repositories and Archives
• Naming, identifiers
• Architectures, interoperability
– OAI: harvesting
– SRU/SRW: federation
• Preservation, archives
– LOCKSS, UVC, emulation/migration
• Scalability, storage
• Institutional repositories, Open Access
34
Services
•
•
•
•
•
NSDL Services
Taxonomy of services
Ontology, composition, reuse
Evaluation
Key services in-depth:
– Crawling, indexing
– Clustering, classifying
– Recommending, using social networks
– Logging
35
NSDL Services
• Help services, frequently asked questions, etc.
• Synchronous/asynchronous collaborative
learning environments using shared resources
• Mechanisms for building personal annotated
digital information spaces
• Reliability testing for applets or other digital
learning objects
• Audio, image, and video search capability
• Metadata system translation
• Community feedback mechanisms
36
Infrastructure Services
Repository-Building
Creational
Preservational
Acquiring
Cataloging
Crawling (focused)
Describing
Digitizing
Federating
Harvesting
Purchasing
Submitting
Conserving
Converting
Copying/Replicating
Emulating
Renewing
Translating (format)
Add
Value
Annotating
Classifying
Clustering
Evaluating
Extracting
Indexing
Measuring
Publicizing
Rating
Reviewing (peer)
Surveying
Translating
(language)
Information
Satisfaction
Services
Browsing
Collaborating
Customizing
Filtering
Providing access
Recommending
Requesting
Searching
Visualizing
37
Services Ontology: Applications
38
Ontology: Applications
• Expand definition of minimal DL by
characterizing
– typical DL services
– in the context of “employs” and “produces”
relationships
• Use characterization to:
– Reason about how DL services can be built
from other DL components
– As well as be composed with other services
through extension or reuse
39
Infrastructure
Information
Satisfaction
Services
Services (Add_Value)
Rating
Indexing
p
Training
p
{(digital object, Index
actor, rate) }
Society
actor
p
handle
anchor
e
classifier
e
Browsing
e
Requesting
p
p
e
e
user model
query/category
e
e
Recommending
p
{digital object}
e
e
Searching
p
Collection, {digital object}
e
Filtering
Binding
p
p
{digital object}
query
e
binder
e
fundamental
composite

{digital object}
transformer
e
e
e
Visualizing
Expanding query
p
p
space
query’
40
5S and DL formal definitions and compositions (April 2004 TOIS)
relation (d. 1)
sequence graph (d. 6)
(d. 3)
measurable(d.12), measure(d.13), probability (d.14),
language (d.5)
vector (d.15), topological (d.16) spaces
sequence
tuple (d. 4)*
(d.
3)
function
state (d. 18)
event (d.10)
(d. 2)
5S
grammar (d. 7)
streams (d.9)
structures (d.10) spaces (d.18) scenarios (d.21) societies
(d. 24)
services (d.22)
structured
stream (d.29)
digital
object
(d.30)
structural
metadata
specification
(d.25)
transmission collection (d. 31)
(d.23)
repository
(d. 33)
descriptive
metadata
specification
(d.26)
metadata catalog
(d.32)
(d.34)indexing
service
hypertext
(d.36)
browsing
service
(d.37)
digital
library
(minimal) (d. 38)
searching
service (d.35)
41
Streams
image
contains
metadata
specifications


describes
Collection
Catalog
text
audio
video
contains
Structures
is_version_of/
cites/links_to
describes
digital
object
Index
stores
Measurable
is_a
Measure
employs
produces
Topological
Repository
employs
produces
is_a
is_a Vector Metric
Probabilistic
Spaces
employs
produces
inherits_from/includes
runs
Service

extends
reuses
Scenario
precedes
contains
happens_before
event
Scenarios
Societies
Service
Manager
uses
participates_in Actor
recipient

association
operation
executes
42
redefines
invokes
XML-based DL Log Standard
• Log analysis
– is a source of information on:
• How patrons really use DL services
• How systems behave while supporting user information
seeking activities
• Used to:
– Evaluate and enhance services
– Guide allocation of resources
• Common practice in the web setting
– Supported by web servers, proxy caches
• DL Logging can be more detailed
43
The XML Log Format
Log
Transaction SessionId MachineInfo Timestamp
Event
StatusInfo
Search
SearchBy
SessionInfo
RegisterInfo
Timestamp
Statement
Action
Browse
QueryString
Statement
Update
Collection Catalog
StoreSysInfo
Timeout
PresentationInfo
44
Systems
• Architectures
– Client-server, service-oriented
– P2P, Grid
• System descriptions and comparisons
– Personal DLs; Institutional to global
– DSpace, Eprints, Fedora, Greenstone, Kepler
• ODL
• 5S Suite: language, visualization,
generation, logging
45
Architectural Issues
•
•
•
•
•
Independent system vs. part of federation
Centralized vs. distributed vs. open services
Monolithic vs. modular vs. componentized
Topologies: bus vs. star vs. hierarchical vs. network
Decompositions vary
– search engine, browser, DBMS, MM support
– repository, handle server, client
– information resources + mediators, bus or agent
collection + client with workspace/environment
46
NSDL Information Architecture
Essentially as developed by the Technical Infrastructure Workgroup
Portals &
Portals &
Clients
Portals &
Clients
Clients
User
Interfaces
Core
NSDL
“Bus”
NSDL
NSDL
NSDL
Collections
Collections
Collections
Collection
Building
referenced
referenced
items&&
Special
items
collections
Databases
collections
Core
Core Services:
Collectionmetadata
Building
Core gathering
CollectionServices
protocols
Building
Services
harvesting
NSDL
NSDL
Services
Other
NSDL
Services
Services
Usage
Enhancement
Core
Services:
CI Services
information
retrieval
CI Services
browsing
CI
Services
authentication
CI Services
personalization
CI Services
discussion
annotation
47
5S Modeling -> Systems
represented by
Domain
Concepts
(theory)
instance of
interpreted as
used
to compose
abstracted
from
Modeling
Language
(Meta-Model)
instance of
represented by
DL
Architecture
Model
interpreted as
instance of
instance of
Running
DL
“real” world
object
Actors
Q
“Real”
World
48
Tools/Applications
5S
Meta
Model
DL
Expert
5SGraph
DL
Designer
Practitioner
5SL
DL
Model
Teacher
component
pool
ODLSearch,
ODLBrowse,
ODLRate,
ODLReview,
…….
Researcher
5SLGen
Tailored
DL
Logging Module
XML
Log
49
Formal
Theory/
Metamodel
5S
Requirements
5SGraph
5SL
Analysis
DL XML
Log
5SLGen
OO Classes
Workflow
Design
Components
Implementation
DL
Evaluation
Test
50
5SL: a DL design language
• Domain specific languages
– Address a particular class of problems by offering
specific abstractions and notations for the domain at
hand
– Advantages: domain-specific analysis, program
management, visualization, testing, maintenance,
modeling, and rapid prototyping.
• XML-based realization of 5S
– Interoperability
– Use of many sub-languages (e.g., MIME types, XML
Schemas, UML notations)
51
5SL – The Minimal DL Metamodel
Scenarios
(Meta-) Model
Societal
(Meta-) Model
Meta-Models
Meta-Models
Primitives
uses Actor
runs
Service
Scenario
receiver
Community
Service
Event
Manager
Interface
Manager
Index
Manager
Search
Manager
Collection
Index
User
Repository
Manager
Browsing
Manager
Catalog
Interface
Document
Metadata
Retrieval
Model
Text
Spatial
Stream
(Meta-) Model
(Meta-)Model
Video
Audio
Structural
(Meta-) Model
Image
52
Example of
Document
declaration in the
Structures Model
<document name=`ETD'>
<stream_enumeration>
Example of Actors
declaration in the
Societies Model
<Society>
<Actor>
<Community name='Patron‘/>
<Attribute name='name‘
<stream
type='String'/>
value=`ETDText'>
<Attribute name='ID‘
type='Integer'/>
<stream
value=`ETDAudio'>
...
</Community>
<Community name='Student'>
<Service>Converting</Service>
</stream_enumeration>
</Community>
<structured_stream>
<Community name='ETDReviewer'>
<Service>Reviewing</Service>
%XMLSchema%
<structured_stream>
</document>
</Community>
<Community name='ETDCataloguer'>
<Service>Cataloguing</Service>
</Community>
Example of Service
declaration in the
Scenario Model
<SERVICE name ='Searching'>
<SCENARIO name='SimpleSearching'>
<NOTE>Simple scenario for an NDLTD
site searching service</NOTE>
<EVENT>
<SENDER>Patron</SENDER>
<RECEIVER>InterfaceManager</RECEIVER>
<OPERATION name=SearchCriteria/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
<SENDER>InterfaceManager</SENDER>
<RECEIVER>SearchManager</RECEIVER>
<OPERATION name='Search'/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
</Actor>
<SENDER>SearchManager</SENDER>
………
<RECEIVER>InterfaceManager</RECEIVER>
<PARAMETER name='Results'>WtdSet
</PARAMETER>
</EVENT>
….
53
5SGraph: A DL Modeling Tool
•
•
•
Help users model their own instances of a
digital library (DL) in the 5S language (5SL).
A simple modeling process which enables rapid
generation of digital libraries
Features
–
–
–
5SGraph loads and displays a metamodel in a
structured toolbox.
The structured editor of 5SGraph provides a topdown visual building environment for the DL
designer.
5SGraph produces syntactically correct 5SL files
according to the visual model built by the designer.
54
Overview of 5SGraph
Workspace
(instance model)
Structured
toolbox
(metamodel)
55
56
5SGen
• Version 1 -- MARIAN as the target system
– Focused on rich structures: semantic networks
– Behavior attached to nodes/links
• Version 2 -- Shifted for later work to
componentized (ODL) approach
– Focused on scenarios/societies
– Structures/Spaces encapsulated within components
(e.g., relational tables, indexes)
– Only textual streams supported
• Version 3 – Practical DL (w. DSpace) –
Doug Gorton
57
5SLGen – Version 2: ODL,
Services, Scenarios
5SL-Scenario
Model (6)
DL
Designer
Component
Pool
XMI:Class
Model (3)
ODL
Search
Wrapping
Wrapping
import
import
Scenario
Synthesis (9)
Deterministic
FSM (10)
Xmi2Java (4)
Java
Classes
Model (5)
DL
Designer
StateChart
Model (8)
5SLGen
Java
ODL
Browse
XPath/JDOM
Transform (7)
XPATH/JDOM
Transform (2)
.
.
.
Java
5SL-Societies
Model (1)
SMC (11)
superclass
Java
Finite
State Machine
Class
Controller (12)
binds
JSP
User
Interface
View (13)
58
Generated DL Services
Requirements (1)
5S
Meta
Model
DL
Expert
Analysis (2)
DL
Designer
5SGraph
Practitioner
5SL
DL
Model
component
pool
ODLSearch,
ODLBrowse,
ODLRate,
ODLReview,
…….
Teacher
Design (3)
Researcher
Tailored
DL
Services
5SLGen
Implementation (4)
5SSuite
5SGraph
5SGen
Mapping Tool
59
Describing Quality in
Digital Libraries
• What’s a “good” digital Library?
– Central Concept: Quality!
– Hypotheses of this work:
• Formal theory can help to define “what’s a good
digital library” by:
• New formalizations of quality indicators for DLs
within our 5S framework
• Contextualizing these measures within the
Information Life Cycle
60
Quality and the Information Life Cycle
Active
Accura
cy
Comple
te
Conform ness
ance
Timeliness
Similarity
Preservability
Describing
Organizing
Indexing
Authoring
Modifying
Semi-Active
Pertinence
Retention
Significance
Mining
Creation
Accessibility
Storing
Accessing
Timeliness
Filtering
Utilization
Archiving
Distribution
Seeking
Discard
Inactive
Ac
ce
ssi
bil
Networking P
r es
i
er v t y
ab
ilit
y
Searching
Browsing
Recommending
Relevance
61
Quality Dimensions
DL Concept
Digital object
Metadata specification
Collection
Catalog
Repository
Services
Dimensions of Quality
Accessibility
Pertinence
Preservability
Relevance
Similarity
Significance
Timeliness
Accuracy
Completeness
Conformance
Completeness
Impact Factor
Completeness
Consistency
Completeness
Consistency
Composability
Efficiency
Effectiveness
Extensibility
Reusability
Reliability
62
Services: Efficiency / Effectiveness
• Effectiveness
– Very common measures: Precision, Recall, F1, 10precision, R-Precision
– Other services may have different measures: e.g.,
Recommending, etc.
• Efficiency
– let t(e) be the time of an event e
– let eix and efx be the initial and the final event of
service sex .
– For service sex, efficiency is defined as:
• Efficiency(sex) = t(efx) - t(eix)
63
DL Integration
•
What is “DL Integration”
– Hide distribution
– Hide heterogeneity
– Enable autonomy of individual component
•
Why Integration
– island-DLs
– inability to seamlessly and transparently
access knowledge across DLs
Utilize various autonomous DLs in concert
64
Integration: Urgency, Longevity
• If we collect, capture, acquire, or produce
information, will it be usable in 100 years?
• NSF Digital Archiving Program
• Library of Congress National Digital
Information Infrastructure and
Preservation Program
65
DL integration formalization
based on
DL interoperability approach
Consists of
Intermediary-based
Interrelated with
mapping-based
use
mediator
wrapper
use
agent
schema mapping
used in
two architectures
Consists of
federation
Union Archiving
use
hybrid mapper
composite mapper
trained by
GA
66
Union DL Definitions
• A Minimal Union Digital Library integrated from n
DLs is given as a four-tuple:
MinUnionDL=(Union Repository, Union
Catalog, Minimal Union Services, Union
Society).
• DL Integration Problem Definition: Given n
individual digital libraries (DL1, DL2, …, DLn),
each defined as described above, to integrate
the n DLs is to create a Union DL.
Union Catalog Quality Measurement
•
Complete
– All the catalogs to be integrated are complete.
•
Consistent
– All the catalogs to be integrated are consistent.
– Each descriptive metadata specification in the
union catalog describes only one digital object.
68
Member DLs of ETANA-DL
Lahav
Madaba
Megiddo
Umayri
Society
Society
Society
Society




Archaeologists
Archaeologists
Archaeologists
Archaeologists
Service
Database
Searching
and Browsing
Service
Database
Searching
and Browsing
Service
Database
Searching
and Browsing
Service
Database
Searching
and Browsing
Catalog
Catalog
Catalog
Catalog
Repository
Repository
Repository
Repository
…
Architecture of ETANA-DL, with
centralized catalog and partially
decentralized repository
Union Society

Archaeologists
General Public
Union Services
Harvesting, Mapping
Searching, Browsing, Recommendation,
Annotation, Object Comparison, Object Sharing
Binding, Visualization
Union Catalog
Union Repository
Mapping confirmation
Mapping history
71
Union Catalog Integration
Virtual Nimrin
(VN)
VN Metadata
Format
Mapping
Tool
Union ArchDL
VN
Catalog
Halif DigMaster
(HD)
Wrapper
Union
Catalog
HD
Catalog
Global Metadata
Format
Wrapper
HD Metadata
Format
Mapping
Tool
72
ArchDL Expert
5S Archaeology
MetaModel
ArchDL Designer
5SGraph
VN Metadata Format
Scenario
Sub-model
ETANA-DL
Union Services
Descriptions
ETANA-DL Metadata Format
VN
Catalog
HD
Catalog
Mapping Tool
Wrapper4VN
Harvesting
Mapping
Searching
Browsing
…
Wrapper4HD
Structure
Inverted FilesSub-model
Search
Service
XOAI
Browse DB
Browse
Service
Component
Pool
Services DB
5SGen
Other
XOAI
ETANA-DL
Services
Web Interface
Union
Catalog
Browsing
…
HD Metadata Format
73
5S definitional structure
Streams
Structured
Stream
Structures
Spaces
Structural
Metadata
Specification
Scenarios
Societies
services
Descriptive
Metadata
Specification
indexing
browsing searching
hypertext
Digital Object
Collection
Metadata Catalog
Repository
Minimal DL
Minimal archaeological DL in the
5S framework
(A.i is from minimal DL, j is new)
A .1
A .2
S tr e a m s
S tr u c tu r e s
A .3
A .4
A .5
S paces
S c e n a r io s
S o c ie tie s
A .7
D e s c r ip tiv e
M e ta d a ta
s p e c ific a tio n
A .6
S tr u c tu r e d
S tr e a m
1
A .8 s e r v ic e s
S p a T e m O rg
2
S tr a D ia
3
4
in d e x in g
A .1 0
b r o w s in g
A r c h D e s c r ip tiv e
M e ta d a ta s p e c ific a tio n
A rc h O b j
A .1 2
A .1 1
s e a r c h in g
h y p e r te x t
6
5
A .9
A .1
8
A rc h D O
A r c h M e ta d a ta c a ta lo g
A r c h C o ll
7
A r c h D C o ll
9
A rc h D R
10
M in im a l A r c h D L
Minimal CBIR DL
Stream
Image
Stream
Space
Feature
Vector
Image
Descriptor
Composite
Descriptor
Structure
Service
Society
KNNQ
User Info
Need
Structured
Featute
Vector
Image
Content
Description
Image
Object
Visualization
Operation
Image
Digital
Object
Image Descriptor
Metadata Catalog
Image
Collection
RQ
Content-based Image
Searching Service
DL Ref. Model Concepts -5S(see II.4.2)
• User -> Societies
– Human and machine actors
– End-users, Designers, Administrators,
Application Developers + Librarians (DL curric)
•
•
•
•
•
Content -> Streams, Structures
Functionality -> Services -> Scenarios
Quality -> Services (recall 5SQual)
Policy -> Scenarios, Societies
Architecture -> Scenarios, Structures, Spaces
77
(components, protocols, standards, specs)
International Repository Infrastructure
Workshop (Amsterdam, Mar 16-17, 2009)
• How can we strengthen the infrastructure
for repositories: key solvable problems:
• Citation services - making citation data
more easily available from repositories
• Repository handshake – talking to each
other, user deposit into several at once
• Interoperable identification infrastructure –
unambiguous people, documents (FRBR)
78
International Repository Infrastructure
Workshop – and DL.org
• How are these 2 related?
• Can we learn from the Amsterdam
meeting and focus on some important and
solvable issues immediately?
79
Discussion Topics
• Faced in MARIAN, NCSTRL, CITIDEL,
Ensemble, NSDL, ETANA
• Already solved: OAI-PMH
• Focus
– Superimposed information / annotation
– Citation information
• Approaches
– 5S: 5SL, 5SGen, 5SQual
– XML representations
– Protocols (VIDI)
80
Summary
• Contextual Background
– DL Definitions, Scope
– DL Curricula Efforts
– Interoperability Approaches
• 5S
• 5S Services Work
• International Repository Infrastructure
Workshop (Amsterdam, Mar 16-17, 2009)
• Discussion Topics
81
Questions?
Discussion?
Thank You!
82
Download