20070214DELOSkeynotePisa - Edward A. Fox

advertisement
DELOS Conference
(Pisa, Italy –14 Feb 2007)
Digital Libraries:
From Proposals to Projects
to Systems to Theory
to Curricula
Edward A. Fox
Virginia Tech
Blacksburg, VA 24061 USA
1
Outline
•
•
•
•
•
•
•
•
•
•
Acknowledgments
Introduction
Proposals
Projects
Systems
Theory
Curricula
Examples
Summary
Discussion
2
Acknowledgements
•
•
•
•
•
Students
Faculty, Staff
Collaborators
Support
Mentors
3
Acknowledgements: Students
• Pavel Calado, Yuxin Chen, Fernando Das Neves,
Shahrooz Feizabadi, Robert France, Marcos
Gonçalves, Doug Gorton, Nithiwat Kampanya,
Rohit Kelapure, S.H. Kim, Neill Kipp, Aaron
Krowne, Bing Liu, Ming Luo, Paul Mather, Uma
Murthy, Sanghee Oh, Ananth Raghavan, Unni.
Ravindranathan, Ryan Richardson, Rao Shen,
Ohm Sornil, Hussein Suleman, Ricardo da Silva
Torres, Srinivas Vemuri, Wensi Xi, Seungwon
Yang, Baoping Zhang, Qinwei Zhu, …
4
Acknowledgements: Faculty, Staff
• Lillian Cassel, Lois Delcambre, Debra Dudley,
Roger Ehrich, Joanne Eustis, Weiguo Fan,
James Flanagan, C. Lee Giles, Sandy Grant,
Eric Hallerman, Eberhard Hilf, John
Impagliazzo, Filip Jagodzinski, Douglas Knight,
Deborah Knox, Alberto Laender, David Maier,
Gail McMillan, Claudia Medeiros, Manuel
Perez-Quinones, Jeff Pomerantz, Naren
Ramakrishnan, Layne Watson, Barbara
Wildemuth, …
5
Other Collaborators (Selected)
•
•
•
•
•
•
•
•
Brazil: FUA, UFMG, UNICAMP
Case Western Reserve University
Emory, Notre Dame, Oregon State
Germany: Univ. Oldenburg
Mexico: UDLA (Puebla), Monterrey
College of NJ, Hofstra, Penn State, Villanova
Portland State University
University of Arizona, University of Florida,
Univ. of Illinois, University of Virginia
• VTLS (slides on digital repositories, NDLTD)6
Acknowledgements: Support
ACM, Adobe, AOL, CAPES, CNI,
CONACyT, DFG, IBM, IMLS, Microsoft,
NASA, NDLTD, NLM, NSF (IIS-9986089,
0080748, 0086227, 0307867, 0325579,
0532825, 0535057, 0535060; ITR0325579; DUE-0121679, 0121741,
0136690, 0333531, 0333601, 0435059),
OCLC, SOLINET, SUN, SURA, UNESCO,
US Dept. Ed. (FIPSE), VTLS, …
Acknowledgements - Mentors
• JCR Licklider – undergrad advisor (1969-71)
– Author in 1965 of “Libraries of the Future”
– Before, at ARPA, funded start of Internet
• Michael Kessler – BS thesis advisor
– Project TIP (technical information project)
– Defined bibliographic coupling
• Gerard Salton – graduate advisor (1978-83)
– “Father of Information Retrieval”
– Application of Scientific Methods toward Integration of
Theory, Systems, Experiments, and Education
8
Libraries of the Future
JCR Licklider, 1965, MIT Press
World
Nation
State
City
Community
9
Introduction – Mentor Challenges
• Scientific method
– “Leonardo da Vinci: The first scientist”
• Theory-based -> integration
– Across computing disciplines
– Over content, representations, services
• Experimentally proven
– Evaluation: formative, summative
• Practically useful and beneficial
– Make the world better (smaller)
– Task support, effectiveness, efficiency
10
Digital Libraries --- Objectives
• World Lit.: 24hr / 7day / from desktop
• Integrated “super” information systems: 5S:
Table of related areas and their coverage
• Ubiquitous, Higher Quality, Lower Cost
• Education, Knowledge Sharing, Discovery
• Disintermediation -> Collaboration
• Universities Reclaim Property
• Interactive Courseware, Student Works
• Scalable, Sustainable, Usable, Useful
Digital Libraries
Shorten the Chain from
Editor
Reviewer
Publisher
A&I
Consolidator
Library
12
DLs Shorten the Chain to
Author
Teacher
Digital
Reader
Editor
Reviewer
Learner
Library
Librarian
13
Introduction – 1991 Workshop
• ACM SIGIR ’91 (Chicago)
• Workshop on Future Directions in IR
• Report planning with
– Michael McGill
– Michael Lesk
• How can we accomplish something?
– Address society’s needs
• What if all undergrads had info. access?
• Funding lobbying leading to: DLI, NSDL
14
15
Communications
(bandwidth, connectivity)
Locating Digital Libraries in Computing and
Communications Technology Space
Digital Libraries
technology
trajectory: intellectual
access to globally
distributed information
Computing (flops)
Digital content
less
more
Note: we should consider 4 dimensions:
computing, communications,
content, and community (people)
Challenges, Apps, Projects
• US-Korea Collaboration on DLs Workshop
• Reagan Moore and Ed Fox report
• Chart Headings:
– Application Domain
– Related Institutions
– Examples
– Technical Challenges
– Benefit/Impact
17
R
e
a
g
a
n
M
o
o
r
e
E
d
F
o
x
Application
Domain
Related Institutions
Examples
Technical Challenges
Benefit / Impact
Publishing
Publishers, Eprint
archives
OAI
Quality control, openness
Aggregation, organization
Education
Schools, colleges,
universities
NSDL, NCSTRL
Knowledge management,
reuseability
Access to data
Art, Culture
Museum
AMICO, PRDLA
Digitization, describing, cataloging
Global understanding
Science
Government,
Academia, Commerce
NVO, PDG,
SwissProt, UK
eScience,European
Union Commission
Data models
reproducibility, faster reuse, faster
advance
(e)
Government
Government Agencies
(all levels)
Census
Intellectual property rights, privacy,
multi-national
Accountability, homeland security
(e)
Commerce,
(e) Industry
Legal institutions
Court cases, patents
Developing standards
Standardization, economic development
History,
Heritage
Foundations
Crosscutting
Library,
Archive
J
u
n
e
2
0
0
2
American Memory
Content, context, interpretation
Long term view, perspective,
documentation, recording, facilitating,
interpretation, understanding
Web, personal
collections
Multi-language, preservation,
scalability, interoperability,
dynamic behavior, workflow,
sustainability, ontologies,
distributed data, infrastructure
Reduced cost, increased access,
pereservation, democratization, leveling,
peace, competitiveness
18
f
o
r
N
S
F
Introduction – Alliteration
• 5S
–
–
–
–
–
Societies
Scenarios
Spaces
Structures
Streams
• 3C
– Content
– Context
– Criticism, commentary
19
Introduction – Alliteration
• 5S
• 3C
– Societies
• Users
• Collaboration, Web 2.0
– Scenarios
• Workflow, Stories
• Services, Components
– Spaces: GIS
– Structures: DBMS
– Streams: DSMS
– Content
• Content Management
Systems
– Context
• Link Structure
• NLP
• Mental models
– Criticism, commentary
•
•
•
•
•
Annotation, Talmud
Cataloging, indexing
Abstracting
Summarizing
Secondary literature
20
Introduction – Time to:
• Treat DL as a serious field
• Achieve balance
– Research & Development
– Systems & Services
– Practice, Continuous Quality Improvement
– Use, Benefit
• Train digital librarians
• Achieve sustainability
21
Introduction - Approach
1.
2.
3.
4.
5.
Proposals
Projects
Systems
Theory
Curricula
1.
2.
3.
4.
Vision
Objectives
Generality
Abstraction,
conceptualization
5. Education
– Structure
– Pedagogy
22
Introduction - Proposals
•
•
•
•
•
•
•
Early visions
Providing rationale for funding, programs
USA
Europe
India, China, New Zealand, Australia, …
Sustainability, follow-on
Technology transfer
– Stanford DLI-1 -> Google
23
Introduction - Projects
•
•
•
•
•
•
•
•
Body of information
Media type (maps, video, speech, photos)
Representation (DC, METS, FRBR)
Architecture (SOA)
Interoperability (OAI)
Archiving and Preservation (UVC)
Devices (SenseCam, PIM)
Links with other fields
24
Introduction – Projects -2
• Body of information
– Person’s works (Cervantes)
– Content by organization
•
•
•
•
Library (Library of Congress)
Publisher (ACM)
Million books project
Google consortium
– Content by discipline (Physics, CS, Archaeology)
– Content by genre (ETDs)
– Content by target audience (TEL, Learners)
25
NSDL Information Architecture
Essentially as developed by the Technical Infrastructure Workgroup
Portals &
Portals &
Clients
Portals &
Clients
Clients
User
Interfaces
Core
NSDL
“Bus”
NSDL
NSDL
NSDL
Collections
Collections
Collections
Collection
Building
referenced
referenced
items&&
Special
items
collections
Databases
collections
Core
Core Services:
Collectionmetadata
Building
Core gathering
CollectionServices
protocols
Building
Services
harvesting
NSDL
NSDL
Services
Other
NSDL
Services
Services
Usage
Enhancement
Core
Services:
CI Services
information
retrieval
CI Services
browsing
CI
Services
authentication
CI Services
personalization
CI Services
discussion
annotation
26
Digital Library Content
Content
Types
Text
Documents
Video
Audio
Geographic
Information
Software,
Programs
Bio
Information
Images and
Graphics
Articles,
Reports,
Books
Speech,
Music
(Aerial)
Photos
Models
Simulations
Genome
Human,
animal,
plant
2D, 3D,
VR,
CAT
27
Introduction – Projects - 5
• Links with other fields
– Art, sculpture, music, speech
– Medicine: images, datasets, genomics
– Law, government
• Statutes, regulations
• Citations, commentaries
– Supercomputers, Grid
– HCI, Cognitive Psychology
– IR, HT, MM
28
CC2001 Information Management Areas
IM1. Information models and
systems*
IM2. Database systems*
IM8. Distributed DBs
IM3. Data modeling*
IM10. Data mining
IM4. Relational DBs
IM11. Information storage and
retrieval
IM12. Hypertext and
hypermedia
IM13. Multimedia information
& systems
IM14. Digital libraries
IM5. Database query
languages
IM6. Relational DB design
IM7. Transaction processing
IM9. Physical DB design
29
* Core components
Introduction - Systems
•
•
•
•
•
•
IBM DL -> content management system
MARIAN, ODL, WS-ODL
Greenstone
DSpace
Fedora
DELOS
– DLMS
– ISIS & OSIRIS
30
Introduction - Theory
•
•
•
•
•
•
Definitions: Key ideas, concepts
Taxonomy: Groups, clusters
Abstraction/generalization: Components
Models, metamodels
Proofs: relationships, improvements
Uses, benefits
– Interoperability (map, wrap, mediate, harvest)
• User interface: Explore: browse/search/visualize
– Automation (lex/yacc -> 5SGraph, 5SGen)
31
Introduction - Curricula
• Audience
– LIKES, LIS, CS
– Developer, implementer, systems librarian
– D. Librarian (reference, coll. development)
• Core
• Tracks
– Libraries: public, school/univ., corporation
– Cultural heritage
– Science (research, education)
– Persons (PIM)
32
Living In the KnowlEdge Society (LIKES):
Core surrounded by enabling computing concepts
and problem providing disciplines
Economics
Math
Political
Science
Architecture
Marketing
Biology
Algorithms
HCI
Sociology
Visualization
Geography
Database
Social &
Ethical
Chemistry
Knowledge
Society
Intelligent
Systems
Finance
Systems
Analysis
& Design
Physics
Art
Simulation
Programming
Music
Knowledge
Management
Architecture
History
Psychology
Net-Centricity
Healthcare
Engineering
Modeling
Communications
Library &
Information
Science
English
33
DL Curricula
• “Curriculum Development for Digital
Libraries” – NSF grant to VT, UNC-CH
• Studied body of literature
• Modules: core, related
• Invite collaboration worldwide
34
Digital Librarian:
Needed Skills and Knowledge
• Choi, Y., & Rasmussen, E. (2006)
• What is needed to educate future digital
librarians: A study of current practice and
staffing patterns in academic and research
libraries.
• D-Lib Magazine, 12(9)
• doi:10.1045/september2006-choi.
35
D.Librarian Skills & Knowledge:
Technology Related
•
•
•
•
•
DL architecture and software
Technical and quality standards
Web markup languages
Database development and DBMS
Web design skills
36
D.Librarian Skills & Knowledge:
Library Related
•
•
•
•
•
The needs of users
Digital archiving and preservation
Cataloging, metadata
Indexing
Collection development
37
D.Librarian Skills & Knowledge:
Other
•
•
•
•
•
Communication and interpersonal skills
Project management and leadership skills
Legal issues
Grant/proposal writing skills
Teaching and group presentation skills
38
Development & Evaluation Process
·
·
·
·
Vision/plan
From research team
(VT & UNC)
From current courses
at VT & UNC
From Advisory Board
From CC 2001
Feedback
Analyze
· Specific strengths
· Specific weaknesses
· CC 2001 context
· Curricular needs
· Student background
Products
· Modules ready for
use
· Lessons ready for
use
Evaluate
· Inspection by
Advisory Board
· Inspection by
external experts
· Inspection by
Doctoral Consortium
participants
Design
· Modules
· Lessons
Evaluate
in the field
· Teacher perceptions
· Student perceptions
· Student outcomes
Revise & Implement
· At UNC & VT
· At additional universities
(in CS & LIS programs)
39
RELATED
TOPICS
CORE DL
TOPICS
COURSE
STRUCTURE
Curriculum framework
Semester 1:
DL collections:
development/creation
Module 1:
Digitization,
Storage,
Interchange
Module 3:
Metadata,
Cataloging,
Author
submission
Module 2:
Digital objects,
Composites,
Packages
Semester 2:
DL services and
sustainability
Module 6:
Architectures
(agents, buses,
wrappers/mediators),
Interoperability
Module 5:
Spaces
(conceptual,
geographic,
2/3D, VR)
Module 13:
Documents,
E-publishing,
Markup
Module 10:
Multimedia
streams/structures,
Capture/representation,
Compression/coding
Module 16:
Bibliographic
information,
Bibliometrics,
Citations
Module 11:
Content-based
analysis,
Multimedia indexing
and retrieval
Module 7:
Services
(searching,
linking,
browsing, etc.)
Module 4:
Naming,
Repositories,
Archives
Module 8:
Intellectual property
rights management,
Privacy,
Protection (watermarking)
Module 6:
Architectures
(agents, buses,
wrappers/mediators),
Interoperability
Module 15:
Thesauri,
Ontologies,
Classification,
Categorization
Module 12:
Multimedia
presentation
and rendering
Module 14:
Info. needs,
Relevance,
Evaluation,
Effectiveness
Module 9:
Archiving and
preservation,
Integrity
Module 17:
Routing,
Filtering,
Community
filtering
Module 18:
Search & search strategy,
Info seeking behavior,
User modeling,
Feedback
Module 19:
Information
summarization,
Visualization
40
Figure 1. Curriculum framework
Modules
1.
2.
3.
4.
5.
6.
7.
Collection Development
Digital objects / Composites / Packages
Metadata, Cataloging, Author submission
Architecture, Interoperability
Data visualization
Services
Intellectual property rights management, Privacy,
Protection
8. Social issues / Future of DLs
9. Archiving and Preservation
41
Conference papers x modules
200
JCDL 05
180
JCDL 04
JCDL 03
JCDL 02
160
JCDL 01
ACM DL 00
Number of conference papers
140
ACM DL 99
ACM DL 98
ACM DL 97
120
ACM DL 96
100
80
60
40
20
0
1
2
3
4
5
Module ID
6
7
8
9
42
Taxonomy of DL Educational Resources
43
CORE TOPICS
1
Overview
2
Collection
Development
3
Digital Objects
4
5
Architecture
(agents, mediators)
User Behavior/
Interactions
7
Services
8
Archiving and
Preservation
Integrity
10
2-a: Collection development/selection policies
2-b: Digitization
4-d: Subject description
4-e: Information architecture (e.g., hypertext, hypermedia)
4-f: Object description and organization for a specific domain
5-a: Architecture overviews/models
5-b: Applications
5-c: Identifiers, handles, DOI, PURL
5-d: Protocols
5-e: Interoperability
5-f: Security
6-a: Info needs, relevance, evaluation
6-b: Search strategy, info seeking behavior, user modeling
6-c: Sharing, networking, interchange (e.g., social)
6-d: Interaction design, info summarization and visualization,
usability assessment
7-a: Search engines, IR, indexing methods
7-b: Reference services
7-c: Recommender systems
7-d: Routing, community filtering
7-e: Web publishing (e.g., wiki, rss, Moodle, etc.)
8-a: Repositories, archives, storage
8-b (3-c): File formats, transformation, migration
9-a: Project management
Management and 9-b: DL case studies
9-c: DL evaluation
Evaluation
9-d: Usability assessment, user studies
DL education
and research
2-c: Harvesting
2-d: Document and e-publishing/presentation markup
3-a: Text resources
3-b: Multimedia
3-c (8-b): File formats, transformation, migration
4-a: Metadata, cataloging, metadata markup, metadata
Info/ Knowledge harvesting
4-b: Ontologies, classification, categorization
Organization
4-c: Vocabulary control, thesauri, terminologies
6
9
1-a (10-c): Conceptual frameworks, theories
10-a: Future of DLs
10-b: Education for digital librarians
8-c: Sustainability
9-e: Bibliometrics, Webometrics
9-f: Legal issues (e.g., copyright)
9-g: Cost/economic issues
9-h: Social issues
10-c (1-a): Conceptual framework, theories
10-d: DL research initiatives
44
1
Overview
1-a (10-c): Conceptual frameworks, theories
45
2
Collection
Development
2-a: Collection development/selection policies
2-b: Digitization
2-c: Harvesting
2-d: Document and e-publishing/presentation markup
46
3
Digital Objects
3-a: Text resources
3-b: Multimedia
3-c (8-b): File formats, transformation, migration
47
4
Info/ Knowledge
Organization
4-a: Metadata, cataloging, metadata markup, metadata
harvesting
4-b: Ontologies, classification, categorization
4-c: Vocabulary control, thesauri, terminologies
4-d: Subject description
4-e: Information architecture (e.g., hypertext, hypermedia)
4-f: Object description and organization for a specific domain
48
5
Architecture
(agents, mediators)
5-a: Architecture overviews/models
5-b: Applications
5-c: Identifiers, handles, DOI, PURL
5-d: Protocols
5-e: Interoperability
5-f: Security
49
6
User Behavior/
Interactions
6-a: Info needs, relevance, evaluation
6-b: Search strategy, info seeking behavior, user modeling
6-c: Sharing, networking, interchange (e.g., social)
6-d: Interaction design, info summarization and visualization,
usability assessment
50
7
Services
7-a: Search engines, IR, indexing methods
7-b: Reference services
7-c: Recommender systems
7-d: Routing, community filtering
7-e: Web publishing (e.g., wiki, rss, Moodle, etc.)
51
8
Archiving and
Preservation
Integrity
8-a: Repositories, archives, storage
8-b (3-c): File formats, transformation, migration
8-c: Sustainability
52
9
Management and
Evaluation
9-a: Project management
9-b: DL case studies
9-c: DL evaluation
9-d: Usability assessment, user studies
9-e: Bibliometrics, Webometrics
9-f: Legal issues (e.g., copyright)
9-g: Cost/economic issues
9-h: Social issues
53
10
DL education
and research
10-a: Future of DLs
10-b: Education for digital librarians
10-c (1-a): Conceptual framework, theories
10-d: DL research initiatives
54
Personalizing A Course
Website Using the NSDL
William Cameron2, Boots Cassel2,
Edward Fox1, Manuel Perez-Quinones1,
Manas Tungare1, Xiaoyan Yu1
Virginia Tech1, Villanova2
55
Syllabus Collection …
Towards an intelligent educational system
Publisher
Recommender
Searcher
Editor
Services
Potential
Syllabus
Text
Other
NSDL
Resources
Syllabus
Classifier
Crawle
r
Unstructured
Syllabus Text
Structured
Syllabus
Text
Syllabus
Ontology
Classification
Scheme
Extractor
Resource
Classifier
56
Syllabus Ontology
•
•
•
•
Standard, machine understandable
Ontology Editor: Protégé
Syllabus Schema: SylVia
http://doc.cs.vt.edu/ontologies/
57
Creating new syllabus
• Web-based
application to
support entry of
syllabi into
collection
• Moodle Plug-in in
the works
• Uses CC 2001 to
select topics for a
course
58
Example: CBIR + SI
• Integration of
– CBIR
– Superimposed information (annotations …)
• Application to
– Biodiversity, fisheries and wildlife
– Archaeology
• Systems
– CBISC, SIMPEL, SIERRA
59
EKEY: The electronic key for identifying
freshwater fishes
60
Biodiversity Information Systems
• Retrieve fish descriptions of all fish whose shape
is similar to that shown in Figure below, which
belong to genus “Notropis”, which have “large
eyes” and “dorsal stripe”, and have been
observed within the catchments of the
“Tennessee” river
61
Here is another scenario …
• An archeologist wants to write
commentaries on artifacts
discovered in the field
• Using an Archeology digital library
in his study, he wants to be able to:
– Manually annotate images
(and parts)
– Search for images (and
parts), and annotations
– Automatically annotate/tag
similar images (and parts)
– Share annotations and
images
Source: http://www.bewegende-plaatjes.net
Sources: http://www.dorsetforyou.com, http://www.archaeology.org
62
Functionality required
• Digital Library (DL) users need, but get
little assistance, regarding tasks:
– Selecting and Annotating images and parts of
images
• Preserve original context of information
• Manual and automated annotation
– Content-based image retrieval of images and
parts of images
– Combined text- and content-based image
retrieval of images and parts of images
– Share selections and annotations
63
Layers in an SI system
Superimposed
Layer
marks
Base
Layer
Information
Source1
Information
Source2
* Source: ICDE04 presentation by Murthy, et. al
…
Information
Sourcen
64
Superimposed Applications
C
A
Enhanced CMapTools
B
0
20
5
10
15
SIMPEL: A SuperImposed Multimedia
Presentation Editor and pLayer
65
Content-Based Image Retrieval (CBIR)
• Retrieve images similar to a user-defined
specification or pattern (e.g., shape sketch,
image example)
• Goal: To support image retrieval based on
content properties (e.g., shape, color or
texture), usually encoded into feature
vectors
66
Effective Image Descriptor
Feature Vector
67
Image descriptors
• Image Descriptor
Example: Histogram
• Frequency count of each individual color
• Most commonly used color feature
representation
Image
Corresponding histogram
69
Source: Andrade, D.
Texture Descriptors
70
A typical CBIR system
Interface
Data Insertion
Query Specification
Visualization
Query Pattern
Feature Vector
Extraction
Query-processing
Module
Feature
Vectors
Image
Database
Similar Images
Ranking
Similarity
Computation
Images
71
CBISC Architecture
72
CBISC in ETANA
73
SIERRA
• A tool that allows users to select parts of
images and associate them with text
annotations.
• Performs information retrieval as
annotations and associated marks in two
ways, either for:
– images or marks similar (in content) to a
specified image or mark
– annotations containing specified query terms
74
Annotating an image
75
Searching over annotations
76
Searching over images/sub-images
77
Theory
78
Informal 5S & DL Definitions
DLs are complex systems that
•
•
•
•
•
help satisfy info needs of users (societies)
provide info services (scenarios)
organize info in usable ways (structures)
present info in usable ways (spaces)
communicate info with users (streams)
79
5Ss
Ss
Examples
Objectives
Streams
Text; video; audio; image
Describes properties of the DL content
such as encoding and language for
textual material or particular forms of
multimedia data
Structures Collection; catalog;
hypertext; document;
metadata
Specifies organizational aspects of the DL
content
Spaces
Measure; measurable,
topological, vector,
probabilistic
Defines logical and presentational views
of several DL components
Scenarios
Searching, browsing,
recommending
Details the behavior of DL services
Societies
Service managers,
learners, teachers, etc.
Defines managers, responsible for
running DL services; actors, that use
those services; and relationships among
80
them
5S and DL formal definitions and compositions
(April 2004 TOIS)
relation (d. 1)
sequence graph (d. 6)
(d. 3)
measurable(d.12), measure(d.13), probability (d.14),
language (d.5)
vector (d.15), topological (d.16) spaces
sequence
tuple (d. 4)*
(d.
3)
function
state (d. 18)
event (d.10)
(d. 2)
5S
grammar (d. 7)
streams (d.9)
structures (d.10) spaces (d.18) scenarios (d.21) societies
(d. 24)
services (d.22)
structured
stream (d.29)
digital
object
(d.30)
structural
metadata
specification
(d.25)
descriptive
metadata
specification
(d.26)
metadata catalog
transmission collection (d. 31)
(d.32)
(d.23)
repository
(d. 33)
(d.34)indexing
service
hypertext
(d.36)
browsing
service
(d.37)
digital
library
(minimal) (d. 38)
searching
service (d.35)
81
5SL – The Minimal DL Metamodel
Scenarios
(Meta-) Model
Societal
(Meta-) Model
Meta-Models
Meta-Models
Primitives
uses Actor
runs
Service
Scenario
receiver
Community
Service
Event
Manager
Interface
Manager
Index
Manager
Search
Manager
Collection
Index
User
Repository
Manager
Browsing
Manager
Catalog
Interface
Document
Metadata
Retrieval
Model
Text
Spatial
Stream
(Meta-) Model
(Meta-)Model
Video
Audio
Structural
(Meta-) Model
Image
82
Streams
image
contains
metadata
specifications


describes
Collection
Catalog
text
audio
video
contains
Structures
is_version_of/
cites/links_to
describes
digital
object
Index
stores
Measurable
is_a
Measure
employs
produces
Topological
Repository
employs
produces
is_a
is_a Vector Metric
Probabilistic
Spaces
employs
produces
inherits_from/includes
runs
Service

extends
reuses
Scenario
precedes
contains
happens_before
event
Scenarios
Societies
Service
Manager
uses
participates_in Actor
recipient

association
operation
executes
83
redefines
invokes
Infrastructure Services
Repository-Building
Creational
Preservational
Acquiring
Cataloging
Crawling (focused)
Describing
Digitizing
Federating
Harvesting
Purchasing
Submitting
Conserving
Converting
Copying/Replicating
Emulating
Renewing
Translating (format)
Add
Value
Annotating
Classifying
Clustering
Evaluating
Extracting
Indexing
Measuring
Publicizing
Rating
Reviewing (peer)
Surveying
Translating
(language)
Information
Satisfaction
Services
Browsing
Collaborating
Customizing
Filtering
Providing access
Recommending
Requesting
Searching
Visualizing
84
Ontology: Applications
85
Infrastructure
Information
Satisfaction
Services
Services (Add_Value)
Rating
Indexing
p
Training
p
{(digital object, Index
actor, rate) }
Society
actor
p
handle
anchor
e
classifier
e
Browsing
e
Requesting
p
p
e
e
user model
query/category
e
e
Recommending
p
{digital object}
e
e
Searching
p
Collection, {digital object}
e
Filtering
Binding
p
p
{digital object}
query
e
binder
e
fundamental
composite

{digital object}
transformer
e
e
e
Visualizing
Expanding query
p
p
space
query’
86
Formal
Theory/
Metamodel
5S
Requirements
5SGraph
5SL
Analysis
DL XML
Log
5SLGen
OO Classes
Workflow
Design
Components
Implementation
DL
Evaluation
Test
87
A Minimal DL in the 5S Framework
Streams
Structured
Stream
Structures
Spaces
Structural
Metadata
Specification
Scenarios
Societies
services
Descriptive
Metadata
Specification
indexing
browsing searching
hypertext
Digital Object
Collection
Metadata Catalog
Repository
Minimal DL
88
A Minimal ArchDL in the 5S Framework
Streams
Structures
Structured
Stream
Spaces
Descriptive
Metadata
specification
Scenarios
Societies
services
SpaTemOrg
StraDia
Arch Descriptive
Metadata specification
ArchObj
indexing
browsing searching
hypertext
ArchDO
Arch Metadata catalog
ArchColl
ArchDColl
ArchDR
Minimal ArchDL
89
Tools/Applications
5S
Meta
Model
DL
Expert
5SGraph
DL
Designer
Practitioner
5SL
DL
Model
Teacher
component
pool
ODLSearch,
ODLBrowse,
ODLRate,
ODLReview,
…….
Researcher
5SLGen
Tailored
DL
Logging Module
XML
Log
90
5SGen – Version 2: ODL,
Services, Scenarios
5SL-Scenario
Model (6)
DL
Designer
Component
Pool
XMI:Class
Model (3)
ODL
Search
Wrapping
Wrapping
import
import
Scenario
Synthesis (9)
Deterministic
FSM (10)
Xmi2Java (4)
Java
Classes
Model (5)
DL
Designer
StateChart
Model (8)
5SLGen
Java
ODL
Browse
XPath/JDOM
Transform (7)
XPATH/JDOM
Transform (2)
.
.
.
Java
5SL-Societies
Model (1)
SMC (11)
superclass
Java
Finite
State Machine
Class
Controller (12)
binds
JSP
User
Interface
View (13)
91
Generated DL Services
5SGraph
Workspace
(instance model)
Structured
toolbox
(metamodel)
92
93
Information model
94
95
Formal Definition of DL Integration
•
DLi=(Ri, DMi, Servi, Soci), 1 i n
–
–
–
–
Ri is a network accessible repository
DMi is a set of metadata catalogs
 for all collections
Servi is a set of services
Soci is a society
•
•
•
•
UnionRep
UnionCat
UnionServices
UnionSociety
•
Given n individual libraries, integrate the n DLs
to create a UnionDL.
96
Taxonomy of Union Services
Infrastructure Services
Information Satisfaction Services
Essential
Add_Vaue
Essential
indexing
harvesting
mapping
(Schema
registry with
analyses &
mapping)
(data) cleaning
(focused) crawling
copying (replicating)
logging
(format) translating
(Service to support
annotation)
(Metadata validation)
searching access control
browsing binding
comparison
(forum) discussion
(query) expansion
filtering
recommendation
visualization
Add_value
Note: Suggested NSDL services are shown in blue.
97
Union Catalog Integration
Virtual Nimrin
(VN)
VN Metadata
Format
Mapping
Tool
Union ArchDL
VN
Catalog
Halif DigMaster
(HD)
Wrapper
Union
Catalog
HD
Catalog
Global Metadata
Format
Wrapper
HD Metadata
Format
Mapping
Tool
98
local schema
global schema
99
5SQual Tool
Implementing a Tool Aimed
at Automatic Quality Assessment in
Digital Libraries
Bárbara Lagoeiro Moreira
100
Quality Base Model
Digital Object
•
•
•
•
Accessibility •
Pertinence
•
Preservability •
Relevance
Metadata
•
•
•
Accuracy
Completeness
Conformance
Collection
•
•
Completeness
Impact Factor
Catalog
•
•
Completeness
Consistency
Repository
•
•
Completeness
Consistency
Services
•
•
•
•
Composability
Efficiency
Effectiviness
Extensibility
Similarity
Significance
Timeliness
Numeric
Indicators
•
•
Reusability
Reliability
101
DL Success Model
relevance adequacy timeliness
reliability understandability scope
information quality
(IQ)
performance
expectancy
(PE)
satisfaction
system
system quality
quality
(SQ)
(SQ)
behavioral
Intention to
(re)use
social influence (SI)
user
interface
ease of use accessibility
joy of use
reliability
102
Systems
103
DL Manifesto - 1
• DL Reference Model
• In support of the future European Digital Library
• Developed by team connected with DELOS
(Candela, Casteli, Ioannidis, Koutrica, Meghini,
Pagano, Ross, Schek, Schuldt)
• Draft 2.2 presented in Frescati, near Rome,
June 2006 – 79 pages
• Could be integrated with work of DLF, JISC, etc.
104
DL Manifesto – 2: 3 Tiers
105
DL Manifesto – 3: Main Concepts
106
DL Manifesto – 4: Actor Roles
107
108
SIMILE
Objectives, Current Status,
and Demonstration
Stephen J. Garland, MIT CSAIL
Mick Bass, HP Labs
DSpace User Group Meeting
Cambridge, MA
March 11, 2004
109
Simile Goals
• Make the Semantic Web a reality
– For libraries and their users
– Support heterogeneous, multi-community
metadata
– Provide tools for viewing, browsing, searching
• Assess current state of Semantic Web
– Explore utility of standards (RDF, RDFS, OWL)
– Extend Semantic Web tool stack for libraries
– Identify issues, gaps, opportunities, best
practices for digital libraries
110
What is Fedora™?
Flexible Extensible Digital Object
Repository Architecture
• Slides courtesy Vinod Chachra of VTLS
111
Client
Application
Fedora™
Repository
Batch
Program
Web
Browser
HTTP SOAP
HTTP SOAP
HTTP SOAP
Manage
Access
Search
Server
Application
Web Service
Web Service
Exposure
Exposure
Layer
Layer
HTTP
OAI Provider
Session Management
User Authentication
Management
Subsystem
Security
Subsystem
Access
Subsystem
Policy Mgmt
Object Reflection
Component Mgmt
Policy Enforcement
Object Dissemination
HTTP
Object Validation
Users/Groups
PID Generation
External
Content
Source
HTTP
FTP
External Content
Retriever
Digital Objects
XML Files
Datastreams
HTTP
Local
Service
Policies
Storage Subsystem
FT P
External
Content
Source
SOAP
Object Mgmt
Remote
Service
Content
Relational DB
Adapted from Slide by V. Chachra, VTLS
112
VITAL / Fedora Relationship
113
OCKHAM Library Network
NSDL
Services
NSDL
OCKHAM
Library
Network
OCKHAM
Services
Library
Services
Teachers
Learners
Librarians
114
OCKHAM
• Simplicity (a la OCCAM’s razor)
• Support by Mellon and DLF
• Four main ideas:
1. Components
2. Lightweight protocols
3. Open reference models (e.g., 5S, OAIS)
4. Community perspective and involvement
• Funded by NSF in NSDL, with P2P
115
Summary
•
•
•
•
•
•
•
•
•
•
Acknowledgments
Introduction
Proposals
Projects
Systems
Theory
Curricula
Examples
Summary
Discussion
116
Questions?
Comments?
See http://fox.cs.vt.edu/talks/
117
Download