DLCourseSlidesPart1

advertisement

Digital Libraries

Based on Draft Book

“Foundations for Information Systems:

Digital Libraries and the 5S Framework” by Edward A. Fox and

Marcos André Gonçalves

• See content of Preface in the next slides.

• See table of contents / outline, and then corresponding content, following.

Disclaimer

Everything can change!

For More Information

• Magazine : www.dlib.org

• Books : http://fox.cs.vt.edu/DLSB.html (1994)

– MIT Press: Arms, plus by Borgman, Licklider (1965)

– Morgan Kaufmann: Witten... (several), Lesk (2 nd edition)

• Conferences

– ECDL: www.ecdl2005.org

– ICADL: http://icadl2004.sjtu.edu.cn

– JCDL: www.jcdl2005.org

• Associations

– ASIS&T ACM DL SIG

– IEEE TCDL: www.ieee-tcdl.org (student awards, doctoral consortia)

• NSF : www.dli2.nsf.gov

• Labs : VT: www.dlib.vt.edu, http://ei.cs.vt.edu/~dlib/

DL Challenges

• Preservation - so people with trust DLs

• Supporting infrastructure - networks, ...

• Scalability, sustainability, interoperability

• DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info “quality WWW” integrating IR, HT, MM, ...

– Need tools & methods to make them easier to build

DL Challenges – 2: Terminology

• Digital / electronic / virtual library

• Born digital, hybrid (digital/physical)

• Universal access (all people/places/times)

– Accommodate disabilities (color, visual, auditory)

– Mobile (office, home, laptop, PDA, mobile)

• Archiving, self-archiving

• Open (source, standards, archives)

How to organize a DL course?

• Various frameworks

– What, Why, How

– History, Current status, Future (research)

– Economics: open source, sustainability

– Social: users/patrons, management

– Technical: HCI, HT, IR, LIS, Web

CC2001 Information Management Areas

IM1. Information models and systems*

IM2. Database systems*

IM8. Distributed DBs

IM9. Physical DB design

IM3. Data modeling* IM10. Data mining

IM4. Relational DBs

IM5. Database query languages

IM6. Relational DB design

IM11. Information storage and retrieval

IM12. Hypertext and hypermedia

IM13. Multimedia information

& systems

IM7. Transaction processing IM14. Digital libraries

* Core components

DL Curriculum Framework

Semester 1:

DL collections: development/creation

Semester 2:

DL services and sustainability

Digitization

Storage

Interchange

Digital objects

Composites

Packages

Metadata

Cataloging

Author submission

Spaces

(conceptual, geographic,

2/3D, VR)

Architectures

(agents, buses, wrappers/mediators)

Interoperability

Naming

Repositories

Archives

Architectures

(agents, buses, wrappers/mediators)

Interoperability

Services

(searching, linking, browsing, etc.)

Archiving and preservation

Integrity

Intellectual property rights mgmt.

Privacy

Protection (watermarking)

Documents

E-publishing

Markup

Multimedia streams/structures

Capture/representation

Compression/coding

Thesauri

Ontologies

Classification

Categorization

Bibliographic information

Bibliometrics

Citations

Content-based analysis

Multimedia indexing

Multimedia presentation, rendering

Info. Needs

Relevance

Evaluation

Effectiveness

Routing

Filtering

Community filtering

Search & search strategy

Info seeking behavior

User modeling

Feedback

Info summarization

Visualization

Book Parts

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

• Part 2 – Higher DL Constructs

• Part 3 – Advanced Topics

• Appendix

Book Parts and Chapters - 1

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Book Parts and Chapters - 2

• Part 2 – Higher DL Constructs

– Ch. 7: Collections

– Ch. 8: Catalogs

– Ch. 9: Repositories and Archives

– Ch. 10: Services

– Ch. 11: Systems

– Ch. 12: Case Studies

Book Parts and Chapters - 3

• Part 3 – Advanced Topics

– Ch. 13: Quality

– Ch. 14: Research Challenges

• Appendix

– A: Mathematical preliminaries

– B: Formal Definitions: Ss, DL terms

– C: Glossary of terms, mappings

Acknowledgements

• Students

• Faculty, Staff

• Collaborators

• Support

• Mentors

Acknowledgements: Students

• Pavel Calado, Yuxin Chen, Fernando Das

Neves, Shahrooz Feizabadi, Robert

France, Marcos Gon çalves, Nithiwat

Kampanya, S.H. Kim, Aaron Krowne, Bing

Liu, Ming Luo, Paul Mather, Fernando

Das Neves, Unni. Ravindranathan, Ryan

Richardson, Rao Shen, Ohm Sornil,

Hussein Suleman, Ricardo Torres, Wensi

Xi, Baoping Zhang, Qinwei Zhu, …

Acknowledgements: Faculty, Staff

• Lillian Cassel, Debra Dudley, Roger

Ehrich, Joanne Eustis, Weiguo Fan,

James Flanagan, C. Lee Giles, Eberhard

Hilf, John Impagliazzo, Filip Jagodzinski,

Rohit Kelapure, Neill Kipp, Douglas

Knight, Deborah Knox, Aaron Krowne,

Alberto Laender, Gail McMillan, Claudia

Medeiros, Manuel Perez, Naren

Ramakrishnan, Layne Watson, …

Other Collaborators (Selected)

• Brazil : FUA, UFMG, UNICAMP

• Case Western Reserve University

• Emory, Notre Dame, Oregon State

• Germany : Univ. Oldenburg

• Mexico : UDLA (Puebla), Monterrey

• College of NJ, Hofstra, Penn State, Villanova

• University of Arizona

• University of Florida, Univ. of Illinois

• University of Virginia

• VTLS (slides on digital repositories, NDLTD)

Acknowledgements: Support

• Course: UNESCO, CETREDE, IFLA-

LAC, AUGM, CLEI, UFC

• Sponsors: ACM, Adobe, AOL, CAPES,

CNI, CONACyT, DFG, IBM, Microsoft,

NASA, NDLTD, NLM, NSF (IIS-9986089,

0086227, 0080748, 0325579; ITR-

0325579; DUE-0121679, 0136690,

0121741, 0333601), OCLC, SOLINET,

SUN, SURA, UNESCO, US Dept. Ed.

(FIPSE), VTLS

Acknowledgements - Mentors

• JCR Licklider – undergrad advisor (1969-71)

– Author in 1965 of “Libraries of the Future”

– Before, at ARPA, funded start of Internet

• Michael Kessler – BS thesis advisor

– Project TIP (technical information project)

– Defined bibliographic coupling

• Gerard Salton – graduate advisor (1978-83)

– “Father of Information Retrieval”

Chapter 1 - Introduction

Chapter 1 Overview

• Why digital libraries?

• What are digital libraries (DLs)?

• Why is 5S helpful in a DL book?

• How do digital libraries work?

• History: Memex, 1990s, proliferation

• Related areas: LIS, linguistics, IR, AI, DBs, knowledge management, content management, probability/statistics

Synchronous

Scholarly Communication

Same time, Same or different place

Asynchronous, Digital Library

Mediated Scholarly Communication

Different time and/or place

DL Overview

Why of Global Interest?

• National projects can preserve antiquities and heritage: cultural, historical, linguistic, scholarly

• Knowledge and information are essential to economic and technological growth, education

• DL - a domain for international collaboration

– wherein all can contribute and benefit

– which leverages investment in networking

– which provides useful content on Internet & WWW

– which will tie nations and peoples together more strongly and through deeper understanding

Digital Libraries --- Objectives

• World Lit.: 24hr / 7day / from desktop

• Integrated “super” information systems: 5S:

Table of related areas and their coverage

• Ubiquitous, Higher Quality, Lower Cost

• Education, Knowledge Sharing, Discovery

• Disintermediation -> Collaboration

• Universities Reclaim Property

• Interactive Courseware, Student Works

• Scalable, Sustainable, Usable, Useful

Libraries of the Future

JCR Licklider, 1965, MIT Press

World

Nation

State

City

Community

Locating Digital Libraries in Computing and

Communications Technology Space

Digital Libraries technology trajectory: intellectual access to globally distributed information

Digital content less more

Computing (flops)

Note: we should consider 4 dimensions: computing, communications, content, and community (people)

Information

Life

Cycle

Borgman et al.:

Workshop Report on

Social Aspects of

Digital Libraries: http://www-lis.gseis.

ucla.edu/DL/

Information Life Cycle

Using

Creating

Retention

/ Mining

Accessing

Filtering

Authoring

Modifying

Organizing

Indexing

Storing

Retrieving

Distributing

Networking

Digital Libraries

Shorten the Chain from

Editor Reviewer

Publisher

A&I

Consolidator

Library

DLs Shorten the Chain to

Author Teacher

Digital

Reader

Editor

Reviewer

Learner

Librarian

Library

How is a DL different from a database?

A traditional SQL database has as its basic element data items in a relation:

– select name from employee, project where employee.deptnumber = “25” AND project.number = “100” databases exploit known structures and relations

DBMS retrieval is not probabilistic (Frakes,

Baeza-Yates, p. 3)

How is a DL different from the

WWW?

• The keyword is managed

– The WWW is not managed

• Some meta searchers (Yahoo, Lycos) attempt to add an organizational framework to their web holdings

– However, most are focused on keyword searching (i.e., Google)

How is a DL different from the

WWW?

• Another key difference is who controls the input into the system

– most meta searchers hunt down their holdings

• Lycos is short for Lycosidae lycosa (the “ ), which pursues its prey and does not build a web (Mauldin, IEEE Expert, 1/97)

– some (Yahoo) have humans in the loop for review and classification

• To date, DLs are generally more tightly controlled, and have a targeted customer set

DL = Content + Services

Vector and/or

Boolean

Search

Engin es

(traditional IR)

WWW (http) Access

(most common)

Digital Library Services non-WWW

Access

(now uncommo n)

(searching, browsing, citation anlaysis usage analysis, alerts)

RDBMS

File

Sys tems

Other

Techno logies

Content

• “ “?

– WWW by itself has low archival

& management characteristics

• “ “

– In the same way that a card catalog is not a TL, a RDBMS is candidate technology for use in

DLs

• DL is the union of the content and services defined on the content

How is a DL Different from a

Traditional Library?

TL has as its focus physical objects

– even if the card catalog (metadata) is electronic, the purpose is to point you to a physical location

– trafficking in physical objects has both obvious and subtle implications

• object can exist only in 1 place if you have it, I can “

I have to go to the object, or wait for it to come to me

TLs vs. DLs

• DLs clearly better than TLs at:

– Dissemination, storing information variety

• However, TL objects are more survivable

– Who will archive the research information?

QuickTime™ and a

TIFF (Uncompressed) decompressor are needed to see this picture.

• the publishers?

• the institutions?

• the authors?

– Will the average DL object still be accessible in 10 years?

• take my digital preservation seminar in the spring!

image from: http://www.ancientegypt.co.uk/writing/rosetta.html

How is a DL Different from a

Traditional Library?

Digital Library

– removing the physical restriction has obvious benefits

• multiple access, multiple listings, electronic transmission

– also complicates many other issues...

• intellectual property, terms and conditions, etc.

Note that a TL offers additional social and educational benefits

– Most TLs also offer hybrid services too.

from Lesk, http://community.bellcore.com/lesk/columbia/session1/

TLs vs. DLs

• Where does publishing stop, and libraries begin?

– there has always been tensions between TLs and traditional publishers, but the roles were fairly well defined

– DLs can muddle the separation of these responsibilities

• result: conflict, and/or new models

Traditional Players

publisher book store library archive responsibility over time

DL Definitions - 1

• “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection.”

• Witten & Bainbridge – “How to Build a

Digital Library” – Morgan Kaufmann 2003

DL Definitions - 2

• “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities”

• Waters,D.J. CLIR Issues , July/August 1998

• www.clir.org/pubs/issues/issues04.html

DL Definitions - 3

• Issues and Spectra

– Collection vs. Institution

– Content vs. System

– Access vs. Preservation

– “Free” vs. Quality

– Managed vs. Comprehensive

– Centralized vs. Distributed

DL Definitions - 4

• NOT a “digitized library”

• NOT a “deconstruction” of existing systems and institutions, moving them to an electronic box in a Library

• IS a new way to deal with knowledge

– Authoring, Self-archiving, Collecting,

– Organizing, Preserving,

– Accessing, Propagating, Re-using

Digital Library Content

Content

Types

Text

Documents

Articles,

Reports,

Books

Video

Audio

Speech,

Music

Geographic

Information

Software,

Programs

Bio

Information

Images and

Graphics

(Aerial)

Photos

Models

Simulations

Genome

Human, animal, plant

2D, 3D,

VR,

CAT

Content Area Description

African-American cultural life

Agricultural crisis of late 19 th century

Codification of segregation laws

Configuration of white supremacy

Audio Digital Finding

Aid

6

1

4 6

1

MSS Other Photo Video MF Print Total

9

3

4

1

12

1

3 10

4

18

8

72

19

1 1 1

1

3

3

2

3 3 1

8

9

16

20

Cultural values and activities

Disenfranchising movements

Educational movements

Emergence of Holiness & Pentecostal Groups

Emergence of new musical forms

Emergence of organized groups expressing farmers concerns

Total Each Format

3

1

6

3

1

1

1

1

1

5

2

17

2

18

1

2

4

1

6

15

2

21

1

1

2

1

1

3

5

5

1

1

20

6

27

7

2

8

8

13

… … … … … … … … … …

41 14 51 161 38 133 13 79 301 831

71

15

98

10

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Motivation

• Digital Libraries (DLs): what are they??

– No definitional consensus

– Conflicting views

– Makes interoperability a hard problem

• DLs are not benefiting from formal theories as are other CS fields: DB, IR, PL, etc.

• DL construction: difficult, ad-hoc, lack of support for tailoring/customization

• Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development.

– Lack of specific DL models, formalisms, languages

Informal 5S & DL Definitions

DLs are complex systems that

• help satisfy info needs of users ( societies )

• provide info services ( scenarios )

• organize info in usable ways ( structures )

• present info in usable ways ( spaces )

• communicate info with users ( streams )

5S Layers

Societies

5 Elements

Fire

Scenarios

Spaces

Structures

Streams

Wood

Earth

Metal

Water

Hypotheses

• A formal theory for DLs can be built based on 5S.

• The formalization can serve as a basis for modeling and building highquality DLs.

Research Questions

1. Can we formally elaborate 5S?

2. How can we use 5S to formally describe digital libraries?

3. What are the fundamental relationships among the Ss and high-level DL concepts?

4. How can we allow digital librarians to easily express those relationships?

5. Which are the fundamental quality properties of a DL?

Can we use the formalized DL framework to characterize those properties?

6. Where in the life cycle of digital libraries can key aspects of quality be measured and how?

5Ss

Ss Examples

Streams Text; video; audio; image

Objectives

Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data

Specifies organizational aspects of the DL content

Structures Collection; catalog; hypertext; document; metadata

Spaces Measure; measurable, topological, vector, probabilistic

Scenarios Searching, browsing, recommending

Societies Service managers, learners, teachers, etc.

Defines logical and presentational views of several DL components

Details the behavior of DL services

Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

5S and DL formal definitions and compositions (April 2004 TOIS) relation (d. 1) sequence

(d. 3) graph (d. 6) measurable(d.12), measure(d.13), probability (d.14), vector (d.15), topological (d.16) spaces sequence

(d. 3) function

(d. 2) state (d. 18) language (d.5) tuple (d. 4)* event (d.10)

5S grammar (d. 7) streams

(d.9) structures

(d.10) structured stream (d.29) spaces

(d.18) scenarios

(d.21) societies

(d. 24) services

(d.22) digital object

(d.30) structural metadata specification

(d.25) descriptive metadata specification

(d.26)

(d.34)indexing service hypertext

(d.36) browsing service

(d.37) searching service (d.35) transmission

(d.23) collection (d. 31) metadata catalog

(d.32) repository

(d. 33) digital library

(minimal)

(d. 38)

ETANA-DL

• Archaeological DL

• Integrated DL

– Heterogeneous data handling

• Applies and extends the OAI-PMH

– Open Archives Initiative Protocol for Metadata

Handling

• Design considerations

– Componentized

– Extensible

– Portable

Initial ETANA-DL Member Locations

Canadian University College

Walla Walla College

Willamette University

Andrews University

CWRU

Virginia Tech

Vanderbilt University

Mississippi State University

Map courtesy: www.enchantedlearning.com

Lahav Website

Megiddo Opening Screen

Locus Screen:

Pictures

View all

Area Screen

ETANA-DL Approach

• Applying and extending Digital Library (DL) techniques to solve key problems: making primary data available, data preservation, and interoperability

• Modeling archaeological information systems using

5S to better understand the domain and design the system and the supporting services

• Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks:

– eliciting requirements

– refining metamodel and union schema

– modeling sites

– mapping

– harvesting

– providing useful services

ETANA-DL Website

Marking Items

Marking – writing notes for a specific user

Sender, Date,

Object OAI ID

Sender

Comments

Options:

View Record,

Add record to Items Of Interest,

Re-mark item (Redirect),

Unmark item (Remove item from list)

Marked Items Display

Discussions Page

Discussions about an object

View/Post messages, create new threads

Recommendations

Items recommended on the basis of similar interests

ETANA-DL Searching Service

Search

ETANA-DL Multi-dimensional Browsing

3 new sites

2 new types of artifacts

ETANA-DL Visual Browsing Service

By site

Visual Browse

Visual Browsing Nimrin:

Topographical Drawings

Full site

Square:

N40/W20

North west quadrant

Visual Browsing Nimrin : Square information

Square:

N40/W20

Locus: 86

Loci layout

Visual Browsing Nimrin : locus sheet

Visual Browsing

Bab edh-Dhra'

Cemetery

Pottery # 25

Visual Browsing

Bab edh-Dhra'

Cemetery

Pottery # 25

ETANA Societies

1. Historic and pre-historic societies (being studied)

2. Archaeologists (in academic institutes, fieldwork settings, or local and national governmental bodies)

3. Project directors

4. Technical staff (consisting of photographers, technical illustrators, and their assistants)

5. Field staff (responsible for the actual work of excavation)

6. Camp staff (e.g., camp managers, registrars, tool stewards)

7. General public (e.g., educators, learners, citizens)

ETANA Societies

• Social issues

1. Who owns the finds?

2. Where should they be preserved?

3. What nationality and ethnicity do they represent?

4. Who has publication rights?

5. What interactions took place between those at the site studied, and others? What theories are proposed by whom about this?

ETANA Scenarios

1.

Life in the site in former times

2.

Digital recording: the planning stage and the excavation stage

3.

Planning stage: remote sensing, fieldwalking, field surveys, building surveys, consulting historical and other documentary sources, and managing the sites and monuments

4.

Excavation

1.

Detailed information is recorded, including for each layer of soil, and for features such as pole holes, pits, and ditches.

2.

Data about each artifact is recorded together with information about its exact find spot.

3.

Numerous environmental and other samples are taken for laboratory analysis, and the location and purpose of each is carefully recorded.

4.

Large numbers of photographs are taken, both general views of the progress of excavation and detailed shots showing the contexts of finds.

5.

Organization and storage of material

6.

Analysis and hypotheses generation and testing

7.

Publications, museum displays

8.

Information services for the general public

ETANA Spaces

1. Geographic distribution of found artifacts

2. Temporal dimension (as inferred by archaeologists)

3. Metric or vector spaces

1. used to support retrieval operations, and to calculate distance (and similarity)

2. used to browse / constrain searches spatially

4. 3D models of the past, used to reconstruct and visualize archaeological ruins

5. 2D interfaces for human-computer interaction

ETANA Structures

1. Site Organization

1. Region, site, partition, sub-partition, locus,

2. Temporal orderings (ages, periods)

3. Taxonomies

1.

for bones, seeds, building materials, …

4. Stratigraphic relationships

1. above, beneath, coexistent

ETANA Streams

1. successive photos and drawings of excavation sites, loci, unearthed artifacts

2. audio and video recordings of excavation activities and discussions

3. textual reports

4. 3D models used to reconstruct and visualize archaeological ruins.

Exercise 1

• Forms groups of 2.

• Select a digital library you wish to build, improve, or study.

• As was done for ETANA, discuss it using the 5S perspective.

• Present a summary to the class and lead a discussion.

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Chapter 2 Overview

• Multiple media types and representation

– See ch. 4 for IR (except some here for non-text)

– Standards for each, and for some combinations

• Text

– Character strings, encoding (Unicode)

– Morphology -> Stemming

– Syntax, semantics -> stop words

– ** POS tagging, phrases

• Images, Audio, Video, Graphics, Animation

– Capture, digitization, representation

– CBIR for each

• ** Compression, processing, analysis

• **Synchronization, rendering, presentation, interchange

– RealVideo, SMIL, QoS

Content Based

Information

Retrieval

Problems

• Image similarity is subjective

– Personal Interpretation

• Concept x Appearance

By Visual features

– Retrieve images with 50 percent of white colour and 50 percent of black colour

Query result

Textual information retrieval

Query on Google using Sunset and Rio de Janeiro

Image Classification by shape

Image Classification by shape

VITAL Web Portal

Clicking on the thumbnail image from this screen will launch the VITAL Hi-

Res Image Navigator – a tool which provides for detailed examination of these wavelet compressed image files

Institutions have considerable flexibility in the way they present their collections – the examples here show two different approaches to presenting EAD (Encoded

Archival Description) metadata objects

VITAL Web Portal

MrSID and JPEG2000 wavelet compressed images can be stored in the repository and displayed to the user via the integrated VITAL Hi-Res

Image Navigator

The AMICO Library™

VITAL Web Portal

The AMICO Library in VITAL

Implementation Options

 The Fedora™ package

 Fedora™ open source software

(free)

 VTLS installation, training, and support

Implementation Options

 The Full VITAL package

 Fedora™ open source software (free)

 VTLS software and hardware extensions, with features and workflows

 VTLS installation, training, support, integration and documentation

Implementation Options

 VITAL Hosted Solution

 VTLS provides ASP services for your digital collections

 VTLS Professional Digital

Imaging Services

 Imaging services and project consulting can be combined with any of the above packages to provide a solution tailored to your needs

DL Student Research: Torres

• Search in collections of fish images

• using combination of

• image properties (CBIR) and

• textual descriptions

Motivation

• Query 1:

– List all metadata related to fish which were observed in the Amazon River

• Query 2:

– Retrieve images of fishes whose shape is similar to that in the example o Query 3: List all metadata related to fishes that were observed in the Amazon River and whose shape is similar to that in the example

Motivation

• Retrieve fish descriptions whose shapes are similar to the one shown below , that belong to the “ Notropis” genre, that have large yes” e and that have been observed in the

“Tennessee River”

Problem

• There is no BIodiversity Information System which allow queries involving :

– Geographic data

– Species metadata

– Image Descriptors

• Existing systems:

– Metadada or

– Metadada + spatial data

– Images are stored as separate files

• With no possibilty of retrieval by content

WeBioS

Torres: Visualizations

Concentric Rings Pattern

Spiral Pattern

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Chapter 3 Overview

• Digital Objects

– Documents, digitization, packaging (METS), interchange, standards, format conversion

– Genre: plays, encyclopedia, dictionaries, educational resources: courses (e.g., syllabi) and lessons

– Structural organizations (books, chapters, sections), excerpts/spans (mark, superimposed info)

• Metadata: standards, markup

• Knowledge Structures & Representations

– Databases, Schema, Ontologies, Thesauri, Lexicons, Authority files, Concept maps, Semantic networks

• Indexes

– Inverted files, signature files, R-trees, Quad trees, etc.

• Clusters & Classification Schemes

Degree of Structure

Web

Chaotic

DLs

Organized

DBs

Structured

Digital Objects (DOs)

• Born digital

• Digitized version of “real” object

– Is the DO version the same, better, or worse?

– Decision for ETDs: structured + rendered

• Surrogate for “real” object

– Not covered explicitly in metamodel for a minimal DL

– Crucial in metamodel for archaeology DL

Metadata Objects (MDOs)

• MARC

• Dublin Core

• RDF

• IMS

• OAI (Open Archives Initiative)

• Crosswalks, mappings

• Ontologies

• Topics maps, concept maps

Complex to Simple

MARC ($50) Dublin Core (DC)

+ thesis

Also Important: Epub, SGML, XML

• 5S perspective: streams, structures, scenarios

• Authoring

• Rendering, presenting

• Tagging, Markup, DOM

• Semi-structured information

• Dual-publishing, eBooks

• Styles (XSL, XSLT)

• Structured queries

Databases

• 5S perspective: structures, streams, scenarios

• Extending database technology

• Structured and unstructured info

• Multimedia databases

• Link databases

• Performance, transaction processing

• Replicated storage, rollback/recovery

PACS Automatic Classification

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Chapter 4 Overview

• Retrieval models

– Boolean, extended Boolean

– Vector, LSI

– Probabilistic: classical, belief network, inference network, language models

• User interfaces and visualization

User interfaces and visualization

• 2D interfaces

• 3D interfaces

• GIS

• Other paradigms

• Stepping Stones and Pathways

– http://fox.cs.vt.edu/SSP/

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Chapter 5 Overview

• Recall OO for streams – now have objects as well as scenarios – ex interface components

• Information Access

– Searching: ad hoc, filtering/routing

– Browsing: using an organization, using a visualization, using links (i.e., hypertext, hypermedia)

– Workflow: sessions, feedback, etc.

• Scenario-based Design

• Usability: goals, tasks, claims

• NOTE: this is covered in the outline

Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Chapter 6 Overview

• User communities

– Authors, editors, teachers, students, readers

– Personal(ization), group(ware), community, global

– Accessibility, universal access

• Librarians: reference, acquisition, operations

• Research community

– Associations, conferences, publications, labs, projects

• Economics

– Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints)

– Publishers, catalogers, distributors, sustainability

– Open source, commercial, hybrid

Download