DSpace Introduction

advertisement
MacKenzie Smith
Associate Director for Technology
MIT Libraries
Agenda
Introduction
DSpace demo
Technical architecture
Organizational model
MIT case study
DSpace Federation
Q&A at the end of each presentation
General Q&A at the close
DSPACE
INTRODUCTION
DSpace
Vision (1999)

A federated repository that makes available the
collective intellectual resources of the world’s
leading research institutions
Mission
Create a scalable digital archive that preserves and
communicates the intellectual output of MIT’s
faculty and researchers
 Support adoption by and federation with other
research institutions

DSpace is…
An open source technology platform
A service model for open access and/or digital
archiving
A platform to build an Institutional Repository
A (proposed) federation of digital repositories across
multiple academic research institutions
A production service of the MIT Libraries to the
local research community
Institutional Repositories
Institution-based
Scholarly material in digital formats
Cumulative and perpetual
Open and interoperable
The DSpace Repository
Institutional Repository for MIT faculty’s
digital research materials
MIT Libraries - Hewlett Packard Research
Labs collaborative development project
Open Source system
Federated system
Preservation archive
DSpace Functions
Captures



Digital research material (any format)
Directly from creators (e.g. faculty)
Large-scale, stable, managed long-term storage
Describes


Descriptive, technical, rights metadata
Persistent identifiers
Distributes

Via WWW, with necessary access control
Preserves
Possible Content
Preprints, articles
Technical Reports
Working Papers
Conference Papers
E-theses
Datasets

statistical, geospatial,
matlab, etc.
Images

visual, scientific, etc.
Audio files
Video files
Learning Objects
Reformatted digital
library collections
Why Libraries?
Expertise

Large-scale collection management




Assessment/collection policies
preservation
Metadata
Solid business practices
Commitment


Long time frames
Mission scope
CHALLENGES
Challenges
Faculty Acceptance

Valuing and trusting an institutional archive
Sustainability

institutional, financial
Digital Preservation
Digital Preservation
Philosopy
Lots of digital material is already lost
 Most digital material is at risk
 Better to have it, do bit preservation, than to
lose it completely
 Need to capture as much information as
possible to support functional preservation
 Cost/benefit tradeoffs

Digital Preservation
MIT’s commitment levels

Known/supported


TIFF, SGML/XML, AIFF, PDF
Known/unsupported
Microsoft Word, PowerPoint (common, proprietary)
 Lotus 1-2-3, Visicalc, WordPerfect (less common)


Unknown/unsupported

One-of-a-kind software program
Digital Preservation
Supported = migration and/or emulation
Migration for texts, images, audio, etc.
 Emulation for software, multimedia?

Unsupported
Bit preservation at minimum
 Format migration where possible


Commercial conversion services
Global Digital Format Registry
DESIGN
Information Model
Communities

Research units of the organization
Collections (in communities)

Distinct groupings of like items
Items (in collections)


Logical content objects
Receive persistent identifier
Bitstreams (in items)


Individual files
Receive preservation treatment
Information Model
Versioning

Item “versions” can be

All instances of a work in different formats


All editions of a work over time



E.g. the XML, PDF, and PostScript versions
Official changes (e.g. addenda or new release)
Periodic snapshots (e.g. web sites)
Metadata lists all available versions of items
Communities
Research units of the organization
Schools, Departments, Research Labs,
Research Centers, Programs, etc.
 Individuals

Community “home page” with logo, custom
description, etc.

Or contract with library
Communities
Local, distributed policy decisions


Who can contribute, access material
Submission workflow


Submitters, approvers, reviewers, editors
Collections definition, management
Local, distributed production work

Communities supply metadata, files
Partnership between library and communities
Communities
Communities
DSpace system
Archival Storage
DEPARTMENTS
LABS
CENTERS
PROGRAMS
Submission Workflow
SCHOOLS
Metadata (Database)
Search/Browse
Web User Interface
SCHOOL
DEPARTMENT
LAB
CENTER
Collection
Item
Item
Item
Item
Users
EDUCATIONAL
TECHNOLOGY
Problem
Lack of persistent repository for Learning
Objects
Needed for reuse of
Entire courses
 Useful “learning objects”

Prior efforts not institution-based

Merlot, HEAL, etc.
Open Knowledge Initiative
Defines API for interoperation between

Course/Learning Management Systems
Open source (e.g. Coursework, Stellar)
 Commercial (e.g. Blackboard, WebCT)


Digital Repositories
Open source (e.g. DSpace, FEDORA)
 Commercial (e.g. TEAMS, Bulldog)

Collaborating with IMS Digital Repository
working group
OpenCourseWare
“Make MIT course materials that are used in the teaching of
almost all undergraduate and graduate subjects available
on the Web, free of charge, to any user anywhere in the
world.”
“Course materials contained on the MIT OCW Web site may
be used, copied, distributed, translated, and modified, but
only for non-commercial educational purposes that are
made freely available to other users under the same terms
defined by the MIT OCW legal notice.”
OpenCourseWare
Publication of all course content on the Web
Faculty-authored
 3rd party produced
 Metadata based on IMS specifications

DSpace
Archive for entire course web site
 Archive of significant content items or “learning
assets” for rediscovery and reuse

Metadata
SIMILE

Flexible metadata infrastructure


HP/MIT Alliance-funded project




e.g. support for IMS/SCORM schema
HP Labs
W3C’s Semantic Web activity
MIT Lab for Computer Science researcher (David Karger)
Haystack project on personalized information management
MIT Libraries’ DSpace providing test-bed, real-world
applications
RESEARCH AGENDA
Further R&D
Digital preservation
Datasets, multimedia, websites, programs
 Economics and user requirements

Publishing
E-journal alternatives
 Collaborative, iterative authoring tools

Rights management for academia
Download