13_pulling_it_together - KCL Digital Consultancy Service

advertisement
Pulling it all together…
with thanks to Sheila Anderson
Pulling it all together
Think of it as a continuum…………………..
Quality Assessment
and Publication
1
Creation
2
Submission
3
Revision(s)
4
4
5
Technical
Obsolescence
6
Review
Retention
7
Withdraw
File Format & Content
Types Determined
Resource Discovery Metadata
Technical Metadata
Rights Metadata
File Format Conversion
Unique, Persistent Identifier
Version Control
Migration, Emulation
Other Preservation
Action
The OAIS Model
Vocabulary for Digital Repository practitioners
Sub-divides a complex task into related functions
Provides logic and structure to allow digital holdings to
be managed and processed
Six Primary Functions
Ingest
Archival Store
Access
Data Management
Administration
Preservation Planning
OAIS Functional Model
Some thoughts. . .
OAIS is a conceptual reference model and not a system
design model
Functions in the OAIS Reference Model may not
necessarily correspond with the functional modules of a
system that would implement the model.
(Organisational needs vary!)
Functionality of an actual digital repository may be
much more limited than in the OAIS Reference Model,
and may include additions
Take from it only what you need to manage your digital
collections
Process and Activities
Business and strategic planning, including costs and fund raising
Resource creation (not included in OAIS model)
Resource preparation and enhancement (not included in OAIS)
Resource presentation: access and delivery (Access)
Resource curation / Data Management (Ingest)
Preservation planning (Preservation Planning)
Archival storage (Archival Store)
Management and administration (Admin + bits of data
management)
Process and Activities
Strategic and business planning; fund raising
What is the purpose of your data creation efforts?
Are you undertaking a one-off project or establishing
something longer term
If the latter is it short, medium or long-term?
Will you be providing a service for others to use?
What skills will you need
What infrastructure?
What will it cost?
How will you raise the money?
Do you want to do everything in-house or outsource some or
all of the tasks?
Process and Activities
Resource creation: the methods used to convert
analogue information into raw digital data
Scanning
Digital photography
Transcription
OCR
Etc. etc.
Process and Activities
Resource preparation and enhancement: methods used
to transform raw data into information
Modeling and design of the resource’s information
structure
Normalisation and enhancement of data
Structuring
Encoding
Manipulation
Adding and creating metadata
Process and Activities
Resource presentation: providing access, delivery and
use of the resource
Static or interactive web-based services; Off-line services
Resource discovery
On-line browsing and selection
Requests from users
Coordinates requests and execution of requests
Applies controls to limit access as needed
Generates information for dissemination
Delivers information (data) to users
Process and Activities
Resource curation: management of the resource over the
long-term
Maintaining schema, view definitions, referential integrity
Receiving/managing updates & additions to data & metadata
Version Control
Performing quality assurance
Generating archive versions to archive’s data formatting and
documentation standards
Extracting descriptive information for data management and to
support search and retrieval functions
Coordinating updates and transfer of the data and metadata to
appropriate places
Process and Activities
Preservation planning
Evaluating contents of archive; periodically
recommending updates to migrate current holdings
Developing and maintaining recommendations for
archive standards and policies
Monitoring changes in technology environment,
users’ service requests, and knowledge base
Developing and maintaining detailed migration
plans, software prototypes and test plans
Process and Activities
Archival storage
Receiving data from curation process and adding to
permanent storage
Managing storage hierarchy
Refreshing/replacing media
Performing error checks
Providing disaster recovery capabilities
Duplicating contents for off-site storage
Process and Activities
Management and administration
Negotiating agreements and licences
Managing IPR, copyright issues
Managing quality control and audit process
Monitoring systems operations
Providing inventories, reports, updates
Establishing and maintaining archive standards and
policies
Providing user support
Requirements
Strategic Business Planning
OAIS-Type Repository
Money
Creation
Preparation
Access
Curation
Preservation Planning
Archival store
Management and Administration
Infrastructure
Skills
Infrastructure
Space
Equipment
Facilities
Project Management
Systems and tools
Software and hardware
Standards – technical, metadata etc.
Skills
Scholars / Curators / digital librarians (creation, acquisition
and curation)
Scanner operatives, photographers etc. etc.
Information professionals (metadata management & creation)
Computer scientists (systems design and development,
management tools)
Preservation professionals (preservation planning and
management, migration and emulation services)
Systems administrators (systems management, disc
partitioning, refreshment, back-up)
Managers and administrators: project, process, strategic and
financial etc.
Costs
Staff salaries and benefits
Operating expenses
Software
System equipment
Other?
Solutions: In-house Repository
“….is a set of services that a organisation or
institution offers to the members of its community for
the creation, management and dissemination of
digital materials created by the organisation or
institution and its community members. It is most
essentially an organisational commitment to the
stewardship of these digital materials, including longterm preservation where appropriate, as well as
organisation and access or distribution.”
Lynch, C., ARL Bimonthly Report 226,
http://www.arl.org/newsltr/226/ir.htm
A very long quote……
As with computing, the cost of data repositories (done correctly) will be dominated
by the recurring costs of personnel performing curation, maintenance and
upgrade,and providing user advice, assistance, and support. The most sophisticated
of these personnel need professional skills in the relevant aspects of information
management and information technology (e.g. databases, archival file systems,
building portals),and will be developing and maintaining custom software. By using
a combination of high-speed networks and local high-speed caches, there is no hard
requirement to co-locate professional staff with physical storage particularly staff
performing data acquisition and curation functions as opposed to disk partitioning,
regeneration, and backup functions. As with computing, there is need for support
personnel at local institutions, in discipline-specific groups (often located in
centers),and centralized in centers. Although further analysis is needed, we expect
that the most efficient approach will be to have relatively centralized storage
hardware (with supporting staff) but distributed data acquisition and curation
personnel.
http://www.communitytechnology.org/nsf_ci_report/report.pdf (page 77)
Solutions: Sharing; Outsourcing
Sharing infrastructure and costs
Institutional consortia
Regional consortia
Local/national alliances
Federated services
Local/national/regional
Outsourcing some or all of the tasks
Creation
Preservation
Example: Shared Archives Model
Users
Institutional
Repository
Accessions
institutional
collections
and provides
access and
delivery
services
Harvest
preservation (and
discovery)
metadata
Preservation
Service
Metadata
integrated into
preservation
metadata store
and on-line
catalogue
facilities
Capture preservation Preservation
System
copies
Risk assessment
process
Preservation
Services
Benefits and Challenges
Benefits
Shared infrastructure
Shared expertise
Effective use of resources – human, technical and financial
Shared costs
Experts doing what they are best at
Challenges
Creating partnerships (partnerships or outsourcing?)
Agreement on policy and procedures
Establishing relationships of trust
Consistency of practice and outputs
Now it’s your turn…….
Recap: Questions to Ask
What is the purpose of your data creation efforts?
Are you undertaking a one-off project or establishing
something longer term
If the latter is it short, medium or long-term?
Will you be providing a service for others to use?
What skills will you need
What infrastructure?
What will it cost?
Do you want to do everything in-house or outsource some or
all of the tasks?
Resource creation
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Resource enhancement
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Resource presentation
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Resource curation
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Preservation planning
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Archival storage
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Management and administration
Skills?
Infrastructure?
Likely costs (high, medium, low)?
Possible solutions (in-house, outsource, other)?
Other issues?
Download