DCC Development David Giaretta Associate Director (Development) Digital Curation Centre

advertisement
Digital Curation Centre
a centre of expertise in data curation and preservation
DCC Development
David Giaretta
Associate Director (Development)
Funders:
Organisation to Engage & Collaborate
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of
Data
Organisations
service
definition
& delivery
management
& admin
support
research
research
collaborators
development
co-ordination
testbeds
& tools
Industry
standards bodies
Development – initial plans
• Registries/Repositories
– offering a repository of tools and technical
information, a focal point for digital curators
– metadata standards
• Testbeds
– for testing and evaluating tools, methods,
standards and policies in realistic settings
• Certification
– standards
3
Development
What can we rely on in the
Long Term?
• The bits (original or migrated)
– let us for the moment put to one side the issue
of BIT PRESERVATION (but it is an issue)
• Physical documents that people can read
– e.g. ISO standards on paper
• Additional information we collect – either
held by the DCC, its collaborators or
successors
4
Development
Preservation “vs” Current Use
• There are already very many architectures to
support immediate use of information
– Aim to support these
• Therefore chose to be guided by
– long-term preservation aspects
• try to ensure that components of the preservation architecture
can supplement other “current use” architectures.
– to promote this we should emphasise “interoperability”
and “automated use” as far as possible.
– based initially on OAIS Reference Model – but not
limited to that
5
Development
OAIS Reference Model –
Functional Model
P
R
O
D
U
C
E
R
Descriptive
Info
Data
Management
Descriptive
Info
queries
result sets
Ingest
SIP
AIP
4-1.2
Preservation Planning
Archival
Storage
Access
AIP
Administration
MANAGEMENT
6
Development
orders
DIP
C
O
N
S
U
M
E
R
OAIS – Preservation Planning key aspects
• Designated Communities & Knowledge
Base
• Representation Net
7
Development
Representation Net
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of Data
Organisations
service
definition
& delivery
managemen
t & admin
support
researc
h
development coordination
testbeds
& tools
Industry
8
Development
standards bodies
research
collaborators
Representation Information
vs File Format
• File Format provides only limited information
– Knowing that a file is in Word 6.0 format does not allow one to
understand its contents e.g.
• File contains French text
• File has text with specialised terms
– Science data file (e.g. FITS) also has keywords and values
• What do they mean?
• Representation Information is not limited in this way
– N.B. includes File Format
• See DCC demo
– Registry/Repository of Representation Info
• Low cost of “buy-in”
9
Development
Archival Information Package
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of Data
Organisations
service
definition
& delivery
managemen
t & admin
support
researc
h
development coordination
testbeds
& tools
Industry
10
Development
standards bodies
research
collaborators
Testbeds
curation
organisations
eg DPC
communities of
practice: users
• Hardware used by “curators”
in the wild
community
support &
outreach
Collaborative
Associates
Network of Data
Organisations
• Hardware suppliers
• Software suppliers
– Commercial
– Non-commercial
11
Development
managemen
t & admin
support
researc
h
development coordination
testbeds
& tools
Industry
– Details from projects
service
definition
& delivery
standards bodies
research
collaborators
Standards and Audit &
Certification
• How can people know to whom their
information can be entrusted?
• OAIS follow-on standard(s) underway
– on which a certification program can be based
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of Data
Organisations
service
definition
& delivery
managemen
t & admin
support
researc
h
development coordination
testbeds
& tools
Industry
standards bodies
• From the standards:
– need to establish accreditation and certification bodies in
preparation for offering audit and certification services
– audit, certification and accreditation are potential sources of
long term funding for the DCC
– Testbeds and testing procedures
• for software certification
• hardware and software systems will need to be purchased,
hired or borrowed.
12
– we expect to work with hardware and software manufacturers to
certify hardware and software components
Development
research
collaborators
Working with Others
•
•
•
•
•
•
•
•
•
•
•
communities of
practice: users
Digital Library Federation
The National Archives
Global Grid Forum
NARA
Library of Congress
Research Library Group
Digital Preservation Coalition
JISC community
Development info – see
E-Science Community
http://dev.dcc.rl.ac.uk/twiki/bin/view
Associates Network
for details of Wiki and email list
…and many more
curation
organisations
eg DPC
community
support &
outreach
Collaborative
Associates
Network of Data
Organisations
service
definition
& delivery
managemen
t & admin
support
researc
h
development coordination
testbeds
& tools
Industry
open to all
13
Development
standards bodies
research
collaborators
The Virtuous Circle
curation
organisations
eg DPC
communities of
practice: users
community
support &
outreach
Collaborative
Associates
Network of
Data
Organisations
service
definition
& delivery
management
& admin
support
research
research
collaborators
development
co-ordination
testbeds
& tools
Industry
standards bodies
Download