a centre of expertise in data curation and preservation DCC Curation Lifecycle Model Project Sarah Higgins Sarah.Higgins@ed.ac.uk Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncsa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Lifecycle Approach to Digital Curation and Preservation A Lifecycle approach is necessary because: • Digital materials are fragile and susceptible to change from technological advances from creation onwards • Activities (or lack of) at each lifecycle stage influence ability to manage and preserve materials in subsequent stages • Reliable re-use of digital materials is only possible if materials are curated in such a way that their authenticity and integrity are retained. • Facilitates continuity of service • Supports verification of provenance • Helps maximise initial investment made in creating or gathering data • Requires significant input and buy-in from the range of stakeholders – creators, curators, IT staff, management. From: Pennock, Maureen, Digital Curation: A Life-Cycle Approach to Managing and Preserving Usable Digital Information, (2007) Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Project Aim - Internal • Planning tool • Framework for DCC output – tools, services, research, eScience liaison • Identify gaps in DCC materials • Move to lifecycle approach to resource creation, management and presentation Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Standards Watch Paper ISO 15489 Briefing Paper Using OAIS for Curation Briefing Papers Persistent Identifiers Standards Watch Paper PREMIS Data Dictionary Briefing Papers Digital Preservation: Continued access to authentic digital assets Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Project Aim- External • Enable others to: • • • • • • • Plan curation and preservation activities Map granular functionality Define roles and responsibilities Build framework of standards and technologies Identify additional steps required Identify steps not required Ensure adequate documentation Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation DCC Curation Lifecycle Model Aims • Curation specific model • Generic graphical high level overview of lifecycle stages • Organisational planning tool • Extensible for greater granularity • Adaptable for different domains • Complement – not restate - a number of lifecycle applicable standards: • OAIS • ISO 15489 • MoReq2 • Indicative not exhaustive Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Scope notes • • • • Added in response to DCC team comment Generic language Cross domain understandability No specific standard quoted Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Related Lifecycle Activity • • • • • • • Records Continuum Model eScience Curation Report InterPares 2 SherpaDP Life Project Paradigm Project MoReq2 Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Records Continuum Model - 1996 • • Archives and records management discipline Identification, control and access of electronic records Upward, Frank, Structuring the Records Continuum -, 1996 Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation E-Science Curation Report - 2003 • • • Curation 101 – Edinburgh: 6 October 2008 E-science discipline databases Takes integrated look at higher education data curation problems Poor granularity on curation activities a centre of expertise in data curation and preservation InterPARES- Chain of Preservation - 2005 • • • • Curation 101 – Edinburgh: 6 October 2008 Records management / archives discipline Very detailed workflow for preservation activities Identifies key relationships and activities An audit tool for lifecycle management or service development a centre of expertise in data curation and preservation SherpaDP - 2006 • • • • Institutional Repository focus Specific to ePrints Policies on submission format and permission for preservation processes Creator / preserver relationships Knight, Gareth, A lifecycle model for an e-print in the institutional repository, 2006 Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Life Model v1.1- 2007 • • • Curation 101 – Edinburgh: 6 October 2008 Typical chronological processes for preservation of digital objects Deals with costing activities Enables cost of preservation to be calculated a centre of expertise in data curation and preservation Paradigm Project - 2007 • • • Archives and records management discipline OAIS implementation Project specific Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation MoReq2 - 2008 ISO 12033 GUID • X.509, XKMS ISO 15801, 12654 PDF/A X.509, XKMS • ISO 18492, OAIS RFC 2821, 2822, TIFF, JPEG 2000, ISO 216 PDF/A RECORDS ISO 12037 MoReq2, ISO 15489 XML Metadata: DC, ISAAR, ISO 23081, 639, 2788, 5964, 8601 ISO 12142 Curation 101 – Edinburgh: 6 October 2008 • Electronic records management discipline EU Project by serco Identifies which standards to use and where to use them a centre of expertise in data curation and preservation Reference Models / Frameworks • ISO 14721:2003 - Open Archival Information Systems Reference Model (OAIS) • ISO 20652:2006 - Producer-archive interface -- Methodology abstract standard (PAIMAS) • ISO 15489:2001 Information and documentation -- Records management Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation OAIS -1 • Consultative Committee for Space Data Systems • Good on ingest and post-ingest activities • Weak on pre-ingest activities • Basis for other initiatives • PREMIS data dictionary • Trusted Digital Repository Audit and Certification Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation OAIS - 2 • Mandatory responsibilities • Negotiate for and accept appropriate information its producers • Obtain sufficient control of information to ensure Long-Term Preservation. • Determine the Designated Community • Ensure information preserved is independently understandable to the Designated Community without expert assistance • Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original or traceable as the original. • Make the preserved information available to the Designated Community. Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation OAIS - 3 • Functional model P R O D U C E R Descriptive Info Data Management Descriptive Info queries result sets Ingest SIP AIP 4-1.2 Preservation Planning Archival Storage Access AIP orders DIP Administration MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package Curation 101 – Edinburgh: 6 October 2008 C O N S U M E R a centre of expertise in data curation and preservation OAIS - 4 • Information model • Data object • Metadata • Preservation • Hierarchical description • Representation information • Packaging • Persistent identifiers Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation OAIS - 5 • • • • • Not for the faint hearted No implementation recommendations Break down into constituent parts Identify staff responsibilities at start See Cornell University’s project: MathArc — Ensuring Access to Mathematics Over Time Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation PAIMAS • OAIS “prequel” • Deals with pre-ingest actions • Pre-ingest – initial contact, feasibility studies, scope, draft SIP definition, draft submission agreement • Formal definition – final SIP design, transfer conditions, access restrictions, delivery • Transfer phase – actual transfer, preliminary processing of SIP • Validation phase – SIP validation, follow-up action with producer Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation OAIS Entity 2 - Ingest Update SIP PIAMAS Receive Submission Generate descriptive info Format standards, document standards, procedures Quality assurance Ingest Quality assurance Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation ISO 15489 - 1 • • • • Origin: Australian Records Management standard Part 1: General - best practice framework Part 2: Guidelines – implementation guidelines Covers full life-cycle of a record • • • • • What to create Capture – formats, technologies Documentation Appraisal Retention or destruction Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation ISO 15489 - 2 • Ensures records are • • • • Properly maintained Easily accessible Correctly documented across lifecycle Disposal • Transparent • Pre-determined criteria Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation ISO 15489 - 3 • Defines records management principles: general • • • • • • • • • What to create Format, structure, technology Metadata – linkage, management Use requirements Organisation of records Risk – not maintaining records Legal and organisational requirements – standards Safe storage and maintenance Evaluation of processes Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Step A Conduct preliminary investigation Step B Analyse business activity Step C Identify requirements for records Step E Identify strategies Step F Design records system Policy Design Step D Assess existing system Standards Implementation Step H Conduct postimplementation review Curation 101 – Edinburgh: 6 October 2008 Step G Implement records system a centre of expertise in data curation and preservation ISO 15489 - 5 • Practical implementation - help • PD ISO/TR 15489-2: 2001Information documentation – Records management Part 2: Guidelines • Effective Records Management. A Management Guide to the Value of ISO 15489-1 • Effective Records Management. Practical Implementation of ISO 15489-1 • Effective Records Management. Performance Management for ISO 15489-1 Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation ISO 15489 - 6 • Implementation benefits • • • • • Policy and procedures reviewed or developed High profile for project Awareness of issues Recognition that all involved Organisational asset of records recognised Dr Julie McLeod and Sue Childs – Assessing the impact of ISO 15489: the first international standard for records management, October 2005 Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation ISO 15489 Step H Review Step A: Conduct Preliminary investigation Steps C, D, E & F Design records system Step B: Analyse business activity Standards Step G: Implement Records system Step A: Conduct Preliminary investigation Curation 101 – Edinburgh: 6 October 2008 a centre of expertise in data curation and preservation Planning Curation and Preservation Activities • • • • • Customise the model Identify stages in the process Plan granular functionality Define roles and responsibilities Build framework of standards and technologies Curation 101 – Edinburgh: 6 October 2008