Co-funded by the European Union under FP7-ICT-2009-6 Alliance Permanent Access to the Records of Science in Europe Network The VCoE: what it offers David Giaretta, director@alliancepermanentaccess.org APA APARSEN webinar, November 2014 Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Solutions to problems related to digital preservation • Based on experience and expertise of the “pioneers” of digital preservation • All kinds of digital objects – But Audio visual – perhaps better done by PrestoCentre Documents and images – perhaps better done by OPF • Need to: – Define the solution : – Using : – Your people then need : CONSULTANCY TOOLS & SERVICES TRAINING The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 CHALLENGES • Different types of digital objects e.g. – – – – • • • • Rendered – simple images, sound, video, documents Data – needs meaning of numbers etc Software Time dependent Many different tools and services – many lists Many sources – APARSEN members and others Many terminologies, many glossaries NEED AN INTEGRATED VIEW The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Many models – why another? See http://blogs.loc.gov/digitalpreservation/2012/02/life -cycle-models-for-digital-stewardship / The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Data Lifecycle Models and Concepts by CEOS, 2012, see http://www.ceos.org/images/DSIG/Data% 20Lifecycle%20Models%20and%20Conc epts%20v13.docx …and more The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Integrated vision • If the payback is not immediate then the resources need to be justified • Since the resources have to be found somehow, the question “who pays and why?” if often heard • To justify the resources needed for preservation one needs to identify the potential value. • To maintain, for even increase, the likely value, the techniques chosen for preservation plays a key role. • http://www.alliancepermanentaccess.org/index.php/commu nity/common-vision/ • Clickable image to showing related research The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Training • Consistency through the “Integrated View” The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Click on PRESERVATION: The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Basic preservation activities • Can repeat what has been done before BUT • Cannot use new applications Libraries say: •“Emulate or migrate” • Convert to format which new software can use BUT • What if there are many software systems? –Works well with data only in special cases Can repeat what was done before instead of new things –Does not help with building cross-disciplinary communities The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Preservation techniques For each technique • look for evidence – what evidence? • must at least make sure we consider different types of data –rendered –composite –dynamic –active vs non-rendered vs simple vs static vs passive • must look at all types of threats The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Evidence • APA/APARSEN list of tools: http://www.alliancepermanentaccess.org/index.php/tools/tool s-for-preservation/ – details of preservation related software, examples of data and the evidence of preservation linking software to types of data. Some of this evidence comes from specific testbeds but much comes from user scenarios. The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Tools • evidence based selection • linked to SCIDIP-ES toolkits The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Other evidence and tools • CASPAR project large amount collected evidence about the effectiveness of tools and services for many different types of data: – Scientific – Cultural heritage – Contemporary performing arts • Prototyped the tools and services • These have been developed in SCIDIP-ES http://www.scidip-es.eu and http://intplatform.digitalpreserve.info • Also massive collection of information about views of thousands of researchers, data managers and publishers across disciplines and around the world – PARSE.Insight The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 When things change • We need to: – Know something has changed Orchestration Service – Identify the implications of that change Gap Identification Service – Decide on the best course of action for preservation – What RepInfo we need to fill the gaps Preservation Strategy Tk RepInfo Registry Service Created by someone else or creating a new one – If transformed: how to maintain data authenticity – Alternatively: hand it over to another repository Authenticity Toolkit Storage Service – Make sure data continues to be usable RepInf o Toolkit Data Virtualisa tion Toolkit The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu Process Virtualisa tion Toolkit #APARSEN Threat Requirement for solution Co-funded by the European Union under FP7-ICT-2009-6 Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved Ability to create and maintain adequate Representation RepInfo toolkit, Packager and Registry – to create and store Information Information. Representation In addition the Orchestration Manager and Knowledge Gap Manager help to ensure that the RepInfo is adequate. Non-maintainability of essential hardware, software or support environment may make the information inaccessible Registry Orchestration to exchange information Ability toand share informationManager about the availability of hardware about the obsolescence of hardware and software, amongst other and software and their replacements/substitutes The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity Ability to bring together evidence from diverse sources about Authenticity toolkit will allow one to capture evidence from many the Authenticity of a digital object sources which may be used to judge Authenticity. Access and use restrictions may make it difficult to reuse data, or alternatively may not be respected in future Ability to deal with Digital Rights correctly in a changing and evolving environment Packaging toolkit to package access rights policy into AIP Loss of ability to identify the location of data Persistent Identifier system: such a system will allow objects to be An ID resolver which is really persistent changes. The Representation Information will include such things as software source code and emulators. located over time. The current custodian of the data, Brokering of organisations to hold data and the ability to Orchestration Manager will, amongst other things, allow the whether an organisation or project, package together the information needed to transfer exchange of information about datasets which need to be passed may cease to exist at some point in information between organisations ready for long term from one curator to another. The VCoE: preservation what it offers the future David Giaretta, APA The ones we trust to look after the November Certification Webinar, 2014 process so that one can have confidence about Certification toolkit to help repository manager capture evidence digital holdings may let us down whom trustAudit to preserve data holdings over the long term for ISO to 16363 and Certification aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 APARSEN test audit findings • Lack of definition of Designated Community – SCIDIP-ES Gap Identification Services helps • Lack of adequate Representation Information – SCIDIP-ES Registry or RepInfo, Preservation Strategy and RepInfo Toolkit help to create/share RepInfo – Orchestration service and Gap Identification Services help repository manager • Inadequate Archival Information Packages – SCIDIP-ES Packaging tools help create AIPs – several flavours • Lack of hand-over plans – Orchestration Services helps find partners The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Terminologies – APARSEN DP Glossary Why another? • Provides relationships between different terms from the various glossaries OAIS, APARSEN, DPC, ANZ, SNIA, INTERPARES, TDR (ISO 16363) – Uses SKOS to organise the terms – tells us whether a term is Broader / narrower / related to another term See http://www.alliancepermanentaccess.org/index.php/consultancy/d pglossary/ Each term has a URI e.g. Representation Information: http://www.alliancepermanentaccess.org/index.php/consultancy/d pglossary/#Representation_Information The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Representation_Information In scheme: OAIS Co-funded by the European Union under FP7-ICT-2009-6 Definition: The information that maps a Data Object into more meaningful concepts. An example of Representation Information for a bit sequence which is a FITS file might consist of the FITS standard which defines the format plus a dictionary which defines the meaning in the file of keywords which are not part of the standard. Another example is JPEG software which is used to render a JPEG file; rendering the JPEG file as bits is not very meaningful to humans but the software, which embodies an understanding of the JPEG standard, maps the bits into pixels which can then be rendered as an image for human viewing. Pref Label: Narrower term: Narrower term: Narrower term: Narrower term: Narrower term: Narrower term: Narrower term: Narrower term: Narrower term: Broader term: Related term: Related term: 0 Format Representation_Network Semantic_Information Structure_Information Data_Type Documentation Format_ANZ Logical_format Packaging_Information Metadata Data_format The VCoE: what it offers Form David Giaretta, APA Related term: Webinar, November 2014 Registry_Repository-of-Representation_Information aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Standards support • OAIS ISO 14721:2012 • Audit and Certification ISO 16363:2014 The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Solution providers • Many possible solution providers • APA members and others e.g. – http://www.giaretta.org – http://www.iso16363.org • Links to – Audit and Certification – Developing new standards following on from OAIS on the whole data lifecycle • Links to RDA e.g. – Active Data Management Plans: https://rdalliance.org/groups/active-data-management-plans.html – Preservation infrastructures: https://rdalliance.org/groups/preservation-e-infrastructure-ig.html The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Summary: Solutions – consultancy, tools & services, training - based on Co-funded by the European Union under FP7-ICT-2009-6 • Experience and expertise of digital pioneers • Evidence based tools and services – All types of digital objects • Training materials – on-line and face-to-face • Consistency provided by – Integrated View – Terminology brought together by SKOS Glossary – Standards database • Consistent with (and contributed to) the digital preservation fundamental standards ISO 14721 (OAIS) and ISO 16363 (Audit and Certification) The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Resources All resources are available on the website: http://www.alliancepermanentaccess.org Contact: director@alliancepermanentaccess.org Or david@giaretta.org The VCoE: what it offers David Giaretta, APA Webinar, November 2014 aparsen.eu #APARSEN aparsen.eu Network of Excellence #APARSEN