Publishing Cultural Heritage Alastair Dunning Digitisation Programme Manager JISC (Joint Information Systems Committee) a.dunning@jisc.ac.uk, 0203 006 6065 UCL Presentation, 19th June Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 1 JISC Digitisation Programme Manager for 8 projects, part of 16 project programme to digitise UK cultural heritage. For example – British Newspapers 1620-1900 – Pre-Raphaelite Art – Images from Scott Polar Research Institute – Nineteenth-Century Pamphlets – 20th-century Government Cabinet Papers – http://www.jisc.ac.uk/digitisation Started April 2007, finishing March 2009 Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 2 Digitisation is easy http://homepage.mac.com/xcia0069/lizzie-innes/index.htm Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 3 Growth of Digitisation Possibilities of Internet inspired rapid data capture of precious objects all over the world But maybe this started out as a reactive cottage industry? – Museums, Libraries and Archives rushing to digitise material and dump it on the web How long does this material last on the Internet? Is it good quality? Can people locate it? Can they use it? Quantity of material and issue of long-term digitisation effects published material. Added pressure supplied by Google digitisation programme …. Digitisation is difficult Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 4 Need for an infrastructure To address the issues raised in previous slide – How long does this material last on the Internet? Is it good quality? Can users locate it? Can they use it? Illustrations from the British model; other country’s models may be different Demonstration that mass digitisation is complex, involving multiple players and technologies Good infrastructure allows publication of cultural heritage to happen quickly; to show value for money; to be usable; to be easily accessible by educational communities and general public Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 5 Data capture To convert the physical to digital – Flat scanners, robotic scanners, 3D scanners, direct capture via digital camera, remote controlled camera, conversion via medium (e.g. microfilm), reel-to-digital, millions of typists To cope with all kinds of material (newspapers, stained glass, banners, posters, maps, census, reports, grey literature, artefacts, film, audio … ) Need to have keen idea of priorities for digitisation Ensure competition but not redundancy (Keep machines working; keep staff in place) Requires research on success of methodologies, dialogue with other subject areas (i.e. sciences) Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 6 University of Southampton Robotic Scanner – Details at http://www.soton.ac.uk/medi acentre/news/2004/nov/04_ 181.shtml If you don’t have a range of options for data capture – cultural heritage won’t get digitised Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 7 Standards and Formats What file formats to ensure high-quality, long-term use – Images - TIFF, but also JPEG2000, PNG – Text – XML (and flavours thereof), but also RTF, Word – Sound – WAV, AIFF, MP3, Ogg (formats and wrappers) – Film – MJPEG, MPEG4, AVI, Quicktime, Flash (ditto) Normally developed internationally, but local variations occur Co-ordination, certification, co-operation, involvement and decisiveness at national and international levels As with all parts of infrastructure, research and innovation If you don’t have this – see current mess over video! Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 8 Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 9 Metadata Requires sophisticated of experts who know the digital objects (e.g. newspapers, sound recordings, census reports) As with before, international co-ordination, certification, cooperation to develop international schema and vocabularies These are required at subject level, format level, technical levels, preservation levels. For example – Dublin Core, MODS – generic resource description – VRA4 – digital image description, including technical details – METS – wraps together different information on a digital object – PREMIS – preservation metadata over long term If you don’t have this – trust and authenticity, interoperability, resource discovery are severely hindered Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 10 Data Delivery I.e. the people that build websites Complex engagement between commercial (Google, ProQuest, Thomson Gale, JSTOR) and non-commercial suppliers (universities, museums etc.) Huge range of potential business models – Institutional subscription, Personal subscription – Pay-per-view, Google Ads – Open Access – Mixed model But no definitive answers about the more successful Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 11 Data Delivery – What is required Ability to regularly serve up websites and data Systems to deliver a range of digital content (e.g. newspapers, audio, posters, artifacts) Low overheads and year on year costs Good understanding of end-users Working in partnership with other content providers Commitment to innovation and good practice If you don’t have this – wheel will be constantly reinvented, users will be driven away, material will be siloed Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 12 Preservation Facilities Digital objects become obsolete with time. Experts are required to ensure this does not happen – Expertise in handling digital assets (content and all metadata) in long term, and preferably also the hardware and media that hold such content – Must be trusted and reliable – Good relationship with data delivery providers – Continual research – why, what and how to preserve? Without this, digital data will be lost, endangering the entire investment made in digitisation Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 13 Preservation Facilities – Case Study A good example from the late 1990s Orphaned archaeological data rescued from obsolescence CDs, floppy discs, PCs, databases, word files, CAD files all left But lack of metadata meant not all data could be retrieved http://ahds.ac.uk/creating/case-studies/newham/ Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 14 Digitisation Infrastructure Network capabilities Data capture Authentication Standards, Formats Tools Development Metadata Usability testing Data Delivery Copyright clearing houses Preservation Consultants Trained expert staff And of course Money Skill is in making sure these pieces fit together Suitable courses Joint Information Systems Committee 31 May 2016 | Programme Meeting | Slide 15