From Individual Solutions to Generic Tools – Digitization at the Max Planck Society Digitization Day 2012, Geneva Andrea Kulas To start with…. Differences…. Journals (different locations!) Rare books (dating back to 6th Century!) But……. Similar Challenges (Exchange!) One infrastructure for different collections (Visibility!) -> service to support the digitization of library holdings at Max Planck Institutes (goal: make digital objects usable for scientists) 14.09.2012 Digitization Day 2012 1 Digitization Lifecycle - a short overview February 1st 2011 – Januar 31st 2013 Partner: MPI for European Legal History, Frankfurt MPI for Human Development, Berlin Kunsthistorisches Institut, Florenz Bibliotheca Hertziana, Rom Max Planck Digital Library, Munich Affiliated/Associated: MPI for Medical Research MPI for Mathematics in the Sciences MPI for the History of Science 14.09.2012 Digitization Day 2012 2 Lifecycle? Tools Guideline Import & View Materials: Monographs Multivolumes Volumes Scan & (OCR) Edit Prepare Publish Virtual Research 14.09.2012 Digitization Day 2012 3 What is generic about the tools in Digitization Lifecycle? Not tailor-made for one specific problem, but performs different types of tasks Individual requirements <-> compromise 4 -> 80 MPG Institutes? 14.09.2012 Digitization Day 2012 4 Ingest, Export and Dataformats TEI P5 (powerful Text Encoding Format <-> exchange) DLC Schema (-> Transformations) MAB.XML (-> MODS, Transformations) TIFF, JPEG, PNG Export: PDF, DFG Viewer, METS/MODS, TEI + OAI-PMH 14.09.2012 Digitization Day 2012 5 Editing & Pagination Page-based Manipulate Table of Contents: Changing hierarchies and deleting structural elements Flexibility: Sequence of steps Optional: Manual setting of end-points for chapters Batch Processes for Pagination 14.09.2012 Digitization Day 2012 6 A Timeline GUI Version 1 Aug 2012 Testing Sept. DLC Prototype 14.09.2012 DLC Application (Open Source) Oct. Nov. Bugfixing and Release GUI Version 2 Dec. Jan. 2013 Migration and Ingest Digitization Day 2012 7 Andrea Kulas (kulas@mpdl.mpg.de) Lu Yu (Yu@mpdl.mpg.de) 14.09.2012 Digitization Day 2012 8 DLC Technical Overview Digitization Day 2012, Geneva Lu Yu Agenda eSciDoc Overview DLC data model & system architecture First DLC experience 14.09.2012 Digitization Day 2012 10 eSciDoc Project 14.09.2012 Digitization Day 2012 11 eSciDoc Services and Solutions Services – generalized resources (Items, Containers, Contexts) – versioning, persistent identification, searching, statistics, authentication, authorization – used by developers, end users and non-human service requestors Solutions – work with specialized resources: publication items, images and image albums, digitized texts, language resources, transcriptions, translations – enable different resource-specific workflows – visualize and reuse services and add value (e.g. data mash-ups, specific views) For more information, visit http://escidoc.org 14.09.2012 and http://colab.mpdl.mpg.de/mediawiki/Portal:ESciDoc Digitization Day 2012 12 Agenda eSciDoc Overview DLC data model & system architecture First DLC experience 14.09.2012 Digitization Day 2012 13 Data model admin depositor moderator Users pending Scans (jpeg, tiff, png) Bibliographic metadata (eg. MAB) Full text (TEI-P5) Collection released Organization 14.09.2012 Digitization Day 2012 14 System architecture DLC application (Batch) Upload (Scans, MD, full text) Scanserver (Digilib) View Edit(Structure) Search Export Annotation eSciDoc Services Annotation Server(Yuma) (PostgreSQL) MODS XML TEI XML eSciDoc Core (Fedora) 14.09.2012 Digitization Day 2012 15 Open Source Common Development and Distribution license (CDDL, OSI-approved) Technologies: JSF 2.2, Richfaces 4, Tomcat 7, eSciDoc services Check out source code from the repository: https://subversion.mpdl.mpg.de/repos/virr/digi_lifecycle 14.09.2012 Digitization Day 2012 16 Agenda eSciDoc Overview DLC data model & system architecture First DLC live experience 14.09.2012 Digitization Day 2012 17 Thank you for your attention! 14.09.2012 Digitization Day 2012 18