From Individual Solutions to Generic Tools Digitization Day 2012, Geneva Andrea Kulas

advertisement
From Individual Solutions to Generic Tools
– Digitization at the Max Planck Society
Digitization Day 2012, Geneva
Andrea Kulas
To start with….
Differences….
Journals (different locations!)
Rare books (dating back to 6th Century!)
But…….
Similar Challenges (Exchange!)
One infrastructure for different collections (Visibility!)
-> service to support the digitization of library holdings
at Max Planck Institutes
(goal: make digital objects usable for scientists)
14.09.2012
Digitization Day 2012
1
Digitization Lifecycle - a short overview
February 1st 2011 – Januar 31st 2013
Partner:
MPI for European Legal History, Frankfurt
MPI for Human Development, Berlin
Kunsthistorisches Institut, Florenz
Bibliotheca Hertziana, Rom
Max Planck Digital Library, Munich
Affiliated/Associated:
MPI for Medical Research
MPI for Mathematics in the Sciences
MPI for the History of Science
14.09.2012
Digitization Day 2012
2
Lifecycle?
Tools
Guideline
Import &
View
Materials:
Monographs
Multivolumes
Volumes
Scan &
(OCR)
Edit
Prepare
Publish
Virtual
Research
14.09.2012
Digitization Day 2012
3
What is generic about the tools in Digitization Lifecycle?
Not tailor-made for one specific problem, but performs
different types of tasks
Individual requirements <-> compromise
4 -> 80 MPG Institutes?
14.09.2012
Digitization Day 2012
4
Ingest, Export and Dataformats
TEI P5 (powerful Text Encoding Format <-> exchange)
DLC Schema (-> Transformations)
MAB.XML (-> MODS, Transformations)
TIFF, JPEG, PNG
Export: PDF, DFG Viewer, METS/MODS, TEI + OAI-PMH
14.09.2012
Digitization Day 2012
5
Editing & Pagination
Page-based
Manipulate Table of Contents:
Changing hierarchies and deleting
structural elements
Flexibility: Sequence of steps
Optional:
Manual setting of end-points for chapters
Batch Processes for Pagination
14.09.2012
Digitization Day 2012
6
A Timeline
GUI Version 1
Aug 2012
Testing
Sept.
DLC
Prototype
14.09.2012
DLC
Application
(Open
Source)
Oct.
Nov.
Bugfixing
and
Release
GUI
Version 2
Dec.
Jan. 2013
Migration
and Ingest
Digitization Day 2012
7
Andrea Kulas (kulas@mpdl.mpg.de)
Lu Yu (Yu@mpdl.mpg.de)
14.09.2012
Digitization Day 2012
8
DLC Technical Overview
Digitization Day 2012, Geneva
Lu Yu
Agenda
eSciDoc Overview
DLC data model & system architecture
First DLC experience
14.09.2012
Digitization Day 2012
10
eSciDoc Project
14.09.2012
Digitization Day 2012
11
eSciDoc Services and Solutions
Services
– generalized resources (Items, Containers, Contexts)
– versioning, persistent identification, searching, statistics, authentication, authorization
– used by developers, end users and non-human service requestors
Solutions
– work with specialized resources: publication items, images and image albums, digitized
texts, language resources, transcriptions, translations
– enable different resource-specific workflows
– visualize and reuse services and add value (e.g. data mash-ups, specific views)
For more information, visit http://escidoc.org
14.09.2012
and http://colab.mpdl.mpg.de/mediawiki/Portal:ESciDoc
Digitization Day 2012
12
Agenda
eSciDoc Overview
DLC data model & system architecture
First DLC experience
14.09.2012
Digitization Day 2012
13
Data model
admin
depositor
moderator
Users
pending
Scans (jpeg, tiff, png)
Bibliographic metadata (eg. MAB)
Full text (TEI-P5)
Collection
released
Organization
14.09.2012
Digitization Day 2012
14
System architecture
DLC application
(Batch) Upload
(Scans, MD, full text)
Scanserver (Digilib)
View
Edit(Structure)
Search
Export
Annotation
eSciDoc Services
Annotation Server(Yuma)
(PostgreSQL)
MODS
XML
TEI
XML
eSciDoc Core
(Fedora)
14.09.2012
Digitization Day 2012
15
Open Source
Common Development and Distribution license (CDDL, OSI-approved)
Technologies: JSF 2.2, Richfaces 4, Tomcat 7, eSciDoc services
Check out source code from the repository:
https://subversion.mpdl.mpg.de/repos/virr/digi_lifecycle
14.09.2012
Digitization Day 2012
16
Agenda
eSciDoc Overview
DLC data model & system architecture
First DLC live experience
14.09.2012
Digitization Day 2012
17
Thank you for your attention!
14.09.2012
Digitization Day 2012
18
Download