Matthias Gross

advertisement
Archiving and Presenting Journals with Rosetta
DRAG Dresden 2014
Matthias Groß, Bavarian State Library, Munich, Germany
10th IGeLU Conference, Budapest, September 2nd 2015
Short timeline (1) - DigiTool
BVB: Bavarian Library Network, regional consortia for
research libraries
Head Office: department of the Bavarian State Library
2004-2006: looking for powerful „multimedia“ software
2006-:
implementing DigiTool, going live 2007/08
How to manage journals?
complex objects / collections / METS objects
 BVB chooses METS-objects for journals
2
3
Short timeline (2) - Rosetta
2010-:
implementing Rosetta at BSB
journals not included in pilot workflows
How to manage journals?
collections / METS-objects/…
2013/14
2014
2015
collection management gets better, but …
… decision to follow own approach in parallel
struggling with some problems, then:
 Welcome, journals, to Rosetta!
4
Presenting journals with Rosetta
• BSB uses Rosetta as „light“ archive
whenever reasonable
• A tree structure with several levels
(unlimited depth) is powerful enough
to handle most common journal
structures and seems natural for end
user presentation
• If the tree structure is represented by
an „object“, this can correspond with
catalogue entries / persistent identifier
on the title level
5
WANTED:
WANTED:
(elsewhere)
Re-shaping our DigiTool concept for Rosetta
• In the „Manual Legal Deposit“ workflow, new issues are
ingested as new IEs
• Testing collection management in Rosetta in 2014 we
saw still some shortcomings (addressed in Pressure
Points document)
• Adding new components (issues) to METS-objects would
create new versions and lead to a confusing situation,
obfuscating genuine preservation actions
 BVB wants something that acts like METS, but is not a
METS-object
6
Starting at the end …
BVB developed own METS viewer for DigiTool in 2012/13 which is
basically independent of the system holding the objects;
display uses jquery/css.
Only a few interfaces to the system needed:
1. Table of contents: from StructMap/FileSec  json (Precache)
tree structure with Digitool-PIDs of components as leaves
2. Bibliographic metadata: on-the-fly from original
MARC/MODS/DC data (2-layer XSLT transformation to json)
3. Request for a child object: uses delivery URL for embedded
mode (provides main title and stream)
4. Thumbnail preview: based on Table of contents using special
Delivery Rule
7
Facial composite of the solution (1)
1. Table of contents as „near-METS“
• All components of a journal share the same
bibliographic ID in dc:relation
• Store reference data (volume, issue, year)
in dcterms:bibliographicCitation
(trick: use OpenURL 1.0)
• Based on this information, a ToC can be
created and stored in the file system as
BibID.json with Rosetta‘s IE IDs as leaves.
8
Facial composite of the solution (1a)
OpenURL as container
Plan: Using MARC/MODS metadata instead; OpenURL
trick is not so friendly for human editing
9
Facial composite of the solution (2)
2. Bibliographic metadata
BibID is known (from each component); for display fetch
recent MARC-XML record via Aleph SRU interface
3. Request for child object
DeliveryRule „embedded“ in Rosetta
4. Thumbnail preview
DeliveryFunction „thumbnail“ in Rosetta
10
Proof of concept
11
Creation of near-METS industrialized
Our approach: Harvesting the OAI interface (good
experience with DigiTool)
However, we encountered problems to get valid XML output
from Rosetta. After some months it turned out that there is a
config parameter ‚dublincore_additional_namespaces‘
(see Home > Advanced > Configuration > General >
General Parameters) that should be defined as [blank] –
which was not the case in our installation.
12
Data processing (simplified: without deletions)
• (
Rosetta OAI repository
Harvest: What‘s new since …?
filter by journal
Found new component?
BibID BV123456789
issue 3, vol. 2, year 2015
Known journal
add to
StructMap BV123456789.json
New journal
create
StructMap BV123456789.json
get bibliographic MD from Aleph
13
Following two tracks
Combining near-METS with Rosetta-Collections
1 collection equals 1 journal

Metadata on journal level

URN on journal level (PP: CM 2.2.2)

AssignCMS for journal level (metadata in Rosetta // URN, ArchiveURL in
ALEPH) (Collection Support – WP, 2012)

Searching monographs and journals in parallel (IEs and collections, PP:
CM 2.2.3)

Manual Legal Deposit : Issue goes to correct journal „automatically“

Easy administration of IEs in Rosetta
14
They are waiting:
Legal Deposit:
- in DigiTool: 450 journals, 15.000 issues
-
on heap: 100+ journals, constantly new titles arriving
OA publications
- finalizing collection strategy for Bavarica and special subject fields
Licensed publications (E-journal backfiles):
-
responsibility on national, regional and local levels
for hosting and long term preservation
Digitized material
- from ZEND / TSM
15
Thank you very much for your interest
in the most fascinating format
of scientific literature!
gross@bsb-muenchen.de
16
Download