Digital Preservation - Digital Humanities Austria

advertisement
Österreichische Tage der Digitalen Geisteswissenschaften
save the data - workshop on digital repositories, Dec 2nd 2014
Johannes Spitzbart
Phonogrammarchiv, Austrian Academy of Sciences

Collecting, preserving and documenting primary audiovisual
research sources (unique) and making them available for research
use

Scholarly research (ethnomusicology, ethnology, linguistics, context
& discourse)

Technical research & development (standardization, i.e. IASA-TC,
preservation, replay & digitization of recordings, patent “Method
for Reconditioning of Data Carriers”) and training
Role in the digital research infrastructure:

Digitization, long-term preservation, data accessibility

Metadata capture, comprehensive documentation, database
access, searchability (online catalogue)
Manually managed system:

File Server – RAID 10 (= mirrored) HDD array, 40TB (usable capacity); LTO Tape Library
(two backup copies)

Eternal preservation through continuous data and format (if needed) migration

Checksums for file integrity check
Pros:
easy files/folder management, in-house maintenance, flexible (independent from
proprietary management software; hosting of any file format), probably easier disaster
management
Con:
additional effort for manual file/folder management and linking to database (human
resources)
Network Switch
1-Gigabit Ethernet
Content:
Several workstations
•
Digitized (approx. 50%) tape/disc collection (audio)
•
Born digital output of supported external & own
projects (audio)
•
Digitized acquired collections (audio)
Gigabit Ethernet
file servers
•
Workspace (for temp data)
2 independent, scalable hard drive storage units in
storage area network environment (mirrored hard
drives, RAID 10)
Data backups managed by system administrator
Tape library with LTO (=tape
data storage) drive






MySQL DB, PHP frontend, custom developed (daily backup onto
storage system)
Elaborate structure due to archival documentation needs
(comprehensive metadata capture)
Taxonomies & controlled vocabularies (Hornbostel-Sachs, languages,
ethnic groups, etc.)
Different access levels (visitors read only, admin for taxonomies and
CV)
AV playback in browser (MP3, MPEG-2)
English version (work in progress)
Content:
Comprehensive documentation
(technical, content descriptive and
contextual) at item level (=recording)
which is a prerequisite for accessibility
and potential use
… slimmed down copy of in-house database on dedicated
“exposed” server
 Reduced data set & short samples
 Focus on usability and sophisticated search possibilities
 Connected to Europeana through Dismarc (weekly updated)
Open Access?

Legal constraints
(Intellectual property rights)

Ethical issues (sensitive content)
 full length online publication not possible
to date

INPUT (external, main part):
Supported research projects get ...
◦ Methodological support/advice
◦ Technical support (recording equipment & training, )
◦ Preservation of the outcome (data: field recordings, metadata)
◦ Exclusive usage right for six years
… and provide on their part ...
◦ The original field recordings
◦ Their description (= predefined set of metadata) for proper documentation

OUTPUT:
Interested users ...
◦ Browse online catalogue, listen to samples, inquire via email
◦ Are provided with access copies via Download (small handling fees or fixed
rates for commercial use, e.g. media, exhibitions)
Download