Peter Buneman Vassilis Christophides Univ. of Edinburgh Univ. of Crete and FORTH-ICS

advertisement
23 March 2007
PresDb’07
Peter Buneman
Univ. of Edinburgh
Vassilis Christophides
Univ. of Crete and FORTH-ICS
P. Buneman & V. Christophides
1
PresDb’07
23 March 2007
PresDB 2007

Informal venue aiming to bring together researchers and
practitioners addressing archival issues associated with
databases

8 invited talks, 13 short presentations

Thanks to speakers

Thanks to executive committee
Peter Buneman, Bertram Ludaescher, Chris Rusbridge, WangChiew Tan, Ken Thibodeau

Thanks to organizing committee
Joy Davidson, Yrsa Roca Fannberg, Florance Kennedy, Heiko
Mueller
P. Buneman & V. Christophides
2
PresDb’07





23 March 2007
The Importance of Scientific Data
Much of the data is either impossible to reproduce
e.g. climate and demographic data
Much of the data can only be recovered at enormous costs
e.g. data from high energy physics experiments or space flight
missions
Nearly every reference manual, dictionary and gazetteer benefits
from some form of database management support
 there has been an explosion in the number of curated
databases (e.g., biology)
These databases represent a huge investment of human effort!
The need for preservation is self-evident
P. Buneman & V. Christophides
4
PresDb’07

23 March 2007
Preserving Digital Objects
Digital Preservation aims to maintain and add value to a trusted
body of digital objects for current and future use
digital
objects are maintained in an archive without being
damaged, lost or maliciously altered (integrity, authenticity)
digital
objects can be found, extracted and served to a user
(accessibility, retrievability)
digital
objects can be interpreted and understood by the user
(readability, interpretability)

Preservation life cycle: appraisal, extraction, ingestion,
description, maintenance, access, and dissemination
P. Buneman & V. Christophides
5
PresDb’07

23 March 2007
Why are Databases Different?
How is a DB it different from a “fixed” digital object?
has internal structure – understandable by both people and
programs
It

It
changes over time
It
has internal consistency
We consider everything from relational and object-oriented
databases to data held in XML, scientific data formats and
ontologies to be “databases”
P. Buneman & V. Christophides
6
23 March 2007
PresDb’07

Preserving Databases
Database Preservation poses new technical, economic and legal
challenges
databases
are structured, what do we really need to preserve?
 Queries (which?), DBMS environment, application semantics
databases
evolve we need both
 old versions of the database, and
 time-base queries about the change (e.g. how has the number
of smokers in Greece changed in the past 20 years)
databases
are centrally managed, data survival depends on the
continued existence (funding?) of the host organization
 can we move to a distributed, redundant model of database
preservation?
P. Buneman & V. Christophides
7
PresDb’07
23 March 2007
Program
8:45 - 9:00 Registration
9:00 - 9:15 Opening
9:15 - 11:15 A Computer Scientist’s Perspective on Database
Preservation
11:30 - 13:00 Brainstorming Session
13:00 - 14:30 Lunch
14:30 - 16:30 An Archivist’s perspective on Database
Preservation
16:30 - 16:45 Break
16:45 - 18:30 Brainstorming Session
18:30 - 19:00 Closing Remarks
P. Buneman & V. Christophides
8
PresDb’07
23 March 2007
Logistics

Don’t forget your registration !!!
Full Talks: 30min  20min presentation + 5min discussion
Short Talks: 15min  10min presentation + 5min discussion

Reminder: load presentations during the breaks!


P. Buneman & V. Christophides
9
PresDb’07
P. Buneman & V. Christophides
23 March 2007
Acknowledgements
10
PresDb’07
23 March 2007
Program
8:45 - 9:00 Registration
9:00 - 9:15 Opening Vassilis Christophides
9:15 - 11:15 A Computer Scientist’s Perspective on Database
Preservation
9:15 - 9:45 Giorgos Flouris and Carlo Meghini
Steps Towards a Theory of Information Preservation
9:45 - 10:15 Mema Roussopoulos
A Fresh Look at the Reliability of Long-term Digital Storage
10:15 - 10:45 David Rosenthal
Engineering Issues in the Preservation of Databases
10:45 - 11:15 David Gross-Amblard
Database Watermarking: Protection by Alteration
11:15 - 11:30 Break
P. Buneman & V. Christophides
11
PresDb’07
23 March 2007
Program
11:30 - 13:00 Brainstorming Session Questions to be addressed
 How do we keep archived databases readable and usable in the long
term (at acceptable cost)?
 How do we separate the data from a specific database management
environment?
 How can we preserve the original data semantics and structure?
 How can we preserve authenticity and provenance of databases?
 How can we preserve data while it continues to evolve?
 How can we have efficient preservation frameworks, while retaining the
ability to query different database versions?
 How can multi-user online access be provided to hundreds of archived
databases containing terabytes of data?
 Can we move from a centralized model to a distributed, redundant
model of database preservation?
P. Buneman & V. Christophides
12
PresDb’07
23 March 2007
Program
11:30 - 11:45 Peter Buneman
Why current database technology does not support preservation
11:45 - 12:00 Panos Vassiliadis, George Papastefanatos, and Timos Sellis
Management of the Evolution of Database-Centric Information
Systems
12:00 - 12:15 Stefan Brandl and Peter Keller-Marxer
Long-term Archiving of Relational Databases with Chronos
12:15 - 12:30 Gabriel David
Data Warehouses in the Path from Databases to Archives
12:30 - 12:45 Norman Swindells
Sustainable Data - Data representation by standardised information
models
12:45 - 13:00 Ulf Andersson
Information and Operational applications and LTPA (long term
preservation application)
13:00 - 14:30Lunch
13
P. Buneman & V. Christophides
PresDb’07
23 March 2007
Program
14:30- 16:30 An Archivist’s perspective on Database Preservation
14:30 - 15:00 John A. Kunze
Practical Citation in a World of Evolving Data
15:00 - 15:30 Kevin Ashley
Preserving the Imperfect
15:30 - 16:00 Bill Roberts
A success story and an unsolved problem
16:00 - 16:30 Michael Lesk
Data Preservation: It's a People Problem
16:30 - 16:45 Break
P. Buneman & V. Christophides
14
PresDb’07
23 March 2007
Program
16:45- 18:30 Brainstorming Session Questions to be addressed






What are the salient features of a database that should be preserved?
What are the different stages in the database preservation's life cycle?
What documentation is preserved together with a database, and in
what format?
What are the legal encumbrances on database preservation?
What can be learned from traditional archival appraisal for the selection
of databases for preservation?
To what extent can the preservation strategies, and procedural policies
developed by archivists be adapted for databases?
P. Buneman & V. Christophides
15
PresDb’07
23 March 2007
Program
16:45 - 17:00 W. Christopher Lenhardt
Promoting Trusted Digital Repositories to Support Database
Preservation
17:00 - 17:15 Katerina Tsakona
Legal Awareness on Database Preservation
17:15 - 17:30 Luís Faria and Rui Castro
RODA - Repository of Authentic Digital Objects
17:30 - 17:45 Dirk Roorda
MIXED: Migration to Intermediate XML for Electronic Data
17:45 - 18:00 Seamus Ross and Sarah Jones
Performing Arts Databases: Use Cases
18:00 - 18:15 Jonathan Bard
$2B of irradiation data from the '50s, archived in an incoherent DBMS
18:15 - 18:30 Rolf Lang
Position Paper
18:30 - 19:00 Closing Remarks
P. Buneman & V. Christophides
16
Download