PowerPoint Presentation - Emulation, Migration and Long

advertisement
Emulation, Migration and
Long-Term Preservation of
Electronic Records
Cal Lee
University of Michigan
School of Information
ECURE 2001: Preservation and Access for
Electronic College and University Records
October 13, 2001
Outline
• The Digital Preservation Problem
• Base-Line Assumptions
• Major Approaches: Migration and
Emulation
• Migration
• Emulation
• For Further Reference
The Digital Preservation Problem
Technological Dependency
• Digital objects are useless if we can’t
interact with them
• Those interactions depend on numerous
technical components.
Key Concept - Abstraction
"Computer science is largely a matter of abstraction: identifying a
wide range of applications that include some overlapping functionality,
and then working to abstract out that shared functionality into a
distinct service layer (or module, or language, or whatever). That new
service layer then becomes a platform on top of which many other
functionalities can be built that had previously been impractical or
even unimagined. How does this activity of abstraction work as a
practical matter? It's technical work, of course, but it's also social
work. It is unlikely that any one computer scientist will be an expert
in every one of the important applications areas that may benefit from
the abstract service. So collaboration will be required.” (emphasis
added)
- Phil Agre, Red Rock Eater, March 25, 2000
Oh so many layers
•
•
•
•
•
•
•
•
•
•
Physical medium - only layer yielding real consensus
Bit
Byte
Character encoding
Instruction set architecture
Physical organization of bytes
Logical organization of chunks
Reading hardware
Input/output hardware
Input/output software
But, wait, there’s more
•
•
•
•
•
•
•
•
•
•
Operating system kernel
Network operating system
Networking protocols
Desktop and windowing environment
Data syntax
Data structure
Data semantics
Data content
Data values
Contextual linking within and between objects
Obsolescence
"Those who forget the past are
condemned to reload it."
- Nick Montfort, July 2000
• All layers undergo change over time, at
varying rates.
Some Base-Line Assumptions
• Several assumptions which I will take to be
given.
• Making them explicit can help us to be
more precise about available options and
their costs/benefits.
Assumption #1: Digital objects are
instructions for future interaction
• Only a small part of preservation work is
about treating them like physical artifacts.
• Jeff Rothenberg takes this even farther,
contending that all digital objects should
be seen as programs.
Assumption #2: Bits will be Bits
• Bit rot and advantages of newer media both call
for periodic refresh and reformatting.
• Ensuring the integrity of the bit stream in
such transfers is extremely important.
• See Charles Dollar’s 1999 book for an excellent
explanation of these processes.
Assumption #3: Change Happens
• Any long-term strategy must recognize that any
underlying technical platform will eventually be
abandoned by the industry and thereafter
increasingly difficult to support.
• Ongoing preservation effort is assumed, regardless
of the strategy adopted.
• Goal is to minimize (rather than eliminate) work and
maximize the benefits.
Assumption #4: Must identify what’s
desirable and what’s possible
• Best, most informed guess about how
objects will be used.
• Characteristics that support such use.
• Currently available technical approaches.
• Whether using any given approach can costeffectively preserve those characteristics.
• All of these decisions should be well
documented and revisited periodically.
Major Approaches:
Migration and
Emulation
Migration
• Periodic transformation of the bits/bytes to
run directly on newer platforms.
• Used widely as an approach to actively
managing legacy systems.
• Work can be expensive and introduce errors
of translation.
• Since the resulting objects can run directly
on newer platforms, layers of technology
can be minimized.
Emulation - Oxford English
Dictionary, Second Edition
“To reproduce the action of or behave like (a
different type of computer) with the aid of
hardware or software designed to effect this;
to run (a program, etc., written for another
type of computer) by this means.”
Popular Examples from the History of
Emulation
• Hardware and software - IBM System/360 (1963)
• Operating systems
–
–
–
–
–
–
–
IBM MVS (1972)
Amiga (1985)
Microsoft Z80 Softcard (1989)
DOS emulation in Windows (1987)
SoftWindows (for Macintosh)
Virtual PC (1997)
Wine (Windows Emulator, 1993)
More Emulation Examples
• Processors - Intel 8080 (1974)
• Virtual Machines - Java (1995)
• Terminal emulators - Telnet (1969),
WinFrame (1995)
• Lots and lots of games
Broad Issues to Address
• What level to emulate
• When to create the emulator - now
vs. later, once vs. periodically
• How to develop emulators - what
language, what platform
• Intellectual property rights
Arguments for preservation using
emulation
• Rothenberg - specification, interpreter, virtual
machine
• IBM - distinction between preserving data files and
programs, create emulators to run on Universal
Virtual Computer (UVC)
• CEDARS - maintain byte stream, focus on
preserving the significant properties of its
underlying abstract form (UAF)
• CAMiLEON - create emulator in a (simplified)
high-level language, migrate emulator across
platforms when necessary
Critiques of Emulation
• David Bearman most vocal critic
• Metadata and functional requirements are what
counts for preserving electronic records
• Emulation attempts to capture too much (full
functionality of technical environment) and not
enough (essential characteristics of records)
A Balanced Perspective on
Preservation Strategies
• No single solution
• Identify requirements THEN evaluate the
technical options.
• What attributes should be preserved (which
differences matter)?
• Make (and document) educated guesses of
costs and benefits.
For Further Reference
• Growing literature on these issues
• Several prominent projects now and in recent
years
• Please see the bibliography associated with this
presentation
Thank you!
Download