What to do with the Bits? Triage, First Aid, Clean Room

advertisement
What to do with the Bits?
Triage, First Aid, Clean Room
Patricia Galloway
School of Information
University of Texas at Austin
First step: DO NOTHING



Digital records are harder to destroy
completely than most believe
But it is very easy to alter them and thus
destroy their authenticity
Hence: you must proceed forensically



Ideas from digital discovery/digital forensics
Archives CSI!
First step: look but don’t touch
What do you have? Inventory



Find media and computers in collection(s)
Note any evidence from original order
Categorize and date them based on physical
evidence

Media names and formatting as proclaimed on
media


Timeline: http://en.wikipedia.org/wiki/Floppy_disk
Labels on the media, even multiple ones (should
you peel them off?)
How does it fit? Context

What are your working hypotheses?





Who created? (evidence from the fonds)
When? (scope note?)
How does it compare in amount to paper?
How might it be relevant?
What is the computing history of the fonds
creator?

Construct a technology timeline (cf. Maria
Esteva’s discoveries)
Triage


How old/outdated is it?
How important is it?






Does it likely have a paper counterpart?
Will that counterpart maintain affordances?
Might the digital amplify evidence?
How much will it cost to retrieve?
How much needs to be retrieved?
Do you need to know what’s there before you
can decide?
First aid: What can you find
out without killing the patient?




Media format + operating system + application
software = accessibility
BUT Media format + operating system + application
software = potential danger to authenticity
Mining a digital fonds without reading it (MPLP?)
Without opening any file you can potentially see:




File arrangement
Detailed directory listing
File naming conventions
But how to do it without risk?
How can you find out?



Do you have drives to read media?
Do you have software to read/render/list the
contents?
Can you do this nondestructively?

Does it matter?



Are the materials well-documented and already an
intentional copy?
Do you need to recover process as well as content?
If you don’t know, assume it does matter
Authenticity warning 1


Creation date is crucial to archival interest
Creation date may appear in many forms




Metadata as part of file
Metadata as auxiliary file (Mac resource fork)
Metadata as managed by OS
Creation date as managed by the OS may be
changed systematically


On copy
On saving an opened file
Authenticity warning 2

Creator/author metadata



Placed by software
Usually haphazardly set up by individuals
May not reflect individuals if set up by company
Cheap and cheerful: checking
out floppies



Apply hardware write-protect
Try to read the medium
If no adverse message



“Do you want to format this disk?”
“Disk is unreadable”
Then copy to another medium



Using forensic-copy software: maintains metadata
Using your OS
 dates and other metadata will be altered
 Metadata must be captured before copy
And set original aside
Clean room procedure

Digital environments can eat their young





Alteration of metadata
Alteration of format
Neutral “clean room” environment needed: where
object is seen ONLY as sequence of bits
Tools for nondestructive copy out of original and into
clean room: digital discovery
Tools for nondestructive analysis of file system:
digital forensics
Is this the future?





What do we really know about paper, after all?
What tools do we use to decide how valuable it is?
What can we know about digital objects if we are
careful?
What tools can we use to decide how valuable it is?
Compare in terms of MPLP


Paper: settle for high-level aggregate knowledge
Digital: organize at will, mine out subjects, locate every
item
Download