MyLifeBits: Personal archive issues Imaging Sci. & Tech. Conf. San

advertisement
Challenges in building and using a
Lifetime Personal Information Store
based on MyLifeBits
Gordon Bell
Accelerating Change ─ 6 November 2004
The 1 TB Life

1TB gives you 65+ years of:









100 email messages a day (5KB each)
100 web pages day (50KB each)
5 scanned pages a day (100KB each)
1 book every 10 days (1 MB each)
10 photos per day (400 KB JPEG each)
8 hours per day of sound - e.g. telephone,
voice annotations, and meeting recordings (8 Kb/s)
1 new music CD every 10 days (45 min each at 128 Kb/s)
It will take you 5 years to fill up your 80 GB drive
Want video? Buy more cheap drives (1 TB/year lets
you record 4 hours/day of 1.5 Mb/s video)
Everything goes in a database

You need all the features of a database
(Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup,
replication)


If you don’t use one, you will find yourself creating
one!
Files as blobs, also sync with file system for legacy
apps
SQL
MyLifeBits Software
GPS import &
Map display
TV capture
tool
SenseCam
Telephone
capture tool
MyLifeBits
store
Internet
TV EPG
download
tool
database
Browser
tool
MyLifeBits
Shell
Screen saver
PocketPC
transfer
tool
PocketRadio
player
Radio
capture
& EPG
MAPI
interface
Legacy
email client
files
Legacy
applications
IM capture
Voice
annotation
tool
Text
annotation
tool
Import files
Memex
As We May Think, Vannevar Bush, 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
 Full-text search, text & audio annotations, and
hyperlinks
I am data
The guinea pig


Gordon Bell is digitizing his life
Has now scanned virtually all:










Books written (and read when possible)
Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages…
conversations and meetings to come
Paperless throughout 2002. 12” scanned, 12’ discarded.
Only 44 GB, incl. 10 wma, 14 SQL!!! Video: o(100) + 500 mov
Capture and encoding
I mean everything
50+ year old newspaper clippings
400 year old books
O(100s) tapes from videotape “black hole”
Personal LifeLog Applications
Self
Diary/Journal
Tutor
Mentor
Advisor
Others
Application used by:
Babysitter
Financial
Manager
Medical
Manager
Companion
Caretaker
Parole
Officer
Assistant
for Elderly
Pers Flight
Recorder
Meeting
Prep
Personal
Assistant
Photo
Album
Autobiography
Captain’s
Log
Conservator
Biography
Baby
Book Trustee
Obituary
Executor
Others
Application controlled by:
Personal
Proxy
Self
Personal Search is not
Professional or Web search
System sees every entry & access
 Everything, not just a professional life
 Limited to SIS, not an infinite amount,
covers a profession & personal life

MyLifeBits
Professional user
Depth e.g.
information
item types
& coverage
Web as seen by search engines
Knowledge breadth e.g. Dewey classification
Why bother? ..some reasons










Technologist: “we can” an opportunity e.g. 1 TB disks
For all of us with new media: a need e.g. jpg. Mp3
Environmentalist: eliminates “atoms” (paper, CDs…)
For business--memory enhancement & faster search:
Let content analysis and data mining discover trends
and correlations in our lives…that even we don’t know.
Business: It costs more to delete than it costs to store
Preservationist: decays or disappears unless its saved
For the human pack rat: “I may need it some day.”
For posterity and nostalgia: “Maybe others will want it.”
Stories and ambience: basis for creating content
For the aging & failed memory: surrogate memory
So you’ve got it – now what do you
do with it?
“A record if it is to be useful … must be
continuously extended, it must be stored,
and above all it must be consulted”
“The difficulty seems to be, not so much that
we publish unduly … but rather that
publication has been extended far beyond
our present ability to make real use of the
record”
- Vannevar Bush
Using my life bits:
beyond folders
#1: Folders
One item. One place.
It worked for 1000s of years.
My docs and archive
Library/file cab
X- Employer
Active Employer
Library/file cab
Employer
S
e
l
f
E
E
Project
Employer
Project
Project
Employer
S
Business
Invests,
family $s,
& Legal
Library/file cab
Library/file cab
Library/file cab
Library/file cab
X-Employer
Library/file cab
Library/file cab
Library/file cab
Library/file cab
Library/file cab
<1995 Library/file cab
Project
Project
Personal,
including
Medical
Freedom from hierarchy
c:\my documents\talks\MyLifeBits.ppt
ID=location=organization=display string
 Don’t make me invent unique names
 Don’t make me file everything
 Or let me pick multiple folders

Using my life bits:
easily adding valuable content
#2: Text annotations
Making bits more valuable and retrievable.
“Its just bits until it is annotated”
Getting the user to tell a story is the
ultimate in media value




A story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search –
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
intern at BARC
for the summer
of 2000
We took him to
lunch at our
favorite Dim Sum
place to say
farewell
At table L-R: Dapeng, Gordon, Tom, Jim,
Don, Vicky, Patrick, Jim
Annotation like this…
Voice
Annotation
Annotation when you feel like it,
how you feel like it

Screensaver is the killer app!
Using my life bits:
the value of time & time posts
#3: “I remember when…”
The 1st or 2nd most important retrieval handle.
MyLifeBits time overlap
MyLifeBits on-the-fly time clustering
MSR Next Media Team
Mark Stewart’s
Lifeline
M Stewart Lifeline v2
Copyright Mark Stewart, 2004
Laura (daughter)
Kolbe Schultz
Stryker Schultz
Sheridan Forbes
M.I.T. Speech Lab
Digital (DEC)
CMU
Encore
NSF
Ardent
Bell Ltd.
Microsoft Res.
Computer Museum
2010
Bridget Bell
2000
Fiona Bell
1990
Brigham (son)
1980
Gwen Druyor Bell
1970
U. of N.S.W.
1960
M.I.T.
1950
Kirksville, MO
1940
Sharon (Smith)
F: father
F: mother
F: self
F: Sister
Education
Education
Education
F: spouse
F: son
F: grandChild
F: grandChild
F: daughter
F: grandchild
F: grandchild
F: Significant Other
W/Education
Work
Work
Work
Work
Work
Work
Work
Organization
1930
Gordon Bell
1920
Lola Bell
1910
1900
Chester Bell
Using my life bits:
Where, an essential attribute
#4: I remember where
Just essential.
Using my life bits:
pivoting on data to aid recall
#5: Relationships (links)
Using something near ‘it”, to find “it”.
MyLifeBits Entities & Links
Photo of Event
Caller in Phone Call
Annotates
Transcludes
PhotoFinder - Schneiderman and Kang
Using my life bits:
never enough meta-data …
but, can you afford it?b
#6: more meta-data (properties)
I remember something about the content
(understanding a person’s work)
Lederberg Finder page
Dublin core of a given item
Using my life bits:
classification of everything
#7: classification
Is any gain from non-automated classification
worth the cost and pain?
Is traditional classification required?
…at OCLC there was unanimous agreement
among faculty and participants that
“access to electronic resources
requires controlled vocabulary and
classification”
OCLC Institute, “Knowledge Access Management: Tools
and Concepts for Next Generation Catalogers”, 17-19
November 1997, Dublin, Ohio.
“I have watched as hundreds
of millions of dollars have been
invested to re-invent the wheel
- often badly.”
-Marcia Bates
www.alberteinstein.info
Professional Life:
Organizations
Administrivia
Projects
Library
Lederberg papers official reports
Number of document segments
Lederberg Artifact types





















Abstracts
Agendas not
Announcements m;
Application forms
Articles m
Autobiographies m
Bibliographies m
Biographies m
Brochures m
Certificates m
Correspondence m
Diaries m
Drafts (documents)
Drawings m
Electronic images m
Essays m
Eulogies
Excerpts
Grant proposals
Interviews m
Invitations





















Laboratory notebooks m
Laboratory notes
Lecture notes
Lectures m
Legal documents m
Legislative records
Lists
Manifestoes
Memoirs m
Minutes
Monographs m
Narratives
Newsletters
Newspaper columns m
Notebooks m
Notes
Obituaries
Official reports
Oral histories m
Petitions
Photographic prints m
Press releases m
Procedures
Proceedings m
Programs m
Proposals m
Questionnaires
Reminiscences
Reports m
Resolutions
Resumes
Reviews m
School records
Speeches m
Summaries
Tables (documents)
Technical reports m
Transcripts m
Typescripts
Video recordings m
Species: Animals: Chordata: Vertebrata: bony fish
Computer structures: digital computer: minicomputer
(refined: Digital Equipment Corp.)
Computer structures taxonomy: computers
Classification wish list





Download classifications rather than build them
Definitions & synonyms should help find what I want
Today it is too expensive to manually classify my
scanned paper. E.g. “right time” meta-data is critical!
Next year we hope “the system” can classify papers
and other documents e.g. bills
In 10 years we expect all documents to appear
electronically & classified with a little help from me
Using my life bits:
Ontologies…
useful? or fool’s errand?
#8: “ontology”???
“Succumbing to the ‘ontology’ fallacy”
-Bates
MyLifeBits: Some Lives(t)



Personal
 Parents, children, grandkids
 CGB himself
 GKB
 SSF
 Close friends
GB $s; Legal entities
 Personal incl. several legal
structures
 Properties: autos, real estate,
 Investments & contracts
Past prof. companies/organiz’ns
 DEC
 Carnegie-Mellon U.
 DEC, NSF, Encore, Ardent,
Me Inc., Bell-Mason



Bell-Mason Director
Diamond & Vanguard Brds.
Startups & boards

CGB@ Microsoft
 MLB
 Clusters
 Telepresence
 WWW presence

Computer History Museum
 BOD member
 Fund-raising
 CyberMuseum
Laura (daughter)
Kolbe Schultz
Stryker Schultz
Sheridan Forbes
M.I.T. Speech Lab
Digital (DEC)
CMU
Encore
NSF
Ardent
Bell Ltd.
Microsoft Res.
Computer Museum
2010
Bridget Bell
2000
Fiona Bell
1990
Brigham (son)
1980
Gwen Druyor Bell
1970
U. of N.S.W.
1960
M.I.T.
1950
Kirksville, MO
1940
Sharon (Smith)
F: father
F: mother
F: self
F: Sister
Education
Education
Education
F: spouse
F: son
F: grandChild
F: grandChild
F: daughter
F: grandchild
F: grandchild
F: Significant Other
W/Education
Work
Work
Work
Work
Work
Work
Work
Organization
1930
Gordon Bell
1920
Lola Bell
1910
1900
Chester Bell
Using my life bits:
Providing insight, including…
Where did I spend my time?
What has been by output?
#9: logging & reports
Interface to xls
TV Usage
Using my life bits:
Recording everything!
#10: CARPE
Continuous archival recording of
personal experiences
The A/V/real time data Future:
new capture modes/devices
Deja View
SenseCam
Body Media
Quindi
Sensecam &
Interactive jewellery
Open Problems
The Agenda for the Tbyte(s), Lifetime, PC:
The killer app after office and mail.searching
1.
2.
Guarantee that data will live forever! “dear appy” problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discarding (scanner/shredder)
Personal meeting capture...
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
3.
4.
5.
6.
7.
8.
9.
Content analysis (critical for photo & video!); doable for text. Needs doing!
Information control: privacy, security, expunge/deniability,…
Having to be schizophrenic or have a lobotomy when leaving a “life”
One dbase for everything (articles, books, conversations, ... financial
transactions) …vs. long-term use of hierarchical files. Is dbase intuitive?
Annotations/meta-information add every-increasing value
Easy annotation for aiding search and it becomes the content
Other “killer apps”: Alzheimer, immortality, surrogate memory?
GUI’s to improve use (e.g. time to learn, use, retention)
www.MyLifeBits.com
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me.
Forever yours truly,
Lost and forgotten data

Who’s responsible?
Media or 8 track cassette, 8” floppy
Evolving platform, file, and database
Evolving, incompatible standards & formats for
legacy data that disregard ancestors
Evolving and/or disappearing apps
A Storocratic Oath
Do no harm to dates
(File creation, Photo taken)
Do no harm to device created &
other meta-data.
1.
2.
•
Support & aid the creation of critical metadata.
3.
•
•
4.
Camera data & location data are sacred.
When/how the user feels like it
Auto-magically!
Maintain user confidentiality
The killer app??
Input, File, Classify, and Find…
 Operational



Observe every action…


“Stuff I’ve Seen” (e.g. msg, name, paper, fact,
birthday, phone call, photo
Time & motion (routing, communicating,
scheduling … thinking)
Archival one’s self
Finder aka Table of Contents aka Site Map
 Story telling.


Screen saver & personal ambience
Download