Lifetime Personal Information Stores MyLifeBits Challenges in Using based on

advertisement
Challenges in Using
Lifetime Personal Information Stores
based on MyLifeBits
Gordon Bell, Jim Gemmell, Roger Lueder
SIGIR
University of Sheffield, July 26, 2004
“I have watched as hundreds
of millions of dollars have been
invested to re-invent the wheel
- often badly.”
-Marcia Bates
The 1 TB Life

1TB gives you 65+ years of:









100 email messages a day (5KB each)
100 web pages day (50KB each)
5 scanned pages a day (100KB each)
1 book every 10 days (1 MB each)
10 photos per day (400 KB JPEG each)
8 hours per day of sound - e.g. telephone,
voice annotations, and meeting recordings (8 Kb/s)
1 new music CD every 10 days (45 min each at 128 Kb/s)
It will take you 5 years to fill up your 80 GB drive
Want video? Buy more cheap drives (1 TB/year lets
you record 4 hours/day of 1.5 Mb/s video)
Everything goes in a database

You need all the features of a database
(Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup,
replication)


If you don’t use one, you will find yourself creating
one!
Files as blobs, also sync with file system for legacy
apps
SQL
MyLifeBits Software
GPS import &
Map display
TV capture
tool
SenseCam
Telephone
capture tool
MyLifeBits
store
Internet
TV EPG
download
tool
database
Browser
tool
MyLifeBits
Shell
Screen saver
PocketPC
transfer
tool
PocketRadio
player
Radio
capture
& EPG
MAPI
interface
Legacy
email client
files
Legacy
applications
IM capture
Voice
annotation
tool
Text
annotation
tool
Import files
Memex
As We May Think, Vannevar Bush, 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
 Full-text search, text & audio annotations, and
hyperlinks
I am data
The guinea pig


Gordon Bell is digitizing his life
Has now scanned virtually all:










Books written (and read when possible)
Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages…
conversations and meetings to come
Paperless throughout 2002. 12” scanned, 12’ discarded.
Only 30 GB!!!
Capture and encoding
I mean everything
50+ year old newspaper clippings
400 year old books
O(100s) tapes from videotape “black hole”
Personal LifeLog Applications
Self
Diary/Journal
Tutor
Mentor
Advisor
Others
Application used by:
Babysitter
Financial
Manager
Medical
Manager
Companion
Caretaker
Parole
Officer
Assistant
for Elderly
Pers Flight
Recorder
Meeting
Prep
Personal
Assistant
Photo
Album
Autobiography
Captain’s
Log
Conservator
Biography
Baby
Book Trustee
Obituary
Executor
Others
Application controlled by:
Personal
Proxy
Self
Why bother? ..some reasons










Technology creates an opportunity e.g. 1 TB disks
Technology creates a need e.g. jpg
It will decay or disappear if you don’t save it
To eliminate physical storage (paper, CDs…)
It costs more (in time) to delete than it costs to store
The mantra of the squirrel: “I may need it some day.”
For posterity and nostalgia: “Maybe others will want it.”
For memory enhancement & faster search
(search your LifeBits rather than the web or your colleagues …
a single source to look for “stuff I’ve seen”)
Let content analysis and data mining discover trends
and correlations in our lives…that even we don’t know.
Aid to aging or failed memories
So you’ve got it – now what do you
do with it?
“A record if it is to be useful … must be
continuously extended, it must be stored,
and above all it must be consulted”
“The difficulty seems to be, not so much that
we publish unduly … but rather that
publication has been extended far beyond
our present ability to make real use of the
record”
- Vannevar Bush
Trying to use my life bits
#1: Folders
One item. One place.
It worked for 1000s of years.
My docs and archive
Library/file cab
X- Employer
Active Employer
Library/file cab
Employer
S
e
l
f
E
E
Project
Employer
Project
Project
Employer
S
Business
Invests,
family $s,
& Legal
Library/file cab
Library/file cab
Library/file cab
Library/file cab
X-Employer
Library/file cab
Library/file cab
Library/file cab
Library/file cab
Library/file cab
<1995 Library/file cab
Project
Project
Personal,
including
Medical
Freedom from hierarchy
c:\my documents\talks\MyLifeBits.ppt
ID=location=organization=display string
 Don’t make me invent unique names
 Don’t make me file everything
 Or let me pick multiple folders

“multiple categorization not only improves
organization and retrieval times but also
matches more closely with the way users
naturally think about organizing their
information” – Quan et al (MIT’s Haystack)
MyLifeBits
collection dialog
Of course Aliases and Shortcuts can be used albeit painfully
to file by time and/or event, subject, location, type.
Trying to use my life bits
#2: Text annotations
Making bits more valuable and retrievable.
“Its just bits until it is annotated”
Getting the user to tell a story is the
ultimate in media value




A story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search –
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
intern at BARC
for the summer
of 2000
We took him to
lunch at our
favorite Dim Sum
place to say
farewell
At table L-R: Dapeng, Gordon, Tom, Jim,
Don, Vicky, Patrick, Jim
Annotation like this…
Voice
Annotation
Annotation when you feel like it,
how you feel like it

Screensaver is the killer app!
Trying to use my life bits
#3: “I remember when…”
The 1st or 2nd most important retrieval handle.
MyLifeBits time overlap
MyLifeBits on-the-fly time clustering
MSR Next Media Team
Mark Stewart’s
Lifeline
M Stewart Lifeline v2
Copyright Mark Stewart, 2004
Trying to use my life bits
#4: Relationships (links)
Using something near ‘it”, to find “it”.
Mark Stewart’s first page
Copyright Mark Stewart, 2004
The Stew family tree
Copyright Mark Stewart, 2004
PhotoFinder - Schneiderman and Kang
MyLifeBits Entities & Links
Photo of Event
Caller in Phone Call
Annotates
Transcludes
Trying to use my life bits
#5: I remember where
Just essential.
Trying to use my life bits
#6: more meta-data (properties)
I remember something about the content
(understanding a person’s work)
Lederberg Finder page
Dublin core of a given item
Trying to use my life bits
#7: classification
Moving oward the ultimate time sink.
Is traditional classification required?
…at OCLC there was unanimous agreement
among faculty and participants that
“access to electronic resources
requires controlled vocabulary and
classification”
OCLC Institute, “Knowledge Access Management: Tools
and Concepts for Next Generation Catalogers”, 17-19
November 1997, Dublin, Ohio.
www.alberteinstein.info
Professional Life:
Organizations
Administrivia
Projects
Library
Lederberg papers official reports
Number of document segments
Lederberg Artifact types





















Abstracts
Agendas not
Announcements m;
Application forms
Articles m
Autobiographies m
Bibliographies m
Biographies m
Brochures m
Certificates m
Correspondence m
Diaries m
Drafts (documents)
Drawings m
Electronic images m
Essays m
Eulogies
Excerpts
Grant proposals
Interviews m
Invitations





















Laboratory notebooks m
Laboratory notes
Lecture notes
Lectures m
Legal documents m
Legislative records
Lists
Manifestoes
Memoirs m
Minutes
Monographs m
Narratives
Newsletters
Newspaper columns m
Notebooks m
Notes
Obituaries
Official reports
Oral histories m
Petitions
Photographic prints m
Press releases m
Procedures
Proceedings m
Programs m
Proposals m
Questionnaires
Reminiscences
Reports m
Resolutions
Resumes
Reviews m
School records
Speeches m
Summaries
Tables (documents)
Technical reports m
Transcripts m
Typescripts
Video recordings m
Species: Animals: Chordata: Vertebrata: bony fish
Computer structures: digital computer: minicomputer
Computer structures: digital computer: minicomputer
(refined: Digital Equipment Corp.)
Computer structures taxonomy: computers
Trying to use my life bits
#8: “ontology”???
“Succumbing to the ‘ontology’ fallacy”
-Bates
Media
Ancestors, Parents,
Siblings
Diaries
Comm.
Self
Artifacts
Friends Family & related social
Children
Company1
Spouse/
Significant Other
Employer2
Organizations
Non-profit3
1.
2.
3.
4.
Family Business2
Academic Inst.2
Generic organization: Correspondence, financial, manuals,
notebooks, org chart, plans, products, stocks, etc..
Facets: doc type, dissemination, institution type
Generic org. plus projects x roles; facets: financial; legal
Generic organization for club, foundation, museum,
professional org, religious, sport, etc.
Books, CDs, papers, videos Facets: media type,
Family ($,property, legal, health)
potentially private…
Legal
Health
Property
Auto, home&
other “things”
Financial Assets
Articles, bio, books,
interviews, talks,
…web pages
Library & archives: info & records.
Personal archives (Ambiance…)
Library4
Institution type:
academic,…
companies, family,
other Orgs…self
MyLifeBits: Some Lives(t)



Personal
 Parents, children, grandkids
 CGB himself
 GKB
 SSF
 Close friends
GB $s; Legal entities
 Personal incl. several legal
structures
 Properties: autos, real estate,
 Investments & contracts
Past prof. companies/organiz’ns
 DEC
 Carnegie-Mellon U.
 DEC, NSF, Encore, Ardent,
Me Inc., Bell-Mason



Bell-Mason Director
Diamond & Vanguard Brds.
Startups & boards

CGB@ Microsoft
 MLB
 Clusters
 Telepresence
 WWW presence

Computer History Museum
 BOD member
 Fund-raising
 CyberMuseum
st
er
Lo Be l
l
G la B
or
e
Ki don ll
rk
s v Be
ll
ill
e,
M
O
M
U
G
w . of .I.T
en
.
N
Dr .S.
uy W
or .
B
el
l
Di
M
gi
.I.
Br tal T.
(
i
La gh DEC
ur am
)
a
(d ( so
au
n
gh )
te
r)
Di
C
gi
ta MU
l(
DE
C
En )
co
re
NS
Ar F
de
nt
B
M
ic e ll
L
ro
so t d.
ft
R
Sh
es
er
.
id
an TC
Fo M
rb
es
Ko
lb
Ch
e
Sc M
F hul
t
St i o
ry na z
ke
B
r S ell
Br ch
i d ul t
ge z
tB
el
l
Ch
e
2010
2000
1900
GB Timeline
1990
1980
1970
1960
1950
1940
1930
1920
1910
F F F F E E F E W F F E WW W WWW O F O F F F F
Roles & Institutions
I <am son of> ….
I <am father of> Brigham <1960->, Laura <1963->
I <studied at> MIT <1952-1957; 1959-1960>
I <worked for> DEC <1960-1966; 1972-1983>
I <am a member of> ACM <1960- ->… NAE
I <am on the board of> Computer Museum…
Things

Can everything be part of the model?
Pets
 Houses
 Cars
 Assets

Trying to use my life bits
#9: logging & reports
Interface to xls
TV Usage
MyLifeBits Log of a video file
Open Problems
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me.
Forever yours truly,
Lost and forgotten data

Who’s responsible?
Media or 8 track cassette, 8” floppy
Evolving platform, file, and database
Evolving, incompatible standards & formats for
legacy data that disregard ancestors
Evolving and/or disappearing apps
A Storocratic Oath
Do no harm to dates
(File creation, Photo taken)
Do no harm to device created &
other meta-data.
1.
2.
•
Support & aid the creation of critical metadata.
3.
•
•
4.
Camera data & location data are sacred.
When/how the user feels like it
Auto-magically!
Maintain user confidentiality
Classification wish list





Download classifications rather than build them
Definitions & synonyms should help find what I want
Today it is too expensive to manually classify my
scanned paper. E.g. “right time” meta-data is critical!
Next year I hope “the system” can classify my papers
In 10 years I expect all documents to appear
electronically & classified with a little help from me
Personal Search is not
Professional or Web search
System sees every entry & access
 Everything, not just a professional life
 Limited to SIS, not an infinite amount,
covers a profession & personal life

MyLifeBits
Professional user
Depth e.g.
information
item types
& coverage
Web as seen by search engines
Knowledge breadth e.g. Dewey classification
The killer app??
Input, File, Classify, and Find…
 Observe every action…
 Operational

SIS (e.g. msg, name, paper, fact, birthday,
phone call,
 Time & motion (routing, communicating,
scheduling … thinking)


Archival one’s self
Finder aka Table of Contents aka Site Map
 Story telling.


Screen saver & personal ambience
The A/V/real time data Future:
new capture modes/devices
Deja View
SenseCam
Body Media
Quindi
Sensecam &
Interactive jewellery
www.MyLifeBits.com
Download