MyLifeBits - Microsoft Research

advertisement
MyLifeBits:
Realizing the Memex Vision
Santa Clara University
13 May 2004
Gordon Bell,
Jim Gemmell & Roger Lueder
www.MyLifeBits.com
www.research.microsoft.com/~gbell
1
Mylifebits collage
2
Outline … MyLifeBits
Background…fulfilling the Memex vision
 Cyberizing everything
 File to database transition
 Use…beyond search
 Working with Media Center for home use
 Long-term agenda and outlook


Archiving persons and things.
3
Memex
As We May Think, Vannevar Bush, 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
 Full-text search, text & audio annotations, and
hyperlinks
4
Capturing what you see
5
I am data
6
The guinea pig


Gordon Bell is digitizing his life
Has now scanned virtually all:










Books written (and read when possible)
Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages…
conversations and meetings to come
Paperless throughout 2002. 12” scanned, 12’ discarded.
Only 30 GB!!!
7
Capture and encoding
8
Quindi conference capture
9
I mean everything
10
Wearable & interactive jewellery LEDs flash
according to sensor type triggered
11
Potentially useful trivia – but not
normally photographed
12
GPS: tells where and when
13
Kentaro Toyama
wwmx.org
14
gbell wag: 67 yr, 25Kday life
1,000,000
100,000
10,000
1,000
100
10
1
100 5KB
Msgs
100 50.1 10 40Ks
0.1 150KB 100KB 1MB 400KB 1KBps 100MB 10GB
pages Tifs Books jpegs sound songs Videos
15
Lifetime storage (GB)
MyLifeBits organization: time and space
Timeline/
Context
(space)
Archival
(time)
Working
Personal
(some $s)
GB Co.
(angel, etc.)
Professional
ACM, etc., …
@Microsoft.com,
New co’s.
16
MyLifeBits: Some Lives(t)



Personal
 Parents, children, grandkids
 CGB himself
 GKB
 Close friends
GB $s
 Personal incl. several legal
structures
 Properties: autos, real estate,
 Investments & contracts
Past prof. companies/organiz’ns
 DEC
 Carnegie-Mellon U.
 DEC, NSF, Encore, Ardent,
Me Inc.,





CGB@ Microsoft
 MLB
 Clusters
 Telepresence
 WWW presence
Computer History Museum
 BOD member
 Fund-raising
 CyberMuseum
Startups & boards
Bell-Mason Director
Diamond & Vanguard Brds.
17
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
C,L
m
CGB...
Where
d
GB SR
KvMO
mB,L
d
KF SB
B ABosP B
WCa
6-year
--GS-HS---MIT DEC---+++++.+++---++++
Education
KV-----mit,F cmu
Work
Bell Elec
DECcmuDEC
ComputerMuseum
Books
Computers
E,NSF MSFT
M B
BN
SBN
SiValley
HiTechVent
4-6 11 VAX E A
Bell Lives timeline
18
Personal LifeLog Applications
Self
Diary/Journal
Tutor
Mentor
Advisor
Others
Application used by:
Babysitter
Financial
Manager
Medical
Manager
Companion
Caretaker
Parole
Officer
Assistant
for Elderly
Pers Flight
Recorder
Meeting
Prep
Personal
Assistant
Photo
Album
Autobiography
Captain’s
Log
Conservator
Biography
Baby
Book Trustee
Obituary
Executor
Others
Application controlled by:
Personal
Proxy
Self
19
MyLifeBits Software
Radio
capture tool
TV capture
tool
Telephone
capture tool
MyLifeBits
store
Internet
TV EPG
download
tool
database
Browser
tool
MyLifeBits
Shell
PocketPC
transfer
tool
PocketRadio
player
Radio EPG
tool
MAPI
interface
Legacy
email client
files
Legacy
applications
IM capture
Voice
annotation
tool
Text
annotation
tool
Import files
20
MyLifeBits is:



Memex and more (audio and video)
Universal store for all personal stuff
Guiding principles for the system:
1. Full text search & collections (> than hierarchy)
2. Visualizations for search, display, insight
3. Annotations and links add value and essential



4.
Increase search ability and value of information.
So make many kinds and them easy to create!
Stories are the ultimate annotation
Keep the links when you author: “transclusion”
21
MLB database: size and content?

Database features are essential: Consistency, Indexing,
Pivoting, Queries, Speed/scalability, Backup, replication.





Folders &Files were the starting point >> database into sets
aka “collections” that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)
Web trails including voice message annotation
Journal (Outlook), trails: every document use & transaction
What about?




Money (transactions, payees, etc.)…is their lifelog/trail
Streets and trips to cross-index to all docs
Attributes for photos for retrieval? Location, time, settings
Presentations as a report or trail. Each slide an object!
22
Why bother? An existence proof.
The following exist in abundance:


Shoeboxes full of photos
Photo albums & framed photos






Creative Memories is a thriving business selling resources for
created high-end photo albums that are well laid out and highly
annotated, using long-lasting materials.
Home videos
Bookshelves and filing cabinets
Old bundles of letters
Professional video/photo companies do capture at kids’
sports events and sell content like hotcakes
Probably not accessed very often but TREASURED
(what’s the one thing you would save in a fire?)
23
Why bother? ..more reasons






To eliminate physical storage (paper, CDs…)
It costs more (in time) to delete than the cost the
storage
You may only want to retrieve one of many items
in the future, but cannot predict which one
(which is why you file many things now)
For posterity and nostalgia
For memory enhancement & faster search
(search your LifeBits rather than the web … a single
source to look for anything you have ever seen)
Let content analysis and data mining discover
trends and correlations in your life
24
Extensible XML schemas
Logical views
Programmatic relationships
Synchronization service
Information agents
people
application
specific data
user
application
specific data
infrastructure
application
specific data
system
application
specific data
application
specific data
Annotation like this…
Voice
Annotation
26
Pivot to look at all of MLB(t)
Call, contact, pivot by time to find web page
28
Find brig, image, and look for 80
29
Here are the photos
30
Timeline view tells a story
31
Interface to xls
32
Statistics of use
33
Value of media depends on
annotations

“Its just bits until it is annotated”
35
Getting the user to tell a story is the
ultimate in media value




A story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search –
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
intern at BARC
for the summer
of 2000
We took him to
lunch at our
favorite Dim Sum
place to say
farewell
At table L-R: Dapeng, Gordon, Tom, Jim,
Don, Vicky, Patrick, Jim
36
Value of media depends on
annotations
“Its just bits until it is annotated”


user-story
user-basic
auto-usage
auto

none

Auto-annotate whenever
possible e.g. GPS cameras
Make manual annotation
as easy as possible. XP
photo capture, voice,
photos with voice, etc
Support gang annotation
Make stories easy
Annotations
37
Future work: Visualizations
Don't give me a little card
image and say, "That's all
you've got, because that's
what I thought you should
want for your virtual
shoebox." There have got
to be multiple modalities
and the designers have to
be able to deal with that.
… don't metaphor me in,
don't give me only one
way of looking at things.
Web Scout
U. Maryland
IN-SPIRE
Next Media
-Andy van Dam, Hypertext '87 Keynote
Address
38
LifeLines (Plaisant et al.)
www.cs.umd.edu/hcil/lifelines
39
University of Maryland
Rethinking collections & files

Date collections (“summer 99”)


By Person (“Photos of Bill”)


Better as links of type “photo of” to person
“Bill”
By Event (“Trip to UCLA”)


Much better as a query
Better as links to event in calendar
Working set

Better as query that figures it out for me so I
don’t need to maintain it
40
Facets and people
•
•
•
•
Time (& stage of life). Events…
Location (lat/long vs home, vacation)
Institution (relations including family, work, clubs,…)
Role (student, professional, parent, owner, etc.)
• Content type
– Audio, graphics, photo, video aka moving picture
– Document t type o(200) plus profession specific
ad, bill…will, cards (calling, credit, grade, greeting),
certificate (birth…death), correspondence, diary, essay, forms,
legal (6), instructions, lists, resume, reservation, scrapbook, transcript,
• Dissemination
– Book, electronic, serial, unpublished,
• Special collections (e.g. geology, stamps, species, places)
41
Facet Lists
42
Certificate facets
43
“By region” and “by time” should be facets!
44
Telephone, Television,
and Radio in the
Home of the Future
45
Evolution of media in the home
Today:
Yesterday:



Analog storage
and
transmission
on separate
networks
Physical space
limitations
Tedious
management
and manual
search




Digital storage (CDs,
DVDs, PVRs, MPEG
& WMA/V)
Digital cable,
internet radio, but
phone is mostly
analog
Still limitations on
what we can store
Different stores for
different stuff
Tomorrow:
 All digital
 Everything
connected
 Unlimited
storage
 Everything in a
database
SQL
46
stereo
Wfr
L
Spkr
stereo
CD
5 speakers
Legacy
Spkr
IR
LVCR
egacy stereo
Video*
5.1 digital
Redundant
DVD
comp. Receiver
Cassette
egacy
Set top
Cable/
Satellite
Ethernet
Camera
Mic
stereo
Video*
Set top
Media
Center
Computer
Kbd Mse
5.1 digital
SVHS-wide
Cables/links
Speaker 5+1
Plasma 2 or 3
Cable/Enet 2
IR 8
Stereo 4
5.1 digital 2
Comp./S-video 3
Plasma panel 1
Power 10
Kbd/mse 2
Monitor II (opt.) 4
Camera 2
Total 42 – 46
Things 18+remote
Video*
Plasma Panel
*Video = composite or S-video
47
48
The Agenda for the Tbyte(s), Lifetime, PC:
The killer app after office and mail.
1.
2.
Guarantee that data will live forever! “dear appy” problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discarding (scanner/shredder)
Personal meeting capture...
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
3.
4.
5.
6.
7.
8.
9.
Content analysis (critical for photo & video!)
Information control: privacy, security, expunge/deniability,…
Having to be schizophrenic or have a lobotomy when leaving a “life”
One dbase for everything (articles, books, conversations, ... financial
transactions) …vs. long-term use of hierarchical files. Is dbase intuitive?
Annotations/meta-information add every-increasing value
Easy annotation for aiding search and it becomes the content
The “killer apps”: Alzheimer, immortality, surrogate memory?
GUI’s to improve use (e.g. time to learn, use, retention)
50
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me,
Lost and forgotten data

Who’s responsible?
 media
 platform, file, and databases
 evolving standards and formats
 evolving and/or disappearing apps
51
Problems: “Amnesia” control
& deleting corporate “life” bits

Full sharing of bits that are mine
I created them, OK to copy and distribute
 DRM: purchased for my own use

“OK to look at, but I only own half the bits”
 Controlling forgetfulness


Private, do not “demo”
Expunge forever... “this never happened”
 The bits “belong” to a corporation or org.

52
The Content Analysis Problem
1.
2.
3.
4.
“Cliplets”: Automatic segmentation of a
pile of documents and video into
individual documents and scenes.
Item typing: Would like a minimal Dublin
Core for each item: date, creator, title,
source, abstract, and type
“Type” classification: articles, letters,
memos, etc.
Ontology creation for collections
53
The End
54
Archiving persons and things…
• www.oac.cdlib.org for 0(1K) corporations, people,
places, things.
– List of finders, usually -> paper boxes!
– E.g. Apple collection at Stanford points to 600’ or say $1K/ft.
• www.AlbertEinstein.org Einstein’s papers, etc.
• diva.library.cmu.edu/Newell/ for Allen Newell
• profiles.nlm.nih.gov/ Nobel Prize winners, Lederberg
• www.ComputerHistory.org computing artifacts
• www.MyLifeBits.com project to capture entire life
55
List of finding aids
56
Apple at Stanford
57
www.alberteinstein.info
58
Allen Newell page
59
Lederberg
60
Computer History Museum
• 1401 Shoreline, Mountain View
61
Archiving computing artifacts
• Charles Babbage Institute …Smithsonian is similar
– 135 collections 8K cu.ft. (20 M pages; 2 TB)
– 160 oral histories (30MB/hr =6000 MB)
– 150 K photos (@1MB, 150 GB)
• Computer history Museum
–
–
–
–
–
–
6 K physical objects: world’s best artifact collection
10 K photos
2 K videos (<1 TB); including recent DV taped interviews
12 M pages books, manuals, brochures, papers, (1.2 TB)
?? Of executable source & object codes
200 volunteers & many more world-wide
Amateurs versus professionals.
62
Computer History Museum
Artifact Collecting… the world is bits
• Artifact (“the machine”)
– Dormant or operating
– Hardware or software
• Project, people, plan
–
–
–
–
–
–
–
–
Timeline of project
Plan, schedule
Specification, manuals
Design
Organization
Communication
Articles, books
Interviews, talks, etc.
• Business aspects
– Plan, sales, marketing
– Ads, brochures, etc.
– Competitors
• Use
– User experience
– Video about it’s use
• Accessibility
– Raw bits, finding aid
– Interpreted story
– Exhibit
63
ChM Software Acquisition
64
Download