MyLifeBits: Attempting to realize the Memex Vision Jim Gemmell & Roger Lueder Gordon Bell http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx 1 Outline … MyLifeBits Background…fulfilling the Memex vision Cyberizing everything File to database transition Use…beyond search Long-term agenda and outlook 2 Memex Posited by Vannevar Bush in “As We May Think” The Atlantic Monthly, July 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” Supports: Annotations, links between documents, and “trails” through the documents “yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so that he can be profligate and enter material freely” 3 Sketch of memex 4 Bush’s camera on the head Capturing what you see 6 "The PC is going to be the place where you store the information and really the center of control“ Billg 1/7/2001 MyLifeBits is a project to “cyberize” everything! What? Recall of all articles, books, CDs, photos, video, communication (e.g. mail, phone), meetings,and web Why? …“because we can” Office: communicate, store, & work Home & Media Center: ambiance &entertainment Immortality for progeny. Memory aids Goal: understand the 1 TByte PC for Lonfor Longhorn need, utility, cost, feasibility and tools. 8 LifeLog: A potential research program LifeLog: A (sub)system that captures, stores, and makes accessible the flow of one person’s experience in and interactions with the world LifeLog Thrust: Capture the “story” of a human Living Content Ontology (format) The End of the Line… Cave Paintings Biographies Sagas Family Bibles Photo Albums Videos Home Movies Blogs LifeLog 9 The guinea pig Gordon Bell is digitizing his life Has now scanned virtually all: Books written (and read when possible) Personal documents (correspondence including memos and email, bills, legal documents, papers written, …) Photos Posters, paintings, photo of things (artifacts, …medals, plaques) Home movies and videos CD collection And, of course, all PC files Now recording: phone, radio, TV (movies), web pages… conversations and meetings to come Paperless throughout 2002. 12” scanned, 12’ discarded. Only 30 GB!!! 11 I am data 12 Capture and encoding 13 Quindi conference capture 14 I mean everything 15 gbell wag: 67 yr, 25Kday life 1,000,000 100,000 10,000 1,000 100 10 1 100 5KB Msgs 100 50.1 10 40Ks 0.1 150KB 100KB 1MB 400KB 1KBps 100MB 10GB pages Tifs Books jpegs sound songs Videos 19 Lifetime storage (GB) MyLifeBits organization: time and space Timeline/ Context (space) Archival (time) Working Personal (some $s) GB Co. (angel, etc.) Professional ACM, etc., … @Microsoft.com, New co’s. 20 MyLifeBits: Some Lives(t) Personal Parents, children, grandkids CGB himself GKB Close friends GB $s Personal incl. several legal structures Properties: autos, real estate, Investments & contracts Past prof. companies/organiz’ns DEC Carnegie-Mellon U. DEC, NSF, Encore, Ardent, Me Inc., CGB@ Microsoft MLB Clusters Telepresence WWW presence Computer History Museum BOD member Fund-raising CyberMuseum Startups & boards Bell-Mason Director Diamond & Vanguard Brds. 21 Personal LifeLog Applications Self Diary/Journal Tutor Mentor Advisor Others Application used by: Babysitter Financial Manager Medical Manager Companion Caretaker Parole Officer Assistant for Elderly Pers Flight Recorder Meeting Prep Personal Assistant Photo Album Autobiography Captain’s Log Conservator Biography Baby Book Trustee Obituary Executor Others Application controlled by: Personal Proxy Self 23 MyLifeBits is: Memex and more (audio and video) Universal store for all personal stuff Guiding principles for the system: 1. Full text search & collections (> than hierarchy) 2. Visualizations for search, display, insight 3. Annotations and links add value and essential 4. Increase search ability and value of information. So make many kinds and them easy to create! Stories are the ultimate annotation Keep the links when you author: “transclusion” 25 MLB database: size and content? Database features are essential: Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication. Folders &Files were the starting point >> database into sets aka “collections” that are identical to the folder structure Outlook (msgs, attachments, calendar, contacts) Web trails including voice message annotation Journal (Outlook), trails: every document use & transaction What about? Money (transactions, payees, etc.)…is their lifelog/trail Streets and trips to cross-index to all docs Attributes for photos for retrieval? Location, time, settings Presentations as a report or trail. Each slide an object! 26 Radio capture tool Telephone capture tool PocketPC transfer tool PocketRadio player TV capture tool Internet MyLifeBits store Radio EPG tool TV EPG download tool database MAPI interface Browser tool files Legacy applications MyLifeBits Shell Voice annotation tool Text annotation tool Legacy email client Annotation like this… Voice Annotation 29 Pivot to look at all of MLB(t) Call, contact, pivot by time to find web page 30 Find brig, image, and look for 80 31 Here are the photos 32 Timeline view tells a story 33 Finding scatological works 34 Statistics of use 35 Value of media depends on annotations “Its just bits until it is annotated” 45 System annotations provide base level of value Date 7/7/2000 46 Tracking usage – even better Date 7/7/2000. Opened 30 times, emailed to 10 people (its valued by the user!) 47 Get the user to say a little something is a big jump Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC dim sum intern farewell Lunch” 48 Getting the user to tell a story is the ultimate in media value A story is a “layout” in time and space Most valuable content (by selection, and by being well annotated) Stories must include links to any media they use (for future navigation/search – “transclusion”). Cf: MovieMaker; Creative Memories PhotoAlbums Dapeng was an intern at BARC for the summer of 2000 We took him to lunch at our favorite Dim Sum place to say farewell At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, Jim 49 Value of media depends on annotations “Its just bits until it is annotated” user-story user-basic auto-usage auto none Auto-annotate whenever possible e.g. GPS cameras Make manual annotation as easy as possible. XP photo capture, voice, photos with voice, etc Support gang annotation Make stories easy Annotations 50 51 stereo Wfr L Spkr stereo CD 5 speakers Legacy Spkr IR LVCR egacy stereo Video* 5.1 digital Redundant DVD comp. Receiver Cassette egacy Set top Cable/ Satellite Ethernet Camera Mic stereo Video* Set top Media Center Computer Kbd Mse 5.1 digital SVHS-wide Cables/links Speaker 5+1 Plasma 2 or 3 Cable/Enet 2 IR 8 Stereo 4 5.1 digital 2 Comp./S-video 3 Plasma panel 1 Power 10 Kbd/mse 2 Monitor II (opt.) 4 Camera 2 Total 42 – 46 Things 18+remote Video* Plasma Panel *Video = composite or S-video 53 54 Media center 2 55 Photos 56 Caneel Bay Vacation Jan. 1998 Gordon, Gwen, Brig, Pam, Fiona, Bob, Laura and Kolbe 57 The Agenda for the Tbyte(s), Lifetime, PC: The killer app after office and mail. 1. 2. Guarantee that data will live forever! “dear appy” problem Cheap, easy, and data-rich (e.g. time, place) capture: GPS and time everywhere Paper capture has to be as easy as discard (scanner/shredder) Personal meeting capture... E-book…e-magazines & journals need to have critical mass! Telephony and audio capture with indexing Media Center compatible for entertainment (photos, video, TV, radio) 3. 4. 5. 6. 7. Content analysis (critical for photo & video!) Information control: privacy, security, expunge/deniability,… One dbase for everything (articles, books, conversations, ... financial transactions) …vs. long-term use of hierarchical files. Is dbase intuitive? Annotations/meta-information add every-increasing value Easy annotation for aiding search and it becomes the content 59 The “killer apps”: Alzheimer, immortality, surrogate memory? The “dear appy” problem Dear Appy, How committed are you? Please come back to me, Lost and forgotten data Who’s responsible? media platform, file, and databases evolving standards and formats evolving and/or disappearing apps 60 The Amnesia Control Problem Full sharing of bits that are mine I created them, OK to copy and distribute DRM: purchased for my own use “OK to look at, but I only own half the bits” Controlling forgetfulness Private, do not “demo” Expunge forever... “this never happened” 61 The Content Analysis Problem 1. 2. 3. 4. “Cliplets”: Automatic segmentation of a pile of documents and video into individual documents and scenes. Item typing: Would like a minimal Dublin Core for each item: date, creator, title, source, abstract, and type “Type” classification: articles, letters, memos, etc. Ontology creation for collections 62 The End 63