Challenges in Using Lifetime Personal Information Stores based on MyLifeBits Gordon Bell, Jim Gemmell, Roger Lueder SIGIR University of Sheffield, July 26, 2004 “I have watched as hundreds of millions of dollars have been invested to re-invent the wheel - often badly.” -Marcia Bates The 1 TB Life 1TB gives you 65+ years of: 100 email messages a day (5KB each) 100 web pages day (50KB each) 5 scanned pages a day (100KB each) 1 book every 10 days (1 MB each) 10 photos per day (400 KB JPEG each) 8 hours per day of sound - e.g. telephone, voice annotations, and meeting recordings (8 Kb/s) 1 new music CD every 10 days (45 min each at 128 Kb/s) It will take you 5 years to fill up your 80 GB drive Want video? Buy more cheap drives (1 TB/year lets you record 4 hours/day of 1.5 Mb/s video) Everything goes in a database You need all the features of a database (Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication) If you don’t use one, you will find yourself creating one! Files as blobs, also sync with file system for legacy apps SQL MyLifeBits Software GPS import & Map display TV capture tool SenseCam Telephone capture tool MyLifeBits store Internet TV EPG download tool database Browser tool MyLifeBits Shell Screen saver PocketPC transfer tool PocketRadio player Radio capture & EPG MAPI interface Legacy email client files Legacy applications IM capture Voice annotation tool Text annotation tool Import files Memex As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” Full-text search, text & audio annotations, and hyperlinks I am data The guinea pig Gordon Bell is digitizing his life Has now scanned virtually all: Books written (and read when possible) Personal documents (correspondence including memos and email, bills, legal documents, papers written, …) Photos Posters, paintings, photo of things (artifacts, …medals, plaques) Home movies and videos CD collection And, of course, all PC files Now recording: phone, radio, TV (movies), web pages… conversations and meetings to come Paperless throughout 2002. 12” scanned, 12’ discarded. Only 30 GB!!! Capture and encoding I mean everything 50+ year old newspaper clippings 400 year old books O(100s) tapes from videotape “black hole” Personal LifeLog Applications Self Diary/Journal Tutor Mentor Advisor Others Application used by: Babysitter Financial Manager Medical Manager Companion Caretaker Parole Officer Assistant for Elderly Pers Flight Recorder Meeting Prep Personal Assistant Photo Album Autobiography Captain’s Log Conservator Biography Baby Book Trustee Obituary Executor Others Application controlled by: Personal Proxy Self Why bother? ..some reasons Technology creates an opportunity e.g. 1 TB disks Technology creates a need e.g. jpg It will decay or disappear if you don’t save it To eliminate physical storage (paper, CDs…) It costs more (in time) to delete than it costs to store The mantra of the squirrel: “I may need it some day.” For posterity and nostalgia: “Maybe others will want it.” For memory enhancement & faster search (search your LifeBits rather than the web or your colleagues … a single source to look for “stuff I’ve seen”) Let content analysis and data mining discover trends and correlations in our lives…that even we don’t know. Aid to aging or failed memories So you’ve got it – now what do you do with it? “A record if it is to be useful … must be continuously extended, it must be stored, and above all it must be consulted” “The difficulty seems to be, not so much that we publish unduly … but rather that publication has been extended far beyond our present ability to make real use of the record” - Vannevar Bush Trying to use my life bits #1: Folders One item. One place. It worked for 1000s of years. My docs and archive Library/file cab X- Employer Active Employer Library/file cab Employer S e l f E E Project Employer Project Project Employer S Business Invests, family $s, & Legal Library/file cab Library/file cab Library/file cab Library/file cab X-Employer Library/file cab Library/file cab Library/file cab Library/file cab Library/file cab <1995 Library/file cab Project Project Personal, including Medical Freedom from hierarchy c:\my documents\talks\MyLifeBits.ppt ID=location=organization=display string Don’t make me invent unique names Don’t make me file everything Or let me pick multiple folders “multiple categorization not only improves organization and retrieval times but also matches more closely with the way users naturally think about organizing their information” – Quan et al (MIT’s Haystack) MyLifeBits collection dialog Of course Aliases and Shortcuts can be used albeit painfully to file by time and/or event, subject, location, type. Trying to use my life bits #2: Text annotations Making bits more valuable and retrievable. “Its just bits until it is annotated” Getting the user to tell a story is the ultimate in media value A story is a “layout” in time and space Most valuable content (by selection, and by being well annotated) Stories must include links to any media they use (for future navigation/search – “transclusion”). Cf: MovieMaker; Creative Memories PhotoAlbums Dapeng was an intern at BARC for the summer of 2000 We took him to lunch at our favorite Dim Sum place to say farewell At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, Jim Annotation like this… Voice Annotation Annotation when you feel like it, how you feel like it Screensaver is the killer app! Trying to use my life bits #3: “I remember when…” The 1st or 2nd most important retrieval handle. MyLifeBits time overlap MyLifeBits on-the-fly time clustering MSR Next Media Team Mark Stewart’s Lifeline M Stewart Lifeline v2 Copyright Mark Stewart, 2004 Trying to use my life bits #4: Relationships (links) Using something near ‘it”, to find “it”. Mark Stewart’s first page Copyright Mark Stewart, 2004 The Stew family tree Copyright Mark Stewart, 2004 PhotoFinder - Schneiderman and Kang MyLifeBits Entities & Links Photo of Event Caller in Phone Call Annotates Transcludes Trying to use my life bits #5: I remember where Just essential. Trying to use my life bits #6: more meta-data (properties) I remember something about the content (understanding a person’s work) Lederberg Finder page Dublin core of a given item Trying to use my life bits #7: classification Moving oward the ultimate time sink. Is traditional classification required? …at OCLC there was unanimous agreement among faculty and participants that “access to electronic resources requires controlled vocabulary and classification” OCLC Institute, “Knowledge Access Management: Tools and Concepts for Next Generation Catalogers”, 17-19 November 1997, Dublin, Ohio. www.alberteinstein.info Professional Life: Organizations Administrivia Projects Library Lederberg papers official reports Number of document segments Lederberg Artifact types Abstracts Agendas not Announcements m; Application forms Articles m Autobiographies m Bibliographies m Biographies m Brochures m Certificates m Correspondence m Diaries m Drafts (documents) Drawings m Electronic images m Essays m Eulogies Excerpts Grant proposals Interviews m Invitations Laboratory notebooks m Laboratory notes Lecture notes Lectures m Legal documents m Legislative records Lists Manifestoes Memoirs m Minutes Monographs m Narratives Newsletters Newspaper columns m Notebooks m Notes Obituaries Official reports Oral histories m Petitions Photographic prints m Press releases m Procedures Proceedings m Programs m Proposals m Questionnaires Reminiscences Reports m Resolutions Resumes Reviews m School records Speeches m Summaries Tables (documents) Technical reports m Transcripts m Typescripts Video recordings m Species: Animals: Chordata: Vertebrata: bony fish Computer structures: digital computer: minicomputer Computer structures: digital computer: minicomputer (refined: Digital Equipment Corp.) Computer structures taxonomy: computers Trying to use my life bits #8: “ontology”??? “Succumbing to the ‘ontology’ fallacy” -Bates Media Ancestors, Parents, Siblings Diaries Comm. Self Artifacts Friends Family & related social Children Company1 Spouse/ Significant Other Employer2 Organizations Non-profit3 1. 2. 3. 4. Family Business2 Academic Inst.2 Generic organization: Correspondence, financial, manuals, notebooks, org chart, plans, products, stocks, etc.. Facets: doc type, dissemination, institution type Generic org. plus projects x roles; facets: financial; legal Generic organization for club, foundation, museum, professional org, religious, sport, etc. Books, CDs, papers, videos Facets: media type, Family ($,property, legal, health) potentially private… Legal Health Property Auto, home& other “things” Financial Assets Articles, bio, books, interviews, talks, …web pages Library & archives: info & records. Personal archives (Ambiance…) Library4 Institution type: academic,… companies, family, other Orgs…self MyLifeBits: Some Lives(t) Personal Parents, children, grandkids CGB himself GKB SSF Close friends GB $s; Legal entities Personal incl. several legal structures Properties: autos, real estate, Investments & contracts Past prof. companies/organiz’ns DEC Carnegie-Mellon U. DEC, NSF, Encore, Ardent, Me Inc., Bell-Mason Bell-Mason Director Diamond & Vanguard Brds. Startups & boards CGB@ Microsoft MLB Clusters Telepresence WWW presence Computer History Museum BOD member Fund-raising CyberMuseum st er Lo Be l l G la B or e Ki don ll rk s v Be ll ill e, M O M U G w . of .I.T en . N Dr .S. uy W or . B el l Di M gi .I. Br tal T. ( i La gh DEC ur am ) a (d ( so au n gh ) te r) Di C gi ta MU l( DE C En ) co re NS Ar F de nt B M ic e ll L ro so t d. ft R Sh es er . id an TC Fo M rb es Ko lb Ch e Sc M F hul t St i o ry na z ke B r S ell Br ch i d ul t ge z tB el l Ch e 2010 2000 1900 GB Timeline 1990 1980 1970 1960 1950 1940 1930 1920 1910 F F F F E E F E W F F E WW W WWW O F O F F F F Roles & Institutions I <am son of> …. I <am father of> Brigham <1960->, Laura <1963-> I <studied at> MIT <1952-1957; 1959-1960> I <worked for> DEC <1960-1966; 1972-1983> I <am a member of> ACM <1960- ->… NAE I <am on the board of> Computer Museum… Things Can everything be part of the model? Pets Houses Cars Assets Trying to use my life bits #9: logging & reports Interface to xls TV Usage MyLifeBits Log of a video file Open Problems The “dear appy” problem Dear Appy, How committed are you? Please come back to me. Forever yours truly, Lost and forgotten data Who’s responsible? Media or 8 track cassette, 8” floppy Evolving platform, file, and database Evolving, incompatible standards & formats for legacy data that disregard ancestors Evolving and/or disappearing apps A Storocratic Oath Do no harm to dates (File creation, Photo taken) Do no harm to device created & other meta-data. 1. 2. • Support & aid the creation of critical metadata. 3. • • 4. Camera data & location data are sacred. When/how the user feels like it Auto-magically! Maintain user confidentiality Classification wish list Download classifications rather than build them Definitions & synonyms should help find what I want Today it is too expensive to manually classify my scanned paper. E.g. “right time” meta-data is critical! Next year I hope “the system” can classify my papers In 10 years I expect all documents to appear electronically & classified with a little help from me Personal Search is not Professional or Web search System sees every entry & access Everything, not just a professional life Limited to SIS, not an infinite amount, covers a profession & personal life MyLifeBits Professional user Depth e.g. information item types & coverage Web as seen by search engines Knowledge breadth e.g. Dewey classification The killer app?? Input, File, Classify, and Find… Observe every action… Operational SIS (e.g. msg, name, paper, fact, birthday, phone call, Time & motion (routing, communicating, scheduling … thinking) Archival one’s self Finder aka Table of Contents aka Site Map Story telling. Screen saver & personal ambience The A/V/real time data Future: new capture modes/devices Deja View SenseCam Body Media Quindi Sensecam & Interactive jewellery www.MyLifeBits.com