Challenges in Using Lifetime Personal Information Stores based on MyLifeBits Gordon Bell Alpbach Forum 26 August 2004 The 1 TB Life 1TB gives you 65+ years of: 100 email messages a day (5KB each) 100 web pages day (50KB each) 5 scanned pages a day (100KB each) 1 book every 10 days (1 MB each) 10 photos per day (400 KB JPEG each) 8 hours per day of sound - e.g. telephone, voice annotations, and meeting recordings (8 Kb/s) 1 new music CD every 10 days (45 min each at 128 Kb/s) It will take you 5 years to fill up your 80 GB drive Want video? Buy more cheap drives (1 TB/year lets you record 4 hours/day of 1.5 Mb/s video) Everything goes in a database You need all the features of a database (Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication) If you don’t use one, you will find yourself creating one! Files as blobs, also sync with file system for legacy apps SQL MyLifeBits Software GPS import & Map display TV capture tool SenseCam Telephone capture tool MyLifeBits store Internet TV EPG download tool database Browser tool MyLifeBits Shell Screen saver PocketPC transfer tool PocketRadio player Radio capture & EPG MAPI interface Legacy email client files Legacy applications IM capture Voice annotation tool Text annotation tool Import files Memex As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” Full-text search, text & audio annotations, and hyperlinks I am data Capture and encoding I mean everything Personal Search is not Professional or Web search System sees every entry & access Everything, not just a professional life Limited to SIS, not an infinite amount, covers a profession & personal life MyLifeBits Professional user Depth e.g. information item types & coverage Web as seen by search engines Knowledge breadth e.g. Dewey classification Why bother? ..some reasons Technologist: “we can” an opportunity e.g. 1 TB disks For all of us with new media: a need e.g. jpg. Mp3 Environmentalist: eliminates “atoms” (paper, CDs…) For business--memory enhancement & faster search: Let content analysis and data mining discover trends and correlations in our lives…that even we don’t know. Business: It costs more to delete than it costs to store Preservationist: decays or disappears unless its saved For the human pack rat: “I may need it some day.” For posterity and nostalgia: “Maybe others will want it.” Stories and ambience: basis for creating content For the aging & failed memory: surrogate memory Using my life bits: beyond folders #1: Folders One item. One place. It worked for 1000s of years. My docs and archive Library/file cab X- Employer Active Employer Library/file cab Employer S e l f E E Project Employer Project Project Employer S Business Invests, family $s, & Legal Library/file cab Library/file cab Library/file cab Library/file cab X-Employer Library/file cab Library/file cab Library/file cab Library/file cab Library/file cab <1995 Library/file cab Project Project Personal, including Medical Freedom from hierarchy c:\my documents\talks\MyLifeBits.ppt ID=location=organization=display string Don’t make me invent unique names Don’t make me file everything Or let me pick multiple folders “multiple categorization not only improves organization and retrieval times but also matches more closely with the way users naturally think about organizing their information” – Quan et al (MIT’s Haystack) MyLifeBits collection dialog Of course Aliases and Shortcuts can be used albeit painfully to file by time and/or event, subject, location, type. Using my life bits: easily adding valuable content #2: Text annotations Making bits more valuable and retrievable. “Its just bits until it is annotated” Getting the user to tell a story is the ultimate in media value A story is a “layout” in time and space Most valuable content (by selection, and by being well annotated) Stories must include links to any media they use (for future navigation/search – “transclusion”). Cf: MovieMaker; Creative Memories PhotoAlbums Dapeng was an intern at BARC for the summer of 2000 We took him to lunch at our favorite Dim Sum place to say farewell At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, Jim Using my life bits: the value of time & time posts #3: “I remember when…” The 1st or 2nd most important retrieval handle. Laura (daughter) Kolbe Schultz Stryker Schultz Sheridan Forbes M.I.T. Speech Lab Digital (DEC) CMU Encore NSF Ardent Bell Ltd. Microsoft Res. Computer Museum 2010 Bridget Bell 2000 Fiona Bell 1990 Brigham (son) 1980 Gwen Druyor Bell 1970 U. of N.S.W. 1960 M.I.T. 1950 Kirksville, MO 1940 Sharon (Smith) F: father F: mother F: self F: Sister Education Education Education F: spouse F: son F: grandChild F: grandChild F: daughter F: grandchild F: grandchild F: Significant Other W/Education Work Work Work Work Work Work Work Organization 1930 Gordon Bell 1920 Lola Bell 1910 1900 Chester Bell Mark Stewart’s Lifeline M Stewart Lifeline v2 Copyright Mark Stewart, 2004 MSR Next Media Team Using my life bits: Where, an essential attribute #4: I remember where Just essential. Using my life bits: pivoting on data to aid recall #5: Relationships (links) Using something near ‘it”, to find “it”. MyLifeBits Entities & Links Photo of Event Caller in Phone Call Annotates Transcludes PhotoFinder - Schneiderman and Kang Using my life bits: never enough meta-data … but, can you afford it?b #6: more meta-data (properties) I remember something about the content (understanding a person’s work) Lederberg Finder page Dublin core of a given item Using my life bits: classification of everything #7: classification Is any gain from non-automated classification worth the cost and pain? Is traditional classification required? …at OCLC there was unanimous agreement among faculty and participants that “access to electronic resources requires controlled vocabulary and classification” OCLC Institute, “Knowledge Access Management: Tools and Concepts for Next Generation Catalogers”, 17-19 November 1997, Dublin, Ohio. www.alberteinstein.info Professional Life: Organizations Administrivia Projects Library Lederberg Artifact types Abstracts Agendas not Announcements m; Application forms Articles m Autobiographies m Bibliographies m Biographies m Brochures m Certificates m Correspondence m Diaries m Drafts (documents) Drawings m Electronic images m Essays m Eulogies Excerpts Grant proposals Interviews m Invitations Laboratory notebooks m Laboratory notes Lecture notes Lectures m Legal documents m Legislative records Lists Manifestoes Memoirs m Minutes Monographs m Narratives Newsletters Newspaper columns m Notebooks m Notes Obituaries Official reports Oral histories m Petitions Photographic prints m Press releases m Procedures Proceedings m Programs m Proposals m Questionnaires Reminiscences Reports m Resolutions Resumes Reviews m School records Speeches m Summaries Tables (documents) Technical reports m Transcripts m Typescripts Video recordings m Species: Animals: Chordata: Vertebrata: bony fish Computer structures: digital computer: minicomputer Classification wish list Download classifications rather than build them Definitions & synonyms should help find what I want Today it is too expensive to manually classify my scanned paper. E.g. “right time” meta-data is critical! Next year we hope “the system” can classify papers and other documents e.g. bills In 10 years we expect all documents to appear electronically & classified with a little help from me Using my life bits: Ontologies… useful? or fool’s errand? #8: “ontology”??? “Succumbing to the ‘ontology’ fallacy” -Bates Media Ancestors, Parents, Siblings Diaries Comm. Self Artifacts Family ($,property, legal, health) potentially private… Friends Family & related social Children Company1 Spouse | Significant Other Employer2 Organizations Non-profit3 Family Business2 Academic Inst.2 1. Generic organization: Correspondence, financial, manuals, notebooks, org chart, plans, products, stocks, etc.. Facets: doc type, dissemination, institution type 2. Generic org. plus projects x roles; facets: financial; legal 3. Generic organization for club, foundation, museum, professional org, religious, sport, etc. 4. Books, CDs, papers, videos Facets: media type Legal Health Property Auto, home& other “things” Financial Assets Articles, bio, books, interviews, talks, …web pages Library & archives: info & records. Personal archives (Ambiance…) Library4 Institution type: academic,… companies, family, other organizations…self vs. complex contact?? Using my life bits: Providing insight, including… Where did I spend my time? What has been by output? #9: logging & reports Interface to xls TV Usage MyLifeBits Log of a video file Using my life bits: Recording everything! #10: CARPE Continuous archival recording of personal experiences The A/V/real time data Future: new capture modes/devices Deja View SenseCam Body Media Quindi www.joshgemmell.com Open Problems The Agenda for the Tbyte(s), Lifetime, PC: The killer app after office and mail.searching 1. 2. Guarantee that data will live forever! “dear appy” problem Cheap, easy, and data-rich (e.g. time, place) capture: GPS and time everywhere Paper capture has to be as easy as discarding (scanner/shredder) Personal meeting capture...perhaps by the room E-book…e-magazines & journals need to have critical mass! Telephony and audio capture with indexing (telephonic speech-to-text needed) Media Center compatible for entertainment (photos, video, TV, radio) 3. 4. 5. 6. 7. 8. 9. Content analysis (critical for photo & video!); doable for text. Information control: privacy, security, expunge/deniability,… Having to be schizophrenic or have a lobotomy when leaving a “life” or being a part of some other person’s life recording One dbase for everything (articles, books, conversations, ... financial transactions) …vs. long-term use of hierarchical files. Is dbase intuitive? Annotations/meta-information add every-increasing value at high cost! Easy annotation for aiding search and it becomes the content Other “killer apps”: Alzheimer, immortality, surrogate memory? GUI’s to improve use (e.g. time to learn, use, aid in retention) www.MyLifeBits.com