Personal Information Management Uichin Lee KAIST KSE Personal Information Management, Williams Jones Annual Review of Information Science and Technology Volume 41, Issue 1, pages 453–504, 2007 Some slides are based on Bob Glushko’s lecture on PIM: http://courses.ischool.berkeley.edu/i202/f08/lectures/202-20081020.pdf Plan for Today’s Class • • • • • • Defining Personal Information Management Review key historical influences Analysis of PIM Understand how people do PIM Cognitive overhead (information utility) Approaches to PIM integration 1. Temperance 2. Silence 3. Order 4. Resolution 5. Frugality 6. Industry 7. Sincerity 8. Justice 9. Moderation 10. Cleanliness 11. Tranquility 12. Chastity 13. Humility Benjamin Franklin’s 13 virtues Slides from "http://people.csail.mit.edu/teevan/work/publications/talks/campustech06.ppt" 1. Temperance 2. Silence 3. Order 4. Resolution 5. Frugality ? 6. Industry 7. Sincerity 8. Justice 9. Moderation 10. Cleanliness 11. Tranquility ? 12. Chastity 13. Humility ? ? “Order .. with regard to places for things, papers, etc., I found extremely difficult to acquire.” ? Slides from "http://people.csail.mit.edu/teevan/work/publications/talks/campustech06.ppt" Personal Information Management (PIM) --- A Fanciful Definition • PIM is a game of catch ... in which a person tosses their personal information into the future ... in the hope of being able to catch it later • Maybe "later" is "forever" and we hope that someone else will do the catching PIM - A More Serious Definition • "The practice and the study of the activities that people perform to acquire, organize, maintain, and retrieve information for everyday use" • So we limit PIM to cover actions (or inactions) that are the result of individual choices • It is also personal -- rather than social -- is that we decide on the activities and organizational schemes and carry them out by ourselves (although it may be part of our PIM strategy to rely partly on others) • Having this discretion about information organization and "making sense of it“ to make decisions is a defining characteristic of professional, as opposed to clerical, information work Why PIM Matters? • PIM matters to us as individuals and professionals because better PIM results in better use of our time and attention and ultimately in better quality of life • PIM matters within enterprises because better PIM means increased productivity (in the short term) and better knowledge management (in the long term) • Advances in PIM may also translate to improvements in education and in helping old people "match their mental lifespan to their physical lifespan" PIM and Prevailing Technology • We can view many inventions as responses to the need for PIM • Because PIM is embedded in user tasks and work context it reflects the prevailing technology support for information work • The personal computer has had the greatest impact (so far) on PIM technology • But as processors and connectivity is increasingly embedded in objects of all kinds information "from and about stuff" will have to be managed and the PC may lose its central role • And replication of digital objects so they can be stored "in the cloud" is becoming common 19th 19th Century PIM Technology Breakthroughs Vannevar Bush's Personal Information Manager? As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” • Full-text search, text & audio annotations, and hyperlinks 20th Century PIM Technology • 1970s-80s: Ringbound organizer (aka filofax) • Late 1980s: Early electronic organizer – Compatibility/data format problems • 1990s-2000s: Enter PIM/PDA devices – PC link, standard functions, user friendliness, stylus input, compatible s/w – Leading companies: • Casio, Sharp, Psion, etc. • Palm pilots, iPaq, etc. Reference: http://www.bbc.co.uk/dna/h2g2/A2284229 Filofax Psion Organizer Palm Zire Compaq iPaq 21st Century PIM • 2000s-now: Smartphone revolution: – Phones, emails, social nets, user generated content (text, audio, photos, videos, sensors) – Leading companies: • Handset manufactures • S/W giants (Google, MS) + many s/w developers Information and Information Item • What is information? – Aural comments, emails, web pages, hand-written notes, etc.. • Information item: packaging of information – Encapsulate information in a persistent form that can be created, stored, moved, given a name, and other properties, copied, distributed, deleted, transformed, and so forth – Examples: Paper documents, electronic documents and other files, e-mail messages, web pages, references • Information form: – Determined by the tools and applications A Personal Space of Information (PSI) • Personal information – Information that a person keeps for personal use – Information about a person kept by and under the control of others (e.g., health records) – Information experienced by a person (e.g., browsing a book, reading web pages, etc); either pushed/pulled.. • Personal Space of Information (PSI) – All the information items that are at least nominally if not exclusively under an individual’s control – Examples: books, emails, e-documents, web pages, other files (on various computers) – A person has only a single PSI – People have some sense of control over the items in their PSIs, which is partly illusory.. Personal Information Collections or Containers (PICs) • Use of collection/container in managing personal info – e.g., piles of papers, papers in a cabinet, project related items, a collection of bookmarks • PICs are “islands” in our PSIs where we have made some conscious effort to control both the information that goes in and the manner in which it is organized PSI PIC PIC Content Networking Project PIC My Photos PIM Activities • Finding/re-finding activities: – Move from need to information and affect the output of information from a PSI • Keeping activities: – Move from information to need and affect the input of information into a PSI • Meta-level (or mapping) activities: – Curatorship: focus on the PSI itself and on the management and organization of PICs within it – Efforts to get organized in a physical office, e.g., constitute on kind of meta-level activity PIM Activities • PIM activities viewed as an effort to establish, use, and maintain a mapping between needs and information Needs Remind John about meeting Mapping Calendar Folders Listen to relaxing music? Desktop Searchable content Information Phone # for John Meeting time Smooth jazz in mp3 file M-level activities Finding activities Keeping activities Finding: from Need to Information • Difference between seeking and finding? • Finding is more inclusive • Wilson’s definition (2000): – The purposive seeking for information as a consequence of a need to satisfy some goal. In the course of seeking, the individual may interact with manual information systems (such as a newspaper, or a library), or with computer-based systems (e.g., WWW) • Information finding: – Find information outside a PSI (e.g., brick/mortar library or WWW) – Finding is more direct (expressing goals): location of items that meets the need – People find not only information but also physical items – Finding is complementary to keeping; e.g., old saying, what we find, we can keep • Trade-off exists: keeping vs. finding Finding: from Need to Information • Finding public information (on-line search) – Berrypicking model (Bates, 1989) • Information is gathered in bits and pieces in the course of serious of steps where the user’s expression of need as reflected in the current query, evolves. – Stepwise orienteering (2004) • People have a sense of control and context over the search progress; lessen the cognitive burden associated with query articulation Finding: from Need to Information • How to re-find personal information from a PSI? • Four steps: 1. Remembering to look 2. Recall information about the information that can help to narrow the subsequent scan 3. Recognizing the desired items 4. Repeating as need in order to “re-collect” the set of items required to meet the current need Finding: from Need to Information • Step 1: Remembering – Related to awareness of information – Reminders: visibility! (e.g., desktop screen, postits, e-mail reminders, etc) – Why is reminding so important? • Answer: information fragmentation • Information items are scattered in different forms across various organizational devices • Support grouping and interrelating items is not well developed Finding: from Need to Information • Step 2/3: Recall and Recognition – Recall: typing a search word – Recognize: scan through a list of results – These steps may be tightly coupled (e.g., desktop search tool) • Why people prefers to use file browser? – People use desktop search tool as a last resort.. – Some explanations: • Easy to recall, orienteering is feasible (e.g., smaller steps, less burden to working memory) • Generation effect (when a person names a file) – Not all files are named by the person.. Finding: from Need to Information • Step 4: Repeating – People want to access a collection of items (which may be scattered in different forms within different organizations) – Output interference?? • Items retrieved first may interfere with the retrieval of later items in a set • E.g., recalling all the people who partied together.. Keeping: from Information to Need • People encounter information and try to determine what, if anything they should do with it • Keeping: decisions and actions relating to encountered information – Any types of information, including: magazine subscription, RSS/Twitter feeds, even your friends – Consume immediately? Ignore? Keep? Keeping: from Information to Need • Keeping is difficult and error-prone – Syndrome: damned if you do, damned if you don’t • If you keep information, you never access it; if you don’t keep it, we may need it later; further, if you keep it in a wrong way, it is useless.. • Filing information items is cognitively difficult and errorprone – Exemplar document classification behavior • Attributes (title/author), keeping behavior (discard, keep), order/scheme, time, value, etc.. • Piling information items is also problematic: – Low accessibility and visibility • Do nothing: – Especially for web search: search again, type a few characters, and navigate to the web from another web site, etc. Keeping: from Information to Need • Keeping everything? (w/ automated keeping) – Could be faulty (e.g., rule based auto classification of incoming emails?) – People may forget to look again later (can’t remember what’s in a PSI) Meta-level: Mapping between Need and Information • How to map between need and information? • Two aspects: – Maintenance and organization (folders) • • • • • Neat or messy? Information fragmentation makes this hard.. Old magazine effect: keeping cost outweighs potential use Deletion paradox: spending precious things on items of little value Stuff goes in but doesn’t come back out – it just builds up? – Making sense of information and planning its use • Too many folders?? Why not using a flat folder, and use search tool instead? (e.g., Google desktop?) • Folder hierarchy is still important: internal category, easy to access, generation effect, category name could be associated with task/project/goals Krish on Cognitive Overload • Kirsh points out that "where we work" is a superposition of many specific environments and applications that we move in and out of • Each of these environments and applications has its own cost structure for handling information based on the tools and resources it makes available • These diverse cost structures result in "computational complexity" for making PIM decisions about keeping and finding information and encourage suboptimal reactive methods rather than careful planning Kirsh on "Information Utility" • People think about and value different kinds of information in different ways - their utility functions are non-monotonic and non-linear • When you are looking for information, can you tell how hard or long it will be before you find it? • How long should you look before you give up? • How valuable will information you don't yet have turn out to be? Kirsh's Information Utility Functions • Utility means how valuable information is in improving performance. PIM Strategies • The lack of a coherent utility or demand function for information means that different people (or the same person at different times) will follow radically different PIM strategies • Pack Rat or Blind Accumulation – just save everything, usually spends excessive time filing • Insurer – Doesn't keep everything, but creates multiple copies (paper and digital) of information items to maximize re-finding • Surface Clutterer – Doesn't keep everything, but strives to keep information accessible, often in spatially organized piles • Just-in-Case Learner – Spends excessive time consuming information when it arrives so they can always be prepared for some future information need • Just-in-Time Gatherer – Ignore all information needs except those needed immediately for current tasks. Maximizes the average value of information items, but some high-value information can't be found this way Approaches to PIM Integration • Through e-mails: Taskmaster • Through search: Google Desktop, Spotlight, SIS (Microsoft Stuff I’ve Seen) • Through projects/tasks: Taskmaster • Through properties: PRESTO (structureless approach), MEMOIRS (time-based) • Through a common representation (e.g., RDF?) • Through a digital recording of everything? (or Lifelogging)