The Information School at the University of Washington Information Inventories Bob Boiko UW iSchool ischool.washington.edu Metatorial Services Inc. www.metatorial.com The Information School at the University of Washington What we will cover • What is an inventory? • What are the deliverables? The Information School at the University of Washington What is an Info Inventory? What is an inventory anywhere? The Information School at the University of Washington What’s the Point? • What files exactly do we have? • Can we get them good and organized for the system we are building? – Use – Audience – Subject – Types – Formats – Source The Information School at the University of Washington A Readiness Inventory vs. Info Inventory Doc Inventory for a Readiness Info Inventory for Some Assessment Project Docs about the initiative Docs for the initiative A small number A large number Informally organized Formally organized No metadata As much metadata as possible No further destination Destined for other people to use The Information School at the University of Washington Info Inventory vs. Info Audit Inventory Audit Find files Find opportunities Tag files Define gaps Deliver files Deliver recommendations The Information School at the University of Washington What’s the Overall Process? 1. Establish a domain of interest 2. Establish were the files and other information live that are within the domain 3. Establish a metadata set 4. Tag for that set 5. Amass the metadata and files for delivery The Information School at the University of Washington An Iterative Model Ready, shoot, aim, shoot, aim, shoot, aim… Do the least work on the most files • Domain Shares Directories Files • Files (small random sample) Files (bigger sample) Files (bigger still) Files (all) The Information School at the University of Washington What are the Deliverables? 1. Establish a domain •A strategy 2. Establish locations •Location list with priorities 3. Establish metadata •Metatorial guide •Process plan 4. Meta-tag •Collection tools •A Repository •Reports 5. Amass the metadata and files •Delivery tools •Metadata collection •File collection The Information School at the University of Washington The Domain of Interest • What statement describes the files we are looking for? • How do you know if a file qualifies? • Where do these kinds of files reside? – – – – – LAN WAN Local Hard Drives Webs Public Sources • How can you get access to them? The Information School at the University of Washington Deliverables: A Strategy How you will interact with the organization to find and tag information • What: A Written Plan with – Who you need – What you will need from them • How – – – – – Whatever mandate you might have Consensus building Establish span of control Provide plans Get buy-in The Information School at the University of Washington What are the Deliverables? 1. Establish a domain •A strategy 2. Establish locations •Location list with priorities 3. Establish metadata •Metatorial guide •Process plan 4. Meta-tag •Collection tools •A Repository •Reports 5. Amass the metadata and files •Delivery tools •Metadata collection •File collection The Information School at the University of Washington Deliverables: A Location List What files and other information we will tag • What: spreadsheet, database table or XML structure – – – – Location Number of files and size Types of files Process for deepening the analysis • How – Browsing, observing, asking – File statistics – Sampling The Information School at the University of Washington What are the Deliverables? 1. Establish a domain •A strategy 2. Establish locations •Location list with priorities 3. Establish metadata •Metatorial guide •Process plan 4. Meta-tag •Collection tools •A Repository •Reports 5. Amass the metadata and files •Delivery tools •Metadata collection •File collection The Information School at the University of Washington What Will it Take- Metadata ROI • What metadata do we need? • What metadata can we afford? – What will each kind cost? – Who will each kind take? The Information School at the University of Washington What Will it Take- Tagging For each type of metadata: • One value or many? • How long will it take per file? • What expertise will they need? • With what certainty will taggers be able to discern metadata? The Information School at the University of Washington Typical Tagging Profile The Information School at the University of Washington Should I Automate? For each type of metadata: – Is it auto-detectable? – In what percent of the files? – What will it take to create a tool? – Is it worth it? The Information School at the University of Washington Deliverables: Metatorial Guide A definitive guide to how to tag • What: An MS Word file or Web page – Why are you tagging? – What is the overall process? – For each tag: • What does it mean? • When do you use it? • What are its allowed values? • How: – – – – Existing metadata distinctions File statistics Automated metadata discovery tools Feedback and revision process The Information School at the University of Washington Deliverables: Process Plan What each person should be doing and when • What: MS Project or other planning system – Each person’s time commitment – Each person’s assignment – Due dates • How – – – – – Lots of negotiation Process for constant evaluation and reassignment Process for training Relief valves Process for QC The Information School at the University of Washington What are the Deliverables? 1. Establish a domain •A strategy 2. Establish locations •Location list with priorities 3. Establish metadata •Metatorial guide •Process plan 4. Meta-tag •Collection tools •A Repository •Reports 5. Amass the metadata and files •Delivery tools •Metadata collection •File collection The Information School at the University of Washington Deliverables: Collection Tools Aids to effective data entry • What: Templates and small programs – Preloaded spreadsheets – Web forms – Data validation • How – Automated metadata discovery tools – MS Office power use & programming – Web programming The Information School at the University of Washington Deliverables: A Repository A place to put the metadata you collect • What – – – – Databases and/or XML structures Controlled vocabularies Taxonomies Management info • How – Loaders from collection tools – Schema development – RDB or XML programming The Information School at the University of Washington Deliverables: Reports What do we have, what do we still need, and how are we doing? • What – Word files – Email messages – Spreadsheets • How – RDB or XML programming – Statistical analysis – Roughing it out The Information School at the University of Washington What are the Deliverables? 1. Establish a domain •A strategy 2. Establish locations •Location list with priorities 3. Establish metadata •Metatorial guide •Process plan 4. Meta-tag •Collection tools •A Repository •Reports 5. Amass the metadata and files •Delivery tools •Metadata collection •File collection The Information School at the University of Washington Deliverables: Metadata & Files The results in a useful way • What – The database or XML in a friendly form – Web sites with navigation, metadata and files – CD’s or DVD’s with UI • How – – – – RDB or XML programming File collection tools UI creation (HTML or otherwise) Final reports The Information School at the University of Washington What We Will Cover What are the goals of the project? What is the overall process? What are the deliverables? • What does the plan look like? The Information School at the University of Washington The Team • Project management – Traffic manager – Issues manager • Process designer • Tool developer • Quality measurement and control staff The Information School at the University of Washington The Rest of the Organization • Who – Info Taggers – Info Finders – QC staff – People in charge of the above • How much time can you expect from them? • How much mind-share can you expect? • How will you establish span of control? The Information School at the University of Washington Third Parties • Software development • Tagging support • Project management The Information School at the University of Washington In Sum… • The goal of the project is to amass and deliver a well described body of information • The process is to establish the guidelines, tag the files, and collect them for delivery • The ultimate deliverable is a set of files and their related metadata • The plan matches a small team with the largest possible staff of knowledgeable insiders and a small set of external experts. The Information School at the University of Washington Nagging Questions • Can you get mindshare? • How do you know what your reuse rights are on each file? • What do you do with composite files? • When do you stop? • How do you avoid the bottlenecks of the SME’s • How do you take back an early mistake? • How do you scale back?