Information Inventories

advertisement
The Information School at the University of Washington
Information Inventories
Bob Boiko
UW iSchool
ischool.washington.edu
Metatorial Services Inc.
www.metatorial.com
The Information School at the University of Washington
What we will cover
• What is an inventory?
• What are the deliverables?
The Information School at the University of Washington
What is an Info Inventory?
What is an
inventory
anywhere?
The Information School at the University of Washington
What’s the Point?
• What files exactly do we have?
• Can we get them good and organized for
the system we are building?
– Use
– Audience
– Subject
– Types
– Formats
– Source
The Information School at the University of Washington
A Readiness Inventory vs. Info Inventory
Doc Inventory for a Readiness Info Inventory for Some
Assessment
Project
Docs about the initiative
Docs for the initiative
A small number
A large number
Informally organized
Formally organized
No metadata
As much metadata as
possible
No further destination
Destined for other people to
use
The Information School at the University of Washington
Info Inventory vs. Info Audit
Inventory
Audit
Find files
Find opportunities
Tag files
Define gaps
Deliver files
Deliver recommendations
The Information School at the University of Washington
What’s the Overall Process?
1. Establish a domain of interest
2. Establish were the files and other
information live that are within the
domain
3. Establish a metadata set
4. Tag for that set
5. Amass the metadata and files for
delivery
The Information School at the University of Washington
An Iterative Model
Ready, shoot, aim, shoot, aim, shoot, aim…
Do the least work on the most files
• Domain
 Shares
Directories
 Files
• Files (small random sample)
 Files (bigger sample)
Files (bigger still)
 Files (all)
The Information School at the University of Washington
What are the Deliverables?
1. Establish a domain
•A strategy
2. Establish locations
•Location list with priorities
3. Establish metadata
•Metatorial guide
•Process plan
4. Meta-tag
•Collection tools
•A Repository
•Reports
5. Amass the
metadata and files
•Delivery tools
•Metadata collection
•File collection
The Information School at the University of Washington
The Domain of Interest
• What statement describes the files we are
looking for?
• How do you know if a file qualifies?
• Where do these kinds of files reside?
–
–
–
–
–
LAN
WAN
Local Hard Drives
Webs
Public Sources
• How can you get access to them?
The Information School at the University of Washington
Deliverables: A Strategy
How you will interact with the organization to
find and tag information
• What: A Written Plan with
– Who you need
– What you will need from them
• How
–
–
–
–
–
Whatever mandate you might have
Consensus building
Establish span of control
Provide plans
Get buy-in
The Information School at the University of Washington
What are the Deliverables?
1. Establish a domain
•A strategy
2. Establish locations
•Location list with priorities
3. Establish metadata
•Metatorial guide
•Process plan
4. Meta-tag
•Collection tools
•A Repository
•Reports
5. Amass the
metadata and files
•Delivery tools
•Metadata collection
•File collection
The Information School at the University of Washington
Deliverables: A Location List
What files and other information we will tag
• What: spreadsheet, database table or XML
structure
–
–
–
–
Location
Number of files and size
Types of files
Process for deepening the analysis
• How
– Browsing, observing, asking
– File statistics
– Sampling
The Information School at the University of Washington
What are the Deliverables?
1. Establish a domain
•A strategy
2. Establish locations
•Location list with priorities
3. Establish metadata
•Metatorial guide
•Process plan
4. Meta-tag
•Collection tools
•A Repository
•Reports
5. Amass the
metadata and files
•Delivery tools
•Metadata collection
•File collection
The Information School at the University of Washington
What Will it Take- Metadata ROI
• What metadata do we need?
• What metadata can we afford?
– What will each kind cost?
– Who will each kind take?
The Information School at the University of Washington
What Will it Take- Tagging
For each type of metadata:
• One value or many?
• How long will it take per file?
• What expertise will they need?
• With what certainty will taggers be able to
discern metadata?
The Information School at the University of Washington
Typical Tagging Profile
The Information School at the University of Washington
Should I Automate?
For each type of metadata:
– Is it auto-detectable?
– In what percent of the files?
– What will it take to create a tool?
– Is it worth it?
The Information School at the University of Washington
Deliverables: Metatorial Guide
A definitive guide to how to tag
• What: An MS Word file or Web page
– Why are you tagging?
– What is the overall process?
– For each tag:
• What does it mean?
• When do you use it?
• What are its allowed values?
• How:
–
–
–
–
Existing metadata distinctions
File statistics
Automated metadata discovery tools
Feedback and revision process
The Information School at the University of Washington
Deliverables: Process Plan
What each person should be doing and when
• What: MS Project or other planning system
– Each person’s time commitment
– Each person’s assignment
– Due dates
• How
–
–
–
–
–
Lots of negotiation
Process for constant evaluation and reassignment
Process for training
Relief valves
Process for QC
The Information School at the University of Washington
What are the Deliverables?
1. Establish a domain
•A strategy
2. Establish locations
•Location list with priorities
3. Establish metadata
•Metatorial guide
•Process plan
4. Meta-tag
•Collection tools
•A Repository
•Reports
5. Amass the
metadata and files
•Delivery tools
•Metadata collection
•File collection
The Information School at the University of Washington
Deliverables: Collection Tools
Aids to effective data entry
• What: Templates and small programs
– Preloaded spreadsheets
– Web forms
– Data validation
• How
– Automated metadata discovery tools
– MS Office power use & programming
– Web programming
The Information School at the University of Washington
Deliverables: A Repository
A place to put the metadata you collect
• What
–
–
–
–
Databases and/or XML structures
Controlled vocabularies
Taxonomies
Management info
• How
– Loaders from collection tools
– Schema development
– RDB or XML programming
The Information School at the University of Washington
Deliverables: Reports
What do we have, what do we still need, and
how are we doing?
• What
– Word files
– Email messages
– Spreadsheets
• How
– RDB or XML programming
– Statistical analysis
– Roughing it out
The Information School at the University of Washington
What are the Deliverables?
1. Establish a domain
•A strategy
2. Establish locations
•Location list with priorities
3. Establish metadata
•Metatorial guide
•Process plan
4. Meta-tag
•Collection tools
•A Repository
•Reports
5. Amass the
metadata and files
•Delivery tools
•Metadata collection
•File collection
The Information School at the University of Washington
Deliverables: Metadata & Files
The results in a useful way
• What
– The database or XML in a friendly form
– Web sites with navigation, metadata and files
– CD’s or DVD’s with UI
• How
–
–
–
–
RDB or XML programming
File collection tools
UI creation (HTML or otherwise)
Final reports
The Information School at the University of Washington
What We Will Cover
What are the goals of the project?
What is the overall process?
What are the deliverables?
• What does the plan look like?
The Information School at the University of Washington
The Team
• Project management
– Traffic manager
– Issues manager
• Process designer
• Tool developer
• Quality measurement and control
staff
The Information School at the University of Washington
The Rest of the Organization
• Who
– Info Taggers
– Info Finders
– QC staff
– People in charge of the above
• How much time can you expect from
them?
• How much mind-share can you expect?
• How will you establish span of control?
The Information School at the University of Washington
Third Parties
• Software development
• Tagging support
• Project management
The Information School at the University of Washington
In Sum…
• The goal of the project is to amass and deliver
a well described body of information
• The process is to establish the guidelines, tag
the files, and collect them for delivery
• The ultimate deliverable is a set of files and
their related metadata
• The plan matches a small team with the largest
possible staff of knowledgeable insiders and a
small set of external experts.
The Information School at the University of Washington
Nagging Questions
• Can you get mindshare?
• How do you know what your reuse rights are
on each file?
• What do you do with composite files?
• When do you stop?
• How do you avoid the bottlenecks of the SME’s
• How do you take back an early mistake?
• How do you scale back?
Download