Taxonomy in Context

advertisement
Taxonomy and Knowledge
Organization
Taxonomy in Context
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
 Introduction: Time for Taxonomies
 Taxonomy Types: Strengths and Weaknesses
–
Formal and Browse
 Taxonomy in the Organization: Intellectual Infrastructure
–
Content, People, Activities
 Taxonomy Tips and Techniques
–
Development Stages
– Issues and Ideas
 Future Directions
–
Building on the Intellectual Infrastructure
2
KAPS Group




Knowledge Architecture Professional Services (KAPS)
Consulting, strategy recommendations
Knowledge architecture audits
Partners – Convera and others
–
First Convera Certified Taxonomy Developers
 Taxonomies: Enterprise, Marketing, Insurance, etc.
–
Taxonomy customization
 Intellectual infrastructure for organizations
–
–
Knowledge organization, technology, people and processes
Search, content management, portals, collaboration,
knowledge management, e-learning, etc.
3
Time for Taxonomies
 Taxonomy Time: Technology is not delivering
–
–
Professionals spend more time looking for information than
using it
50% of them spend > 2 hours a day looking
 Search not enough – text strings vs. concepts
–
Relevance isn’t very relevant
 Data mining misses 80% of significant content
–
Text mining needs more structure (taxonomies)
 Surveys
–
–
76% say taxonomies are important
90% plan on a taxonomy strategy in 24 months
4
Time for Taxonomies: Word of Caution
 Taxonomy is not the answer
–
Is this a taxonomy?
• Inventories, catalogs,
classifications, categorization
schemas, thesauri, controlled
vocabularies
–
Taxonomy not enough – need other structures
• Metadata, facets
–
Taxonomies have to be used to be useful
 How to fail:
–
–
Taxonomy as a project
Taxonomy as a search engine project afterthought
5
Two Types of Taxonomies: Browse and Formal
Browse Taxonomy – Yahoo
6
Browse Taxonomies: Strengths and Weaknesses
 Strengths: Browse is better than search
–
–
Context and discovery
Browse by task, type, etc.
 Weaknesses:
–
Mix of organization
• Catalogs, alphabetical listings, inventories
• Subject matter, functional, publisher,
document type
–
–
–
Vocabulary and nomenclature Issues
Problems with maintenance, new material
Poor granularity and little relationship
between parts.
• Web site unit of organization
–
No foundation for standards
7
Formal Taxonomies: Strengths and Weaknesses
 Strengths:
–
–
–
Fixed Resource – little or no maintenance
Communication Platform – share ideas, standards
Infrastructure Resource
• Controlled vocabulary and keywords
• More depth, finer granularity
 Weaknesses:
–
–
Difficult to develop and customize
Don’t reflect users’ perspectives
• Users have to adapt to language
8
Dynamic Classification: Best of Both Worlds
 Search and browse better than either alone
– Categorized search – context
– Browse as an advanced search
 Dynamic search and browse is best
– Can’t predict all the ways people think
• Advanced cognitive differences
• Panda, Monkey, Banana
–
Can’t predict all the questions and activities
• Intersections of what users are looking for
and what documents are often about
• China and Biotech
• Economics and Regulatory
 Facet Taxonomies
–
Actors, events, functions, geography
9
Taxonomy in Context: Intellectual Infrastructure
 3 infrastructures: technology, organizational, intellectual
–
–
–
Technology – systems and applications, servers and
desktops, programmers and help desks, etc.
Organizational – business units and project groups, policies
and procedures, administrators and facilitators
Intellectual – Information and knowledge, vocabularies and
applications, authors and editors and librarians
 Taxonomy at the nexus of the three infrastructures
 Taxonomy enables communication among people, content,
and technology
10
Taxonomy in the Organization:
Project Approach or Infrastructure Approach
 Situation: Problem with access to information
–
Project Approach
•
•
•
•
•
•
–
Publish everything on the intranet
Buy a search engine
Do some keyword and usability tests
Buy a portal (or two)
Buy content management software
Try knowledge organization – taxonomy?
Infrastructure Approach
• “The path up and down is one and the same.”
(Heraclitus)
11
Taxonomy in the Organization:
Why an Infrastructure Approach?
 Immanuel Kant
“Concepts without percepts are empty.”
– “Percepts without concepts are blind.”
–
 Knowledge Management (KM) / Information
Projects
–
KM without applications is empty
• Strategy only, management fad
• Elegant taxonomies – unused
 Applications without knowledge architecture (KA)
are blind
–
IT based KM
– Fragmented applications
12
Taxonomy in the Organization:
Structuring Content
 All kinds of content
–
Structured and unstructured, Internet and desktop
 Metadata standards – Dublin core+
–
Keywords - poor performance
– Need controlled vocabulary, taxonomies, semantic network
 Document Type
–
–
Form, policy, how-to, etc.
Dynamic classification with subject matter taxonomies
 Audience
–
–
Role, function, expertise, information behaviors
Consistent across subject matter and people
 Best bets metadata
13
Taxonomy in the Organization:
Structuring People
 Individual People
–
–
Tacit knowledge, information behaviors
Advanced personalization – category priority
• Sales – forms ---- New Account Form
• Accountant ---- New Accounts ---- Forms
 Communities
–
–
–
–
Variety of types – map of formal and informal
Variety of subject matter – vaccines, research, scuba
Variety of communication channels and information behaviors
Community-specific vocabularies, need for inter-community
communication (Cortical organization model)
14
Taxonomy in the Organization:
Structuring Processes and Technology
 Technology: infrastructure and applications
–
Enterprise platforms: from creation to retrieval to application
– Taxonomy as the computer network
• Applications – integrated meaning, not just data
 Creation – content management, innovation, communities of
practice (CoPs)
–
When, who, how, and how much structure to add
– Workflow with meaning, distributed subject matter experts (SMEs)
and centralized teams
 Retrieval – standalone and embedded in applications and
business processes
–
Portals, collaboration, text mining, business intelligence, CRM
15
Taxonomy in the Organization:
The Integrating Infrastructure
 Starting point: knowledge architecture audit, K-Map
–
Social network analysis, information behaviors
 People – knowledge architecture team
–
–
Infrastructure activities – taxonomies, analytics, best bets
Facilitation – knowledge transfer, partner with SMEs
 “Taxonomies” of content, people, and activities
–
–
Dynamic Dimension – complexity not chaos
Analytics based on concepts, information behaviors
 Taxonomy is the answer
–
In an Infrastructure Context
16
Taxonomy Development: Tips and Techniques
Stage One – How to Begin
 Step One: Strategic Questions – why, what value from the
taxonomy, how are you going to use it
–
Variety of taxonomies – important to know the differences, when to
use what.
 Step Two: Get a good taxonomist! (or learn)
–
Library Science+ Cognitive Science + Cognitive Anthropology
 Step Three: Software Shopping
–
Automatic Software – Fun Diversion for a rainy day
• Uneven hierarchy, strange node names, weird clusters
–
Taxonomy Management, Entity Extraction, Visualization
 Step Four: Get a good taxonomy!
–
Glossary, Index, Pull from multiple sources
– Get a good document collection
17
Taxonomy Development: Tips and Techniques
Stage Two: Development and/or Customization
 Combination of top down and bottom up (and Essences)
–
–
–
Top: Design an ontology, facet selection
Bottom: Vocabulary extraction – documents, search logs,
interview authors and users
Develop essential examples (Prototypes)
• Most Intuitive Level – genus (oak, maple, rabbit)
• Quintessential Chair – all the essential characteristics, no more
–
–
Work toward the prototype and out and up and down
Repeat until dizzy or done
18
Taxonomy Development: Tips and Techniques
Stage Three: Evaluate and Refine
 Formal Evaluation
–
–
–
–
–
Quality of corpus – size, homogeneity, representative
Breadth of coverage – main ideas, outlier ideas (see next)
Structure – balance of depth and width
Kill the verbs
Evaluate speciation steps – understandable and systematic
• Person – Unwelcome person – Unpleasant person - Selfish
person
–
–
Avoid binary levels, duplication of contrasts
Primary and secondary education, public and private
19
Taxonomy Development: Tips and Techniques
Stage Three: Evaluate and Refine
 Practical Evaluation
–
–
–
Test in real life application
Select representative users and documents
Test node labels with Subject Matter Experts
• Balance of making sense and jargon
–
–
Test with representative key concepts
Test for un-representative strange little concepts that only
mean something to a few people but the people and ideas are
key and are normally impossible to find
20
Taxonomy Development: Tips and Techniques
Issues and Ideas
 Complex Topics – intersection of subject domains and
facets
–
–
What documents are often about is the intersection
Example – China and Biotech
 Standards and Customization
–
–
–
Balance of corporate communication and departmental
specifics
At what level are differences represented?
Customize pre-defined taxonomy – additional structure, add
synonyms and acronyms and vocabulary
21
Taxonomy Development: Tips and Techniques
Issues and Ideas
 Enterprise Taxonomy
–
–
No single subject matter taxonomy
Need an ontology of facets or domains
 Enterprise Facet Model:
–
–
Actors, Events, Functions, Locations, Objects, Information
Resources
Combine and map to subject domains
22
Future Directions: Knowledge Organization
 New analytic methods
–
Cognitive anthropology, history of ideas, ESNA
 New metadata schemas
–
–
SCORM, RDF and semantic Web
Learning and knowledge objects
 New people models
–
Bloom’s Taxonomy, Gardner’s 7 Intelligences
 Advanced personalization
–
–
Community-based, cognitive-based
Adaptive, dynamic presentation variations
23
Future Directions: Technology
 Taxonomies within applications
–
Richer world knowledge and better learning
 Entity extraction and fact extraction
 Natural language processing (NLP) search – answers, not
document lists
 Integrated KM platform
–
–
–
Creation, structure, retrieval, application, measurement
Integrated KM/KA team
Contextualizing content: related content, best bets, expertise,
communities
24
Future Directions: Well-Articulated Organization
 Learning takes place throughout the system
–
–
Smart applications – adapts to users’ and community’s
activities
Just-in-time training and performance support
 Combination of analytics and knowledge organization
–
–
Concept-level, not document-level
Taxonomy is the brain, analytics are the eyes
 Self-knowledge – highest form of knowledge
–
–
“Unexamined life is not worth living.” (Plato)
Unexamined, inarticulate enterprise is not worth having
25
The Contextual Desktop: Document, List of
Documents, Applications Screen
 Before you view:
 When you look for information
–
Agent keeps you up to date
– Your connections to content and
communities, your preferences
– Your history and the history of other
members of your communities
–
–
–
 When you add/change content
–
Suggests categorization value,
metadata values
– Routes to appropriate content and
communities

– Prompt on unusual connections
• Pre-existing content
• Related content
• Regulatory issues
• Ask the question – route to experts?
–
–
Taxonomy-based dynamic browse
Entities
• People, companies, wells
Related content
• Regulatory, patents, BI-CI
• Geological data
• News stories
Dictionaries, USGS data, databases
Experts
• Ask questions, chat
When you use information
–
Communities
• Search, chat, email
– Performance aids, classes
– Stories
26
Sources
 Books
–
Women, Fire, and Dangerous Things
• What Categories Reveal about the Mind
• Geroge Lakoff
–
The Geography of Thought
• Richard E. Nisbett
 Software
–
–
Convera Retrievalware
Inxight Smart Discovery – entity and fact extraction
 Courses
–
Convera Taxonomy Certification
27
Questions?
Tom Reamy
tomr@kapsgroup.com
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Download