Metadata Strategy

advertisement
Selecting Taxonomy Software
Who, Why, How
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
 Introduction: Basic Decision Context
• What, Why, and How
 Evaluating Software
• Features – good, bad, and ugly
• History, Philosophy, and Evolution
 Conclusion
2
KAPS Group: General






Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 12-15
Partners – Convera, Inxight, FAST, etc.
Consulting, Strategy, Knowledge architecture audit
Taxonomies: Enterprise, Marketing, Insurance, etc.
Services:
– Taxonomy development, consulting, customization
– Technology Consulting – Search, CMS, Portals, etc.
– Metadata standards and implementation
– Knowledge Management: Collaboration, Expertise, e-learning
– Applied Theory – Faceted taxonomies, complexity theory, natural
categories
3
Varieties of Taxonomy Software
 Taxonomy Management
–
Multi-Tes, Data Harmony, SchemaLogic
 Distributed Taxonomy Development
–
Wordmap, Wikionomy
 Text Analytics – Entity Extraction
–
ClearForest, Inxight, Teragram
 Auto-Categorization
–
ClearForest, Inxight, Teragram
 Embedded software – Content Management, Search
4
Why Taxonomy Software?
 If you have to ask, you can’t afford it
 Spreadsheets
–
–
Good for calculations, days of taxonomy development over
(almost)
 Ease of use – more productive
–
–
Increase speed of taxonomy development
Better Quality – synonyms, related terms, etc.
 Distributed development – lower cost, user input (good and
bad)
5
Decision Points
 Dedicated taxonomy management software
–
Small company, specialized taxonomy
 Real issue is how it will be integrated
 Text analytics / auto-categorization
–
Dedicated software or use features of CM and/or enterprise
search
 Combination of dedicated and embedded
–
Integration – export and import is critical
 Integration with Policy / Procedure
–
Distributed contributions
6
Taxonomy – How will it be used?
 Browse front end to portal
 Search engine indexing
–
–
Keyword searching
Hierarchical browsing – formal structure
 Faceted navigation
–
Subject taxonomy and lots of metadata
 Controlled vocabulary for entering metadata
 Applications – text and data mining, alerts, etc.
 Semantic Infrastructure
7
Evaluating Taxonomy Software
Historical Perspective: Four Methods
 Spreadsheets were good enough for my father
 Flip a Coin
–
50-50 chance
 Ask a Friend (Industry Recommendation)
–
Historical Accident?
 Feature Check List and Score
–
Basic taxonomy functionality
 Which method produces different results?
8
Evaluating Taxonomy Software
Feature Checklist and Score: Basic Features
 New, copy, rename, delete, merge
–





Branches not just nodes
Scope Notes
Spell check
Search – all parts and selected (only taxonomy nodes)
Names and Identifiers for terms and nodes
Versioning
9
Evaluating Taxonomy Software
Feature Checklist and Score: Usability
 Ease of use – copy, paste, rename, merge, etc.
 User Documentation, user manuals, on-line help, training and
tutorials
 Visualization
– file structure, tree
–
Hierarchy and alphabetical?
 Automatic Taxonomy/Node Generation
–
Nonsense for Taxonomy
– Node – suggestions – perhaps
– List of terms out of context versus reading
10
Evaluating Taxonomy Software
Feature Checklist and Score: Additional Features
 Language support – international
–
If you have need for it
 Scalability – Size of taxonomy rarely important
–





More important for auto-categorization
Import-Export – XML and SKOS
Support standards – NISO, etc.
Mapping between taxonomies
API / SDK
Security, Access Rights, Roles – See integration
11
Evaluating Taxonomy Software
Advanced Features – Taxonomy as Platform
 Text Analytics – multiple document types
 Entity Extraction
–
Multiple types, custom classes
 Auto-categorization
–
Training sets
– Terms – literal strings, stemming, dictionary of related terms
– Rules – simple – position in text (Title, body, url)
– Advanced – saved search queries (full search syntax)
– NEAR, SENTENCE, PARAGRAPH
– Boolean – X NEAR Y and Not-Z
 Advanced Features
–
Facts / ontologies /Semantic Web – RDF +
12
Evaluating Taxonomy Software
“Philosophy” Perspective
 Self-Knowledge is the highest form of knowledge.
 It’s not what you do, it’s who you know.
–
Importance of who on team
 Life is meaningless and absurd
–
And so are most search/categorization results
 Beauty and Meaning are in the eye of the beholder
–
Raise your hand if you think I’m more beautiful than …
 “The real constitution of things is accustomed to hide itself”
–
Beware 2.0 “solutions”
13
Self Knowledge is the highest form of knowledge
 Start with self knowledge – KA audit – content, users,
technology, business and information behaviors
 Develop a model of taxonomy use in your enterprise
 Ask Experts – Taxonomy is not for faint of heart
 If test – use own content
–
–
Balance of current application and platform
Use the test to get a head start on taxonomy development
 Spend more time on self knowledge than vendor capability.
14
Evaluating Taxonomy Software
Self Knowledge – Distributed model of taxonomy in action
 People
–
Interdisciplinary Team
– Knowledge architects, editors, SME, users
 Roles
–
Select and implement taxonomy software, input into CM, Search
– Care and feeding of taxonomies, metadata, vocabularies
– Initial filter of user input, monitoring user input, answer questions
– Provide input – what works and not, new terms
 Technology
–
–
Develop taxonomies, vocabularies, facets
Integrate taxonomy into CM, search, applications
 Activities
–
Information needs and behaviors – support with advanced features
15
It’s not what you know, it’s who you know
Design of the Taxonomy Selection Team
 Traditional Candidates - IT
 Experience with large software purchases
–
Search/Categorization is unlike other software
 Experience with needs assessments
–
Need more – know what questions to ask, knowledge audit
 Objective criteria
–
Looking where there is light?
– Asking IT to select taxonomy software is like asking a construction
company to select the design of your house.
 They have the budget
–
OK, they can play.
16
It’s not what you know, it’s who you know
Design of the Taxonomy Selection Team
 Traditional Candidates - Business Owners
 Understand the business
–
But don’t understand information behavior
 Focus on business value, not technology
–
Focus on semantics is needed
 They can get executive sponsorship, support, and budget.
–
OK, they can play
17
It’s not what you know, it’s who you know
Design of the Taxonomy Selection Team
 Traditional Candidates - Library
 Understand information structure
–
But not how it is used in the business
 Experts in search experience and categorization
–
Suitable for experts, not regular users
 Experience with variety of search engines, taxonomy
software, integration issues
–
OK, they can play
18
It’s not what you know, it’s who you know
Design of the Taxonomy Selection Team
 Interdisciplinary Team, headed by Information Professionals
 Relative Contributions
–
–
–
IT – Set necessary conditions, support tests
Business – provide input into requirements, support project
Library – provide input into requirements, add understanding
of search semantics and functionality
 IP – Rank the relative contributions
–
–
Knowledge Audit – understand information behaviors
Taxonomy in full context
19
Evaluating Taxonomy Software
Evolutionary Approach
 Eliminate the unfit
–
Filter One- Ask Experts - reputation, research – Gartner, etc.
• Market strength of vendor, platforms, etc.
• Look for minimum features,
–
–
–
Filter Two – Technology Filter – match to your overall scope
and capabilities – Filter not a focus
Filter Three – Focus Group one day visit – 3-4 vendors
Filter Four – deep pilot (2) – advanced, integration
 Evolve higher life forms
–
–
Focus on working relationship with vendor.
Focus on ease of customization
20
Conclusion





Start with self-knowledge
Taxonomy is not an end it itself – what will you use it for?
Basic Features are only filters, not scores
Integration – need an integrated team (IT, Business, KA)
Integration – right balance, location (dedicated or
embedded)
 Integration – Distributed model of taxonomy development
and applications
–
–
Central team and distributed authors, users
CM, Sharepoint, Search, Advanced Applications
21
Questions?
Tom Reamy
tomr@kapsgroup.com
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Download