Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com Agenda Introduction: Basic Decision Context • What, Why, and How Evaluating Software • Features – good, bad, and ugly • History, Philosophy, and Evolution Conclusion 2 KAPS Group: General Knowledge Architecture Professional Services Virtual Company: Network of consultants – 12-15 Partners – Convera, Inxight, FAST, etc. Consulting, Strategy, Knowledge architecture audit Taxonomies: Enterprise, Marketing, Insurance, etc. Services: – Taxonomy development, consulting, customization – Technology Consulting – Search, CMS, Portals, etc. – Metadata standards and implementation – Knowledge Management: Collaboration, Expertise, e-learning – Applied Theory – Faceted taxonomies, complexity theory, natural categories 3 Varieties of Taxonomy Software Taxonomy Management – Multi-Tes, Data Harmony, SchemaLogic Distributed Taxonomy Development – Wordmap, Wikionomy Text Analytics – Entity Extraction – ClearForest, Inxight, Teragram Auto-Categorization – ClearForest, Inxight, Teragram Embedded software – Content Management, Search 4 Why Taxonomy Software? If you have to ask, you can’t afford it Spreadsheets – – Good for calculations, days of taxonomy development over (almost) Ease of use – more productive – – Increase speed of taxonomy development Better Quality – synonyms, related terms, etc. Distributed development – lower cost, user input (good and bad) 5 Decision Points Dedicated taxonomy management software – Small company, specialized taxonomy Real issue is how it will be integrated Text analytics / auto-categorization – Dedicated software or use features of CM and/or enterprise search Combination of dedicated and embedded – Integration – export and import is critical Integration with Policy / Procedure – Distributed contributions 6 Taxonomy – How will it be used? Browse front end to portal Search engine indexing – – Keyword searching Hierarchical browsing – formal structure Faceted navigation – Subject taxonomy and lots of metadata Controlled vocabulary for entering metadata Applications – text and data mining, alerts, etc. Semantic Infrastructure 7 Evaluating Taxonomy Software Historical Perspective: Four Methods Spreadsheets were good enough for my father Flip a Coin – 50-50 chance Ask a Friend (Industry Recommendation) – Historical Accident? Feature Check List and Score – Basic taxonomy functionality Which method produces different results? 8 Evaluating Taxonomy Software Feature Checklist and Score: Basic Features New, copy, rename, delete, merge – Branches not just nodes Scope Notes Spell check Search – all parts and selected (only taxonomy nodes) Names and Identifiers for terms and nodes Versioning 9 Evaluating Taxonomy Software Feature Checklist and Score: Usability Ease of use – copy, paste, rename, merge, etc. User Documentation, user manuals, on-line help, training and tutorials Visualization – file structure, tree – Hierarchy and alphabetical? Automatic Taxonomy/Node Generation – Nonsense for Taxonomy – Node – suggestions – perhaps – List of terms out of context versus reading 10 Evaluating Taxonomy Software Feature Checklist and Score: Additional Features Language support – international – If you have need for it Scalability – Size of taxonomy rarely important – More important for auto-categorization Import-Export – XML and SKOS Support standards – NISO, etc. Mapping between taxonomies API / SDK Security, Access Rights, Roles – See integration 11 Evaluating Taxonomy Software Advanced Features – Taxonomy as Platform Text Analytics – multiple document types Entity Extraction – Multiple types, custom classes Auto-categorization – Training sets – Terms – literal strings, stemming, dictionary of related terms – Rules – simple – position in text (Title, body, url) – Advanced – saved search queries (full search syntax) – NEAR, SENTENCE, PARAGRAPH – Boolean – X NEAR Y and Not-Z Advanced Features – Facts / ontologies /Semantic Web – RDF + 12 Evaluating Taxonomy Software “Philosophy” Perspective Self-Knowledge is the highest form of knowledge. It’s not what you do, it’s who you know. – Importance of who on team Life is meaningless and absurd – And so are most search/categorization results Beauty and Meaning are in the eye of the beholder – Raise your hand if you think I’m more beautiful than … “The real constitution of things is accustomed to hide itself” – Beware 2.0 “solutions” 13 Self Knowledge is the highest form of knowledge Start with self knowledge – KA audit – content, users, technology, business and information behaviors Develop a model of taxonomy use in your enterprise Ask Experts – Taxonomy is not for faint of heart If test – use own content – – Balance of current application and platform Use the test to get a head start on taxonomy development Spend more time on self knowledge than vendor capability. 14 Evaluating Taxonomy Software Self Knowledge – Distributed model of taxonomy in action People – Interdisciplinary Team – Knowledge architects, editors, SME, users Roles – Select and implement taxonomy software, input into CM, Search – Care and feeding of taxonomies, metadata, vocabularies – Initial filter of user input, monitoring user input, answer questions – Provide input – what works and not, new terms Technology – – Develop taxonomies, vocabularies, facets Integrate taxonomy into CM, search, applications Activities – Information needs and behaviors – support with advanced features 15 It’s not what you know, it’s who you know Design of the Taxonomy Selection Team Traditional Candidates - IT Experience with large software purchases – Search/Categorization is unlike other software Experience with needs assessments – Need more – know what questions to ask, knowledge audit Objective criteria – Looking where there is light? – Asking IT to select taxonomy software is like asking a construction company to select the design of your house. They have the budget – OK, they can play. 16 It’s not what you know, it’s who you know Design of the Taxonomy Selection Team Traditional Candidates - Business Owners Understand the business – But don’t understand information behavior Focus on business value, not technology – Focus on semantics is needed They can get executive sponsorship, support, and budget. – OK, they can play 17 It’s not what you know, it’s who you know Design of the Taxonomy Selection Team Traditional Candidates - Library Understand information structure – But not how it is used in the business Experts in search experience and categorization – Suitable for experts, not regular users Experience with variety of search engines, taxonomy software, integration issues – OK, they can play 18 It’s not what you know, it’s who you know Design of the Taxonomy Selection Team Interdisciplinary Team, headed by Information Professionals Relative Contributions – – – IT – Set necessary conditions, support tests Business – provide input into requirements, support project Library – provide input into requirements, add understanding of search semantics and functionality IP – Rank the relative contributions – – Knowledge Audit – understand information behaviors Taxonomy in full context 19 Evaluating Taxonomy Software Evolutionary Approach Eliminate the unfit – Filter One- Ask Experts - reputation, research – Gartner, etc. • Market strength of vendor, platforms, etc. • Look for minimum features, – – – Filter Two – Technology Filter – match to your overall scope and capabilities – Filter not a focus Filter Three – Focus Group one day visit – 3-4 vendors Filter Four – deep pilot (2) – advanced, integration Evolve higher life forms – – Focus on working relationship with vendor. Focus on ease of customization 20 Conclusion Start with self-knowledge Taxonomy is not an end it itself – what will you use it for? Basic Features are only filters, not scores Integration – need an integrated team (IT, Business, KA) Integration – right balance, location (dedicated or embedded) Integration – Distributed model of taxonomy development and applications – – Central team and distributed authors, users CM, Sharepoint, Search, Advanced Applications 21 Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com