Advanced Semantics and Search

advertisement
Advanced Semantics and Search
Beyond Tag Clouds and Taxonomies
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
 Introduction
–
2.0 is really 1.35
 Semantic Search - Integrated Design
Examples – Good, Bad, Ugly
– Themes and Conclusions
–
 Integrated Solutions – How to Beat the Crowd
–
People, Technology, Tags, Semantics
 Conclusion
2
KAPS Group: General






Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 12-15
Partners – FAST, Inxight, Siderean,Nstein, etc.
Consulting, Strategy, Knowledge architecture audit
Taxonomies: Enterprise, Marketing, Insurance, etc.
Services:
– Taxonomy development, consulting, customization
– Technology Consulting – Search, CMS, Portals, etc.
– Metadata standards and implementation
– Knowledge Management: Collaboration, Expertise, e-learning
– Applied Theory – Faceted taxonomies, complexity theory, natural
categories
3
2.0 – Reality Check - General




Evolution, not Revolution
Tyranny of the majority - worst type of central authority
More Madness of Crowds than Wisdom of Crowds
Enterprise 2.0 – still looking for a problem to solve
–
Social Networking is a small part of business
 “Things fall apart; the center cannot hold;
Mere anarchy is loosed upon the world,…
The best lack all conviction, while the worst
Are full of passionate conviction.” - The Second Coming – W.B.
Yeats
4
2.0 – Reality Check - Search
 Folksonomies don’t compare with taxonomies or ontologies
 Serendipity browsing is small part of search
 Fundamental Limits
Limited areas of success – popular sites are popular
– Quality Content – finance, science, etc – not good candidates
– No mechanism for improving folksonomies
– Scale – Too Big (million hits) – Too Little (200 items) –
Amazon and LibraryThing
– Need intrinsic value of tagging – not tagging for better tags
 Bad Tags - idiosyncratic or too broad, errors, limited reach
–
–
Most people can’t tag very well – learned skill
5
Semantics and Search: An Integrated Approach:
Elements
 Multiple Knowledge Structures
–
–
–
Facet – orthogonal dimension of metadata
Taxonomy - Subject matter / aboutness
Ontology – Relationships / Facts
• Subject – Verb - Object
 Software - Text analytics, auto-categorization, entity
extraction
 People – tagging, evaluating tags, fine tune rules and
taxonomy
 People – Users, social tagging, suggestions
 Rich Search Results – context and conversation
6
7
8
9
10
11
12
Integrated Design – Facets & Semantics
Design Issues - General
 What is the right combination of elements?
–
Faceted navigation, metadata, browse, search, categorized
search results, file plan
 What is the right balance of elements?
–
Dominant dimension or equal facets
 Full Facets – Multiple intersecting filters
–
1 or 2 filters (source / type) – No
 When to combine search, topics, and facets?
–
–
Search first and then filter by topics / facet
Browse/facet front end with a search box
13
Integrated Design – Facets & Semantics
Design Issues - General
 Good Information Architecture
–
–
–
–
–
Space wars – summary or full facet display
Simplicity vs. research power
Source and Type are basics
Standard Facets – People, Companies, Place, Industry
Interactive interface – sliders, date ranges
 Semantics still hardest – summaries, related, rank
 Taxonomy – just another facet?
–
Keywords vs. simple taxonomy
 Tag Clouds / Clusters – how useful?
 Feedback – numbers of stories vs. top stories
14
Integrated Design – Facets & Semantics
Design Issues - Users
 Homogeneity of Audience and Content
 Model of the Domain – broad
–
How many facets do you need?
– More facets and let users decide
– Allow for customization – can’t define a single set
 User Analysis – tasks, labeling, communities
• Issue – labels that people use to describe their business
and label that they use to find information
 Match the structure to domain and task
– Users can understand different structures
15
Integrated Solution: Enterprise and eCommerce
 Semantics, Technology, People, Policy
 Design the right balance for each area
–
–
Products – facets, Publishing – more software emphasis – for
tags
Enterprise – more precise targets, high quality content, more
direct role for policy
 New Relationship of Central and Crowd
–
Not top down or bottom up
– Interpenetration of opposites
 Variety of Knowledge structures
–
Folksonomies, taxonomies, ontologies, facets
16
Integrated Solutions: Technology
 Text Analytics – Taxonomy management, entity extraction,
categorization, sentiment
–
–
Auto-populate variety of metadata – author, title, date, etc.
Relevance – best bets to weights and classes of documents
 Search – Integrated features, facets and clusters and tag
clouds and feedback
 Enterprise Content Management
–
–
Place to add metadata, supported by policy
Gather input from authors, tag clouds plus
17
Integrated Solution: People
 Programmers, Librarians, Taxonomists, Metadata specialist
–
Integrate, design, develop rules, monitor activity & quality
 Authors, Subject Matter Experts
–
Input into design (important facets), rules, activity meaning
 Users – Web 2.0
–
–
–
Feedback – quality and usability
Suggestions – missing terms, bad categorization & entity
Tags Clouds & folksonomy – for social networking features,
not for information retrieval
18
Conclusions
 90% of what you hear about Folksonomies (2.0) is hype – again
–
Folksonomies are a great source for first drafts and social research
– Social Networking is really good – for social networking
 Semantic Infrastructure solution (people, policy, technology,
semantics) and feedback is best approach
 Integrated design is essential – not facets as add on
 Semantics is still not there – hardest, but some progress
 Text Analytics (Entity extraction and auto-categorization) are
essential
 Future – new kinds of applications:
– Text Mining, research tools, sentiment
19
Questions?
Tom Reamy
tomr@kapsgroup.com
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Download