Taxonomy Strategies The Search for Meaning and Semantics: Taxonomies Get It Done Joseph Busch – Why Semantics Matter June 9, 2014 Copyright 2014 Taxonomy Strategies. All rights reserved. Agenda Why semantics matter (… a quick review from 2001) What is semantic search, SKOS and Linked Data? Some semantic search examples? Taxonomy Strategies The business of organized information 2 Why Semantics Matter May 20, 2001 Taxonomy Strategies The business of organized information 3 When you own a Rembrandt you can spell his name any way you want. Taxonomy Strategies The business of organized information 4 But when you want to find a Rembrandt … you better spell his name correctly. Taxonomy Strategies The business of organized information 5 Vocabulary resources can help find the right artist even if their name is typed incorrectly. Taxonomy Strategies The business of organized information 6 Users cannot type in the complex queries needed to find all the relevant items... But this can be done automatically. Taxonomy Strategies The business of organized information 7 Complex queries are even more important when you search the entire web. Taxonomy Strategies The business of organized information 8 So you find Rembrandt the Dutch guy... Taxonomy Strategies The business of organized information 9 … And not Rembrandt the toothpaste. Taxonomy Strategies The business of organized information 10 Getty Vocabularies Linked Data Services February 19, 2014 Taxonomy Strategies The business of organized information 11 Agenda Why semantics matter What is semantic search, SKOS and Linked Data? Some semantic search examples? Taxonomy Strategies The business of organized information 12 Search Failure 19% Character errors. (Young, et al) 40% Vocabulary errors. (Seaman. Norgard, et al) 20% Index confusion. 21% Successful (Nielsen) Taxonomy Strategies The business of organized information 19% 21% 40% 20% 13 Taxonomy Strategies The business of organized information 14 Semantic search solution Semantic search improves search accuracy by inferring the contextual meaning of terms via: Disambiguation Part of speech (POS) analysis Synonyms, variations and quasi-synonyms Concept matching Natural language query analysis Key sentence detection Generate more consistent content to search on. Correct user errors. Map the language of users to the language of the target content. Augment search results with linked data. Taxonomy Strategies The business of organized information 15 What semantics do for search? Function Description Related search Query corrections … did you mean? Concept search Query expansion with synonyms, abbreviations, acronyms, etc. … do you also want? Ontology-based search Query expansion with narrower or broader terms; scoping exhaustive search results Faceted search Dynamic filtering of search results; online shopping Clustering Dynamically bucketing search results into predefined categories Stored queries RSS feeds, alerts, SDI (selective dissemination of information), etc. Personalization Weighting search results based on explicit profiles and implicit data (where you’ve been and what you’ve done) Taxonomy Strategies The business of organized information 16 What is SKOS? Provides the basis for any user, tool, or program to identify, define and link concept vocabularies. Relationship Definition Concept A unit of thought, an idea, meaning, or category of objects or events. A Concept is independent of the terms used to label it. Preferred Label A preferred lexical label for the resource such as a term used in a digital asset management system. Alternate Label An alternative label for the resource such as a synonym or quasisynonym. Broader Concept Hierarchical link between two Concepts where one Concept is more general than the other. Narrower Concept Hierarchical link between two Concepts where one Concept is more specific than the other. Related Concept Link between two Concepts where the two are inherently "related", but that one is not in any way more general than the other. Taxonomy Strategies The business of organized information 17 CONCEPT prefLabel Fringe parking lc:sh85052028 altLabel Park and ride systems altLabel altLabel prefLabel Park-nride altLabel Park and ride Park & ride trt:Brddf altLabel Subject Predicate Object lc:sh85052028 skos:prefLabel Fringe parking lc:sh85052028 skos:altLabel Park and ride systems lc:sh85052028 skos:altLabel Park and ride lc:sh85052028 skos:altLabel Park & ride lc:sh85052028 skos:altLabel Park-n-ride trt:Brddf skos:prefLabel Fringe parking trt:Brddf skos:altLabel Park and ride trt:Brddf skos:altLabel P&R system Trt:Brdd skos:broader Parking Taxonomy Strategies The business of organized information altLabel broader P&R system trt:Brdd prefLabel Parking 18 Why SKOS? According to Alistair Miles* (SKOS co-author) Ease of combination with other standards Vocabularies are used in great variety of contexts. – E.g., databases, faceted navigation, website browsing, linked open data, spellcheckers, etc. Vocabularies are re-used in combination with other vocabularies. – E.g., Library of Congress Subject Headings + Transportation Research Thesaurus; USPS states + USPS zip codes + US Congressional districts; etc. Flexibility and extensibility to cope with variations in structure and style Variations between types of vocabularies – E.g., list vs. classification scheme Variations within types of vocabularies – E.g., Z39.19-2005 monolingual controlled vocabularies and the Transportation Research Thesaurus * Head of Epidemiological Informatics at Oxford University Wellcome Trust Centre for Human Genetics (formerly OUP Senior Computing Officer) Taxonomy Strategies The business of organized information 19 Why SKOS? (2) Publish managed vocabularies so they can readily be consumed by applications Identify the concepts – What are the named entities? Describe the relationships – Labels, definitions and other properties Publish the data – Convert data structure to standard format – Put files on an http server (or load statements into an RDF server) Ease of integration with external applications Use web services to use or link to a published concept, or to one or more entire vocabularies. – E.g., Google maps API, NY Times article search API, Linked open data; etc. A W3C standard like HTML, CSS, XML and RDF, RDFS, and OWL. Taxonomy Strategies The business of organized information 20 Agenda Why semantics matter What is semantic search, SKOS and Linked Data? Some semantic search examples? Taxonomy Strategies The business of organized information 21 Taxonomy browser Taxonomy Strategies The business of organized information 22 Taxonomy-powered search results Taxonomy Strategies The business of organized information 23 Oracle.com top-level taxonomy Person Organization Location Content Type Product Line Technology Audience Products Has a Is a Application Industry Solution Taxonomy Strategies The business of organized information 24 Oracle event finder http://events.oracle.com/ Filter on Location and Language More filters based on this result Subscribe to RSS feed based on the criteria set on this page Results shown on Google maps UI Taxonomy Strategies The business of organized information 25 APS Taxonomy browser Taxonomy Strategies The business of organized information 26 Linked data example A faceted taxonomy of concepts in physics APS Taxonomy Broad Subject Areas Methods & Theories Phenomena Physical Systems Astronomical systems Atomic-scale objects Beams Complex systems Dynamical systems Electric & magnetic fields Engineered materials Fundamental particles Gases delete Information systems Liquids delete Materials Materials by Composition Nonlinear systemMaterials by Dimensionality Nuclei Materials by Property Plasma Materials by Structure Quasiparticles Taxonomy Strategies The business of organized information Elements of the periodic table, and common isotopes Elements by Group Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7 Group 8 Group 9 Group 10 Group 11 Group 12 Cadmium Group 13 Copernicium Group 14 Mercury Group 15 Zinc Group 16 Group 17 Group 18 194Hg 196Hg 198Hg 199Hg 200Hg 201Hg 202Hg 204Hg 27 Paper submission tagging (prototype) Taxonomy Strategies The business of organized information 28 Joseph A Busch Mobile 415-377-7912 jbusch@taxonomystrategies.com QUESTIONS Taxonomy Strategies The business of organized information 29 Session description Semantic search – a phrase that is increasingly used in the popular as well as the professional literature. What does it look like, and how will it work. Panelists will present their visions of semantic search. Program is designed to be interactive with audience participation – suggestions for functions and features they see in the future. What is semantic search? What are the components of semantic search? How can it be used in libraries? Taxonomy Strategies The business of organized information 30