TopQuadrant Technology Research Dictionary of Search Terminology TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page 1 of 23 Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Table of Contents TopQuadrant Technology Research ............................................................................... 1 Dictionary of Search Terminology .................................................................................. 1 Search Technology Overview........................................................................................... 5 How Search Works ................................................................................................... 5 Categorization and Search ....................................................................................... 6 The Reasons for publishing and using a Dictionary of Search Terminology...... 6 Dictionary .......................................................................................................................... 6 Adaptive probabilistic concept modeling (APCM) ................................................ 6 Boolean Search .......................................................................................................... 6 Bayesian Inference or Bayesian Statistics ..................................................... 6 Capitalization ............................................................................................................ 6 Case Based Reasoning .............................................................................................. 6 Categorization ........................................................................................................... 6 Controlled Vocabulary ............................................................................................. 6 Corpus ........................................................................................................................ 6 Dublin Core ............................................................................................................... 6 Fuzzy Search.............................................................................................................. 6 Genre Detection......................................................................................................... 6 Grammatical analysis ............................................................................................... 6 Guided Search ........................................................................................................... 6 Inbound Link............................................................................................................. 6 Index File ................................................................................................................... 6 Information Gain ...................................................................................................... 6 Information Visualization ........................................................................................ 6 Inverse Document Frequency (IDF)........................................................................ 6 Inverted File .............................................................................................................. 6 Keyword Search ........................................................................................................ 6 Keyword targeting .................................................................................................... 6 Knowledge Extraction .............................................................................................. 6 Knowledge Model...................................................................................................... 6 Knowledge Representation Language..................................................................... 6 Language Identification............................................................................................ 6 Lexical analysis or Tokenizing................................................................................. 6 Link Tracking............................................................................................................ 6 Log File Analysis ....................................................................................................... 6 Metadata .................................................................................................................... 6 Meta Search Engine.................................................................................................. 6 Meta Tag .................................................................................................................... 6 Natural Language Query.......................................................................................... 6 Natural Language Processing .................................................................................. 6 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 2 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Navigational Search .................................................................................................. 6 Ontology..................................................................................................................... 6 Ontology Model ......................................................................................................... 6 Parametric Search..................................................................................................... 6 Pattern Matching ...................................................................................................... 6 Phonetic Analysis ...................................................................................................... 6 Phrase Extraction...................................................................................................... 6 Precision..................................................................................................................... 6 Pragmatic Analysis ................................................................................................... 6 Proximity Search....................................................................................................... 6 Query by Example .................................................................................................... 6 Ranking...................................................................................................................... 6 Recall .......................................................................................................................... 6 Relevance ................................................................................................................... 6 Relevance Modeling Technology ............................................................................. 6 Results Management................................................................................................. 6 Semantic Analysis ..................................................................................................... 6 Semantic Web............................................................................................................ 6 Similarity Measures .................................................................................................. 6 Spiders or Crawlers .................................................................................................. 6 Stemming ................................................................................................................... 6 Soundex Search ......................................................................................................... 6 Summarization .......................................................................................................... 6 Syntactic Analysis ..................................................................................................... 6 Taxonomy .................................................................................................................. 6 Term Frequency (TF) ............................................................................................... 6 Term Vectors ............................................................................................................. 6 Thesaurus................................................................................................................... 6 Word Exclusion and Meaningless Terms ............................................................... 6 Word Location .......................................................................................................... 6 Word Proximity ........................................................................................................ 6 Emerging Standards ......................................................................................................... 6 Knowledge Representation....................................................................................... 6 DAML 6 OIL 6 OWL 6 RDF 6 RDF Schema 6 TopicMaps 6 Metadata .................................................................................................................... 6 Dublin Core 6 ISO/IEC 11179 6 About TopQuadrant ......................................................................................................... 6 Additional TopQuadrant Technology Briefings are Available .................................... 6 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 3 of 23 TopQuadrant Technology Research TQTR-Search02 TQTR-Search02_color.doc Dictionary of Search Terminology Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 4 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Search Technology Overview !"# $ "#### % &'( ) * + , + + . -#( $ $ / / / 0 1 2 * / * 3 4 5 0 6 7 $ How Search Works $ $ / $ 8 9 8 : ; 8 $ 7 / 6 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 5 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology < $ = = ; ) > +? * 3 +; 1 @0 5 1 * A+? / $ / * $ $ $ Figure 1: Basic components of the search process Categorization and Search $ $ TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 6 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology B $ *B 0 * $ / . . $ Figure 2: Categorization Engine * + 0 $ A+? 0 + $ 6 $ $ Figure 3: Taxonomy-based Solution Lifecycle TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 7 of 23 * TopQuadrant Technology Research TQTR-Search02 + Dictionary of Search Terminology $ $ $ $ $ * *B $ * $ 3 5 3 $ 5 The Reasons for publishing and using a Dictionary of Search Terminology ? $ Dictionary Adaptive probabilistic concept modeling (APCM) $ 8 C / Boolean Search / 3 26 3 ; 6 3 D5 5 2 6 8 ; 5 9 2 : Bayesian Inference or Bayesian Statistics 8 8 7 ) $ B / 8 B @ 8 Capitalization 3 5 $ 9 2 A: 9 $: TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 8 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Case Based Reasoning < *8 * <8 <8 $ <8 9 : * E $ * <8 E E * E E E < E / E <8 / 3 5 6 <8 / * $ / + <8 <8 <8 * / Categorization < B $ 0 $ $ $ 5 3 $ * $ Controlled Vocabulary $ Corpus B TQTR-Search02_color.doc $ Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 9 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Dublin Core F" 3 < + ) ; 5 < Fuzzy Search )$ 7 $ > G $ 7 $ / / $ Genre Detection H C $ Grammatical analysis + 0 $ $ 3 3 9 9 $* 5 5= : : Guided Search = * / = @ <8 ; Inbound Link 1 8 8 Link Tracking Index File $ $ $ TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 10 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Information Gain 0 $ <8 Information Visualization / G / 7 $ G $ 8 * Inverse Document Frequency (IDF) Inverted File . Keyword Search C $ $ 9 $ 9 J#( F#( : I I / Keyword targeting H Knowledge Extraction $ 5 B 3 3 5 * $ TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 11 of 23 : TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Knowledge Model H + + $ Knowledge Representation Language $ )$ H0 < H? A+? 0 < ? ?66+ 1)8 +?K6 ? ; 0; 1'< 61? Language Identification * Lexical analysis or Tokenizing 8 ) , $ $ 1 $ ?$ / $ Link Tracking $ + = E * * * E3 $ / 5 * Log File Analysis * Metadata H < + ; TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 12 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology 0 , 9 < $ F FJJJ : 6 * * + $ $ Meta Search Engine + $ / * $ * * Meta Tag > +? $ Natural Language Query / Natural Language Processing 2 ? @ $ 3 2?@5 2?@ $ $ Navigational Search $ 1 Ontology 9 $ : $ Ontology Model TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 13 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology 8 Parametric Search @ ; 0 ; @ Pattern Matching $ / C 8 Phonetic Analysis @ 7 $ 9 : $ 3 5 Phrase Extraction / * Precision 3 / 0 L M / N4L M $ ' N 5; E E Pragmatic Analysis 6 $ 0 $ B B * Proximity Search $ Query by Example O $ 3 O8)5 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 14 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Ranking / 0 $ ' E E Recall / 3 L M N4L M N 5 Relevance B Relevance Modeling Technology / ; $ / Results Management > / / Semantic Analysis $ $ $ $ $ $ $ P6 B B / ; B ; Semantic Web $ 1)8 * 3 C ? http://www.SemanticWeb.org). * Similarity Measures + / ; $ 4 4 Spiders or Crawlers < $ + $ B C * / ; ? 2 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 15 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Stemming $ Soundex Search / Summarization $ %#( $ ; Syntactic Analysis < 3 5 $ $ / Taxonomy $ $ 1 $ 1 E $ Q $ ? E 7 4 P * 7 $ ; $ ; 1 5 0 Q = 7 $ 1 $ R 3 7 Q 2 7 Q* 7 * B $ $ Term Frequency (TF) 0 Inverse Document Frequency 3 05 / 0 Term Vectors * / / TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 16 of 23 Q* TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Thesaurus B * 0 Word Exclusion and Meaningless Terms ; 0 $ $ $ 9 : $ $ 9 ; : Word Location ; Word Proximity 1 $ TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 17 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Emerging Standards Knowledge Representation DAML @ + ? 3 +?5@ %### +? 1 > 1 1 + $+ 3 1115 ? $ 3 > +?5 1 > +? 1 3 A+?5 . 1 1 < 3 1'<5 )$ + 3 B B $ ? > 5 B A+? +? A+? 3 +?K6 ?5 0 3 05 3 C ?* 4 4 5 OIL 6 ? * * 0; 3 0;5 3 56 ? ) $ 3 5 $ OWL 1'< 1 6 $ 1 = A+? 3 1 0 6 5 0; 61? +?K6 ? +?K6 ? / S +?K6 ? $ TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 18 of 23 TopQuadrant Technology Research TQTR-Search02 $ $ 61? %##% 0 61? $ 61? 1 < 61? ?3 $ Dictionary of Search Terminology 6 +?K6 ? 61? ' ? 5 %##' , RDF 0 1'< * 1 A+? * 1 $ 1 A+? 0 A+? $ 0 0 $ 1 $ E 1 1 E 1 E $ 0 E 1 0 0 * 0 0 * 0 0; RDF Schema $ < ? 1 H0 ? ? 0 0; 0 0 0 1 0 1 0 $ 0 $ $ * $ 0; ; A+? C A+? A+? 3 5 ; 0; 0 TQTR-Search02_color.doc 1 A+?; Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 19 of 23 A+? TopQuadrant Technology Research TQTR-Search02 $ Dictionary of Search Terminology 04 A+? $ 0; 0 TopicMaps ;64)< + F'%"# ? ; 9 + L;6F'%"#N : . 3 F5 3 5 ** ; . B 3 %5 3 5 ** 6 E 1 + + @ L;6F'%"#N E $ A+? = ; 3 3 ;=+? > A+?* 1 * * ;64)< F'%"# %### 5 1 1 + 6 9 =E 5 TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 20 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology Table 1: Adoption Level of Knowledge Representation Languages Metadata Dublin Core < + 3 <+ 5 < < * $ > +? < $ / 04 A+? 1 < . $ / T TQTR-Search02_color.doc < @ < Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 21 of 23 TopQuadrant Technology Research TQTR-Search02 1 1 1 0 ; B Dictionary of Search Terminology ; ; B ? < T 1 1 T ? C T C Table 2: Dublin Core metadata example ISO/IEC 11179 ;6 ;6 FFF&J $ @ TQTR-Search02_color.doc " ;6 FFF&J Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 22 of 23 TopQuadrant Technology Research TQTR-Search02 Dictionary of Search Terminology About TopQuadrant * 6 $ $ ; + H ; C 6 B ; / H 8+ = + ) B * $ $ < < 3 5$ < * $ $ < 1 $ Additional TopQuadrant Technology Briefings are Available • • 6 • • + ? ; ; ; @ / ; ; ; ; / TQTR-Search02_color.doc Date 4/10/2003 10:52 AM Page Copyright ® 2002 - 2003 TopQuadrant, Inc. All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant 23 of 23