Dictionary of Search Terminology

TopQuadrant Technology Research
Dictionary of Search Terminology
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page 1 of 23
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Table of Contents
TopQuadrant Technology Research ............................................................................... 1
Dictionary of Search Terminology .................................................................................. 1
Search Technology Overview........................................................................................... 5
How Search Works ................................................................................................... 5
Categorization and Search ....................................................................................... 6
The Reasons for publishing and using a Dictionary of Search Terminology...... 6
Dictionary .......................................................................................................................... 6
Adaptive probabilistic concept modeling (APCM) ................................................ 6
Boolean Search .......................................................................................................... 6
Bayesian Inference or Bayesian Statistics ..................................................... 6
Capitalization ............................................................................................................ 6
Case Based Reasoning .............................................................................................. 6
Categorization ........................................................................................................... 6
Controlled Vocabulary ............................................................................................. 6
Corpus ........................................................................................................................ 6
Dublin Core ............................................................................................................... 6
Fuzzy Search.............................................................................................................. 6
Genre Detection......................................................................................................... 6
Grammatical analysis ............................................................................................... 6
Guided Search ........................................................................................................... 6
Inbound Link............................................................................................................. 6
Index File ................................................................................................................... 6
Information Gain ...................................................................................................... 6
Information Visualization ........................................................................................ 6
Inverse Document Frequency (IDF)........................................................................ 6
Inverted File .............................................................................................................. 6
Keyword Search ........................................................................................................ 6
Keyword targeting .................................................................................................... 6
Knowledge Extraction .............................................................................................. 6
Knowledge Model...................................................................................................... 6
Knowledge Representation Language..................................................................... 6
Language Identification............................................................................................ 6
Lexical analysis or Tokenizing................................................................................. 6
Link Tracking............................................................................................................ 6
Log File Analysis ....................................................................................................... 6
Metadata .................................................................................................................... 6
Meta Search Engine.................................................................................................. 6
Meta Tag .................................................................................................................... 6
Natural Language Query.......................................................................................... 6
Natural Language Processing .................................................................................. 6
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
2 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Navigational Search .................................................................................................. 6
Ontology..................................................................................................................... 6
Ontology Model ......................................................................................................... 6
Parametric Search..................................................................................................... 6
Pattern Matching ...................................................................................................... 6
Phonetic Analysis ...................................................................................................... 6
Phrase Extraction...................................................................................................... 6
Precision..................................................................................................................... 6
Pragmatic Analysis ................................................................................................... 6
Proximity Search....................................................................................................... 6
Query by Example .................................................................................................... 6
Ranking...................................................................................................................... 6
Recall .......................................................................................................................... 6
Relevance ................................................................................................................... 6
Relevance Modeling Technology ............................................................................. 6
Results Management................................................................................................. 6
Semantic Analysis ..................................................................................................... 6
Semantic Web............................................................................................................ 6
Similarity Measures .................................................................................................. 6
Spiders or Crawlers .................................................................................................. 6
Stemming ................................................................................................................... 6
Soundex Search ......................................................................................................... 6
Summarization .......................................................................................................... 6
Syntactic Analysis ..................................................................................................... 6
Taxonomy .................................................................................................................. 6
Term Frequency (TF) ............................................................................................... 6
Term Vectors ............................................................................................................. 6
Thesaurus................................................................................................................... 6
Word Exclusion and Meaningless Terms ............................................................... 6
Word Location .......................................................................................................... 6
Word Proximity ........................................................................................................ 6
Emerging Standards ......................................................................................................... 6
Knowledge Representation....................................................................................... 6
DAML
6
OIL
6
OWL
6
RDF
6
RDF Schema
6
TopicMaps
6
Metadata .................................................................................................................... 6
Dublin Core
6
ISO/IEC 11179
6
About TopQuadrant ......................................................................................................... 6
Additional TopQuadrant Technology Briefings are Available .................................... 6
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
3 of 23
TopQuadrant Technology Research TQTR-Search02
TQTR-Search02_color.doc
Dictionary of Search Terminology
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
4 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Search Technology Overview
!"# $
"####
%
&'(
)
*
+
,
+
+
.
-#(
$
$
/
/
/
0
1
2
*
/
*
3
4
5
0
6
7
$
How Search Works
$
$
/
$
8
9
8
:
;
8
$
7
/
6
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
5 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
<
$
=
=
;
)
> +?
*
3
+; 1
@0
5
1
*
A+?
/
$
/
*
$
$
$
Figure 1: Basic components of the search process
Categorization and Search
$
$
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
6 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
B
$
*B
0
* $
/
.
.
$
Figure 2: Categorization Engine
*
+
0
$
A+?
0
+
$
6
$
$
Figure 3: Taxonomy-based Solution Lifecycle
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
7 of 23
*
TopQuadrant Technology Research TQTR-Search02
+
Dictionary of Search Terminology
$
$
$
$
$
*
*B
$
*
$
3
5
3 $
5
The Reasons for publishing and using a Dictionary of Search Terminology
?
$
Dictionary
Adaptive probabilistic concept modeling (APCM)
$
8
C
/
Boolean Search
/
3
26 3
;
6
3
D5
5 2
6
8
;
5
9
2
:
Bayesian Inference or Bayesian Statistics
8
8
7
)
$
B
/
8
B
@
8
Capitalization
3
5
$
9
2 A:
9 $:
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
8 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Case Based Reasoning
<
*8
*
<8
<8
$
<8
9
:
*
E
$
*
<8
E
E
*
E
E
E
<
E
/
E
<8
/
3
5
6
<8
/
*
$
/
+
<8
<8
<8 *
/
Categorization
<
B
$
0
$
$
$
5
3
$
* $
Controlled Vocabulary
$
Corpus
B
TQTR-Search02_color.doc
$
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
9 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Dublin Core
F"
3
<
+
)
; 5
<
Fuzzy Search
)$
7
$
>
G
$
7
$
/
/
$
Genre Detection
H
C
$
Grammatical analysis
+
0
$
$
3
3
9
9
$*
5
5=
:
:
Guided Search
=
*
/
=
@
<8
;
Inbound Link
1
8
8
Link Tracking
Index File
$
$
$
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
10 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Information Gain
0
$
<8
Information Visualization
/
G
/
7 $
G
$
8
*
Inverse Document Frequency (IDF)
Inverted File
.
Keyword Search
C
$
$
9
$
9
J#(
F#(
:
I
I
/
Keyword targeting
H
Knowledge Extraction
$
5
B
3
3
5
*
$
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
11 of 23
:
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Knowledge Model
H
+
+
$
Knowledge Representation Language
$
)$
H0 <
H?
A+?
0
< ? ?66+
1)8
+?K6 ?
;
0;
1'< 61?
Language Identification
*
Lexical analysis or Tokenizing
8
)
,
$
$
1
$
?$
/
$
Link Tracking
$
+
=
E
*
*
*
E3
$
/
5
*
Log File Analysis
*
Metadata
H
<
+
;
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
12 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
0
,
9
<
$
F FJJJ
:
6
*
*
+
$
$
Meta Search Engine
+
$
/
*
$
*
*
Meta Tag
> +?
$
Natural Language Query
/
Natural Language Processing
2
?
@
$
3
2?@5
2?@
$
$
Navigational Search
$
1
Ontology
9
$
:
$
Ontology Model
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
13 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
8
Parametric Search
@
;
0
;
@
Pattern Matching
$
/
C
8
Phonetic Analysis
@
7
$
9
:
$
3
5
Phrase Extraction
/
*
Precision
3
/
0
L
M
/
N4L
M
$
'
N
5;
E
E
Pragmatic Analysis
6
$
0
$
B
B
*
Proximity Search
$
Query by Example
O
$
3
O8)5
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
14 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Ranking
/
0
$
'
E
E
Recall
/
3
L
M
N4L
M
N
5
Relevance
B
Relevance Modeling Technology
/
;
$
/
Results Management
>
/
/
Semantic Analysis
$
$
$
$
$
$
$
P6 B
B
/
; B
;
Semantic Web
$
1)8
*
3
C ? http://www.SemanticWeb.org).
*
Similarity Measures
+
/
;
$
4
4
Spiders or Crawlers
<
$
+
$
B
C
*
/
;
?
2
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
15 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Stemming
$
Soundex Search
/
Summarization
$
%#(
$
;
Syntactic Analysis
<
3
5
$
$
/
Taxonomy
$
$
1
$
1
E $
Q $
? E
7
4
P
*
7
$
;
$
;
1
5
0
Q
=
7
$
1
$
R
3
7
Q
2
7
Q* 7
*
B
$
$
Term Frequency (TF)
0
Inverse Document Frequency 3 05
/
0
Term Vectors
*
/
/
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
16 of 23
Q*
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Thesaurus
B
*
0
Word Exclusion and Meaningless Terms
;
0
$
$
$
9
: $
$
9
;
:
Word Location
;
Word Proximity
1
$
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
17 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Emerging Standards
Knowledge Representation
DAML
@
+
?
3 +?5@
%###
+?
1
>
1
1
+
$+
3
1115
?
$
3
> +?5
1
> +?
1
3
A+?5
.
1
1
<
3
1'<5
)$
+
3
B
B
$
?
>
5
B
A+?
+?
A+?
3 +?K6 ?5
0
3 05
3
C ?*
4
4
5
OIL
6
?
*
*
0;
3 0;5
3
56 ?
)
$
3
5
$
OWL
1'< 1
6
$
1
=
A+?
3
1
0
6
5
0;
61?
+?K6 ?
+?K6 ?
/
S
+?K6 ?
$
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
18 of 23
TopQuadrant Technology Research TQTR-Search02
$
$
61?
%##% 0
61?
$
61?
1
<
61? ?3
$
Dictionary of Search Terminology
6
+?K6 ?
61?
'
?
5
%##'
,
RDF
0
1'<
*
1
A+?
*
1
$
1
A+?
0
A+?
$
0
0
$
1
$
E
1
1
E
1
E
$
0
E
1
0
0
*
0
0
*
0
0;
RDF Schema
$
< ?
1
H0 ?
?
0
0;
0
0
0
1
0
1
0
$
0
$
$
*
$
0;
;
A+?
C
A+?
A+?
3
5
;
0;
0
TQTR-Search02_color.doc
1
A+?;
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
19 of 23
A+?
TopQuadrant Technology Research TQTR-Search02
$
Dictionary of Search Terminology
04
A+? $
0;
0
TopicMaps
;64)<
+
F'%"#
?
;
9
+
L;6F'%"#N
:
.
3
F5
3
5
**
;
.
B
3
%5
3
5
**
6
E
1
+
+
@
L;6F'%"#N
E
$
A+?
=
;
3
3
;=+?
>
A+?*
1 *
*
;64)< F'%"# %###
5
1
1
+
6
9 =E
5
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
20 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
Table 1: Adoption Level of Knowledge Representation Languages
Metadata
Dublin Core
<
+
3 <+ 5
<
<
*
$
> +?
<
$
/
04
A+?
1
<
.
$
/
T
TQTR-Search02_color.doc
<
@
<
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
21 of 23
TopQuadrant Technology Research TQTR-Search02
1
1
1
0
; B
Dictionary of Search Terminology
;
; B
?
<
T
1
1
T
?
C
T
C
Table 2: Dublin Core metadata example
ISO/IEC 11179
;6
;6 FFF&J
$
@
TQTR-Search02_color.doc
"
;6 FFF&J
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
22 of 23
TopQuadrant Technology Research TQTR-Search02
Dictionary of Search Terminology
About TopQuadrant
*
6
$
$
;
+
H
;
C
6 B
;
/
H
8+ =
+
)
B
*
$
$
<
<
3
5$
<
*
$
$
<
1
$
Additional TopQuadrant Technology Briefings are Available
•
•
6
•
•
+
?
;
;
;
@
/
;
;
;
;
/
TQTR-Search02_color.doc
Date
4/10/2003
10:52 AM
Page
Copyright ® 2002 - 2003 TopQuadrant, Inc.
All Rights Reserved. Printed in U.S.A. Confidential, Unpublished Property of TopQuadrant
23 of 23