AQUAINTAbs2PIS - School of Communication and Information

advertisement
ABSTRACTS SELECTED FOR PRESENTATIONS
AT AQUAINT 18-MONTH WORKSHOP
JUNE 2003
SAN DIEGO, CA
Blair-Goldensohn, Sasha J., Kathleen R. McKeown, and Andrew H. Schlaikjer.
“Creating Paragraph-Long Answers to Definitional Questions.” Columbia
University.
Ciany, Gary, Anita Kulman, Patrick Schone and Carol Van Ess-Dykema. “Insights into
Multilingual and Multimedia Question Answering.” Dragon Development/U.S.
Department of Defense.
Croft, Bruce and Stephen Cronen-Townsend. “Predicting Question Quality.” University
of Massachusetts Amherst.
Feldman, Jerome. “Dynamic and Probabilistic Inference.” ICSI/Berkeley.
Gish, Herb. “Answer Spotting: Finding Answers in Conversational Speech.” BBN
Technologies.
Hacioglu, Kadri, Sameer Pradhan, Valerie Krugler, Steven Bethard,
Ashley Thornton, Wayne Ward, Dan Jurafsky, and James Martin. “Improved
Semantic Role Parsing.” University of Colorado, Boulder.
Israel, David. “Natural Language Querying of the Semantic Web.” SRI International.
Kantor, Paul. “Data Fusion for Advanced Question-Answering.” University at
Albany/Rutgers University.
Korelsky, Tanya. “Ontology-based Multi-modal User Interface for Question Answering
in MOQA.” CoGenTex, Inc.
Nyberg, Eric. “Towards Light Semantic Processing for Question Answering.” CMU.
Ogden, W, J. McDonald, R. Zacharski, R. Chadwick. “Evaluating the habitability of
Q&A with user-generated tasks.” NMSU.
Snow, Rion L. “Automatic Construction of Semantic Hierarchies.” HNC.
Starr, Barbara. “CNS Knowledge Base.” SAIC.
1
05/02/03
Weischedel, Ralph, Jinxi Xu, Ana Licuanan. “A Hybrid Approach to Answering
Biographical/Definitional Questions.” BBN Technologies.
Yan, Rong, Alexander Hauptmann and Rong Jin. “Negative Pseudo-Relevance Feedback
in Content-based Video Retrieval.” CMU.
2
05/02/03
Creating Paragraph-Long Answers to Definitional Questions
Sasha J. Blair-Goldensohn, Kathleen R. McKeown, and Andrew H. Schlaikjer
Department of Computer Science
Columbia University
Questions such as "What is X?" can sometimes be answered with a short phrase; often,
however, the optimal answer is longer and includes information of different types such as
more general related terms, background information, and historical examples. We will
present DefScriber, a fully implemented component of our question-answering system
that combines knowledge-based and statistical methods in forming multi-sentence
answers to open-ended definitional questions of the form
"What is X?". DefScriber analyzes texts from multiple sources, matching text fragments
to a set of definitional predicates proposed as the knowledge-based side of our approach.
On the statistical side, we use clustering techniques to detect similar sentences across
multiple sources, and lexical cohesion measures to re-order the sentences in a fluent,
natural definition. We will present results of a recent human evaluation of definitions
generated by DefScriber from Internet documents.
Top-down techniques in DefScriber are based on key elements of definitions as identified
in the literature and in our own empirical study of definitions. One such element is
information on the term's category (Genus) and/or important properties (Species). For
instance, category, or Genus, information about the term ``Hajj'' is given in the sentence
``The Hajj is a type of ritual.'' DefScriber specifically searches for sentences that convey
these definitional information types in building a definitional description.
Since relevant information for a given definition may not be entirely modeled by
predicates, we complement our top-down approach with data-driven techniques adapted
from work in multi-document summarization. These techniques take advantage of
redundancy on the web to identify good definitional sentences. Using centroid-based
metrics and clustering, DefScriber finds similarities in documents that focus on a given
term and includes them in the response. These techniques allow us to include core
information in the definition even when we don't have a specific predicate to model its
semantic type. Lexical cohesion measurements allow us to reorder the selected sentences
to maximize the readability of the definition as a whole.
3
05/02/03
Insights into Multilingual and Multimedia Question Answering
Gary Ciany*, Anita Kulman‡, Patrick Schone‡, Carol Van Ess-Dykema‡
(*Dragon Development, ‡ U.S. Department of Defense)
Until recently, question answering systems have focused on extracting answers to
factoid-style questions from a single document contained in a collection of authored
English newswire texts. However, from the point of view of an Intelligence Community
user, such systems are far too limited to successfully respond to the range of data types
and question diversity that occur in an analyst’s everyday experience. These challenges
include processing multi-agency data collections comprising multilingual, multi-genre,
and multimedia documents; allowing questions whose answers can only be found from
merging information from multiple documents or multiple languages; and questions that
make reference to individual files in addition to the data collection at large. Our system
is being developed with these challenges in mind. Our prototype system is being
designed to answer questions with English or Spanish queries and data, where the data
collections are derived from TREC Newswire, reference and errorful transcripts from
CallHome and Switchboard telephone conversations, and other data sources.
In our presentation, we will provide a number of insights that we have gained as we have
proceeded in our system development. We present these insights with the goal of
motivating the AQUAINT community to develop systems that respond to these user
requirements, as well as developing modules that can be integrated into our and other
AQUAINT systems. In particular, we will share our observations in response to the
following technical issues:
 How do TREC-style questions differ from those that might be presented by an
intelligence analyst? For example, an analyst might ask the question “Who is
speaking?” This kind of question cannot typically be answered from a TREC-style
QA system and requires the need for integrating metadata into the knowledge sources
accessed by the system.
 How does question answering on newswire data differ from that on human transcripts
of conversational speech? For example, if parsing is needed to answer the question,
what degradation does one experience when tools that were developed for newswire
are applied to transcripts?
 If one further needs to process automatic, errorful transcripts of conversational
speech, how does this affect the development of the overall system? For example, if
the error rate of a transcript becomes too high, will a QA system that works well for
newswire even have any value on the errorful data?
 What changes are needed to extend a QA system from one that processes English to
one that processes Spanish? Are these same changes adequate for extending to
another language like Arabic? Do the question structures change? Do the tools need
to be modified?
What about issues one might experience when the multilingual corpora are combined into
a conglomerated corpus? For example, an analyst may need to process documents in
more than one language of which none are previously partitioned by language.
4
05/02/03
Predicting Question Quality
Bruce Croft and Stephen Cronen-Townsend
Department of Computer Science
University of Massachusetts Amherst
We develop a method for predicting the quality of passage retrieval in question
answering systems. Since high-quality passages form the basis for accurate answer
extraction, our method would naturally extend to prediction of an entire system's
effectiveness at extracting a correct answer for each given question. Such predictions of
question performance may lead to ways of automatically improving questions or guiding
users in improving them.
Building on previous work on predicting the performance of queries for document
retrieval, we compute the clarity score for questions using passage-based collections. We
show that this score is correlated with average precision in a TREC-9 based system,
breakdown the correlation by question type, and discuss example questions. We also
study a more general set of queries extracted from a Web log to help make the case for
the general usefulness of performance prediction based on question clarity scores.
Clarity scores may also help predict when it will be effective to expand a question with
related terms. Preliminary results with an approach that calculates clarity improvements
after expansion show that it may be possible to improve answer passage retrieval.
5
05/02/03
Dynamic and Probabilistic Inference
Jerome Feldman
ICSI – Berkeley
For advanced question answering, we need to compute information that is not explicitly
in the text, but can be inferred from it. A core task of the Quasi effort has been the
development of advanced inference algorithms. Our previous work showed how separate
dynamic and probabilistic methods can yield information on both the possible causes and
potential consequences of an event. The new result is that we now have a unified
methodology for both modes of inference and this appears to significantly advance the
state of the art.
6
05/02/03
Answer Spotting: Finding Answers in Conversational Speech
Herb Gish
BBN Technologies
Conversational speech corpora represent a unique and vitally important source of
information for analysts in accomplishing their mission. In our presentation we will
emphasize how an analyst may query and explore a conversational speech corpus using
the Speech Navigator, our demo Answer Spotting system. A speech corpus, before
processing, has little structure and answers to questions that may be contained in the
corpus can only be obtained by tedious and unguided listening to files. Previously we
have shown that an analyst, by providing a small amount of transcribed and annotated
data to the system, can create a structure in a speech corpus that makes it amenable to
queries concerning the domains of interest to the analyst. In our latest work we have
shown that significant structure can be incorporated into a speech corpus without the
need for any transcriptions. In this regime of no transcriptions the analyst has given the
system preferences for certain types of conversations, based on examples, or the analyst
has allowed the system to self-organize the speech corpus and is querying the system in
an exploratory, data mining mode.
In our presentation we will briefly describe the underlying speech technologies that we
have developed but our primary emphasis will be on the exploitation of these
technologies by the analyst through the use of our Speech Navigator. The Speech
Navigator, through the use of audio and visual tools, facilitates the analyst’s querying the
corpus of interest. We will demonstrate how an analyst might use these tools.
7
05/02/03
Improved Semantic Role Parsing
Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard,
Ashley Thornton, Wayne Ward, Dan Jurafsky, and James Martin
University of Colorado, Boulder
One of our core technologies is our semantic role parser, which annotates input sentences
with the roles played by constituents relative to target verbs. We use a set of 22 Thematic
Roles such as Agent, Manner, Theme, Reason, etc. We report on the performance of
three systems developed for semantic annotation of text.
1) Our baseline system was developed from the design of [Gildea & Jurafsky 2002]. This
system first uses the Charniak [Charniak 2001] syntactic parser to identify syntactic
constituents, and then labels each constituent with a Thematic Role (including NULL).
The system estimates posterior probabilities of role assignments for the constituents
given sets of features and combines the estimates to assign the final role labels. Using
training and test sets from the PropBank [Kingsbury & Palmer 2002] corpus, this initial
system achieved a performance of 57% precision and 70% recall. After a number of
improvements to this system (such as clustering target verb classes) the precision and
recall rose to 69% / 74%.
2) We then developed a new classifier based on Support Vector Machines. As in the
baseline system, a syntactic parse is generated by the Charniak parser, and each
constituent is classified. In this system, the role classification is done by an SVM. Using
the same training and test sets as the previous system, the SVM classifier achieves
precision and recall of 77% / 82%. This is by far the highest performance ever reported
for this semantic parsing task.
3) We also experimented with a different semantic parsing algorithm based on SVMs that
treats the problem as a chunking task. Rather than use a syntactic parser to identify
constituents that are then classified, the SVM is used to both segment and label the
semantic roles. A chunk is the sequence of words that fills a semantic role. This work
extends previous work [Kudo and Matsumoto, 2000] which used SVMs to do syntactic
chunking. We have not yet evaluated this system on the PropBank corpus, but initial
evaluations on the FrameNet corpus are very encouraging. The potential advantages of
this system are that it is very efficient and does not require a separate syntactic parser.
References:
Eugene Charniak. 2001. Immediate-head parsing for language models. In Proceedings of
the 39th Annual Conference of the Association for Computational Linguistics (ACL-01),
Toulouse, France.
Daniel Gildea and Daniel Jurafsky. 2002. Automatic labelling of semantic roles.
Computational Linguistics, 28(3): 245-288
8
05/02/03
Paul Kingsbury and Martha Palmer. 2002. From TreeBank to PropBank. In Proceedings
of the 3rd International Conference on Language Resources and Evaluation (LREC2002). Las Palmas, Cabary Islands, Spain.
Taku Kudo, Yuji Matsumoto 2000. Use of support vector learning for chunk
identification. In Proceedings of the 4th Conference on Very Large corpora, pages 142144.
9
05/02/03
Natural Language Querying of the Semantic Web
David Israel
SRI International
ASCS (the Agent Semantic Communications Service) is a search engine for the Semantic
Web. Developed by Teknowledge, Inc., it searches the entire Web and indexes all pages
encoded in DAML+OIL, which is a markup language resulting from the DAML, and
other, research programs. (DAML = DARPA Agent Markup Language)
ASCS allows an interlocutor to make precise queries for information expressed in any of
those pages. ASCS also supports certain kinds of simple inference; for instance, queries
can be broadened or relaxed. Although it provides a graphical user interface, ASCS can
also be used directly by web-based agents to support semantic search and ontology
translation. There is, however, a significant barrier to the use of ASCS: you have to be
familiar with logic and DAML+OIL to use it. In collaboration with Teknowledge, we
have integrated ASCS into Quark, our AQUAINT system, so that ASCS can be
interrogated by posing natural-language queries.
Quark employs a human-language parser, Gemini, to translate English queries into a
logical form. This form is phrased as a conjecture to SNARK, an automatic theorem
prover. In the light of the knowledge in its application-domain theory, SNARK
transforms the query and decomposes it into subqueries, which are themselves further
decomposed
into sub-subqueries, and so on. If an appropriate combination of these subqueries can be
answered, the proof is complete. By means of an answer-extraction mechanism, SNARK
will deduce or compute an answer to the original query from answers to the solved
subqueries.
SNARK has a procedural-attachment mechanism, which enables us to link symbols from
its theory to external procedures, including web-based knowledge sources such as ASCS.
The effect of this is to allow information possessed by the linked source to be provided to
SNARK on demand to answer a subquery, while the proof is still in progress, just as if
that information were part of SNARK's theory.
We have experimented with using ASCS to query the CIA World Factbook, since much
of the Factbook has been translated into DAML and made available on the Web. For
example, suppose we have a query "Find the capital of an Islamic country that borders
Afghanistan." This is translated into a conjecture, which posits the existence of the
capital of such a country. The conjecture is submitted to SNARK's inference procedure.
It is transformed and decomposed into subqueries that involve symbols such as "borders",
"religion" and "capital". For each of these symbols, we have introduced procedural
attachments to ASCS. Thus, from a subquery
borders(afghanistan, ?country)
10
05/02/03
(What is a country that borders Afghanistan?) ASCS returns as one answer
borders(afghanistan, pakistan),
which tells us that Pakistan is a country that borders Afghanistan.
Similar queries to ASCS tell us that the religion of Pakistan is principally Muslim (which
SNARK knows implies Islamic), and that the capital of Pakistan is Islamabad. This is the
answer passed back to the interlocutor by the answer-extraction mechanism.
Note that the use of SNARK allows us to perform inferences that may go beyond what
ASCS could do itself; thus the query refers to Islamic countries, but the Factbook prefers
to speak of countries whose religions include Muslim.
Further answers to the question can be obtained by asking "Are there any others?" which
reactivates SNARK to produce alternative proofs. Thus, we can get another answer to
the same question, Dushanbe, which is the capital of Tajikistan, another Islamic country
that borders
Afghanistan. By repeated probing we can get all the cities known to the Factbook and
other sources that satisfy the condition.
In addition to inference and natural-language querying, by integrating
ASCS into Quark we obtain the ability to cooperate with other external knowledge
sources in finding and presenting answers. For example, we can construct threedimensional terrain visualizations (from satellite imagery via TerraVision) and display
specialized maps (via NIMA's
Geospatial Engine or Generic Mapping Tools). We can invoke the
Alexandria Digital Library Gazetteer (about 6 million pages of geographic data), the
TextPro information-extraction engine, and other sources to search for knowledge not
available through ASCS, and combine it with ASCS information in answering a query.
For instance, when Quark answers the query "Find a cave within 50 miles of an airport
that is south of the capital of Afghanistan," the Factbook (via ASCS) provides the capital,
the ADL Gazetteer finds the airport and cave, and another source computes the distances.
11
05/02/03
Data Fusion for Advanced Question-Answering
Paul Kantor
HITIQA Project
University at Albany/Rutgers University
Data fusion enters advanced question-answering in many ways. Our project has so far
focused on its use in enriching the set of documents from which answers are to be
synthesized or extracted. It is thus an application of data fusion in Information Retrieval.
We have worked with three systems: Lemur, Smart and InQuery. At present, two of
them, Smart and InQuery are included in the HITIQA system. We have found that it is
not generally possible to build good linear fusion rules that will work across a broad
range of topics. However, we have found that fusion rules tuned to a specific topic
almost always produce improvement, and sometimes significant improvement, over the
best of the individual systems.
We believe that in the target situation, where analysts deal with incoming streams of
information, this “topic-specific” or “localized” data fusion is entirely appropriate. We
find, by exploratory analysis of the data, that there is substantial potential for improving
our performance even further through the use of nonlinear rules such as logical fusion
rules, Support Vector Machines, and nonlinear classifiers of other types. Investigations
into whether the type of fusion rule can be correlated a priori or even a posteriori, with
features of the topic itself, are under investigation.
12
05/02/03
Ontology-based Multi-modal User Interface for Question Answering
in MOQA
Tanya Korelsky
CoGenTex, Inc.
This report focuses on work by CoGenTex, Inc. under the project “Meaning-Oriented
Question Answering with Ontological Semantics” (MOQA, in collaboration with NMSU
CRL and UMBC ILIT). Since the start of the project in August 2002, we have designed
and implemented an innovative web-based multi-modal question-answering interface to
MOQA’s fact repository. The first version of MOQA concentrates on answering
questions about people, organizations, contacts and travel. The presentation of the search
results includes tables, maps, time line, graphs for representing social network and
various types of contacts, in addition to textual summaries. All these modes are
interconnected by hyperlinks, providing the analyst with comprehensive support for
intuitive browsing and follow-up search.
One of the distinguishing features of the interface is the use of natural language
generation for textual summaries and for query validation. Currently, the analyzed
queries are automatically paraphrased back to the user using a disambiguated form of
language, with terms from the fact repository and resolved references, informing the user
of the system’s interpretation of the query. Such paraphrases are a crucial feature of
ontology-based query clarification and repair dialogs. Textual summaries of search
results complement graphics and tables by highlighting salient facts and events and
clustering data in meaningful ways. Since summaries are in hypertext, they enable data
exploration via “drilldown”.
The MOQA question-answering interface is currently fully functional. It is designed to be
portable between subject domains with minimal effort. The implementation supports
portability by using XML-based technology and domain-independent generic text plans.
13
05/02/03
Towards Light Semantic Processing for Question Answering
Eric Nyberg
Carnegie Mellon University
This presentation focuses on a lightweight knowledge-based reasoning framework which
is currently being implemented for the JAVELIN QA system. The question is mapped
into a logical predicate representation, which is grounded on the lexical labels provided
by the parser. Passages which are judged relevant to the question are also parsed into the
same logical representation. These two representations are matched using a flexible
unification strategy which assigns a match score for partial matches between
representations. The passages with the best match are selected as answer candidates. At
the level of individual terms (atoms), unification is based on the output of a similarity
function. The similarity function can be based on semantic similarity (e.g.calculated by
searching in WordNet), or statistical models of term similarity trained on large corpora.
The predicate representation and unification algorithm are implemented separately from
the similarity metric, so that different metrics can be compared empirically.
14
05/02/03
Evaluating the Habitability of Q&A with User-Generated Tasks
W. Ogden, J. McDonald, R. Zacharski, P. Bernick & R. Chadwick
Computing Research Laboratory
New Mexico State University
Ultimately, the goal of evaluating question and answering (Q&A) systems, and
information retrieval (IR) systems in general, is to discover ways to improve the
technology and to make the systems more useful. Traditionally, however, researchers
have sought to develop methodologies and metrics that are better suited to compare
systems than to identify ways of improving them. These methodologies have proven to
be extremely unproductive for evaluating the usability of interactive information-retrieval
systems with “real” users.
In this talk we discuss the application of ‘comparison-based’ evaluation methodologies to
the evaluation of interactive Q&A and search interfaces and show how using controlled
tasks to represent information needs may be one source of the problems using these
methods (e.g. determining user motives, judging answer completeness). For example, we
have discovered that real information needs are often different from the needs expressed
in the original question. We also discuss why comparison-based methods do not provide
the intended control necessary to compare systems. We have begun to evaluate Q&A
systems and search interfaces using ‘user-generated’ information needs and will discuss
how this approach more directly addresses the goal of improving the habitability of Q&A
systems. We will describe how this methodology has been used in our initial evaluation
of Language Computer Corporation’s web Q&A system. Furthermore, we will discuss
how to evaluate the advantages and disadvantages of a natural language interface and
how to identify ways to improve these interfaces so users will be more productive and
satisfied.
15
05/02/03
Automatic Construction of Semantic Hierarchies
Rion L. Snow
HNC Software
In the past, construction of semantic hierarchies (for example, WordNet) has been
performed manually, requiring great expenditure of human effort and language expertise.
We present a completely automated technique for creating hierarchies of nested word
groupings according to semantic content. This domain and language independent
operation requires no prior knowledge about the vocabulary or grammar of the language,
and thus we can create both general and domain-specific semantic hierarchies in many
languages directly from untagged corpora. We will present multiple examples of
automatically-generated semantic hierarchies using the AQUAINT newswire corpus, and
of more domain-specific hierarchies using the CNS corpus.
16
05/02/03
CNS Knowledge Base
Barbara Starr
SAIC
A recently initiated by-product of the SAIC AQUA program is a sharable ontology and
knowledge base for the AQUAINT program. SAIC is spearheading a federation of
developers to develop an ontology and populate knowledge bases by extraction from
textual data provided by the Center for Nonproliferation Studies (CNS). Federation
participants are: Stanford KSL, Xerox Parc, Battelle, IBM & other interested parties. The
intent of this federation is to make the CNS ontology available to all AQUAINT
participants.
The initial source of terms for the CNS ontology is the CNS Verity TopicSet which is
organized as an hierarchy compatible with the VERITY search engine, not a formal
ontology. Below the upper categorical levels the relations are word associations, phrase
decompositions, exemplary instances, and aliases. Topic sets include: treaties, nonproliferation organizations, dual-use materials, nuclear facilities and component
technologies, chemical and biological weapons, etc.
Beginning with a base of existing ontologies, including the HPKB-Upper-level-kernellatest, Y2-PQ, SAIC-Merged and World-Fact-Book, we are augmenting these with CNS
topic set elements, not previously included, to create the CNS-ontology. We have also
drawn from other sources, such as the Terrorist Knowledge Base (TKB), developed under
DARPA’s High-Performance Knowledge Base (HPKB) program, and continued in
support of their Rapid Knowledge Formation (RKF) program. The Teknowledge LGPL
WMD ontology is also under consideration.
Three knowledge base segments are to be maintained in KIF, resident in the
Stanford/KSL Ontolingua Server. (A move towards DAML for information storage is
also under consideration, as this would enable use of the JTP DAML reasoner where
possible. Restrictions due to the lack of expressivity in DAML apply.) Three different
extractors will be used to populate the possibly overlapping segments. The first
extraction is provided by the NMSU/UMBC/Onyx MOCA system, and then processed by
the SAIC mapper/translator to produce a KIF ontology. KSL is developing the second
extractor, which works on formatted and semi-formatted data. Third, the shared ontology
is being made available to IBM, who will in turn be performing information extraction of
relations over the CNS data.
The final product will be an expanded ontology, containing knowledge on terrorist
groups and acts and non-proliferation issues, which would be of value to researchers
involved with such concerns, such as ARDA’s Novel Intelligence from Massive Data
(NIMD) program and DARPA’s TKB and Total Information Awareness (TIA) programs,
as well as AQUAINT participants. Browsing the CNS Ontology will be available during
the demonstrations.
17
05/02/03
A Hybrid Approach to Answering Biographical/Definitional Questions
Ralph Weischedel, Jinxi Xu, Ana Licuanan
BBN Technologies
This paper focuses on our approach to generating extended answers to
biographical/definitional questions. The approach combines the following components:
 Information retrieval, to judge relevance of source passages/sentences,
 Information extraction, e.g., to find all the mentions of a person, to find relations
appropriate for social network analysis, to find organizational positions and titles,
etc.
 Linguistic analyses to find important descriptions, e.g., from appositive
constructions, copula (“be”, “become”) clauses, and relative clauses where the
target entity is the focus, and
 Summarization (compression) of the information.
Each of the components can and will be improved.
This paper will show the contribution of each component separately by example
answers and the result of the hybrid of these technologies.
Though there is no established evaluation metric yet for this class of questions, we
will also report on the contribution of each component using the bleu scorer, previously
used in machine translation evaluations, as well as our subjective evaluation.
18
05/02/03
Negative Pseudo-Relevance Feedback in Content-based Video Retrieval
Rong Yan, Alexander Hauptmann and Rong Jin
Informedia Project
Carnegie Mellon University
Pittsburgh, PA 15213
Video information retrieval requires a system to find a visual answer to a question which
may be represented simultaneously in different ways through a text description, audio,
still images and/or video sequences. We present a novel approach that uses pseudorelevance feedback from retrieved answers that are NOT similar to the query items
without requiring further user feedback. We provide insight into this approach using a
statistical model and suggest a score combination scheme via posterior probability
estimation.
An evaluation on the 2002 TREC Video Track queries shows that this technique can
improve video retrieval performance on a real collection. Negative pseudo-relevance
feedback shows great promise for very difficult multimedia retrieval tasks, especially
when combined with other different retrieval algorithms.
19
05/02/03
Download