WSD

advertisement

Evaluating the Contribution of EuroWordNet and

Word Sense Disambiguation to Cross-Language

Information Retrieval

Paul Clough 1 and Mark Stevenson 2

Department of Information Studies 1

Department of Computer Science 2

University of Sheffield, UK

GWC2004

Outline

• Introduction

• Word sense disambiguation

• Experimental setup

• CLIR evaluation

• WSD evaluation

• Discussion and conclusion

20 th January 2004

Introduction

CLIR – search for documents written in one language (target) with queries written in another (source)

Approaches – translate query, documents or both

Translation methods – e.g. MT, MRDs, parallel corpora, controlled vocabulary

Problems – e.g. lexical coverage, ambiguity, small context, proper names, compound words

WSD – to identify the correct sense of a word during translation

Experiments – with EuroWordNet and “standard” IR test collection resources

GWC2004 20 th January 2004

Example translation

<num> Number: CL1

<S-title> Caso Waldenheim Source query

Disambiguation needed?

EuroWordNet

caso#1 --> [case#9:grammatical case#1:](4167794)

"nouns or pronouns or adjectives (often marked by inflection) related in some way to other words in a sentence"

caso#2 --> [case#12:instance#2:](4704301)

"an occurrence of something; "it was a case of bad judgment""

caso#3 --> [case#16:event#2:](8533655)

"a special set of circumstances; "in that event, the first possibility is excluded"“

Case (event)

Waldenheim

Target query

GWC2004 20 th January 2004

Word sense disambiguation

• Each Spanish noun can be associated with multiple synsets, in addition each of these can be mapped to multiple synsets in the

ILI (English WN)

• Attempt to automatically identify the EuroWordNet synset appropriate to the query using WSD

• Adapt Resnik’s algorithm for disambiguating groups of nouns:

– Treats EuroWordNet as a hierarchy and identifies most likely synsets based on distance in WordNet and corpus information

– Query is treated as a “bag of words”

GWC2004 20 th January 2004

Experimental setup

TREC 6 collection (242,918 documents and 25 queries)

Spanish used for CL retrieval and English as monolingual baseline

Query translation process:

• term identification  term translation (EWN)  retrieval

EWN transformed into a kind of MRD for translation

Focused on translation of nouns and adjectives

Synset selection – manually, first, all or WSD algorithm

Synset member selection – head (first) or all

Experimented with short (title) and longer queries (title + description)

GWC2004 20 th January 2004

Example translation

<num> Number: CL1

<S-title> Caso Waldenheim Source query

Disambiguation needed?

EuroWordNet

caso#1 --> [case#9:grammatical case#1:](4167794)

"nouns or pronouns or adjectives (often marked by inflection) related in some way to other words in a sentence"

caso#2 --> [case#12:instance#2:](4704301)

"an occurrence of something; "it was a case of bad judgment""

caso#3 --> [case#16:event#2:](8533655)

"a special set of circumstances; "in that event, the first possibility is excluded"“

1 st sense, head case

Waldenheim

GWC2004

1 st sense, all words case grammatical case

Waldenheim all senses, head case

Waldenheim all senses, all words case grammatical case

Instance event

Waldenheim

20 th January 2004

CLIR evaluation (title & description)

Measured MAP and relevant retrieved using trec_eval

Baseline: map = 0.3512, relevant retrieved = 979

Synset selection

GOLD

All

1 st

WSD

Synset members

All

1 st

All

1 st

All

1 st

All

1 st

Relevant retrieved

890

676

760

698

707

550

765

579

MAP

0.2823

0.2459

0.2203

0.2215

0.2158

0.1994

0.2534

0.2073

80% monolingual

Highest (72% monolingual)

GWC2004 20 th January 2004

CLIR evaluation (title only)

Baseline: map = 0.3355, relevant retrieved = 977

Synset selection

GOLD

All

1 st

WSD

Synset members

All

1 st

All

1 st

All

1 st

All

1 st

Relevant retrieved

890

676

760

698

707

550

765

579

MAP

0.2823

0.2459

0.2203

0.2215

0.2158

0.1994

0.2534

0.2073

84% monolingual

Highest (76% monolingual)

GWC2004 20 th January 2004

WSD evaluation

• Manual annotation identifies single correct sense for each noun;

WSD algorithm can return multiple senses

• Calculated two evaluation metrics:

Correct sense

Relaxed: score 1 if correct sense is identified; corresponds to proportion of words where correct senses is included

Strict: score 1/m if correct sense included in m returned; gives indication of amount of incorrect senses returned

• “Choose first synset” used as naïve baseline x x x x x m

GWC2004 20 th January 2004

GWC2004

WSD evaluation

Language Method Strict Relaxed

English WSD 0.41

0.55

first 0.47

0.47

Spanish WSD 0.44

0.55

first 0.48

0.48

• WSD results are disappointing compared to state-of-the-art

• Limited context of queries seems to make disambiguation difficult

• BUT does not seem to effect CLIR results!

20 th January 2004

Discussion and conclusions

Disagreement of usefulness of WSD for monolingual retrieval

WSD algorithms have to be accurate to be useful for retrieval the IR algorithm performs a kind of disambiguation anyway

Our results suggest some WSD better than none for CLIR using

EWN as the translation resource even with poor WSD performance

WSD algorithm well-suited to CLIR where it selects senses only when there is sufficient context

Experiments highlight limitation in EWN for CLIR: many types of useful semantic information missing and lexical coverage

GWC2004 20 th January 2004

Future work

Experiment with different languages supported by EWN to see if results generalise

Experiment with different datasets (e.g. CLEF) and further bilingual pairs, e.g. English  Spanish.

Use advanced query construction techniques, e.g. the “synonym” operator to combine synset members

Combine various WSD algorithms to improve their individual effectiveness

Improve the translation process based on EWN, e.g. identify phrases

GWC2004 20 th January 2004

Download