Applying Genetic Algorithms to Pronoun Resolution Introduction

From: AAAI-99 Proceedings. Copyright © 1999, AAAI ( All rights reserved.
Applying Genetic Algorithms to Pronoun Resolution
Donna K. Byron and James F. Allen
University of Rochester Department of Computer Science
P.O. Box 270226, Rochester NY 14627, U.S.A.
dbyron/[email protected]
Many pronoun resolution algorithms work by calculating the
most salient candidate antecedent. However, many factors
affect salience, for example being the syntactic subject or the
most frequently mentioned item, and these factors must be
combined into an aggregate salience score. One technique is
to assign weights for each factor representing the amount by
which that factor impacts the overall salience, and the candidate antecedent which accumulates the most weight is selected. Previous authors assigned weights heuristically (cf.
Mitkov 1998). By using a genetic algorithm to select the
weights, our program beats baseline techniques, and can be
customized for each language domain.1
General outline of the algorithm
For this study, each salience factor was implemented as an
independent module. The modules developed at this time
were inspired by a number of previous studies:
Increase salience of candidate selected by Hobbs’ naive
algorithm (Hobbs 1986)2
Decrease salience of quoted speech (Kameyama 1998)
Decrease salience of indefinite NPs (Mitkov 1998)
Increase salience of first NP in sentence (Mitkov 1998)
Decrease if in relative clause (Kennedy & Boguraev 1996)
Decrease if in prepositional phrase (Mitkov 1998)
Increase salience of subjects
Increase salience of most recent candidate
Input to the program is:
is the weight assigned to module
is the vector of candidate antecedents
is generated by the genetic algorithm using random
numbers for the first generation, then standard mutate,
crossover, and replicate operations for subsequent generations. Each individual’s fitness is the percent of pronouns
resolved correctly. The initial population size is fifteen, and
after each generation the five most fit individuals are allowed
to reproduce, halting after twenty generations.
Copyright c 1999 American Association for Artificial Intelligence, all rights reserved. This material is based on work supported by USAF/Rome Labs contract F30602-95-1-0025, ONR
grant N00014-95-1-1088, and Columbia Univ. grant OPG:1307.
A more detailed version of this paper is available as URCS-TR
713, from
Hobbs’ algorithm was slightly modified to allow for the syntactic structure of Treebank trees (see Ge, Hale, & Charniak 1998).
Modified Hobbs
Table 1: Pronoun resolution accuracy on the test corpus
Experimental Results
Our evaluation corpus is 3900 sentences of Treebank text
(Marcus, Santorini, & Marcinkiewicz 1993) for which antecedents of definite pronouns were annotated (Ge, Hale, &
Charniak 1998). 70% of the corpus was used to train the
genetic algorithm, the remaining 30% was the test corpus.
Table 1 shows pronoun resolution accuracy for our three
experiments. The ‘most-recent-candidate’ module on its
own correctly resolved only 47%. Hobbs’ algorithm, which
uses syntactic structure, improved to 67.8%. Hobbs’ algorithm performed best of all the modules when run in isolation. The genetic algorithm correctly resolved 69.1%, a
slight improvement over Hobbs.
Using the same evaluation corpus, Ge et al (1998) developed a probabilistic model that resolved 84.2% of singular,
third-person pronouns correctly. Two powerful predictors
from their study, mention counts and selectional restrictions,
were not included in our system. We plan to integrate those
factors as well as additional salience modules and calculations of non-coreference in future experiments. We also plan
to use a more sophisticated method of combining salience
weights into an overall score, using one of the many techniques available in the machine learning literature.
Ge, N.; Hale, J.; and Charniak, E. 1998. A statistical approach to anaphora resolution. In Proceedings of the Sixth
Workshop on Very Large Corpora.
Hobbs, J. 1986. Resolving pronoun reference. In Readings
in Natural Language Processing. Morgan Kaufmann.
Kameyama, M. 1998. Intrasentential centering: A case
study. In Walker, M.; Joshi, A.; and Prince, E., eds., Centering Theory in Discourse, 89–112. Clarendon, Oxford.
Kennedy, C., and Boguraev, B. 1996. Anaphora in a wider
context: Tracking discourse referents. In ECAI-96.
Marcus, M.; Santorini, B.; and Marcinkiewicz, M. 1993.
Building a large annotated corpus of english: The Penn
Treebank. Computational Linguistics 19(2):313–330.
Mitkov, R. 1998. Robust pronoun resolution with limited
knowledge. In Proceedings of ACL ’98, 869–875.