How to evaluate referring expressions from a hearer’s point of view? Judith Masthoff and Kees van Deemter University of Aberdeen Evaluation of algorithms that generate referring expressions has usually involved an assessment of the extent to which the expressions generated resemble speakers’ utterances. However, recent research in psycholinguistics has emphasised that what speakers say is not necessarily optimal for hearers. It therefore seems worthwhile (e.g., in connection with many kinds of practical applications) to investigate how GRE might be optimised for hearers. We have started doing this kind of research, and have conducted an experiment showing that controlled over-specification can have benefits for hearers, in terms of the ease of finding the referent of the description. This kind of research raises interesting questions. Firstly, what does it mean for a referring expression to be optimal for a hearer? In our experiment, we looked at efficiency, that is: how quickly a hearer could find the intended referent. However, we could have made other choices. For instance, we could have considered the hearer’s confidence in their resolution, their pleasure, or the personality they attributed to the speaker (e.g., how tedious, how trustworthy, how caring, how boring). Which aspect is more important depends on the communicative setting: in some settings, accuracy is vital, in some efficiency is more important, while in others users’ pleasure prevails. Secondly, for each aspect of optimality, how do we measure whether one referring expression is better than another? References Paraboni, I., van Deemter, K. and Masthoff, J (2007). Generating Referring Expressions: Making Referents Easy to Identify. To appear in Computational Linguistics 33(2), June 2007. Paraboni, I., Masthoff, J. & van Deemter. K. (2006). Overspecified reference in hierarchical domains: measuring the benefits for readers. In 4th International Conference on natural Language Generation (Sydney), Coling/ACL, pp 55-62.