In/ormation Processing& Management Vol. 34, No. 2/3, pp. 219-236, 1998 ~" 1998ElsevierScienceLtd. All rights reserved Printed in Great Britain 0306-4573/98$19.00+ 0.00 Pergamon Plh S0306-4573(97)00078-2 USERS' CRITERIA FOR RELEVANCE CROSS-SITUATIONAL EVALUATION: A COMPARISON CAROL L. BARRY I* and LINDA SCHAMBER 2 'School of Library and Information Science, Louisiana State University, 267 Coates Hall, Baton Rouge, LA 70803, USA :School of Library and Information Sciences, University of North Texas, P.O. Box 311068, Denton, TX 76203, USA (Received 1 May 1997; accepted 1 October 1997) Abstract--This article takes a cognitive approach toward understanding the behaviors of end-users by focusing on the values or criteria they employ in making relevance judgments, or decisions about whether to obtain and use information. It compares and contrasts the results of two empirical studies in which criteria were elicited directly from individuals who were seeking information to resolve their own information problems. In one study, respondents were faculty and students in an academic environment examining print documents from traditional text-based information retrieval systems. In the other study, respondents were occupational users of weather-related information in a multimedia environment in which sources included interpersonal communication, mass media, weather instruments, and computerized weather systems. The results of the studies, taken together, provide evidence that a finite range of criteria exists and that these criteria are applied consistently across types of information users, problem situations, and source environments. ~;, 1998 Elsevier Science Ltd. All rights reserved 1. INTRODUCTION It has long been recognized that a wide variety o f factors influence h u m a n information seeking and use behaviors in general and relevance judgments in particular. A l t h o u g h the most prominent factor typically suggested as affecting relevance judgments has been topical appropriateness o f information, m a n y others have been identified, including factors relating to characteristics o f relevance judges, information representations, and information systems. It has also been recognized that m a n y o f these factors are reflected in users' own criteria for making relevance judgments. In recent years, several researchers have conducted empirical studies that attempt to help explain relevance evaluation behavior by describing criteria elicited directly from users. Two user criteria studies, by Barry (1993, 1994) and Schamber (1991a,b), resulted in detailed taxonomies o f user criteria that are readily comparable. The methodologies o f the two studies are quite similar, based on open-ended interviewing techniques and content analyses o f the resulting data. However, the types o f users, information formats and sources, and information use environments differ greatly between the two studies. In the Barry study, respondents were faculty and students in an academic environment examining printed, textual information. In the Schamber study, respondents were occupational users o f weather-related information in a multimedia environment in which information sources included interpersonal c o m m u n i c a t i o n , mass media, c o m p u t e r systems, and weather instruments. *To whom all correspondence should be addressed: Tel.: 504-388-1468 (phone), Fax: 504-388-4581, e-mail: lsbary(a~unixl.sncc.lsu.edu. 219 220 Carol L. Barry and Linda Schamber One goal of the current research on user-defined relevance is to determine the extent to which there is a core set of user criteria that encompasses the many human, system, and situational factors that have been suggested as dimensions of relevance. The intent of this article is to synthesize the findings of these two studies as a first step toward identifying the criteria that seem to span information environments, as well as the criteria that seem to be more situationally specific. 2. LITERATURE REVIEW Throughout the history of information science, various writers have expressed a call for research that focuses on understanding end-users as the ultimate assessors of the quality of information and of the systems and services that provide information. The central concept in these discussions has been relevance, manifested in a judgment of the quality of the relationship between a user's information problem and the information itself, or between representations of problems and information (e.g., requests and documents). Among factors that have been suggested as affecting relevance judgments are the knowledge level, cognitive state, perceptions and beliefs of the user; qualities of information such as topical appropriateness, recency, accuracy, and clarity; and situational factors such as time constraints and the effort and cost involved in obtaining information (Boyce, 1982; Cooper, 1971, 1973, 1978; Cuadra & Katter, 1967; Harter, 1996; MacMullin & Taylor, 1984; Marcus, Kugel & Benenfeld, 1978; Rees & Saracevic, 1966; Rees & Schultz, 1967; Saracevic, 1975, 1996b; Schamber, 1994; Schamber, Eisenberg & Nilan, 1990; Swanson, 1977, 1986, 1988; Wilson, 1973, 1978). Generally, the discipline has seen a shift away from a systems or mechanical term-matching view of relevance, to a view of relevance as a cognitive and dynamic process that involves all of the knowledge and perceptions that the user brings to the information problem situation. Some authors have presented models of information behavior that emphasize the cognitive and dynamic aspects of relevance judgments. Taylor (1962, 1968, 1985, 1988) was one of the first authors to address this area, describing the information seeking process in terms of the user's state of readiness to receive information. Taylor suggests that factors such as educational background, familiarity with the subject area, and the user's intuitive sense of analogy all affect the user's state of readiness. In later years, Taylor developed a value-added model of information, which emphasizes the user's perceptions of the utility and value of information. Belkin (1980) and Belkin, Oddy and Brooks (1982) present a view of users' information needs as anomalous states of knowledge (ASK). They argue that information needs are, to some degree, nonspecifiable by users and that this inability to express precise needs resides in the cognitive states of users. Dervin (1983) developed the sense-making approach to information needs, which concentrates on how people bridge cognitive gaps or uncertainties in order to make sense of their world. Within the sense-making approach, users are seen as active participants in the relevance judgment process and it is assumed that all aspects of information seeking behavior are influenced by situational factors, which include users' knowledge levels, cognitive states, and perceptions of the world. Hatter (1992) presents a psychological theory of relevance, which focuses on the cognitive states of users and the dynamic nature of cognition. Cognitive approaches such as these have served as a conceptual foundation for the development of dozens of user-centered information seeking and use models that have been proposed in recent years. The newer models incorporate various aspects of cognitive perceptions, dynamic changes in behaviors during certain stages of information seeking and searching interactions, levels of context in information problem situations, and multiple information sources and formats (see Allen, 1996; Saracevic, 1996b; Schamber, 1994; Hatter & Hert, 1997). It can be said that relevance assessment Users' criteria for relevanceevaluation 221 is implicit if not explicit in all such models, insofar as information seekers must make judgments in order to predict or determine whether information at hand will help resolve their information problems. Some models place particular emphasis on the role of relevance in user behaviors. For example, in Park's (1992, 1993) model, relevance assessments involve multiple layers of interpretation within three contexts: internal (subject area knowledge, searching experience), external (stage of research, search goal, perceptions of search quality) and problem (document descriptions of the same or different problems that were useful for definitions, methodology, framework, etc.). Wang (1994) takes a decision-making approach in her model of the document selection process in which the user processes information from document elements (title, author), arrives at values of a number of criteria (subject area, novelty), combines criterion values to assess document value (functional, emotional), and weighs document values to arrive at a decision (accept, uncertain, reject). Saracevic (1996a) proposes a system of interdependent relevances within a dynamic interaction framework consisting of multiple strata or levels. In this model, user and computer interact in a sequence of actions at a surface level; additionally, the user interacts at cognitive, situational, and affective levels, and the computer interacts at engineering, content/input, and processing levels. These few models, based on distinctly different approaches, serve as examples to underscore the number and complexity of factors that can influence relevance judgments. A small number of researchers have explored this complexity by describing relevance criteria elicited directly from users. For example, Park (1992, 1993) developed her relevance context model based on users' descriptions of 22 factors contributing to their selection of citations and the contexts in which they made their selections. Su (1991, 1993), as part of a larger study testing IR performance measurement, identified 26 success dimensions in academic users' explanations of their ratings of the overall success of a search. The study by Schamber (1991a,b), which focused on criteria for evaluating multiple sources of weather-related information, yielded 10 summary and 22 detail categories of criteria. Barry (1993, 1994) elicited criteria from faculty and students in an academic environment and identified 23 criteria in seven categories. A study by Cool, Belkin and Kantor (1993), also in an academic environment, identified at least 60 factors underlying users' evaluations. These and other user criteria studies (see Schamber, 1994) are notable for at least four reasons. First, although the researchers use a variety of terms for relevance and associated concepts, they seem to share a common view of end-users as the ultimate judges of quality, and of users' evaluation behavior as a cognitive phenomenon. In this article, we use the term relevance in its broadest sense, including any or all individual perceptions of internal and external reality related to the information problem situation. We assume that relevance is (1) cognitive and subjective, depending on users' knowledge and perceptions; (2) situational, relating to users' information problems; (3) complex and multidimensional, influenced by many factors; (4) dynamic, constantly changing over time; and yet (5) systematic, observable and measurable at a single point in time. Second, the researchers seem to share a fundamental dissatisfaction with traditional approaches to relevance-based evaluation of information systems and services. They and most of the authors cited above have criticized previous studies that relied on a priori relevance judgments made by nonusers and that relied only on simple (relevant/ nonrelevant; accept/reject) relevance judgments. The user criteria studies are based on the ideas that relevance judgments should be made by users who are motivated by their own information problem situations and that relevance judgments should take into account a variety of factors, including nontopical factors, that underlie simple acceptreject decisions. It should be noted that here we are concerned only with users' criteria, and not with all criteria used in information system design and evaluation (see Lancaster & Warner, 1993). Third, despite wide variations in the types of users and information environments studied, the pool of criteria identified covers all major factors inherent in general 222 Carol L. Barry and Linda Schamber models of information seeking and use, and further exhibits a remarkable overlap of criterion concepts from one study to the next. This redundancy of results from diverse and independent efforts strongly supports and helps validate the existence of a finite set of criteria that exists across types of users and information environments. Fourth, the overlap in criterion results occurred despite important methodological differences among the studies. Although all the studies were qualitative and descriptive in approach, there was considerable variety in (1) the primary goals of the studies (i.e., not all primarily to describe user criteria); (2) the open-ended questions asked with respect to these goals; and (3) specific techniques for asking questions and analyzing responses. These differences seem to support reliability, as well as validity, in the collective results. The remainder of this article compares and contrasts the results of the studies by Barry and Schamber. We limit our discussion to these two studies, in part, for the practical reason that the results are highly comparable. Given that both studies focused exclusively on eliciting and identifying user criteria and took a similar approach to content analysis, the results are presented in extensive and well-organized taxonomies with detailed definitions that can be examined side-byside. In addition, the types of users and information use environments studied demonstrate extremes in contrast that we feel help clarify how such factors may have affected the criteria mentioned by respondents. At the same time, the overlaps in criteria and collective range of criteria identified resemble the results of other user criteria studies to the extent that we feel they are largely representative of user criteria as a whole. 3. METHODOLOGIES AND RESULTS In this comparison, we first present the two major assumptions that affected our methodological approaches, followed by the overall criterion frequency results. We then describe the methodological approach of each study and compare the results criterion by criterion. One assumption on which both studies were based is that motivated users evaluating information within the context of a current information need situation will base their evaluations on factors beyond the topical appropriateness of information. This assumption is supported in both studies simply by the identification of the criteria mentioned by respondents, and by the fact that every respondent mentioned criteria beyond the topical appropriateness of information. Another assumption was that there is a finite range of relevance criteria that is shared across users and situations; that is, each individual does not possess a unique set of criteria for making relevance judgments. The intent of these studies was to identify a full range of criteria mentioned by respondents. The only means of determining that a full range had been obtained was to examine the redundancy of responses, or the point at which no new criteria were mentioned. In every possible ordering of respondents in Barry's study, redundancy for all criterion categories was reached after the ninth respondent had been interviewed. Redundancy in Schamber's study was similar. This is generally consistent with the findings of previous studies, in which redundancy of criterion mentions was achieved through interviews with fewer than 10 respondents (see Fletcher, 1988; Nilan & Fletcher, 1987). Table 1 lists criterion categories and frequencies of mention by respondents in each study. It should be noted that these frequencies do not necessarily represent the relative importance of specific criteria; respondents were only asked to describe criteria, not to rate or rank the criteria in any way. Users' criteria for relevance evaluation 223 Table 1. Frequencyof criterion category mentions The Barry study: 448 mentions of criterion categories by 18 respondents Category Number of Mentions Depth/Scope Accuracy/Validity (Obj, Subj.) Content Novelty Tangibility Affectiveness Recency Availability Environment Consensus External Verification Background/Experience Source Reputation Effectiveness Access (Obtain., Cost) Source Quality Source Novelty Clarity Ability to Understand Relationship with Author Time Constraints Personal Availability Document Novelty 64 60 53 29 25 25 21 20 19 19 18 16 14 14 10 9 9 7 6 5 5 The Schamber study: 811 mentions of criterion categories by 30 respondents Category Number of Mentions Presentation Quality Currency Reliability Verifiability Geographic Proximity Specificity Dynamism Accessibility Accuracy Clarity 115 114 107 103 96 84 63 52 43 34 Note: Schamber study data include 22 criterion subcategories, not shown. 3.1. The Barry stud)' The intent of the research design for this study was to create, as nearly as possible, an environment in which motivated users could evaluate information as it applied to real and current information need situations. Respondents were 18 faculty and students at Louisiana State University. Each respondent had submitted a request for an online search. A search was conducted for each respondent and a set of documents was randomly selected to serve as stimulus documents for each respondent. Respondents were presented with various document representations (i.e., bibliographic citations, abslracts, notes, and indexing terms) and with the full text of documents. Respondents were instructed to examine these materials and to mark any portion of the materials that indicated something the respondent would or would not pursue. Within this study, relevance was conceptualized as any connection that existed between the information contained within documents and the users' information need situations. The relevance judgment was operationalized as respondents' decisions to pursue or not pursue information. The notion of having respondents mark portions of the stimulus materials was suggested by the signaled stopping technique developed by Carter as a means of monitoring communicative activity, specifically reading behavior (Carter et al., 1973). In an open-ended interview situation, each respondent discussed each item that had been marked in the stimulus materials. The primary advantage of the open-ended interview technique was that respondents could discuss any aspect of the information presented and any aspect of their situations. There were no pre-defined categories or questions that would inherently limit responses. Given that the intent of the study was to identify and describe a full range of criteria, this non-restrictive approach seemed appropriate. In addition, the interview environment allowed the researcher to probe for depth and detail, and to immediately clarify ambiguous or confusing responses. The interviews were audiotaped and the tapes transcribed to create a data set for each respondent. A response was defined as anything said about one marked item. The 18 data sets contained a total of 989 responses to 242 documents. A content analytic technique was then used to inductively identify and describe the categories of criteria mentioned by respondents. (For an in-depth explanation of content analytic techniques, see Krippendorf, 1980.) 224 Carol L. Barry and Linda Schamber The content analysis identified 23 categories of relevance criteria mentioned by these 18 respondents. The 23 criterion categories were then grouped into seven broad classes. The classes identify criterion categories that pertain primarily to: the information content of documents; the sources of documents; the document as a physical entity; other information or sources within the environment; the user's situation; the user's beliefs and preferences; and the user's previous experience and background. Table 2 presents the criterion categories within these classes. (For more detailed explanations of the criterion categories, see Barry, 1993, 1994). 3.2. The Schamber study The intent of the research design for this study was to elicit users' criteria in real-life information seeking and use situations that, unlike nearly all previous relevance research, involved multiple types of information sources and information display or presentation formats. The context of weather information suited this intent. Respondents were 30 users of weather information in three occupational fields: 10 each in construction, electric power utilities, and aviation. Their situations involved weatherrelated planning decisions: the protection of workers and materials during winter construction projects, the scheduling of electric power line maintenance and repairs, or Table 2. Categories of relevance criteria from the Barry study Criteria Pertaining to Information Content of Documents which information is in-depth or focused the extent to which information is accurate, correct, or valid - C l a r i t y : the extent to which information is presented in a clear or readable manner - R e e e m T : the extent to which information is recent, current, up-to-date T a n g i b i l i t y : the extent to which information relates to real, tangible issues; the extent to which definite, proven information is provided - E f f e c t i v e n e s s : the extent to which a technique or procedure that is presented is effective or successful - D e p t h ~ S c o p e : the extent to - Objective Aceuracy/Validio,: - Criteria Pertaining to Sources of Documents the extent to which general standards of quality can be assumed based on a source of the document (i.e., author, editor, journal, sponsoring agency, etc.) - Source Reputation/Vis'ibility: the extent to which a source of the document is well-known or reputable - Source Qualio,: Criteria Pertaining to the Document as a Physical Entity to obtain a document a document - O b t a i n a b i l i O ' : the extent to which some effort will be required - C o s t : the extent to which some cost will be involved to obtain Criteria Pertaining to Other Information or Sources within the Environment the extent to which there is consensus within the field relating to the information within the document - E x t e r n a l V e r ( f i e a t i o n : the extent to which information within the document is supported by other sources of information - A v a i l a b i l i t y w i t h i n t h e E n v i r o n m e n t : the extent to which information like that within the document is available elsewhere - Personal Availabilio: the extent to which the user has information like that within the document - Consensus Criteria Pertaining to the User's Situtation the extent to which time constraints or deadlines are a factor within the situation R e l a t i o n s h i p w i t h A u t h o r : the extent to which the user has a personal or professional relationship with the author of a document Criteria Pertaining to the User's Beliefs and Perferences Sub/ective Aceura~T/Validity: the extent to which the user agrees with information presented within the document or the extent to which the information within the document supports the user's point of view A f f ~ ' c t i v e n e s s : the extent to which the user exhibits an affective or emotional response to any aspect of the information or document Criteria Pertaining to the User's Previous Experience or Background Background/Experience: the degree of knowledge with which the user approaches information, as indicated by mentions of background or experience A b i l i o ' to U n d e r s t a n d : the user's judgment that he/she will be able to understand or follow the information presented C o n t e n t N o v e l o ' : the extent to which the information presented is novel to the user S o u r c e N o v e l o ' : the extent to which a source of the document (i.e., author, journal) is novel to the user D o c u m e n t N o v e l t y : the extent to which the document itself is novel to the user - Time - - - - - within the Field: Constraints: Users' criteria for relevanceevaluation 225 the scheduling and routing of airplane flights. They consulted seven types of weather information sources: Self (often witnessing actual weather conditions); Other Person; Weather Information System including public-access (e.g., telephone recording) and specialized (e.g., computerized aviation) system; Television; Radio; Newspaper; and Weather Instrument (from airport windsock to sophisticated radar-based system). Each respondent was asked to describe events in a decision-making situation that depended on information about the weather. The researcher created a time-line by noting the events on index cards and laying the cards out sequentially to form visual reference points. The interview focus then narrowed to three critical informationseeking events and, within those events, weather questions, weather information sources, and presentation formats. Respondents were asked to evaluate each type of source they consulted and its mode of presentation. Criteria were operationalized as ways in which sources or presentations made a difference to respondents in their situations. The technique of structured time-line interviewing was adapted from work by Dervin (1983). This technique was useful for orienting respondents to their situations and facilitating recall of their perceptions. The questionnaire was a flexible instrument that allowed description of situations with a wide variety of events, questions, and source types. The open-ended, neutrally worded items and probes yielded richly detailed data for content analysis. Each of the 30 interviews was audiotaped, transcribed, and subjected to inductive content analysis in order to identify and describe criteria. A response was defined as anything said in answer to one questionnaire item. The interview transcripts texts were so long that it was necessary to limit content analysis to the 365 responses made to only four questionnaire items. These responses yielded 811 mentions of criteria. Within the three critical events that were the focus of the interviews, respondents reported consulting weather information sources 189 times, or more than six times per respondent on average. Each respondent consulted one to seven different types of sources and presentations, or a mean of nearly three types each. The content analysis identified 10 summary and 22 detail categories of criteria mentioned by the 30 respondents. Regardless of questionnaire item (e.g., evaluating source or presentation), respondents mentioned a full range of criteria pertaining to information, source and presentation qualities. The 10 summary-level categories were Accuracy, Currency, Specificity, Geographic Proximity, Reliability, Accessibility, Verifiability, Clarity, Dynamism, and Presentation Quality. Table 3 describes all 32 criterion categories, (For more detailed explanations, see Schamber, 1991b.) 4. COMPARISON OF USERS' CRITERIA FOR RELEVANCE EVALUATION The following discussion compares the results of these two studies. We begin by discussing the criterion categories common to both studies; that is, we examine the extent to which these two groups of users, examining very different types of information and sources of information, for very different purposes, mentioned the same criteria as factors affecting their relevance evaluations. Again, this is a first step toward determining the extent to which there is a core of relevance criteria that spans such factors as information need situations, user environments, and types of information. We then identify those criterion categories that were unique to one study, and explore the possible reasons for these differences between the findings of the two studies. 4.1. Criterion categories common to both studies Table 4 presents a summary of the criterion categories common to both studies. For each common category, the specific categories from the Barry study and the Schamber 226 Carol L. Barry and Linda Schamber Table 3. Categories of relevance criteria from the Schamber study Accuracy Currency Time Frame Specificity Summary/l nterpretation Variety/Volume Geographic Proximity Reliability Expertise Directly Observed Source Confidence Consistency Accessibility Availability Usability Affordability Verifiability Source Agreement Clarity Verbal Clarity Visual Clarity Dynamism Interactivity Tracking/Projection Zooming Presentation Quality H u m a n Quality Nonweather Information Permanence Presentation Preference Entertainment value Choice of Format Information is accurate Information is up-to-date or timely Information covers a specific time frame Information is specific to user's need; has sufficient detail or depth A summary, interpretation, or explanation is available There is a sufficient variety or volmne of specific information, or just 'a lot' of information Information covers a certain geographic area Respondent trusts, has confidence in source; source is reputable Source is expert, professional, or experienced H u m a n source observes or experiences actual weather conditions H u m a n has confidence in own information Source delivers information with the same quality, often accuracy, over time Source is both available and easy to use; generally convenient. Little effort or cost required for access/operation Source is readily available when needed, or where needed, or just always available Source is easy to use; requires little effort to operate or learn to operate; there are no technical problems Information service is free or the cost reasonable Other sources of information are consulted or available Information from this source is consistent with that from other sources Information is presented clearly; little effort to read or understand Written or spoken language is clear and well-organized Visual display is clear, easy to follow, well-organized Presentation of information is dynamic, active, or live User can engage in two-way interaction with source that allows him to manipulate the presentation User can track or follow movement of weather in real time or over a period of time User can see more than one spatial view Source presents information in a certain format or style, or offers output in a way that is helpful, desirable, or preferable Refers to characteristics of a h u m a n source Source presents information that does not pertain to weather or information in addition to weather Information is presented in permanent or stable form (e.g., hard copy) User prefers source primarily because of way it presents information Presentation gives user pleasure; user enjoys it; it has interest or entertainment value Source provides a choice of presentation format or output study are identified, and a definition based on both Barry's and Schamber's definitions is provided. 4.1.1. Depth~scope~specificity. Respondents in both studies were evaluating information in terms of the depth or scope or specificity of the information. Barry simply defined this as one category: Depth/Scope. Schamber identified one criterion category (Specificity) and then further identified two subcategories: Summary/Interpretation and Variety/Volume. However, in the coding rules for Barry's study, any responses that included mentions of such characteristics as the extent to which information was summarized or the sheer volume of information, were in fact coded for Depth/Scope. 4.1.2. Accuracy~validity. In this instance, there seems to be an exact match in categories between the two studies. Barry's Objective Accuracy/Validity and Schamber's Accuracy are both referring to the extent that users judged information to be accurate, correct or valid. 4.1.3. Clarity. Respondents in both studies were evaluating information in terms of the clarity of the information. The slight differences in the categories defined by the two studies seem to be an attribute of the types of materials being examined by respondents. Barry defines clarity, in part, as the readability of the information. Given that respondents were only examining printed, textual materials, this is an appropriate definition of clarity. Schamber identified two subcategories of clarity: Verbal Clarity, referring to written or spoken language, and Visual Clarity, referring to visual displays. Given that respondents in this study were evaluating interpersonal communications, weather instru- Users' criteria for relevance evaluation 227 Table 4. Criterion categories common to both studies Depth/Scope/Specificity Barry: Depth/Scope Schamber: Specificity; Summary/Interpretation; Variety/Volume The extent to which information is in-depth or focused; is specific to the user's needs; has sufficient detail or depth; provides a summary, interpretation, or explanation; provides a sufficient variety or volume Accuracy/Validity Barry: Objective Accuracy/Validity Schamber: Accuracy The extent to which information is accurate, correct or valid Clarity Barry: Clarity Schamber: Clarity; Verbal Clarity; Visual Clarity The extent to which information is presented in a clear and well-organized manner Currency Barry: Recency Schamber: Currency The extent to which information is current, recent, timely, up-to-date Tangibility Barry Tangibility Schamber: Specificity The extent to which information relates to real, tangible issues; definite, proven information is provided; hard data or actual numbers are provided Quality of Sources Barry Source Quality; Source Reputation/Visibility Reliability; Expertise; Directly Observed; Source Confidence; Schamber: Consistency The extent to which general standards of quality or specific qualities can be assumed based on the source providing the information; source is reputable, trusted, expert Accessibility Barry Obtainability; Cost Schamber: Accessibility; Availability; Usability; Affordability The extent to which some effort is required to obtain information; some cost is required to obtain information Availability of Information/Sources of Information Barry Availability within the Environment; Personal Availability Schamber: Verifiability The extent to which information or sources of information are available Verification Barry External Verification; Subjective Accuracy/Validity Schamber: Source Agreement The extent to which information is consistent with or supported by other information within the field; the extent to which the user agrees with information presented or the information presented supports the user's point of view Affectiveness Barry Affectiveness Schamber: Entertainment Value The extent to which the user exhibits an affective or emotional response to information or sources of information; information or sources of information provide the user with pleasure, enjoyment or entertainment. m e n t s a n d c o m p u t e r i z e d systems, in a d d i t i o n to p r i n t e d m a t e r i a l s , s u c h d i s t i n c t i o n s a r e a p p r o p r i a t e . I n the b r o a d e s t sense, h o w e v e r , r e s p o n d e n t s in b o t h studies w e r e e v a l u a t ing the e x t e n t to w h i c h i n f o r m a t i o n was clearly p r e s e n t e d a n d easily u n d e r s t o o d , r e g a r d less o f the f o r m a t s o f t h e i n f o r m a t i o n . 4.1.4. Currency. T h i s is a n o t h e r i n s t a n c e in w h i c h t h e r e seems to be a n e x a c t m a t c h b e t w e e n c a t e g o r i e s d e f i n e d by b o t h studies. B a r r y ' s R e c e n c y a n d S c h a m b e r ' s C u r r e n c y are b o t h r e f e r r i n g to the e x t e n t to w h i c h users j u d g e d i n f o r m a t i o n to be c u r r e n t , recent, u p - t o - d a t e , o r timely. 4.1.5. Tangibility. F r o m the d e f i n i t i o n s o f c r i t e r i o n c a t e g o r i e s p r o v i d e d , t h e m a t c h b e t w e e n B a r r y ' s T a n g i b i l i t y a n d S c h a m b e r ' s Specificity m a y n o t be o b v i o u s . H o w e v e r , r e s p o n s e s in the S c h a m b e r s t u d y w e r e c o d e d f o r specificity if the r e s p o n d e n t m e n t i o n e d s u c h c h a r a c t e r i s t i c s as " h a r d d a t a " o r " a c t u a l n u m b e r s . " S u c h r e s p o n s e s w o u l d h a v e b e e n c o d e d f o r t a n g i b i l i t y in B a r r y ' s study. T h i s is a s i t u a t i o n in w h i c h r e s p o n d e n t s in b o t h studies m e n t i o n e d the s a m e t y p e s o f criteria, b u t t h e r e s e a r c h e r s used d i f f e r e n t levels o f c o d i n g c a t e g o r i e s to d e s c r i b e t h o s e criteria. 228 Carol L. Barry and Linda Scharnber 4.1.6. Quality of sources. It is evident that respondents in both studies mentioned the sources of information as one factor affecting their evaluations of information. The different types of sources being examined by respondents in the two studies affected the actual coding categories and definitions devised by the researchers. Given that respondents in the Barry study were only examining published scholarly works, there were a limited number of sources that respondents could evaluate: authors or editors, the affiliations of authors or editors, publications in which documents were appearing, or the sponsoring research agencies. Barry defined two criterion categories to reflect reactions to these sources. Source Quality, the extent to which general standards of quality could be assumed, was coded for those responses in which respondents predicted the quality of information based on their previous, personal experience with information from the source. Source Reputation/Visibility, the extent to which the source is wellknown or reputable, was coded for responses in which respondents mentioned the public reputation or visibility of the source, regardless of the respondent's previous personal experience with information from the source. Considering that sources in the Schamber study ranged from oneself to mass media to weather instruments - - sources widely differing from authors and publications in the Barry study - - the criteria were impressively similar. Schamber's summary category, Reliability, and the first subcategory, Expertise, seem to contain elements of both criterion categories defined by Barry: the respondent trusted and had confidence in the source; the source was reputable; the source was expert, professional, or experienced. The next two subcategories specifically address situations in which the source of the information was the respondent or another human communicating with the respondent: Directly Observed, in which a human source observed or experienced actual weather conditions, and Source Confidence, in which a human had confidence in his own information. On the one hand, we can say that such criteria simply do not apply to the situation of Barry's respondents, in which the information being evaluated was published materials only. On the other hand, we could argue that there are similarities between a human having confidence in his own information, and a respondent's prediction of the quality of information from a specific source based on the respondent's previous exposure to information from that source: Barry's Source Quality. There also seems to be some overlap between Schamber's Directly Observed and Barry's Tangibility. In the Barry study, if a respondent had mentioned confidence in the information provided because a human had actually observed some event, that response would have been coded for Tangibility: the extent to which definite, proven information is provided. Finally, Schamber included a subcategory that identifies one very specific quality of information from a source: Consistency, the extent to which the source delivered information with the same quality, often accuracy, over time. In the Barry study, if a respondent had mentioned that information from a particular source was of consistently high quality, and consistently accurate, the response would have been coded for two categories: Source Quality and Objective Accuracy/Validity. The differences between the two studies in terms of the criterion categories for sources of information are clearly an indication of the extent to which the process of inductively defining categories from the responses of individuals examining information from different types of sources affected the resulting categories and definitions. Barry's categories are closely tied to the environment of published scholarly materials, while Schamber's categories allow for other sources of information, such as personal observations of weather conditions. It does seem reasonable to conclude, however, that respondents in both studies were judging the quality of information based on the sources of information, and that respondents were relying on both personal experiences and the public reputation of sources to make those judgments. 4.1.7. Accessibility. Respondents in both studies discussed the extent to which some effort or some cost would be involved in obtaining information. Again, the slight differences between the criterion categories for the two studies seem to be a result of the Users' criteria for relevance evaluation 229 different sources from which respondents could be obtaining information. Respondents in the Barry study were evaluating the accessibility of printed documents. She identified two criterion categories relating to the effort and cost involved in obtaining printed documents. Obtainability, the extent to which some effort would be required to obtain a document, was typically a response indicating that a document would not be readily available on campus and that some type of interlibrary loan or document delivery procedure would be involved. Cost, the extent to which some cost would be required to obtain a document, was typically a response indicating that the ordering of the document would involve a fee. Schamber's categories of Accessibility, Availability, and Affordability are closely related to Barry's categories; that is, the effort and cost required to obtain information from a source. Again, the difference seems to be that respondents in Schamber's study were also discussing the accessibility of information from sources such as weather instruments and computerized systems. Hence, Schamber has included a subcategory that does not appear in Barry's findings: Usability, the extent to which the source was easy to use, required little effort to operate or learn to operate, and presented no technical difficulties. It simply does not seem reasonable to assume that faculty and students examining printed documents would ever mention the difficulty in using or learning to use a printed document. In other words, respondents in the Schamber study were discussing situations in which they often had to manipulate some source in order to obtain information. Respondents in the Barry study were simply presented with the information to be evaluated, and would thus not discuss the ease or difficulty of manipulating any type of source of information. Certainly technical ease of use is a major concern in other research focusing on users who interact directly with information retrieval systems; however, respondents in the Barry study were not asked to interact with such systems. 4.1.8. Availability of information/sources of inJormation. Respondents in both studies were evaluating specific pieces of information within the broader context of the availability of information or sources of information within the environment. Barry defined two categories of availability. Availability within the environment refers to the extent to which information like that within a document is available elsewhere. An example of a response coded for this category is: ++I'm really not going to write about Noel Coward, but so little is written about him that I would grab this now that I've found it." The respondent's decision to pursue the information was influenced by the extent to which information is available within the information environment as a whole. The second criterion category, Personal Availability, refers to the extent to which the respondent already possessed information like that in the document. An example of a response coded for this category is: +'This is about church history and i already have several articles on that. I don't need more." The respondent's decision to not pursue the information was based more on the respondent's personal collection of information than on the general availability of information within the environment. One could argue that a category like Personal Availability did not appear in Schamber's findings because her respondents would not typically possess personal collections of information about current weather conditions. in the multiple-source environment of the Schamber study, the concept of availability took a somewhat different slant. To these respondents, who moved around in the physical environment, sources had to be available when or where respondents were located; for example, " I f I happened to be out in the truck, I had it [the radio] on." Availability was also closely related to Verifiability in the sense that other sources had to be available for comparison, as in: '+Sometimes I get the weather from somewhere else, and compare." 4.1.9. Verification. Respondents in both studies discussed the extent to which the information being evaluated was consistent with or supported by other information in the environment. There seems to be an exact match between Barry's External Verification and Schamber's Source Agreement. Verification was extremely important to Schamber's 230 Carol L. Barry and Linda Schamber respondents, whose situations changed constantly with changes in the weather. Several respondents said they monitored the weather every waking moment, on and off the job, using whatever source was available. Barry also included a separate category for Subjective Accuracy/Validity. This category was coded when respondents mentioned the extent to which they agreed with information being presented, or the extent to which the information supported the respondents' beliefs or points of view. Such responses in the Schamber study would have been coded for Source Agreement; the information from the source was consistent with that from other sources, including the respondent's own observations and information. In other words, the respondent agreed or disagreed with the information being evaluated. This is yet another instance in which respondents in the two studies seem to be discussing the same criteria, although the researchers devised different coding categories for those responses. 4.1.10. Affectiveness. Respondents in both studies discussed the extent to which information or sources of information provided them with pleasure, enjoyment or entertainment. In such instances, respondents were exhibiting affective reactions. An example of such a response from the Barry study is: "The footraces are really not part of my current research, but I just love reading these articles. I shouldn't admit it, but I'll probably get this one first and read it, just for the sheer fun of it. Then I'll get back to work. ~' This seems to be a direct match for Schamber's category of Entertainment Value. For example, one respondent talking about the worldwide forecast on the cable Weather Channel said, " T h a t ' s kind of fun to see in the morning." 4.2. Criterion categories identified only by Barry 4.2.1. Effectiveness. One criterion category identified by Barry that does not appear in Schamber's findings is Effectiveness: the extent to which a technique or procedure that is presented is effective or successful. Respondents who mentioned this criterion were typically, as part of their information need situation, trying to determine how to do something; how to design a research study or methodology, or how to measure a particular phenomenon, for example. Under these circumstances, respondents seemed to be particularly interested in documents that presented evidence that a methodology or technique or procedure had been used successfully, and so might be used successfully by the respondent as well. One can argue that respondents in Schamber's study would not mention this criterion because they were not in fact exploring how to measure or predict the weather; they were simply interested in utilizing data about weather conditions. One could also argue that respondents in Schamber's study might very well have been incorporating aspects of this criterion when discussing the reliability of sources of information; that a reliable source had devised some successful technique for measuring and/or predicting weather conditions. However, even under those circumstances, her respondents would not typically discuss this in terms of a technique or process that the respondent could then use, which was the focus of respondents in the Barry study. The presence of this particular category in one study, but not the other, seems to be an attribute of the different use situations of the two groups of respondents; that is, respondents attempting to devise a methodology or measurement technique to be used in their own research versus respondents attempting to utilize data about current and future weather conditions in order to make job-related decisions. 4.2.2. Consensus within the field. This criterion category refers to the extent to which there is or is not consensus, or agreement, within a field relating to the information being evaluated. One example of a response coded for this category is: " T h a t really isn't part of the dissertation, I can't use that, but there's this whole huge argument about this. People have been having these heated debates about whether that's true or not. So I'd probably look at this, just to see what's going on with that debate." Another Users' criteria for relevanceevaluation 231 example is: "I'll only be discussing areas in which there is still some disagreement about the effectiveness of allocating federal monies to groups, and it is pretty much agreed that money should be distributed to the handicapped, so I wouldn't want that. That's just not a question that needs to be addressed." Here the extent to which there is or is not consensus within the field on a particular theory or question seemed to be affecting respondents' decisions to pursue or not pursue the information. One can argue that respondents in Schamber's study would not mention this type of consensus or debate surrounding a certain theory or issue simply because such theories or issues would not appear in data about weather conditions. In any event, any disagreements would be resolved when a weather condition actually occurred. This is very different from the types of information examined by Barry's respondents, in which an intellectual theory could be supported, but not proved. 4.2.3. Time constraints. This category was coded whenever a respondent mentioned that time constraints or deadlines were a factor in the decision to pursue or not pursue information. Typically, a respondent would indicate that a particular document would not be pursued because it could not be obtained in time for whatever deadline was in place. Although the Schamber study does not include a coding category for this criterion, it is actually inherent within the situations of all of her respondents. That is, each respondent needed to obtain information about weather conditions - - often urgently - - before and during decision-making in order to complete a task (flying an airplane, repairing electric power lines). One can argue that the factor of time constraints was held constant in the Schamber study (i.e., every situation was time-driven) and therefore that it was not mentioned per se, but rather assumed by respondents. Time constraints were implicit, instead, in the fact that Currency was the criterion mentioned most often by the most respondents. 4.2.4. Relationship with author. Several respondents in the Barry study mentioned their personal or professional relationships with authors of documents as influencing their decisions to pursue documents. For example, respondents would mention that the author was a colleague or friend, and that the respondent would pursue the document for that reason, regardless of the extent to which the information within the document actually addressed the situation for which the respondent was seeking information. It seems obvious that respondents in the Schamber study would not mention this factor, simply because the types of information being evaluated would not involve this type of authorship. 4.2.5. Background/experience and ability to understand. It is clear that almost every criterion category in both studies is somehow linked to the user's background and experience. The ability of respondents to predict the quality of information from certain sources, to agree or disagree with information presented, to evaluate the accuracy or validity of information, are all dependent on the respondents' knowledge and experience. This category within Barry's study was typically used to code responses in which respondents indicated that they were lacking some background or experience. Respondents then typically expressed concerns about their ability to understand the information presented. For example: "This is coming out of linguistics, and I just don't have that background, that's not my field. So 1 probably couldn't understand this anyway." It seems probable that respondents in the Schamber study would not discuss their backgrounds or experience in this way, simply because all respondents were examining weather-related information and all respondents were experienced in evaluating this type of information. Under those circumstances, it would be unlikely for respondents to mention a lack of experience or concerns about their ability to understand the information presented. However, a few did comment on having to learn the jargon. This was especially true in aviation, where one respondent said: "I've been doing this for so long it's clear. Someone new to the system, it takes a good couple years before people feel comfortable reading these. It's like starting out in flying, you need to know what they're saying, what the symbology means, because you do have a lot of acronyms." 232 Carol L. Barry and Linda Schamber 4.2.6. Novelty. Barry identified three criterion categories that were related to the extent to which information being examined was new or novel to the user. Document Novelty refers very specifically to situations in which the stimulus document being examined was or was not novel to the respondent. This category was most often coded for situations in which the respondent indicated that the information content of a document was appropriate, but the document would not be pursued because the respondent already had a copy or had already read the document. It seems clear that this criterion category is so closely tied to the research setting of the Barry study, in which respondents were presented with stimulus documents, that it would not apply to respondents in Schamber's study. Source Novelty refers to the extent to which a source of information (i.e., an author, a journal) was or was not novel to the respondent. There were two typical types of responses coded for this category. First, there were situations in which a respondent's familiarity with a source of information, such as an author, allowed the respondent to predict certain aspects of the information that would be presented. Second, there were situations in which an unfamiliar source was seen as a path to additional information. For example, a respondent who was unaware of a journal in a particular field indicated that he would now examine that journal for other articles of interest; it was a potential source of information that was previously unknown to the respondent. Respondents in the Schamber study were describing sources of information that had been used in the past, and sources of information that were fairly standard within their work environments. Under those circumstances, respondents were simply not presented with new or novel sources of information to which they could react. Content Novelty refers to the extent to which the information being examined was or was not novel to the respondent. Examples of responses coded for this category include: "It's not that it doesn't apply to what I ' m doing, it's just that I already know all of that" and "This is a statement that I've never seen anyone make before, so I really want to follow up on that, see what kind of evidence he has for that.'" Why would this criterion not be mentioned by respondents in the Schamber study? It does seem as if users evaluating weather information would in fact be influenced by the extent to which new information was being presented. One could argue that, when the information being evaluated is restricted to information about weather conditions, that the timeliness or currency of the information is a reflection of novelty. That is, if weather conditions have changed in the past eight hours and the user receives the most current information about those conditions, then the information is by definition new and novel to that user. In that sense, respondents in the Schamber study would be likely to discuss information only in terms of currency and timeliness, not the extent to which the information was something they had never seen before. 4.3. Criterion categories identified only by Schamber 4.3.1. Geographic proximity. This category was coded in Schamber's study when respondents mentioned that weather information covered a certain geographic location or area. It followed only one other category, Currency, in being mentioned most often by the most respondents. Clearly it pertained to the topic of weather, but as the topic of weather was held constant (i.e., every situation was weather-related), it can be argued that the topic was assumed and therefore not mentioned by respondents. There was no criterion category for topic in Schamber's study; instead, Geographic Proximity might be considered a greatly expanded subset of the criterion Specificity for the importance of topical detail. Respondents in the Barry study did in fact mention the geographic locations or areas discussed in documents; for example, that a study was done in the same province in China as the respondent's study or that the section of the Mississippi River being discussed was outside the geographic area of the respondent's research. In the Barry Users' criteria for relevanceevaluation 233 study, any responses that pertained to the subjects of documents (i.e., the document is about something) were not coded as mentions of relevance criteria. Such responses were coded for mentions of Information Content only. Given that there was no attempt in Barry's study to develop categories for specific types of information content, there would be no category to correspond to Schamber's Geographic Proximity. 4.3.2. Dynamism and all subcategories. Schamber's summary category of Dynamism and the subcategories (Interactivity, Tracking/Projection, and Zooming) all pertain to the extent to which the presentation of information was dynamic or live, and the extent to which respondents could manipulate the presentation of information. Again, it seems obvious that such factors would not be mentioned by respondents in Barry's study, simply because they were restricted to the examination of printed documents that were not interactive and that the respondent could not manipulate to change the presentation of information. 4.3.3. Presentation quality and all subcategories except entertainment value. As discussed earlier, Schamber's Entertainment Value and Barry's Affectiveness seem to be comparable categories. However, Schamber's summary category of Presentation Quality and the remaining subcategories (Human Quality, Nonweather Information, Permanence, Presentation Preference, and Choice of Format) are not reflected in Barry's categories. Again, the explanation for these variations seem to directly relate to the fact that Schamber's respondents were using sources that could be manipulated in some way and sources that varied greatly in terms of information presentation. Presentation Quality refers to the extent to which the source presents information in a certain format or style. One can imagine that the format of information presented by a radio station weather report and a computerized weather information system would vary greatly, and that this variation might influence the respondent's perception of the utility of the information. Two subcategories, Presentation Preference and Choice of Format, are closely related to this concept that respondents could manipulate the presentation of formats and could prefer some formats over others. On the other hand, one could argue that the format of published scholarly works is actually quite standardized; that the format does not vary greatly from one source to another. For this reason, respondents in Barry's study would not be likely to mention variations in formats or presentations of information. Human Quality refers to the characteristics of a human source. Examples of responses coded for this category are: "I think the guy has a personality" and "He's got his own style and he goes against the grain sometimes." One can simply argue that respondents in Barry's study were not exposed to the types of interpersonal communications that would result in such responses. Nonweather Information refers to the extent to which the source presents information that does not pertain to the weather, or information in addition to the weather. This criterion category is so closely tied to the evaluation of weather-related information that it could not apply to Barry's respondents, none of whom were evaluating weather-related information. Permanence refers to the extent to which information is presented in a permanent or stable form. Examples of responses coded for this category include: "I get a printout and post it so the men can see it" and "I'd rather have this in the form it's in because I can actually hold onto it and look at it." The very existence of such a category in the Schamber study is an indication that some of the sources being evaluated did not provide information in a permanent or stable format. Such criteria would not be mentioned by respondents in Barry's study because all of the information being evaluated was in a permanent and stable format: printed documents. 234 CarolL. BarryandLindaSchamber 5. C O N C L U S I O N S The results of this comparison allow us to reach conclusions about both similarities and differences in the criteria mentioned by respondents in the Barry and Schamber studies. First, there is a high degree of overlap among the criterion categories mentioned by respondents in both studies. This is especially interesting considering the marked differences in types of users, information formats and sources, and information use environments between the two studies. In one study the users were faculty and students evaluating scholarly publications in a variety of intellectual disciplines, and in the other study users were aviation, construction, and utility workers evaluating multiple sources of weather information. The similarities among criteria mentioned by these diverse users seem to provide evidence for the existence of a finite range of criteria that are applied across types of users, information problem situations, and information sources. Second, there are a few criterion categories that do not overlap; that is, categories that are not common to both studies. It seems reasonable to conclude that this divergence is not due to inherent differences in the evaluation behaviors of respondents in the two groups. Rather, they appear to be due to the differences in situational contexts and research task requirements: specifically, control for source type in the Barry study and control for topic in the Schamber study. For example, if respondents in the Barry study had evaluated the sources of documents (e.g., online retrieval systems, document delivery systems, human intermediaries), it is quite likely that they would have mentioned criteria such as system usability and interactivity. On the other hand, if respondents in the Schamber study had evaluated only print documents and not their sources, it is highly unlikely that they would have mentioned criteria referring to characteristics of various formats and presentations, and the ability to control and manipulate sources. Thus the criteria that do not appear in both studies seem to represent shifts in users' selection of criteria, and definitional refinements of criteria, according to the type of information problem situation, task requirement, and source environment. Based on this observation, it can be said that different criteria mentioned under contrasting conditions seem to provide evidence for the existence of a few criterion categories that do respond to situational factors, including criteria that may exist within (or hierarchically, below) the broader common categories. Generally, the findings of both studies confirm the contention that users' relevance evaluations depend on their individual perceptions of their problem situations and the information environment as a whole, and that their perceptions encompass many factors beyond information content. Again, this comparison is only a first step; we anticipate further validation of these criterion categories in future studies involving other types of users and use environments. We feel that continued progress in refining the discipline's understanding of a core set of relevance concepts can benefit both basic and applied research. For example, the core user criteria can be incorporated in the ever more complex and multilayered behavioral models that are currently evolving, as well as in new measurement instruments for evaluating user-centered aspects of information system performance. Only through such studies, and the synthesis of findings of such studies, can information science approach a greater understanding of factors that influence users' relevance evaluation processes. REFERENCES Allen, B. L. (1996). Information tasks." Toward a user-centered approach to O~l'ormation systems. New York: Academic Press. Barry, C. L. (1993). The ident(/qcation of user relevance criteria and document characteristics: Beyond the topical approach to information retrieval, Unpublished doctoral dissertation, Syracuse University. Syracuse, NY. Users' criteria for relevance evaluation 235 Barry, C. L. (1994). User-defined relevance criteria: an exploratory study. Journal of the American Socieo, for Information Science, 45(3), 149-159. Belkin, N. J. (1980). The problem of 'matching' in information retrieval. In O. Harbo & L. Kajberg (Eds.) Theory and application of in[ormation research (pp. 187-197). London: Mansell. Belkin, N. J., Oddy, R. N., & Brooks, H. M. (1982). ASK for information retrieval. Journal of Documentation, 38(2), 145 164. Boyce, B. (1982). Beyond topicality: a two stage view of relevance and the retrieval process. Information Processing and Management, 18(3), 105 109. Carter, R. F., Ruggels, W. L., Jackson, K. M., & Heffner, M. B. (1973). Application of signaled stopping technique to communication research. In P. Clarke (Ed.) New Models for Mass Communication (pp. 15-43). Beverly Hills, CA: Sage. Cool, C., Belkin, N. J., & Kantor, P. B. (1993). Characteristics of texts affecting relevance judgments. In M. E, Williams (Ed.) Proceedings of the 14th National Online Meeting (pp. 77 84). Medford, N J: Learned Information. Cooper, W. S. (1971). A definition of relevance for information retrieval, ln/brmation Storage and Retrieval, 7(1), 19 37. Cooper, W. S. (1973). On selecting a measure of retrieval effectiveness. Journal of the American Society jor lnJormation Science, 24(2), 87 100. Cooper, W. S. (1978). Indexing documents by gendanken experimentation. Journal of the American Socieo~for h!/brmation Science, 29(3), 107-199. Cuadra, C. A., & Katter, R. V. (19673. Experimental studies of relevance judgments: Final report, Vol. L" Prq/ect SummaJ3' (NSF Report No. TM-3520/001/00). Santa Monica, CA: System Development Corp. Dervin, B. (1983). An overview of sense-making research: Concepts, methods and results to date, Paper presented to the International Communication Association, Dallas, TX. Fletcher, P. T. (1988). An exploration of situational dimensions in the inJormation behaviors of general managers in state government, Unpublished doctoral dissertation, Syracuse University, Syracuse, NY. Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Socie O' ]br ln/brmation Science, 43(9), 602-615. Harter, S. P. (1996). Variations in relevance assessments and the measurement of retrieval effectiveness. Journal q[' the American Society Jot ln]brmation Science, 47(1), 37~9. Harter. S. P., & Hert, C. A. (1997). Evaluation of information retrieval systems, Annual Review of b~/brtnation Science and Technology, 32, 3 94. Krippendorf, K. (19803. Content analysis: An introduction to its methodology, Beverly Hills, CA: Sage. Lancaster, F. W., & Warner, A. J. (1993). hformation retrieval today. Arlington, VA: Information Resources Press. MacMullin, S. E., & Taylor, R. S. (1984). Problem dimensions and information traits, h~/brmation SocieO,, 3(1), 91-111. Marcus, R. S., Kugel, P., & Benenfeld, A. R. (1978). Catalog information and text as indicators of relevance. Journal of the American Soc# O'.[br Inlbrmation Science, 29(1), 15-30. Nilan, M. S., & Fletcher, P. T. (1987). Information behaviors in the preparation of research proposals: A user study. In C. Chen (Ed.) Proceedings of the 50 th Annual Meeting of the American Society ./br h!fbrmation Science Vol. 24 (pp. 186 192). Medford, N J: Learned Information. Park, T. K. (19923. The nature o[' relevance in information retrieval." An empirical study, Unpublished doctoral dissertation, Indiana University, Bloomington, IN. Park, T. K. (1993). The nature of relevance in information retrieval: An empirical study. Library Quarterly, 63(3), 318 351. Rees, A. M., & Saracevic, T. (1966). The measurability of relevance. Proceedings of the American Documentation Institute, 3,225 234. Rees, A. M., & Schultz, D. G. (1967). A .fieM experimental approach to the study o[" relevance assessments in relation to document searching, Vol. 1: Final report (NSF Contract No. C-423). Cleveland: Case Western Reserve University. Saracevic, T. (1975). Relevance: a review of and a framework for the thinking on the notion in information science. Journal qf the American Society Jbr Information Science, 26(6), 321 343. Saracevic, T. (1996a). Modeling interaction in information retrieval (IR): a review and proposal. In S. Hardin (Ed.) Proceedings qf the 59 'l' Annual Meeting o[" the American Society./or lnfbrmation Science Vol. 33 (pp. 39). Medford, N J: Information Today. Saracevic, T. (1996b). Relevance reconsidered '96. In P. Ingwersen & N. Ole Pots (Eds.) CoLIS2. 2"`j lnlernational Cop?/brence on Conceptions o/' Library and Information Science (pp. 201 218). Copenhagen, Denmark: Royal School of Librarianship. Schamber, L. (1991a). Users' criteria for evaluation in a multimedia environment. In J.-M. Griffitbs (Ed.) Proceedings q[' the 54 r/' Annual Meeting ~l the American Society ./or lr~formation Science Vol. 28 (pp. 126133). Medford, N J: Learned Information. Scbamber, L (1991b). Users' criteria .[br evaluation in multimedia in/brmation seeking and use situations, Unpublished doctoral dissertation, Syracuse University, Syracuse, NY. Schamber, L. (19943. Relevance and information behavior. In M. E. Williams (Ed.) Annual Review q[ In/ormation Science and Technology Vol. 29 (pp. 33 48). Medford, N J: Learned Information. Schamber, L., Eisenberg, M. B., & Nilan, M. S. (19903. A re-examination of relevance: toward a dynamic, situational definition. Information Processing and Management, 26(6), 755 776. Su. L. T. (1991). An investigation to ,find appropriate measures ./or evaluating interactive in/ormation retrieval, Unpublished doctoral dissertation, Rutgers, The State University of New Jersey, New Brunswick, NJ. Su, L. T. (1993). Is relevance an adequate criterion for retrieval system evaluation: an empirical study into the user's evaluation. In S. Bonzi (Ed.) Proceedings qf the 56 th Annual Meeting ~[' the American Society .for ln[brmation Science Vol. 30 (pp. 93 103). Medford, N J: Learned Information. 236 Carol L. Barry and Linda Schamber Swanson, D. R. (1977). Information retrieval as a trial-and-error process. Library Quarterly, 47(2), 128-148. Swanson, D. R. (1986). Subjective versus objective relevance in bibliographic retrieval systems. Librat3' Quarterly, 56(4), 389-398. Swanson, D. R. (1988). Historical note: Information retrieval and the future of an illusion. Journal of the American Society Jor lr~/ormation Science, 39(2), 92 98. Taylor, R. S. (1962). The process of asking questions. American Docume,tation. •3(4), 391 396. Taylor, R. S. (1968). Question-negotiation and information seeking in libraries. College and Research Libraries, 29(3), 178 194. Taylor, R. S. (1985). Information values in decision contexts, ln/brmation Mams,,ement Review, 1(1), 47 55. Taylor, R. S. (1988). Value-addedprocesses h~ information O,stem. Norwood. N J: Ablex. Wang, P. (1994). A cognitive model o[ document selection ~/' real users ~f IR systems, Unpublished doctoral dissertation, University of Maryland, College Park, MD. Wilson, P. (1973). Situational relevance, lr~[brmation Storage and Retrieval, 9(8), 457 471. Wilson, P. (1978). Some fundamental concepts of information retrieval. Drexel Library Quarterly, 14(2), 1024.