Panel on Reading the Information Infrastructure FRBR as an interdisciplinary high-middle-range theory for information science — a theoretical perspective Allen H. Renear Dave Dubin Graduate School of Library and Information Science University of Illinois at Urbana Champaign Champaign, IL 61820 +1 (217) 265-5216 Graduate School of Library and Information Science University of Illinois at Urbana Champaign Champaign, IL 61820 +1 (217) 244-3275 renear@uiuc.edu ddubin@uiuc.edu ABSTRACT We suggest that IFLA’s Functional Requirements for Bibliographic Records is an interesting, if unexpected, example of Merton’s “theories of the middle range” and show how theoretical analysis and refinement of such theories can illuminate the deep interdisciplinarity of information science. Topics Cultural information systems, Information management, Information organization, Nature and scope of iSchools and iResearch Keywords FRBR, information science, R. K. Merton, theory, conceptual modeling, ontology. For our part in this panel we will discuss a surprising candidate for a middle-range theory in information science, IFLA’s Functional Requirements for Bibliographic Records (FRBR).[8] Given its origins and objectives it may see odd to describe FRBR as a theory in information science, but we have found doing so illuminating and have come to feel that regardless of original intention it is indeed a theory, and a good one. Although guiding empirical research is the principal characteristic feature of middle-range theories, our discussion at this panel will take a theoretical rather than empirical perspective. This is an aspect of middle-range theories that is generally neglected, and it is one that we think nicely exhibits the deep interdisciplinarity of information science. 2. MIDDLE RANGE THEORIES According to Merton theories of the middle range1 …lie between the minor but necessary working hypotheses that evolve in abundance during day-to-day research and the all-inclusive systematic efforts to develop a unified theory that will explain all the observed uniformities of social behavior, social organization, and social change. [11] (p. 39) 1. INTRODUCTION Science proceeds through the criticism, empirical and theoretical, of competing explanatory theories and hypotheses. This picture is may be an oversimplification, but it is a common enough scenario nonetheless. Within contemporary information science however a sense of evolving scientific explanation, with theories generating hypotheses and undergoing both empirical and conceptual revision is still still not as routine as one might like. Too often “brute empiricism”[19] seems to oscillate with vague generalities. It is characteristic of middle-range theories that they are not directly inferred from experience but rather themselves generate inferences about experience: Each of these theories provides an image that gives rise to inferences. To take but one case: if the atmosphere is thought of as a sea of air, then, as Pascal inferred, there should be less air pressure on a mountain top then at its base. The initial idea thus suggests specific hypotheses which are tested by seeing whether the inferences from R. K. Merton has suggested that social science focus on “theories of the middle range”, rather than, on the one hand, mere hypotheses with little explanatory power, or, on the other hand, high-level all-encompassing theories that can be neither clearly defined nor empirically confirmed.[11] Although there are many promising middle-range theories in information science, there are not, we think, enough — the golden mean of Merton's middle range is apparently a hard target to hit when the problem space is interdisciplinary. Copyright and Disclaimer Information The copyright of this document remains with the authors and/or their institutions. By submitting their papers to the iSchools Conference 2008 web site, the authors hereby grant a non-exclusive license for the iSchools to post and disseminate their papers on its web site and any other electronic media. Contact the authors directly for any use outside of downloading and referencing this paper. Neither the iSchools nor any of its associated universities endorse this work. The authors are solely responsible for their paper’s content. Our thanks to the Association for Computing Machinery for permission to adapt and use their template for the iSchools 2008 Conference. 1 Merton cites among his historical antecedents in commending middle-range theories Bacon (axiomata media) and Mill (“middle principles”), and indicates Durkeim’s Suicide and Max Weber’s The Protestant Ethic and the Spirit of Capitalism, examples of middle-range theories in social science. them are empirically confirmed. The idea itself is test for its fruitfulness by noting the range of theoretical problems and hypotheses that allow one to identify new characteristics of atmospheric pressure. [11] (p. 40) These inferences “guide empirical inquiry”: Middle-range theory is principally used in sociology to guide empirical inquiry… it is intermediate to general theories of social systems which are too remote from particular classes of social behavior, organization, and change to account for what is observed, and to those detailed orderly description of particulars that are not generalized at all. [11] (p. 39) Finally, middle-range theories are limited in scope: Middle-range theories involve abstractions, of course, but they are close enough to observed data to be incorporated in propositions that permit empirical testing. Middle-range theories deal with delimited aspects of social phenomena … One speaks of a theory of reference groups, of social mobility, or role-conflict and of the formation of social norms just as one speak of a theory of prices, a germ theory of disease, or a kinetic theory of gases. [11] (p. 39 Although middle-range theories are not, at least in the usual circumstances, derived from more general theories, Merton does note that they may have logical relationships to those broader theories. 3. FRBR The Functional Requirements for Bibliographic Records (FRBR) is a “conceptual model of the bibliographic universe” developed by the International Federation of Library Associations and Institutions to provide “a generalized view” of bibliographic entities and relationships.[8] FRBR has as its immediate objective guiding the design of systems for creating and managing bibliographic records in order to better support the fundamental user tasks of discovery and use. It is not intended as a radical revision of existing practice or theory, but as an articulation of current best practice and an emerging consensus, with new terminology and refinements as FRBR has been very influential. Over the last few years the FRBR framework has been found natural and compelling and is increasingly reflected in cataloging practices and technology development in libraries and elsewhere: international bibliographic databases and software systems are being “FRBRized”, and the working group for the next revision of the bible of library cataloging, the Anglo-American Cataloging Rules (now Resource Description and Access) refers to FRBR as part of the “conceptual foundation” for that revision.[9] FRBR divides bibliographic entities into three groups: Group 1 (the “products of intellectual and artistic endeavor”), Group 2 (their creators), and Group 3 (their subjects). We describe Group 1 in more detail, partly to give a sense of how FRBR is structured, but also because our example focuses on this group. FRBR uses generic entity-relationship modeling techniques to express the formal features of the framework. The FRBR Group 1 entity types are works, expressions, manifestations, and items. A work is defined as “a distinct intellectual or artistic creation”, an expression is “the intellectual or artistic realization of a work in the form of alphanumeric, musical, or choreographic notation, sound, image, object, movement…”, a manifestation is “the physical embodiment of an expression of a work”, and an item is “a single exemplar of a manifestation”. Using printed books as an example (which we will do throughout) these concepts would correspond roughly to the common notions of work, text, edition, and physical copy, respectively. Each entity type is associated with a characteristic attributes — for instance, works have form (novel, play, poem, etc.), expressions may be in a particular language, manifestations may have a typeface, and items may have a condition. A particular work may be realized in any number of expressions (such as different translations or textual variants); an expression may be embodied in any number of different manifestations (such as different editions with different page design or carrier); and a particular manifestation may have any number of individual physical instances. Works, expressions, and manifestations are abstract objects, and items are concrete physical objects.2 4. THEORETICAL ASPECTS OF MIDDLE-RANGE THEORIES At the very heart of the notion of a middle range theory is the view that they guide empirical research by providing hypotheses for exploration, and by explaining empirically observed phenomena. A full account of FRBR as a middle range theory would therefore naturally focus on these hypotheses, the resulting research, and the effectiveness of the theory in explaining empirical observations. However this topic, as important as it is, and as timely as it is, will not be taken up in here. We focus on a different, and somewhat neglected, aspect of middle-range Figure 1. ER Diagram of FRBR Group 1 Entities (diagram from IFLA, 1998) needed. 2 For a short overview of FRBR see Tillett.[21] theories: the role of theoretical analysis and refinement in their conceptual evolution. is asserted, and what is presumed as a semantic condition of an assertion). Merton has relatively little to say about the role of theoretical refinement in the function and evolution of middle-range theories. He does note that good middle range theories pose theoretical problems as well as guide research, and he remarks that while middle-range theories are not derived from upper range theories they may be consistent (and so also, presumably, inconsistent) with upper-range theories. He says little beyond that. But these insights from linguistics, even in combination with other insightful work on theoretical problems with FRBR[2][3] still did not help us reconcile context-dependency with the FRBR model. But it would seem that in the case of at least some theories, let’s call them upper middle-range theories, theoretical analysis and refinement is in fact a major force in their evolution, playing a distinctive role in how those theories provide scientific value, and, in particular, how they are integrated with other theories at higher levels, or at the same level, and, in particular, with theories in other fields. So as interesting and important as the ongoing empirical studies of FRBR are, we will for now ignore them entirely and focus here on theoretical and formal analysis. We hope to show, by an example, how such formal analysis establishes critical relationships with other upper and middle range theories both within information science, and across disciplinary boundaries. These sorts of integrating relationships among theories — vertical, lateral, and interdisciplinary relationships — have often been identified as a source of enhanced explanatory power and warrant for scientific theories. And we think our example confirms this. 5. TYPES, ROLES, AND CONTEXT: AN EXAMPLE OF THEORETICAL REFINEMENT There has in fact already been much illuminating analysis of theoretical issues in FRBR, analysis which, as described above, surfaces rich connections with successful mature theories in other fields. We hope to survey this work, which has been valuable to us, at a later time. Here we discuss, from our own experience, just one specific case of interdisciplinary theoretical refinement. In 2002 we noticed an interesting entity assignment puzzle: is an XML Document a FRBR manifestation or a FRBR expression?[14] There were good arguments on both sides. We concluded that the assignment depended on context of use, but it was unclear how to either reconcile this context-dependency with the FRBR ER model or revise that model to accommodate it. We soon realized that this problem was similar to a previous puzzle about the taxonomy of descriptive markup.[12] That puzzle had been resolved by using the notion of illocutionary force (similar to grammatical mood), from linguistics (pragmatics in particular) to clarify an ambiguity: orthographically and semantically identical XML markup could vary in illocutionary force depending on context of use. The context-dependency of manifestation/expression assignments described above seemed to be a partial generalization of this observation about XML markup. Later another related puzzle was noticed: some XML markup seemed to simultaneously refer to both the textual string which it had as content and also to the referent of that string.[15] We found that this puzzle was also illuminated by a notion from pragmatics, this time presupposition (the distinction between what The critical clue came from computer scientists doing “applied ontology”. Guarino and Welty have developed a method for evaluating modeling decisions which requires that properties designating entity types be rigid in a sense defined using contemporary symbolic modal logic, but meaning, roughly, that the property in question is had permanently and essentially, not contingently.[5][6][7] Properties that fail this test should not be considered types of entities, but other sorts of things, such as roles which entities enter into in particular circumstances. Originally we were interested in how these ontology evaluation rules might help with the Bechamel XML semantics project.[13] However it soon occurred to us to apply them to FRBR, and when we did we noticed that manifestation and expression both seemed to fail the rigidity test. This suggests that manifestation and expression are not types of entities, strictly speaking. And on further reflection they in fact do seem more like roles that some types of entities may have in particular circumstances.[17] Additional corroboration of this reinterpretation then came from aesthetics and the philosophy of social science: Levinson’s analysis of musical works as natural (though abstract) objects in specific social context,[10] and Searle’s theory of social objects as natural objects in some specific social context[18]. We found both of these influential views compelling and felt they supported our emerging sense that the entity types identified by FRBR were not true types, but rather roles that other entities had in particular social circumstances.[17] In light of these converging accounts, from multiple disciplines, we felt confident in conjecturing that a more conceptually precise version of the FRBR model would have FRBR entities as roles, not types.[17] This leads naturally to a number of new questions, such as what are the (true) entity types that take on these different roles, and what are the specific features of some social contexts in virtue of which they initiate and sustain these roles. We suspect that answering such questions will have us, again, drawing on work by linguists, computer scientists, philosophers, and others, as well as librarians and information scientists. 6. CONCLUSION The preceding case illustrates by example how the theoretical refinement of a middle-range theory, such a FRBR, not only reveals and improves that theory’s explanatory power, but often does so by integrating perspectives from several disciplines 7. ACKNOWLEDGMENTS As usual we acknowledge the contributions of members of three GSLIS research groups: the Electronic Publishing Research Group, the Research Writers Group, and the Metadata Roundtable, as well as the GSLIS Center for Research in Informatics and Science and Scholarship (CIRSS), directed by Carole Palmer. Participating in our discussions of FRBR during this period were Yunseon Choi, Thomas Dousa, Ingbert Floyd, Jin Ha Lee, Pat Lawton, Karen Medina, Christopher Phillippe Sara Schmidt, Richard Urban, Xin Xiang, Karen Wickett, Oksana Zavalina, and other members of the FRBR community. We also thank Michael Sperberg-McQueen (W3C/MIT) and Claus Huitfeldt (Bergen) for important early criticisms. Finally we thank members of the Balisage markup community, and, especially, Ann Wrightson, for suggesting that the work of Guarino and Welty could help us with our formalization XML document semantics. The usual disclaimers apply. 8. REFERENCES [1] Carlyle, A. 2006. Understanding FRBR as a conceptual model: FRBR and the bibliographic universe. Library Resources & Technical Services 50, 4. [2] Doerr, M., Hunter, J., and Lagoze, C. (2003). Towards a core ontology for information integration. Journal of Digital Information 4, 1. [3] Doerr, M., LeBoeuf, P. (eds). (2007) FRBR: Object-oriented definition and mapping to FRBR-ER. Version 0.8. International Working Group on FRBR and CIDOC CRM Harmonisation. [4] Floyd, I., and Renear, A. H. (2007). What in the digital world is a FRBR item? Poster at The Annual Meeting of the American Society for Information Science, October. [5] Guarino, N. and Welty, C. A. (2000). A formal ontology of properties. In Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management. Lecture Notes In Computer Science. Springer-Verlag. [6] Guarino, N. and Welty, C. 2002. Evaluating ontological decisions with OntoClean. Communications of the ACM 45, 2. [7] Guarino, Nicola and Chris Welty. 2004. An Overview of OntoClean. In Steffen Staab and Rudi Studer, eds., The Handbook on Ontologies. [8] International Federation of Library Associations. 1998. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications-New Series. Vol. 19, München: K.G.Saur. [9] Joint Steering Committee for Revision of Anglo-American Cataloguing Rules. 2006. RDA: Resource Description and Access. Frequently Asked Questions. Retrieved on June 20, 2006 from: http://www.collectionscanada.ca/jsc/rdafaq.html [10] Levinson, J. (1980). What a musical work is. The Journal of Philosophy 77, 5-28. [11] Merton, R. K. 1968. On sociological theories of the middle range, in Social Theory and Social Structure. The Free Press. [12] Renear, Allen H. 2001. The descriptive/procedural distinction is flawed. Markup Languages: Theory and Practice 2, 4. [13] Renear, A., Dubin, D., and Sperberg-McQueen, C. M. 2002. Towards a semantics for XML markup. In Proceedings of the 2002 ACM Symposium on Document Engineering (McLean, Virginia, USA) ACM, New York, NY. [14] Renear, A. H., Phillippe, C., Lawton, P., and Dubin, D. 2003. An XML document corresponds to which FRBR Group 1 entity? In: Proceedings of Extreme Markup Languages 2003. Montreal, Canada, August. [15] Renear A. H., Lee J. H., Choi Y., Xiang X. 2005. Exhibition: a problem for conceptual modeling in the humanities. In: ACH/ALLC 2005. Conference Abstracts (2005) 2nd Edition. Victoria: Humanities Computing and Media Centre, University of Victoria. 176–179. [16] Renear, A. H., Choi, Y. 2006. Modeling our understanding, understanding our models — the case of inheritance in FRBR. In: Proceedings of the 69th ASIS&T Annual Meeting, Austin, Texas. [17] Renear, A. H. and Dubin, D. 2007. Three of the four FRBR Group 1 entity types are roles, not types. : Proceedings of the 70th ASIS&T Annual Meeting, Milwaukee WI. [18] Searle, J. R. 1995. The Construction of Social Reality. New York: The Free Press. [19] Sutton, R. I. and Staw, B. M. 1995. What theory is not. Administrative Sciences Quarterly, 40. [20] Svenonius, E. 2001. The Intellectual Foundation of Information Organization. MIT Press. [21] Tillett, B. B. 2004. What is FRBR? A Conceptual Model for the Bibliographic Universe. Washington: Library of Congress, Cataloging Distribution Service. Online: http://www.loc.gov/cds/downloads/FRBR.PDF