Dr. Sherry Vellucci Information Organization “In the colossal labor, which exhausts both body and soul, of making into an alphabetical catalog a multitude of books gathered from every corner of the earth there are many intricate and difficult problems that torture the mind.” Thomas Hyde. Catalogue for the Bodleian Library, 1674. Why Organize Information? User has information need Document exists that meets the information need What is Information Organization? The process of creating, arranging, and maintaining systems for bibliographic information retrieval Organization of the materials and information that we collect or provide access to in libraries, museums, archives, and information centers Information Organization differs depending on environment Functions of Information Organization Primary: – Provide access to recorded information for the purpose of retrieval Bring together related documents Distinguish between similar documents Secondary: – Keep inventory of what we have and where it is located – Keep recorded information usable for posterity Subsets of Information Organization Cataloging & metadata Classification Indexing and abstracting Database design Information architecture Content management Knowledge management Trends in Catalog Creation Ancient times - Simple lists Middle Ages - Inventories Sixteenth & Seventeenth Century - Finding lists Eighteenth Century - Codification begins Nineteenth Century - Collocating Devices Twentieth Century - Expanded codification & mechanization Twenty-first Century - ? What Is a Catalog? “A retrieval tool that provides access to individual items within collections of information packages” Taylor, 1999 “An organized set of bibliographic records that represent the holdings of a particular collection.” -- Wynar Bibliographic (Metadata) Records Surrogates for information packages in the collection Include standardized descriptions Form a catalog when arranged or accessed systematically (Also called bibliographic records, catalog records, entries) Access Points Any term in a metadata record that may be used to locate that record A Controlled access point – An authorized (preferred) form of access point – Constructed with information in a certain order – Maintained under authority control Types of Bibliographic Control Control of a Body of Literature – Indexes (& Abstracts) – Bibliographies Control of Collections – Catalogs – Finding Aids – Museum Registers Control of Knowledge – Knowledge Management Levels of Access Macro level access – Broad in scope entire book complete serial complete archival collection – Macro level tools Catalogs Micro level access – Narrower in scope of description Chapter in book Article in serial Individual items in archive or museum – Micro level tools Indexes Abstracting services Databases Cutter’s Objects of the Catalog 1) To enable a person to find a book when one of the following is known: – The author – The title – The subject 2) To show what the library has: – By a given author – On a given subject – in a given kind of literature 3) To assist in the choice of a book – As to the edition (bibliographically) – As to its character (literary or topical) From Rules for a Dictionary Catalog, 1876, 4th ed., 1904 1. Find 2. Collocate 3. Evaluate FRBR User Tasks Find (locate) Relate/Navigate (Collocate [Svenonius]) Identify Select Obtain Other possible tasks: – Attribute Royalties to – Preserve Assumptions Objective 1:User can express the information need & translate into language of the system Objective 2: Users need requires looking at related sets of information (all documents by a given author, on a given subject, in a certain genre) Objective 3: User finds multiple manifestations of work and need to evaluate the surrogate in order to select the appropriate document Problems How do we operationalize open-ended objectives? Success of objective must be measurable To be measurable, must be specific Intellectual Issues Representation – concise depiction of complex information – Document surrogates – Describe attributes of the document Classification -- a scheme for organizing information packages or concepts Problem: What are We Organizing? Recorded information -- meaningful symbols (letters, numbers, etc.), sounds or images created or collected to convey a message – Why do we use the term “recorded information” instead of just information? Document – An information package – Often associated with text printed on paper – Broader context includes videos, sound recordings, graphics, computer files, etc. Functional Requirements for Bibliographic Records What FRBR is: – a logical framework – a conceptual model – a "generalized" view of the bibliographic universe – Available at http://www.ifla.org/VII/s13/frbr/frbr.h tm What FRBR is not: – a data model – an implementation model – a conceptual model for authority records – A conceptual model for subjects FRBR Functions Specifically identify what is being described Improve catalog displays Provide common conceptual model & language Entity-Relationship Model Attributes Attributes • Title • Creator •Subject • Title Entity 1 Relationship Entity 2 • Creator •Subject Group 1 Entities: Products of intellectual or artistic endeavour Group 2 Entities: Those responsible for the intellectual & artistic content, physical production, or custodianship Group 3 Entities: Entities that serve as subjects of intellectual or artistic endeavour Group 1 Entities & Their Relationships An Expression “realizes” A Work A Manifestation “embodies” An Expression Work Expression A Work “Is realized through” An Expression An Expression “Is embodied in” A Manifestation Manifestation An Item “exemplifies” A Manifestation Item A Manifestation “Is exemplified by” An Item LS vs. IS Terminology Comparison FRBR Terms I. S. Terms Work Message Expression Text Manifestation Document Item Instantiation W1 Tolkien The Lord of the Rings E1 English Text E2 German Text Der Herr der Ringe The Lord of the Rings M1 English M2 English The Lord of the Rings The Lord of the Rings Translated by Margaret Carroux M3 English M1 German The Lord of the Rings Der Herr der Ringe Translated by Margaret Carroux Stuttgart Ernst Klett 1968, 3 v. London London New York Allen & Unwin Facsimile Reprints Harper Collins 1998, 3 v. 1954-55, 3 v. 1965 I1 VUW Library Copy 1, signed by the author Work/Expression/ Manifestation/Item Relationships E3 Spoken Word Performance The Lord of the Rings Read by Ian Holms M1 Sound Recording The Lord of the Rings Read by Ian Holm BBC Audiobooks 2003 13 compact discs Bibliographic Relationships Equivalent Derivative Descriptive Whole-part Sequential Accompanying Shared characteristics Barbara Tillett Richard Smiraglia Sherry Vellucci Allyson Carlyle Barbara B. Tillett, “Bibliographic Relationships.” In Relationships in the Organization of Knowledge, edited by Carol A. Bean and Rebecca Green, 19-35. Dordrecht: Kluwer Academic Publishers, 2001 Family of Works Same Expression New Expression New Work B. Tillett Dec. 2001 Equivalent Relationships Multiple manifestations with identical content W1 The Lord of the Rings E1 English language text M1 Allen & Unwin, 1954-55. M2 Facsimile Reprints, Inc., 1965. M3 Harper Collins, 1998. Tolkien, J.R.R. (John Ronald Reuel), 1892-1973. The Lord of the Rings – Books—English + London: Allen & Unwin, 1954-55. + New York: Facsimile Reprints, Inc., 1965. + London: Harper Collins, 1998. Derivative Relationships: Same work Editions Translations Performances Tolkien, J.R.R. (John Ronald Reuel), 1892-1973. The Lord of the Rings – E1 Books—German + M1 Trans. by Margaret Carroux. Stuttgart: Ernst Klett, 1968. – E2 Spoken word recording—English + M1 London: BBC Audio Books, 2003. Derivative Relationships: New works Parodies Adaptations Beard, Henry N. Bored of the Rings: a Parody of J.R.R. Tolkien’s the Lord of the Rings. New York: New American Library, 1969 Strachey, Barbara. Journeys with Frodo: an Atlas of J.R.R. Tolkien’s The Lord of the Rings. London: Grafton, 1992. The Lord of the Rings. Screenplay by Fran Walsh, Phillipa Boyens and Peter Jackson based on the books by J.R.R. Tolkien; produced by Barrie M. Osborne, Peter Jackson, Fran Walsh, Tim Sanders; Directed by Peter Jackson. [London?]: New Line Cnema, 2002. Knizia, Reiner. The Lord of the Rings Board Game. Illustrations by John Howe. Cambridge: Sophisticated Games, 2001. Whole-Part Relationships Components Aggregates The Lord of the Rings = aggregate work = work of works The Fellowship of the Ring = component part = work The Two Towers = component part = work The Return of the King = component part = work The Lord of the Rings Game contains 2 books, 2 map sheets, 9 character sheets, rules, contents sheets, 4 red dice, cardboard counters, map errata Sequential Relationships Part to part (or chronological) Relationship Part 1: The Fellowship of the Ring Part 2: The Two Towers Part 3: The Return of the King The Lord of the Rings Official Fan Club Magazine Vol. 1, no. 1; vol. 1, no. 2 … Accompanying Relationships Manifestation is accompanied by additional material Shore, Howard. The Lord of the Rings: the Motion Picture Trilogy: Instrumental Solos. Music arranged for trombone by Tod Edmonsen. Miami: Warner Bros, 2004. 1 part (25 p.) + 1 sound disc (4 ¾ in.) The Lord of the Rings. Extended edition includes 4 DVDs: 1: Part One; 2: Part Two; 3: Appendices Part One: From Book to Vision; 4: Appendices Part Two: From Vision to Reality. + 1 booklet with explanation of the extended edition; documentary appendices on the making of the movie; complete listing of scenes, with new scenes and extended scenes identified; and diagrams detailing how the book was transformed into visual form. Descriptive Relationships: New works Commentaries Evaluations Criticisms Reviews • Simpson, Dale. Modernized Myth: Beowulf, J.R.R. Tolkien and the Lord of the Rings. • Miesel, Sandra. Myth, Symbol and Religion in the Lord of the Rings. • Smith, Jim E. The Lord of the Rings: The Films, the Books, the Radio Series. • Fisher, Jude. The Lord of the Rings Location Guidebook. • Astin, Sean. There and Back Again: Behind-the-Scenes on the Lord of the Rings. FRBR Group 2 Entities “The Group 2 entities represent those responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of the entities in the first group” (FRBR, p.13) Group 2 entities include: – Persons – Corporate bodies Group 1 Entities Group 2 Entities Relationships of FRBR Group 1 Entities to FRBR Group 2 Entities (FRBR p. 14) Group 1 : Group 2 Relationships w1 The Lord of the Rings “created by” p1 J.R.R. Tolkien e1 The Lord of the Rings [spoken word recording] “performed by” p2 Ian Holm m1 The Lord of the Rings. [motion picture, 2002] “distributed by” cb1 New Line Cinema Home Entertainment i1 The Lord of the Rings [published English text 1965] “owned by” cb1 Victoria University Library FRBR Group 3 Entities The Group 3 entities serve as the subjects of works The group includes – concept (an abstract notion or idea) – object (a material thing) – event (an action or occurrence) – place (a location) In addition, all entities in Groups 1 and 2 can serve as subjects for a work FRBR Relationships of a Work to entities that can serve as the subject of a work (FRBR, p. 15) Group 1 : Group 3 Relationships c1 Mythology w1 J.R.R. Tolkien The Lord of the Rings “is the subject of” w2 The Lord of the Rings: An Examination of Mythical Elements by M.C. Stone FRBR, p. 63 Information Representation Organized by a special purpose language (ontologies & taxonomies) – Many such languages exist Linnaeus’ Taxonomy of living things Educational resources thesaurus – Bibliographic language Subject language Document language Information Organization in Libraries Traditional processes: – Organize items on shelf by classification – Create & maintain catalog that provides access to information resources (surrogate records) – Create indexes & databases – Create bibliographies New processes: – Create library portals – Provide access to variety of resources through unified interface Catalog, databases, resource links, archives, digital libraries, etc. – Customize for personal information (my library) – Create and organize digital libraries Information Organization in Archives Organize & arrange in groups by provenance (originator) and original order (closed stacks) Create accession record (information about collection source and physical content) & finding aid (contents of collection) Information Organization in Museums Organize & describe objects in collection Create accession/field records (info. @ source of object) and register (similar to catalog) – Description of visual objects is more complex than text May also have libraries (include textual material) and archives in museums Information Organization on the Internet Libraries – Web bibliographies (Subject, Classification) – Metadata (MARC, Dublin Core) Non-Libraries – – – – Search engines Subject directories Automatic indexing & classification Visual Organization Concept maps Ontologies Taxonomies Information Organization for Digital Libraries Provides digitized resources with architecture and retrieval service Design of retrieval & description system part of creating the digital library Increasing demand with distance education Information Organization with Library Portals Provide access to variety of resources through unified interface – Catalog, databases, resource links Customizable for personal information Information Architecture “Process of designing, implementing and evaluating information spaces that are humanly and socially acceptable to the intended stakeholders” (Andrew Dillon) – Determine information needs of users – Create structural patterns for finding information – Develop user interface for information retrieval and display – Evaluate success of architecture for retrieval and display Records Management Originally involved keeping, filing, maintaining paper records Computer files on individual PCs created organizational problems Various systems used across organization (payroll, general ledger, accounts payable, inventories) Data modeling used to create conceptual model of records management activities (directories, files, programs, database field values) Knowledge Management Who knows what in an organization and capturing that knowledge using technology Expanded into managing the information explosion in organizations Tacit knowledge vs. explicit knowledge Software used to create knowledge repositories, improve knowledge access, enhance the knowledge environment, manage knowledge as an asset Metadata Data about data Structured data that describes the attributes of a resource, characterizes it relationships, supports its discovery, management, and effective use, and exists in an electronic environment The Structure of Information Unstructured Data Structured Data Q7 Timetable: Manhattan to Queens. Weekends only. 7:13 Departs Times Square Departs Queens Plaza Arrives Jamaica Station 6:58 7:15 7:13 7:30 7:32 7:49 Data has Context & Description Model of an Information Retrieval System Lancaster Major function of an IR System is to “act as an interface between a particular population of users and the universe of information resources in printed or other form.” Activities of IR Systems: 1. Acquire & store documents (or surrogates) 2. Organize & control documents (or surrogates) 3. Distribute documents (or surrogates) Subject Analysis Is . . . The part of indexing or cataloging that deals with, first, the conceptual analysis of an information package … [and] with translating the conceptual analysis into the conceptual framework of the classification or subject heading system (Taylor, p. 132) Step 1: Conceptual Analysis determining what the information package is “about” and/or determining what an item “is” An indexer experienced with a controlled vocabulary may think of aboutness in the terms available Problems in Determining Subject Deciding aboutness is subjective – Predominance? – Frequency? Deciding aboutness may depend on culture, background and knowledge of cataloger – Behaviorially – private – Socially – common ideas – Gramatically – different terms; concepts Deciding interpretive, thematic, or iconographic significance for non-textual material requires specialist 56 Determining Form Form data are terms and phrases that designate specific kinds of genres or materials (Taylor, p. 255) Types of form – Physical character: Videocassettes, photographs, maps – Type of data contained: Text, visual, audio, numeric – Arrangement of information contained: Excyclopedias, dictionaries, indexes, diaries, outlines – Style, technique, purpose or intended audience: Drama, romance, cartoons, algebra text Exhaustivity The number of terms that will be assigned by the cataloger/indexer Determined by local policy & desired level of bibliographic control Dimensions of Exhaustivity Summarization Level: Describes the overall subject content of the work as a whole, i.e., the dominant subject – Cataloging is at summarization level – Assign fewer & more general terms Document Retrieval Depth Level: Describes all main concepts of subject, including smaller units of information, i.e., chapters, articles, etc. – Indexing is at depth level – Assign more & specific terms Information Retrieval Specificity The level of subject analysis provided for by a particular controlled vocabulary The closeness of fit between the meaning of an index term and the document’s themes and/or subthemes “The Care & Feeding of Siamese Cats” Low specificity: Felines High specificity: Siamese cats Classification Oldest form of information organization (Aristotle) – Based on thought process – Mental models classify associate bring like things together Differentiate among things Primary types: hierarchical, faceted Often associated with coding of some type – Symbols (numbers, letters, punctuation…) Theories of Categories Classical theory of categories based on commonalities 20th Century theories – Family resemblance (Wittgenstein; Austin) – Fuzzy Set Theory (Zadeh) – Distinct categories/cultural and linguistic differences (Lounsbury; Berlin & Kay) – Basic-level Categories (Brown) – Universal level of human naming (Berlin) – Prototype Theory (Rosch) Musical instruments Bibliographic Classifications Differ from Taxonomic Groupings Documents are complex – Have combinations of topics, not just mutually exclusive, generic relationships Documents classified based on literary warrant Document arrangement can only be onedimensional-linear order, i.e., show one kind of relationship – Need catalogue to supplement shelf-order For Whom are We Organizing Information? Users--people who have an information need Users vary: Experts – librarians, information professionals, researchers – people who know a domain and have some idea of vocabulary and the kind of information that’s likely to be available Novices – people who never learned to use retrieval tools – people who only have a vague idea of what they’re looking for, e.g., a student assigned a research topic or a person who just found out that their relative has an obscure disease Problems with Information Organization Catalogers focus on bibliographic and authority control and languages Accurate description does not always lead to successful query results Does not link cataloging process with knowledge base of information retrieval Understanding Users’ Perspectives Move from system-centered to usercentered views of information systems Designed for the user based on user input … bottom up rather than top down Needs research into user needs, user modelling, and catalog informationseeking behavior Broadened Perspective Metadata has brought information organization onto center stage – Provides information that goes beyond description (administrative, structural, etc) – Focuses primarily on digital information – Adopts/integrates use of search engines – Objectives can be operationalized, connected and measured Representation Visualization Searching Interface usability Metadata has become important to businesses – Part of knowledge management – Often used in proprietary systems