Toward an
International Sharing and Use of
Subject Authority Data
Marcia Lei Zeng
Athena Salaba
Kent State University
FRBR Workshop, OCLC, 2005
Outline
1.
2.
3.
4.
Background information
Current State
Authority Data
Sharing Authority Data
FRBR Workshop, OCLC, 2005
1. Background
1.1 Subject access
Seeking information on a topic is still the predominant user task
Subject access includes:
Subject searching
Keyword searching
Subject browsing
It is still very problematic for the majority of searchers
FRBR Workshop, OCLC, 2005
1.2 Functions of a catalog regarding subject access
(1)
Cutter (1897)
To find a book if the subject is known
To show what a library has on a given subject ( collocate )
To assist in the choice as to its character
( identify )
FRBR Workshop, OCLC, 2005
1.2 Functions of a catalog regarding subject access
(2)
FRBR (1998)
To find entities of Group 1 that have entities from Group 1, 2, 3 as their subject
To identify
To select
To obtain
FRBR Workshop, OCLC, 2005
1.3 What is a subject?
FRBR – Functional Requirements for Bibliographic Records
Group 1
Work
Expression
Manifestation
Item
Group 2
Persons
Families
Corporate bodies
Group 3
Concepts
Objects
Place
Event
FRBR Workshop, OCLC, 2005
Revisiting Group 3?
Time
Process
Event is a combination of place and time
Concrete vs. abstract concept
Ranganathan
Personality
Matter
Energy
Space
Time
FRBR Workshop, OCLC, 2005
2. Current State
Subject Authority Data
2.1 Structure (heterogeneous)
2.2 Existing Knowledge Organization
Systems/Structures/Schemas (KOS)
2.3 Rules and guidelines
2.4 Communication/Encoding
FRBR Workshop, OCLC, 2005
2.1 Structures
Relationship Groups :
Ontologies
Semantic networks
Thesauri
Classification &
Categorization:
Term Lists:
Classification schemes
Taxonomies
Categorization schemes
Subject Headings
Synonym Rings
Authority Files
Glossaries/Dictionaries
Gazetteers
Pick lists
Natural language Controlled language
Structures: Coordination
Precoordination ………..Post-coordination e.g. subject headings e.g. thesauri
- LCSH - AAT, INSPEC
MeSH
FAST
UMLS
FRBR Workshop, OCLC, 2005
2.2 Existing KOS
(1)
Library of Congress Subject Headings (LCSH)
Medical Subject Headings (MeSH)
ERIC Thesaurus (ERIC)
Inspec Thesaurus
Inspec Classification
Dewey Decimal Classification (DDC)
Library of Congress Classification (LCC)
Universal Decimal Classification (UDC)
HEREIN Thesaurus
Alexandria Digital Library (ADL) Gazetteer and Thesaurus
Schlagwortnormdatei (SWD)
Regenburger Verbund Klassifikation (RVK)
RAMEAU: repertoire d'authorite de matieres encyclopedique unifie
Art and Architecture Thesaurus (AAT)
National Agriculture Library Subject Headings
… …
FRBR Workshop, OCLC, 2005
2.2 Existing KOS
(2)
Structure
Verbal based
AAT
LCSH
RAMEU
INSPEC Thesaurus
MeSH
Code based
UDC
RVK
DDC
LCC
Integrated
INSPEC
MeSH Hierarchy
V C
Global environment
FRBR Workshop, OCLC, 2005
L a n g g e u a
2.3 Rules of KOS Construction
Different rules and guidelines
AACR2, Z39.19, RAK (Regeln f ür die alphabetische Katalogisierung), ISO5964,
ISO2788, IFLA Principles Underlying
Subject Heading Languages (SHLs) …
No rules
Indirect/Inherent use of rules (by example)
FRBR Workshop, OCLC, 2005
2.4 Communication/Encoding for authority data
MARC
MARC21 (1xx, 2xx, etc.)
UNIMARC (1xx, 2xx, etc. different definition)
etc.
Guidelines for Authority Records and
References (GARR) (>, <, >>, <<)
NISO Z39.19 (BT, NT, RT, etc.)
XML-based: OWL Web Ontology Language,
RDF Schema, Voc-ML, etc.
FRBR Workshop, OCLC, 2005
3. Authority Data
3.1 Use of authority data
Direct use of authority data
Index
Identify/Verify
Search & Browse the authority data
Indirect use of authority data
Searching bibliographic file
Browsing bibliographic file
Users
Information professionals
Searcher/end-user
FRBR Workshop, OCLC, 2005
3.2 Common Authority Data
Authorized/established term
Variations
Related terms
Notes
Linked/Parallel terms
Numbering, International numbering?
Other: language, rules, links to external resources, roles, etc.
FRBR Workshop, OCLC, 2005
Do we need one authorized term?
Keep USER in mind!
Preference, language, script
Trends: all are preferred
Synonym rings (included in NISO Z39.19 now)
FRBR Workshop, OCLC, 2005
3.3 Common Semantic
Relationships in Authority Data
Semantic relationships
Broad categories
Equivalence (Use, Used
For, UF, See)
Hierarchical (BT, NT, see also)
Associative (RT, see also)
More specific relationships, such as:
Is part of
Is instance of
Agent/process
Process/product
Need for other types of relationships?
ADL, such as:
Overlap; administrativePartOf;
SubFeatureOf
UMLS, such as:
Like; Parent; Child; Sibling
WordNet, such as:
Familiarity; derivationally related
FRBR Workshop, OCLC, 2005
Unanswered Question
What authority data currently exist in an authority record? or
What authority data should be included in an authority record?
FRBR Workshop, OCLC, 2005
4. Sharing Authority Data in a
Global Environment
Structure
4.1. Challenges
Structures
Languages and scripts
Rules
Encoding
V C
Global environment
L a n g g e u a
FRBR Workshop, OCLC, 2005
4.2. Projects Specifically for
Subject Authority Data Sharing
Construction (not to be discussed here)
Implementation
Projects based on different types of structures
Projects involving multiple languages
FRBR Workshop, OCLC, 2005
KOS Types
Projects thesaurus classification scheme subject heading list; controlled term list
Projects based on different structural types of KOS
UMLS x x x coding system x
HILT
UC Berkeley
DARPA Unfamiliar
Metadata Project
Polish Project x x x x x x x x x x
Languages involved multiple languages multiple languages
English,
French,
German,
Russian,
Spanish
English,
Polish
English Megathesaurus,
H.W.Wilson
Classification Web x
WebDewey
CARMEN x x x x x x
English
English
German,
English
Finnish Finnish Project x x
Projects based on similar structural types of KOS
Renardus x
MACS
Merimee
HEREIN
LCSH/MeSH
MSC/DDC
SAB/DDC
CAMed x x x x x x x multiple languages
English,
French,
German
English,
French
Spanish,
French,
English
English
English
Swedish,
English
English,
French
KOS
Vocabularies
Authority files
Bibliographic files
KOS
Vocabularies
Authority files
Bibliographic files
KOS
Vocabularies
Authority files
Bibliographic files
KOS
Vocabularies
Sharing at Vocabulary Level
KOS
KOS
Vocab ularies adaptation, extension, extraction, translation, etc.
KOS
Vocabularies
KOS
Vocabularies
1.Direct mapping
Sharing at Vocabulary Level
National database "Merimee" about the French Heritage
The Thesaurus of Architecture ( Le thésaurus de l'architecture ) was created and mapped to the Art and
Architecture Thesaurus (AAT) and the
English Heritage Thesaurus (NMR)
KOS
Vocabularies
KOS
Vocabularies
Sharing at Vocabulary Level
Renardus project
“a cross-browsing feature based on the DDC and improved subject searching across distributed and heterogeneous European subject gateways.”
2.Using a switching system
KOS
Vocabularies
Sharing at Vocabulary Level
UMLS® Metathesaurus ®
Over 1,000,000 concepts and 4.3 million concept names from more than 100 controlled vocabularies, some in multiple languages
3.Creating a superstructure
KOS
Vocabularies
Sharing at Vocabulary Level
UCB Unfamiliar Metadata Vocabularies
Accepts query vocabularies and responds with a ranked list of the system’s entry vocabularies– which is an index to five controlled vocabularies.
4.Creating a superstructure
(an index)
KOS
Vocabularies
Sharing at Vocabulary Level
CAMed Cross-thesaurus searching
Terms are linked in a temporary union list generated by the software in response to a query.
5.Creating a superstructure
(a virtual index)
KOS
Vocabularies
Sharing at Vocabulary Level
UCSB Alexandria Digital Library
The Thesaurus Protocol is based on the ANSI/NISO
(1993, R2003) Z39.19 thesaurus model and supports downloading, querying, and navigating thesauri.
6. Linking through a thesaurus server protocol
KOS
Vocabularies
Sharing at Subject Authority File Level
Authority files
Bibliographic files
Direct Mapping
KOS
Vocabularies
Authority files
Bibliographic files
Direct Mapping -- MACS (Multilingual Access to Subjects)
LCSH AND MeSH MAPPING PROJECT SAMPLE AUTHORITY
RECORDS, Northwestern University Library
KOS
Vocabularies
Authority files
Bibliographic files
S1
Metadata
Metadata
Terms from thesaurus 1
Terms from thesaurus 1
Metadata Terms from thesaurus 1
Terms from thesaurus 2
Terms from thesaurus 2
Terms from thesaurus 2
Terms from thesaurus 2
S2
Co-occurrence mapping -- works at the application level, i.e., in metadata records, where the group of subject terms can actually result in loosely-mapped terms.
So far,
Functional Requirements for Authority Records (FRAR)
Covers:
Names for persons, families, corporate bodies (Group 2)
Titles (Group 1)
Projects for Authority Data Sharing focus mainly on
Names:
ONE Shared Authority Control (ONESAC, ppt )
Virtual International Authority File ( VIAF )
Linking and Exploring Authority Files ( LEAF )
Hong Kong Chinese Authority (Name) ( HKCAN )
FRBR Workshop, OCLC, 2005
FRSAR:
Functional Requirements for Subject Authority Data
Scope: focus on FRBR’s Group 3 entities
FRSAR Working Group contact: Marcia Zeng mzeng@kent.edu
Maja Zumer
Athena Salaba asalaba@kent.edu
FRBR Workshop, OCLC, 2005
FRBR Workshop, OCLC, 2005
build a conceptual model of Group 3 entities within the
FRBR framework (Entities in Group 1 and Group 2 can be used as the subjects of works; but further inclusion of them will depend on the outcomes of the work of the
FRANAR Working Group); provide a clearly defined, structured frame of reference for relating the data that are recorded in subject authority records to the needs of the users of those records; and assist in an assessment of the potential for international sharing and use of subject authority data both within the library sector and beyond.
FRBR Workshop, OCLC, 2005