Toward … and Sharing of Subject Authority Data

advertisement

Toward an

International Sharing and Use of

Subject Authority Data

Marcia Lei Zeng

Athena Salaba

Kent State University

FRBR Workshop, OCLC, 2005

Outline

1.

2.

3.

4.

Background information

Current State

Authority Data

Sharing Authority Data

FRBR Workshop, OCLC, 2005

1. Background

1.1 Subject access

 Seeking information on a topic is still the predominant user task

 Subject access includes:

 Subject searching

 Keyword searching

 Subject browsing

 It is still very problematic for the majority of searchers

FRBR Workshop, OCLC, 2005

1.2 Functions of a catalog regarding subject access

(1)

 Cutter (1897)

 To find a book if the subject is known

 To show what a library has on a given subject ( collocate )

 To assist in the choice as to its character

( identify )

FRBR Workshop, OCLC, 2005

1.2 Functions of a catalog regarding subject access

(2)

 FRBR (1998)

 To find entities of Group 1 that have entities from Group 1, 2, 3 as their subject

 To identify

 To select

 To obtain

FRBR Workshop, OCLC, 2005

1.3 What is a subject?

FRBR – Functional Requirements for Bibliographic Records

Group 1

 Work

Expression

Manifestation

Item

Group 2

 Persons

Families

Corporate bodies

 Group 3

 Concepts

 Objects

 Place

 Event

FRBR Workshop, OCLC, 2005

Revisiting Group 3?

Time

Process

Event is a combination of place and time

Concrete vs. abstract concept

Ranganathan

 Personality

Matter

Energy

Space

Time

FRBR Workshop, OCLC, 2005

2. Current State

Subject Authority Data

2.1 Structure (heterogeneous)

2.2 Existing Knowledge Organization

Systems/Structures/Schemas (KOS)

2.3 Rules and guidelines

2.4 Communication/Encoding

FRBR Workshop, OCLC, 2005

2.1 Structures

Relationship Groups :

Ontologies

Semantic networks

Thesauri

Classification &

Categorization:

Term Lists:

Classification schemes

Taxonomies

Categorization schemes

Subject Headings

Synonym Rings

Authority Files

Glossaries/Dictionaries

Gazetteers

Pick lists

Natural language Controlled language

Structures: Coordination

Precoordination ………..Post-coordination e.g. subject headings e.g. thesauri

- LCSH - AAT, INSPEC

MeSH

FAST

UMLS

FRBR Workshop, OCLC, 2005

2.2 Existing KOS

(1)

Library of Congress Subject Headings (LCSH)

Medical Subject Headings (MeSH)

ERIC Thesaurus (ERIC)

Inspec Thesaurus

Inspec Classification

Dewey Decimal Classification (DDC)

Library of Congress Classification (LCC)

Universal Decimal Classification (UDC)

HEREIN Thesaurus

Alexandria Digital Library (ADL) Gazetteer and Thesaurus

Schlagwortnormdatei (SWD)

Regenburger Verbund Klassifikation (RVK)

RAMEAU: repertoire d'authorite de matieres encyclopedique unifie

Art and Architecture Thesaurus (AAT)

National Agriculture Library Subject Headings

… …

FRBR Workshop, OCLC, 2005

2.2 Existing KOS

(2)

Structure

Verbal based

AAT

LCSH

RAMEU

INSPEC Thesaurus

MeSH

Code based

UDC

RVK

DDC

LCC

Integrated

INSPEC

MeSH Hierarchy

V C

Global environment

FRBR Workshop, OCLC, 2005

L a n g g e u a

2.3 Rules of KOS Construction

 Different rules and guidelines

 AACR2, Z39.19, RAK (Regeln f ür die alphabetische Katalogisierung), ISO5964,

ISO2788, IFLA Principles Underlying

Subject Heading Languages (SHLs) …

 No rules

 Indirect/Inherent use of rules (by example)

FRBR Workshop, OCLC, 2005

2.4 Communication/Encoding for authority data

MARC

MARC21 (1xx, 2xx, etc.)

UNIMARC (1xx, 2xx, etc. different definition)

 etc.

Guidelines for Authority Records and

References (GARR) (>, <, >>, <<)

NISO Z39.19 (BT, NT, RT, etc.)

XML-based: OWL Web Ontology Language,

RDF Schema, Voc-ML, etc.

FRBR Workshop, OCLC, 2005

3. Authority Data

3.1 Use of authority data

 Direct use of authority data

 Index

 Identify/Verify

 Search & Browse the authority data

 Indirect use of authority data

 Searching bibliographic file

 Browsing bibliographic file

 Users

 Information professionals

 Searcher/end-user

FRBR Workshop, OCLC, 2005

3.2 Common Authority Data

 Authorized/established term

 Variations

 Related terms

 Notes

 Linked/Parallel terms

 Numbering, International numbering?

 Other: language, rules, links to external resources, roles, etc.

FRBR Workshop, OCLC, 2005

Do we need one authorized term?

 Keep USER in mind!

 Preference, language, script

 Trends: all are preferred

 Synonym rings (included in NISO Z39.19 now)

FRBR Workshop, OCLC, 2005

3.3 Common Semantic

Relationships in Authority Data

 Semantic relationships

Broad categories

Equivalence (Use, Used

For, UF, See)

Hierarchical (BT, NT, see also)

 Associative (RT, see also)

More specific relationships, such as:

Is part of

Is instance of

Agent/process

Process/product

 Need for other types of relationships?

ADL, such as:

 Overlap; administrativePartOf;

SubFeatureOf

UMLS, such as:

 Like; Parent; Child; Sibling

WordNet, such as:

 Familiarity; derivationally related

FRBR Workshop, OCLC, 2005

Unanswered Question

What authority data currently exist in an authority record? or

What authority data should be included in an authority record?

FRBR Workshop, OCLC, 2005

4. Sharing Authority Data in a

Global Environment

Structure

4.1. Challenges

 Structures

 Languages and scripts

 Rules

 Encoding

V C

Global environment

L a n g g e u a

FRBR Workshop, OCLC, 2005

4.2. Projects Specifically for

Subject Authority Data Sharing

 Construction (not to be discussed here)

 Implementation

 Projects based on different types of structures

 Projects involving multiple languages

FRBR Workshop, OCLC, 2005

KOS Types

Projects thesaurus classification scheme subject heading list; controlled term list

Projects based on different structural types of KOS

UMLS x x x coding system x

HILT

UC Berkeley

DARPA Unfamiliar

Metadata Project

Polish Project x x x x x x x x x x

Languages involved multiple languages multiple languages

English,

French,

German,

Russian,

Spanish

English,

Polish

English Megathesaurus,

H.W.Wilson

Classification Web x

WebDewey

CARMEN x x x x x x

English

English

German,

English

Finnish Finnish Project x x

Projects based on similar structural types of KOS

Renardus x

MACS

Merimee

HEREIN

LCSH/MeSH

MSC/DDC

SAB/DDC

CAMed x x x x x x x multiple languages

English,

French,

German

English,

French

Spanish,

French,

English

English

English

Swedish,

English

English,

French

KOS

Vocabularies

Authority files

Bibliographic files

KOS

Vocabularies

Authority files

Bibliographic files

KOS

Vocabularies

Authority files

Bibliographic files

KOS

Vocabularies

Sharing at Vocabulary Level

KOS

KOS

Vocab ularies adaptation, extension, extraction, translation, etc.

KOS

Vocabularies

KOS

Vocabularies

1.Direct mapping

Sharing at Vocabulary Level

National database "Merimee" about the French Heritage

The Thesaurus of Architecture ( Le thésaurus de l'architecture ) was created and mapped to the Art and

Architecture Thesaurus (AAT) and the

English Heritage Thesaurus (NMR)

KOS

Vocabularies

KOS

Vocabularies

Sharing at Vocabulary Level

Renardus project

“a cross-browsing feature based on the DDC and improved subject searching across distributed and heterogeneous European subject gateways.”

2.Using a switching system

KOS

Vocabularies

Sharing at Vocabulary Level

UMLS® Metathesaurus ®

Over 1,000,000 concepts and 4.3 million concept names from more than 100 controlled vocabularies, some in multiple languages

3.Creating a superstructure

KOS

Vocabularies

Sharing at Vocabulary Level

UCB Unfamiliar Metadata Vocabularies

Accepts query vocabularies and responds with a ranked list of the system’s entry vocabularies– which is an index to five controlled vocabularies.

4.Creating a superstructure

(an index)

KOS

Vocabularies

Sharing at Vocabulary Level

CAMed Cross-thesaurus searching

Terms are linked in a temporary union list generated by the software in response to a query.

5.Creating a superstructure

(a virtual index)

KOS

Vocabularies

Sharing at Vocabulary Level

UCSB Alexandria Digital Library

The Thesaurus Protocol is based on the ANSI/NISO

(1993, R2003) Z39.19 thesaurus model and supports downloading, querying, and navigating thesauri.

6. Linking through a thesaurus server protocol

KOS

Vocabularies

Sharing at Subject Authority File Level

Authority files

Bibliographic files

Direct Mapping

KOS

Vocabularies

Authority files

Bibliographic files

Direct Mapping -- MACS (Multilingual Access to Subjects)

LCSH AND MeSH MAPPING PROJECT SAMPLE AUTHORITY

RECORDS, Northwestern University Library

KOS

Vocabularies

Authority files

Bibliographic files

S1

Metadata

Metadata

Terms from thesaurus 1

Terms from thesaurus 1

Metadata Terms from thesaurus 1

Terms from thesaurus 2

Terms from thesaurus 2

Terms from thesaurus 2

Terms from thesaurus 2

S2

Co-occurrence mapping -- works at the application level, i.e., in metadata records, where the group of subject terms can actually result in loosely-mapped terms.

So far,

Functional Requirements for Authority Records (FRAR)

Covers:

 Names for persons, families, corporate bodies (Group 2)

 Titles (Group 1)

Projects for Authority Data Sharing focus mainly on

Names:

ONE Shared Authority Control (ONESAC, ppt )

Virtual International Authority File ( VIAF )

Linking and Exploring Authority Files ( LEAF )

Hong Kong Chinese Authority (Name) ( HKCAN )

FRBR Workshop, OCLC, 2005

FRSAR:

Functional Requirements for Subject Authority Data

Scope: focus on FRBR’s Group 3 entities

FRSAR Working Group contact: Marcia Zeng mzeng@kent.edu

Maja Zumer

Athena Salaba asalaba@kent.edu

FRBR Workshop, OCLC, 2005

FRBR Workshop, OCLC, 2005

FRSAR terms of reference

 build a conceptual model of Group 3 entities within the

FRBR framework (Entities in Group 1 and Group 2 can be used as the subjects of works; but further inclusion of them will depend on the outcomes of the work of the

FRANAR Working Group); provide a clearly defined, structured frame of reference for relating the data that are recorded in subject authority records to the needs of the users of those records; and assist in an assessment of the potential for international sharing and use of subject authority data both within the library sector and beyond.

FRBR Workshop, OCLC, 2005

Download