NCSU - OCLC

advertisement
OCLC Cluster Service
Leiden March 28 2007
Discussion Session With KB & UVA
Janifer Gatenby, Strategic Research
Agenda
• Welcome and Introductions
• Presentation
– Clustering
– Audience Level
– Copyright / Rareness
– FAST subject headings
• Discussion
• Lunch
2
Some slides from NCSU’s
Endeca
Test Catalog using OCLC
work identifiers for
Clustering
3
4
5
6
Some slides from PiCarta
(Netherlands)
Test Catalog using OCLC work
identifiers for Clustering
7
Without clustering
8
With Clustering
9
Consolidation of Holdings
The above example shows 2 holdings, one each per bibliographic record.
The consolidation of holdings permits Reservations (holds) and
Requests at work level
10
Dutch
• 6.7 million work
identifiers / 7.7
million bib records
• Collapse rate of
13%
– Av. 1.15
bibliographic
records per work
record
• Software
adaptation less
than 1 week
NCSU
• 1.64 million work
identifiers / 1.7
million bib records
• Collapse rate of
3%
– Av. 1.03
bibliographic
records per work
record
11
Method
OCLC #
OCLC
Work ID
Title
65647794
20842726
Goldene vliess
27921612
30369321
Goldene vliess
5773235
19885466
Goldene vliess
36638149
12019603
Goldene vliess
36638149
12019603
Goldene vliess
36638149
12019603
Goldene vliess
12
Method
PPN
80637760
124594883
36330531
80626203
18113649x
80540333
OCLC #
65647794
27921612
5773235
36638149
36638149
36638149
OCLC
Work ID
Title
Comments
20842726
Goldene
vliess
not in main group
30369321
Goldene
vliess
not in main group
19885466
Goldene
vliess
not in main group
12019603
Goldene
vliess
in main group
12019603
Goldene
vliess
in main group
12019603
Goldene
vliess
in main group
13
Fixing Mismatches
• Alternatives
– Fix data at source
– Apply name / title authority records
– Enhance algorithm
• Eliminate foreign articles
• Convert “fünf”, “vijf”, “cinq” to “5” etc.
• At OCLC
– Quality control
– Office of Research
14
Authorities Ensure Matching
• Foreign union
catalogue data
– Non AACR2, not
native MARC21,
other language of
cataloguing, non
standard uniform
titles
– Requesting 1,000
name / title
authority records
per union catalogue
Bib record for a
translation without
uniform title
will match if there
is a comprehensive
author / title
authority record
15
Bib
100 …Rowling, J.K.
245 …La chambre
secrète
…………….
Authority
Rowling, J.K.
The secret chamber
De geheime kamer
La chambre secrète
Die geheime kammer
……………
16
FRBR – Divide and conquer
•
•
•
•
Creation of works (38 million)
Algorithm
Authority records
Cleaning bibliographic records
where necessary
• No manual links created
• Improved user interfaces
•
•
•
•
Harvesting
Loading IDs & records
Authority records
Improved user
interfaces
• Suggestions for the
improvement of the
algorithm and records
17
ALA Mid Winter Meeting
• Representatives 19 libraries
with substantial holdings in
WorldCat
• Clear Requirements
– XML cluster record service
– Minimum of daily update
18
Discussion
19
Phase 2
• Phase 1 – table
• Phase 2 – work record with
enriched data
– Audience level
– Rareness
– Copyright
– FAST headings for faceted search
20
Audience Level and Rareness
21
OpenURL
Request Transfer Message
22
Faceted Search
23
FAST headings
• Fully formed concepts
• Suitable for faceted search
– LCSH “sentences” – breaking into
concepts is tricky
http://www.oclc.org/research/projects/fast/
24
Discussion
25
Cluster
Cluster
Identifier
Instance/s
Description
Related Works
Type
Identifier/s
+ type
WC Cluster
Identifier
Value
Copyright
estimate
Instance/s
Holdings count
(rarity)
Relationship
(sequel etc.)
OCLC
Number
26
Cluster
WC Cluster
Identifier
Instances
Author
Description
Title
Related works
About
Audience
WC identity ID
Language
Heading + type
Display version
Alternative
Title/s
Classification
+ type
Holdings
(rarity)
Language
Type
27
Deployment
• CBS 3.2 ++ incorporating
cluster record in test due Easter
• Installation in LBS
• OCLC Distribution service – dev.
To start in April
• PSI modifications to use cluster
record
• Looking for testing partners
28
29
Download