Reconciling OCLC and Orbis - Liberty Application Server

Reconciling OCLC and Orbis Managing a Bibliographic and Holdings Synchronization Between Yale University Library and WorldCat Melissa A. Wisner Purpose of Presentation  Describing the process      What is involved? Staffing required Timeframe Programming required Are We Done Yet?  No! Why do you want to come to this talk?  For any size collection a reconciliation is a detail oriented project, planning, pre-processing, OCLC processing, dealing with returned data, maintaining the data  Why do this?  Living with your own standards—good or bad  What is your database of record? YUL Background  Voyager ILS since 2002  Approximately 8.5 million bibliographic records  Member of (former) RLIN  OCLC Participant—add pcc records, create IR records in WorldCat, weekly holdings update, some cataloging directly in Connexion, ILL lender  Early 00’s YUL did retrospective conversion with OCLC Standard Workflow between Voyager and OCLC   Weekly export to OCLC (staff flag records to send as needed) Sporadic OCLC Batch Matches over the years  Local program to identify “candidate” records by encoding level and “UNCAT” status; send out to OCLC as separate project; filter and reload any 1, 4, or 7 el returned records and overlay the original  Run LC Match once a month-similar process against local copy of LCDB Arcadia Grant  Cultural Knowledge grant  March 2009-March 2013  $5 million/$1 million per year  Cambodian Newspapers, Khmer Rouge Genocide documentation, African language materials and more…  Layoffs and re-staffing What Records to Send or Exclude?  Divided up by locations for staff review  Uncovered some data problems we knew about and didn’t know about…e.g. locations with no holdings in them; locations that still had holdings we thought had been migrated to new locations  Most significant…outdated MARC tags, outdated format codes, practice different from OCLC, dual script records What Records to Send or Exclude?  Sending approximately 6.7 million out of 8.5 million bibliographic records as UTF-8  Excluding:  MARCIVE  E-resource records  Suppressed bibs  Unsuppressed bibs with suppressed holdings records  In Process/On Order records  UNCAT records** Tracking our Records  MySQL database created  Bib IDs  Exlcude Project ID (local tracking)  OCLC Project IDs  Reload Dates Tracking our Records  Used this to QA the results of the queries run to identify all potential records  Used this to push out files of bib ids by OCLC project ID to be used later to extract correct records to send to OCLC  Tracking was/is a big effort of reconciliation! Tracking our Records  As records are prepared for loading back into Voyager this MySQL database will be updated with those date(s)  OCLC will produce crossref reports and other processing reports per each file, but these are not concatenated into any form of a relational database Building an 079 Index in Voyager  Ex Libris contracted to generate and update Voyager indexes  Created in both Production and Test environments--took less then a day each time; downtime required; $ for service  Added 079|a and 079|z left anchored indexes Building an 079 Index in Voyager  Updated SYSN Composite Index to include new 079 indexes:    019|a 035|a |z 079|a |z  Indexes were mostly to assist staff in searching, but also for bulk import profiles for ongoing loads  Exploring how to use the new indexes in ongoing EOD or e-resource loads from vendors OCLC Pre-Processing  OCLC IBM Mainframe limitations  Sending records in 100MB limit/90,000 records per file AND only 15 files per day  Separating records with 880s from those without  Additionally, OCLC is splitting out PCC records from the YUS files OCLC Pre-Processing  Each set of files sent as a “project” with unique ID  Creating label files, tracking via spreadsheet  Suspended weekly exports to OCLC (9/5/2010-12/20/2010**) OCLC Pre-Processing  Deleting YUL IR records in WorldCat   Why? Easier matching? 5.7 million removed total  EBScan software process  Match routines set:  Example: match on this field and that or …. Cross Ref Reports and Stats  Sample  Adding in prefixes of ocm and ocn  Other statistical reports Loading OCLC Numbers back into Orbis  Basic Process:       Retrieve crossref report to be used as input Script to de-dupe crossref reports by name* Extract MARC record using Voyager API BIB_RETRIEVE and crossref as input MARC4Java Open Source to parse and update the MARC record* Remove any version of *OCoLC* *ocm* *ocn* in 035|a Insert new IR number from crossref report with a prefix of (OCoLC) Loading OCLC Numbers back into Orbis  Basic Process:      Comparing 079|a to crossref report—if same, move on, if new, just add and move on, if different, update with new one and report out old one Remove any 079|z and report out Prepare new file of MARC records for bulk import Report out log summary of process, errors encountered, discrepancies in 079|a See handouts! Loading OCLC Numbers back into Orbis  Will also be our new permanent workflow post-reconciliation—maintenance of these control numbers!  Cornell, Columbia and Stanford all used similar processes…  Original hope was to load 250,000 records per day 4 days a week=estimated 6 weeks to reload everything back into Orbis… Loading OCLC Numbers back into Orbis  All depends on timing…OCLC process 80K records in 1-2 days for 6.7 million bibs it is 1.2 million/month or 2.4/month or 3 to 6 months total to process our data!  We can keep pace with loading updated MARC data, but waiting 6 months is a big deal  Need to keep 1 day a week for all other load activity in Orbis Loading OCLC Numbers back into Orbis  Run a keyword regen once a week—even though keyword index not being updated  Program to extract and update MARC records can process 80K records in 15 minutes  Bulk import run no-key takes 2 hours to load 80K records  Minimize the loss of any staff changes Handling Errors  Reports from OCLC with no match records (validation errors)  Correcting anything in OCLC?  Correcting records in Voyager then resubmitting post-reclamation?  See handouts! Processing a “Gap” File  Suspended weekly exports to OCLC 9/5/2010  Extracted a version of the bib record between 9/8/10 and 9/10/2010  Identify and extract all changes and new records from 9/8/10, that have an 079|a and the last operator in History is not OCLCRECON  Send to OCLC as another one-off project What Staff Will Do During Reconciliation  No processing of holdings in OCLC  ILL OK  Will not create IR records so as not to affect matching  Work in Orbis as normal otherwise Modifications Needed to Resume Weekly Exports to OCLC  Two file streams needed-one for archival materials and one for everything else  PCC records will be split off once at OCLC  YUM records split off once at OCLC  New process/program created Lessons Learned So Far  Consistent application of standards across cataloging units (Suppressed, Suppressed!!!, In Process records, etc.)  What is your database of record?  How much time to spend on fixing records so they can be sent?  Maintenance of the control numbers long term Questions?  Thank you!  melissa.wisner@yale.edu

Reconciling OCLC and Orbis - Liberty Application Server

Related documents

Products

Support

Reconciling OCLC and Orbis - Liberty Application Server

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib