Final LD4P Meeting Proposal

advertisement

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Rationale

Philip Schreur

2/23/2015

Stanford University Libraries

Introduction

The library community’s initial transition to a linked-data infrastructure has focused primarily on the conversion of its masses of data in the MARC formats to a linked-data model. However, even as academic libraries across the United States strive to make this bibliographic data available as linked data, complications persist that critically impact their effective transition to this new infrastructure. To address these issues, Stanford University Libraries requests $41,133 in funding from the Andrew W.

Mellon Foundation to convene a Linked Data for Technical Services Production (LD4P) group for two meetings. The seven libraries identified to participate are Stanford, Cornell, Columbia, Harvard,

Princeton, the Library of Congress, and the University of Maryland, and will be tasked with identifying the resources needed to bypass this flawed conversion process and develop a new technical services infrastructure that works natively with linked data, making this conversion process unnecessary.

Statement of need

Libraries worldwide rely upon Machine-Readable Cataloging (MARC)-based systems for communication of their bibliographic data. MARC, however, is an inarticulate communication format developed in the

1960s to represent the bibliographic data recorded on catalog cards. Connections between various data elements within a single catalog record are not expressed as it is assumed a human being will be examining the record as a whole and making the associations between the elements for themselves.

For example, many relationships, such as those between subjects and specific works, performers and performances, geographic area and event, were never expressed in MARC. As libraries shift to a linkeddata based architecture that is reliant upon machine linking of individual data elements, this former reliance on human interpretation at the record level to make correct associations between individual data elements becomes a critical flaw. These associations between data elements can only be added as an additional, manual process once MARC data has been converted to linked data. The amount of work needed to accomplish this final, critical step is immense and as a result is rarely, if ever, completed.

Therefore, libraries have only made a partial step into their linked-data future and are left with masses of converted data in need of evaluation and upgrading. The backlog of MARC data that requires conversion is only increasing with each day of routine processing. Focused conversation around the transition to a natively-based linked-data architecture that would obviate the need for the creation of

MARC metadata is needed to create a permanent solution to this flawed, unsupported cycle of MARC metadata creation, transformation, and upgrading.

Partner Organizations

Stanford, Cornell, Columbia, Harvard, Princeton, the University of Maryland, and the Library of Congress are well positioned to approach the challenge. The Library of Congress is spearheading the development of BIBFRAME, a linked-data based communication format designed to replace MARC, and will be experimenting with cataloging based in BIBFRAME this summer. Stanford has had a long interest in linked-data applications and is currently a partner with Cornell and Harvard in Linked Data for Libraries, a Mellon-based grant focused on linked-data based discovery. Princeton has been a BIBFRAME early experimenter and the University of Maryland brings with it a connection to OLE as a possible linked-data friendly open source ILS.

Methodology & Outcomes

The endless cycle of MARC record creation, conversion, and upgrading needs to be stopped. LD4P plans to accomplish this by beginning the transformation of basic technical services metadata production to a model natively-based in linked data. Technical Services does not work in a closed system, however.

They have numerous connections with various vendors (authority, shelf-ready, cataloging), the ILS, and

OCLC, not to mention interdependencies between themselves.

The participating seven libraries initiated conversations on this topic at ALA Midwinter in Chicago. The proposed planning meetings for LD4P would build upon that conversation, including potential domains and workflows to address (copy cataloging, complex copy cataloging, original cataloging, vendor cataloging, maps, art works, original scripts) and initial vendors to contact (Casalini, Harrassowitz,

MarcNOW, Backstage Library Works, OCLC, Zepheira).

The group also identified initial areas for development, including a cloud space in which the members could work as a community, a triple store for all of our work on which developers could create tools, connections from the cloud to local environments, and specific tool development.

Stanford Libraries expects the goals for the first meeting to be:

Confirmation of institutional commitments to LD4P

Definition of goals for the project; defining both individual and inter- institutional goals, and articulation of assistance that may be needed to meet those goals.

Identification of critical outside partners

Identification and drafting of the elements needed for a proposal to Mellon to support LD4P

The second meeting will focus on the following outcomes:

Refinement of the Mellon proposal based on feedback from Mellon on the 1rst draft

Discussion of responses from critical outside partners identified in the 1 st meeting

Inter-institutional architecture needed to interconnect the individual projects

Project Budget

Please refer to the Melon Budget Form included as Attachment 1 and Budget narrative at the close of this document.

Project Summary

The planning grant for LD4P will be used to support two, face-to-face meetings of the identified partner institutions in order to define and draft a proposal to Mellon for the support of the project itself. The project will create and promote a new infrastructure based on linked-data for traditional technical services processing to by-pass a library’s dependency on the MARC formats for descriptive metadata creation. I would be happy to address any further questions and concerns based on the proposal as it is described here.

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Agenda for Meeting 1

Day 1

Introductions

Specification of the goals of LD4P

Define metrics for assessment of LD4P

Individual institutional interests

Local Institutional Infrastructure needs to support LD4P

Identification of critical partners (e.g., Vendors) for Phase 1

Intersection of LD4P and the Library of Congress’s LD plans

Day 2

Overall architecture

Functional needs of the shared cloud space

Identification of tools in need of development/upgrading

Exploration of possible link to OLE

Sustainability and expansion of LD4P

Review of Mellon proposal guidelines

Drafting of key points for the proposal (to meet the mid-May deadline)

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Budget Narrative for Meeting 1

The budget is planned to cover transportation and room/board costs for the 15 members not from the

Library of Congress (3 members from Columbia, Cornell, Harvard, Princeton and Stanford, and one from the University of Maryland to represent OLE). The group chose three members per institution in order represent someone in authority from technical services that could make commitment decisions, one metadata expert, and one technical expert. The Library of Congress will be providing meeting space and not asking for funds to cover food for its staff. As the proposed first meeting needs to happen quickly to give the partners time to complete its proposal to Mellon by the mid-May deadline, a limited number of hotels were available with an appropriate block of rooms. Food is calculated at government per diem rates with the addition of coffee and snacks to be supplied by the Library of Congress for the two meeting days.

Transportation

Columbia

Cornell

Amtrak: $170 x 3 people = $510

Harvard

Airfare: $420 x 3 people = $1,260

Airfare: $300 x 2 people = $600

Princeton

Amtrak: $140 x 3 people = $420

Stanford

Airfare: $400 x 3 people = $1,200

Shuttle: $50 x 3 people = $150

University of Maryland

Metro fare: $13.60

TOTAL TRANSPORTATION = $4,154

Board

Capitol Hill Suites ($350 a night tax inclusive for 3 nights)

14 members at @ $1,050 = $14,700

TOTAL BOARD = $14,700

Food

Government per diem ($71 a day) for 3 days

15 members @ $213 = $3,195

Coffee and break supplies for 2 days (estimate from the Library of Congress)

$1,500

TOTAL FOOD = $4,695

GRAND TOTAL = $23,549

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Agenda for Meeting 2

Day 1

Review of Mellon’s response to the 1 st draft proposal

Revision of LD4P’s goals and outcomes

Gaps and solutions to local Institutional Infrastructure needs to support LD4P

Response of critical partners (e.g., Vendors) for Phase 1 and modification of LD4P to take advantage of their assistance

Intersection of overall architecture and local need

Day 2

Tool specification

Identification, assessment, and selection of resources/vendors to supply needed infrastructure and tools

Revision of Mellon proposal based on Mellon’s feedback and meeting outcomes

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Budget Narrative for Meeting 2

The budget is planned to cover transportation and room/board costs for the 16 members not from the

Library of Congress (3 members from Columbia, Cornell, Harvard, Princeton and Stanford, and one from the University of Maryland to represent OLE). The group chose three members per institution in order represent someone in authority from technical services that could make commitment decisions, one metadata expert, and one technical expert. We will limit the meeting to a day and a half so members can leave the evening of the second day. The Library of Congress will be providing meeting space and not asking for funds to cover food for its staff. As we have time to plan for the second meeting, we will look for a hotel at the government per diem. Food is calculated at government per diem rates with the addition of coffee and snacks to be supplied by the Library of Congress for the two meeting days.

Transportation

Columbia

Amtrak: $170 x 3 people = $510

Cornell

Airfare: $420 x 3 people = $1,260

Harvard

Airfare: $300 x 2 people = $600

Princeton

Amtrak: $140 x 3 people = $420

Stanford

Airfare: $400 x 3 people = $1,200

Shuttle: $50 x 3 people = $150

University of Maryland

Metro fare: $13.60

TOTAL TRANSPORTATION = $4,154

Board

Capitol Hill Suites ($350 a night tax inclusive for 3 nights)

14 members at @ $700 = $9,800

TOTAL BOARD = $9,800

Food

Government per diem ($71 a day) for 2 days

15 members @ $142 = $2,130

Coffee and break supplies for 2 days (estimate from the Library of Congress)

$1,500

TOTAL FOOD = $3,630

GRAND TOTAL = $17,584

Linked Data for Technical Services

Linked Data for Technical Services Production (LD4P)

Planning Meeting Proposal

Participants

Stanford University

Joshua Greben (Systems Programmer and Analyst)

Nancy Lorimer (Head, Metadata Department)

Philip Schreur (Assistant University Librarian for Technical and Access Services)

Columbia

Kate Harcourt (Director, Original and Special Materials Cataloging)

Melanie Wacker (Metadata Coordinator, Original and Special Materials Cataloging)

Cornell

Bob Wolven (AUL for Bibliographic Services and Collection Development)

Chiat Naun Chew (Director, Cataloging & Metadata Services)

Jason Kovari (Head of Metadata Services and Web Archivist)

Jim LeBlanc (Director, Library Technical Services)

Harvard

Marc McGee (Geospatial Metadata Librarian)

Scott Wicks (Associate Librarian for Information and Technical Services)

Princeton

Jennifer Baxmeyer (Leader, Electronic Resources Team)

Joyce Bell (Cataloging and Metadata Services Director)

Tim Thompson (Metadata Librarian)

Library of Congress

Sally McCallum (Chief, Network Development and MARC Standards Office)

Nate Trail (Digital Project Coordinator)

Beacher Wiggins (Director for Acquisitions & Bibliographic Access)

University of Maryland

Carlen Ruschoff (Director of Technical Services)

Download