How ProQuest Handles Original Data Provided by the Publishers, and Presents It in the Full-Text Aggregator’s Database Steven A. Knowlton, MLIS ALCTS CRS Committee on Holdings Information Holdings Update Forum, 7/11/09 Introduction • Where or from whom does ProQuest obtain coverage data for each title in our databases? • In what format does ProQuest receive raw holdings data? • What are the data elements in these source data? (For instance, are enumeration and chronology both present in the title lists we receive from the source, or does ProQuest supply the data? If ProQuest supplies the data, how much research in involved, and how difficult is it to provide holdings data for content coverage? • How does ProQuest pass these coverage (holdings) data to Electronic Resources Access and Management Services like Serials Solutions, and in what format during data transfer? • Is ProQuest implementing ONIX SOH standards in their coverage data for ejournals? What are the issues in implementation, if any? About ProQuest Based in Ann Arbor Since 1938 ProQuest is an information partner, creating indispensable research solutions that connect people and information. Through innovative, user-centered technology, ProQuest offers a depth and breadth of global content that includes historical newspapers, dissertations, and uniquely relevant resources for researchers of any age and sophistication--including content not likely to be digitized by others. Inspired by its customers and end users, ProQuest is working toward a future that blends information accessibility with community to further enhance learning and encourage lifelong enrichment. PRODUCTS: Microforms Dissertations Databases ProQuest and Serials Holdings Examples of ProQuest databases with serials content: ProQuest Science Journals ABI/Inform ProQuest Central Accessing Serials Holdings in ProQuest Within a database: Accessing Serials Holdings in ProQuest Within a database: Accessing Serials Holdings in ProQuest At the ProQuest website: Accessing Serials Holdings in ProQuest At the ProQuest website: Accessing Serials Holdings in ProQuest “Local Administrator” module: - Holdings information customized for each subscriber - Provides holdings data available within each and all of the ProQuest databases to which a customer subscribes How ProQuest gathers holdings data Content acquisition: ProQuest acquires content from publishers through various techniques 1. Hard copies: chronology & enumeration captured by hand, embedded in file 2. Born-digital content: chronology & enumeration embedded in the file by content provider How ProQuest gathers holdings data Manufacturing: Preparing the content for access in the ProQuest platform - ProQuest has filters to extract holdings data from the electronic content - ProQuest populates only the holdings data supplied by providers - Holdings data along with other information populated in the database fields Example of raw data files ProQuest receives from content providers XML: XMD-entity REPOSITORY_FORMAT="1.0" PRXML_VER="2.2" ENTITY_TYPE="Article" PAGE_NO="1" ID="Ar00100" BOX="362 296 1177 764" LANGUAGE="English" CONTINUATION_TO="Ar00803" SNP="Ar00100S.png" SNP_WIDTH="350" SNP_HEIGHT="118"> <Meta NAME="Mother, son killed in one-car accident" DESCRIPTION="" SUBTYPE="" BASE_HREF="ISJ/2009/05/27" SOURCE_TYPE="PDF" PUBLICATION="ISJ" SECTION="Front Page" ISSUE_DATE="27/05/2009" WORDCNT="180" RELEASE_NO="ISJ20090527_0_0_0_Resized" PAGE_ID="ISJ20090527A01.PDF" PAGE_TYPE="Single" PAGE_WIDTH="399" PAGE_HEIGHT="824" DEFAULT_IMG_EXT="png" PDF_DESTINATION_MAPPED="OLV0_Entity_0001_0001” Example of raw data files ProQuest receives from content providers SGML: <!DOCTYPE ARTICLE SYSTEM "MCB.DTD"><ARTICLE AID="2680070401.sgm" PDFID="2680070401.pdf" DOI="10.1108/14720700710820443" COPYRIGHT="M" ARTTY1="Research paper"> <FM> <PUBFM> <JTI>Vexillology Today</JTI> <VOL>7</VOL><ISS>4</ISS><PBD>2007</PBD><PPF>355</P PF><PPL>369</PPL><ISSN>1472&hyphen;0701</ISSN><NAM E>Gilbert Smith, Frank Jones, and Peter McCartney</NAME> </PUBFM><ATL>Integrating corporate responsibility principles and stakeholder approaches into mainstream strategy&colon; a stakeholder&hyphen;oriented and integrative strategic management framework</ATL><AUG> Example of raw data files ProQuest receives from content providers Tagged ASCII: 020209990629VUTVS01 00000136B32 no09 0001 090627 N S 0906270129 00007681{IT}N{SOURCETAG}0906270129{ ACCESSION}000000{PUBLICATION}THE NEWS AND OBSERVER {DATE}090627{TDATE}Saturday, June 27, 2009{EDITION}FINAL{SECTION}NEWS{PA GE}A1 Converting Holdings Data ProQuest does not add any holdings data - Only convert what is supplied by the publisher Formatting according to known publication frequency: -Tag contains “20071201” - Known to be a monthly serial - ProQuest will display “December 2007” Fields in ProQuest databases Unique identifiers for: • the serial title • the volume/issue/date of the serial • the article Other fields as shown on next slide Fields in ProQuest databases ProQuest document ID Unique identifiers for the article (e.g., 26615328) Links to the unique identifiers assigned to the volume, issue, or date of the serial, and to the unique identifier assigned to the serial as a whole ProQuest Title Lists • Content is available in databases as soon as it is manufactured • Title lists appear shortly afterward • Content Control staff runs a report to verify that the information we present in the Title List is accurate • It is double-checked by a second set of staffers How ProQuest provides data to ERAMS Electronic Resources Access and Management Services (ERAMS): organizations that work to convey holdings data from publishers or aggregators to library systems (“linking partners”) ProQuest provides our holdings data in the SOH 1.0 format for ERAMS partners Serials Online Holdings (SOH) standard • Created by NISO in 2005 as part of the ONIX standards • Intended for “communicating information about the holdings or coverage of online serial resources from a party that holds or supplies the resources to a party that needs this information in its systems” • Coded in XML Serials Online Holdings (SOH) standard DATA ELEMENTS INCLUDED BY PROQUEST: • Publisher • Journal Title • Journal Issue • Format (HTML, PDF, etc.) • Embargo period Serials Online Holdings (SOH) standard <HoldingsRecord><RecordReference>PQPMID:83</RecordReference> <NotificationType>00</NotificationType><SerialVersion><SerialVersionI dentifier><SerialVersionIDType>07</SerialVersionIDType><IDValue>03 841294</IDValue></SerialVersionIdentifier><SerialVersionIdentifier><S erialVersionIDType>01</SerialVersionIDType><IDTypeName>PMID</I DTypeName><IDValue>83</IDValue></SerialVersionIdentifier><Title>< TitleType>02</TitleType><TitleText>The Gazette</TitleText> </Title><Publisher><PublishingRole>01</PublishingRole><PublisherNa me>CanWest Digital Media </PublisherName></Publisher> <OnlinePackage><OnlineServiceName>ProQuest</OnlineServiceNam e><Website><WebsiteRole>03</WebsiteRole><WebsiteLink>http://pro quest.umi.com/pqdweb</WebsiteLink></Website><HoldingsDetail><J ournalIssue><JournalIssueRole>04</JournalIssueRole><JournalIssueD ate><DateFormat>00</DateFormat><Date>19850102</Date></JournalI ssueDate></JournalIssue><JournalIssue><JournalIssueRole>06</Journ alIssueRole><JournalIssueDate><DateFormat>00</DateFormat><Date >20090626</Date></JournalIssueDate></JournalIssue><EpubFormat>1 0</EpubFormat></HoldingsDetail><Embargo><EmbargoType>02</Emb argoType><EmbargoValue>2</EmbargoValue></Embargo></OnlinePac kage></SerialVersion></HoldingsRecord> Serials Online Holdings (SOH) standard <ProQuestDatabases> <DatabaseID>3</DatabaseID><DatabaseName>ABI/INFORM Global</DatabaseName><DatabaseDesc>Most scholarly and comprehensive way to explore and understand business research topics. Search nearly 3000 worldwide business periodicals for in-depth coverage of business and economic conditions, management techniques, theory, and practice of business, advertising, marketing, economics, human resources, finance, taxation, computers, and more. Expanded international coverage. Fast access to information on 60,000 + companies with business and executive profiles. Now includes The Wall Street Journal.</DatabaseDesc> <Titles> <Title><RecordReference>PQPMID:6</RecordReference></Title> <Title><RecordReference>PQPMID:8</RecordReference></Title> <Title><RecordReference>PQPMID:7510</RecordReference></Title> <Title><RecordReference>PQPMID:7539</RecordReference></Title> <Title><RecordReference>PQPMID:7896</RecordReference></Title> <Title><RecordReference>PQPMID:7921</RecordReference></Title> <Title><RecordReference>PQPMID:7940</RecordReference></Title> <Title><RecordReference>PQPMID:7976</RecordReference></Title> How ProQuest customers can receive holdings information updates Database Content Mailing List Thank You • Gregg Zajic • Jessica Lehr • Reed Lenz