Contrails Digital Initiative from a Documents Librarian point of view Presented at: NIDL Spring 2015 Meeting 5/15/15 IIT – Galvin Library Timeframe • • • • • • Germination (1996-1998) Initialization (1999-2000) Obstruction (2001-2002) Redesign & Reboot (2003-2004) Rebrand & On Demand (2005-2008) Capitulation & Celebration (2009-) Germination (1996-1998) Patron Requests • Citations indicating government technical reports with WADC & WADD designations brought to Reference Desk • A group of unprocessed reports, most with yellow covers, lived in the basement • No shelf list or index of any kind WADC & WADD? • Became clear that the acronyms stand for Wright Air Development Center and Wright Air Development Division • Bureaucratic umbrella organizations that provided a centralized administration of research efforts for many laboratories at Wright-Patterson Air Force Base near Dayton, Ohio Let’s find call numbers! Andriot: Andriot: Andriot: Andriot: Andriot Andriot Andriot Andriot Andriot Andriot Added call numbers • We used D 301.45/6:yr-num, where “yr” is the year in 2 digits, and “num” is the number of the report for that year. • The right stem was actually D 301.45/5: • The D 301.45/6: stem was for a group of reports with report numbers “AF TR num” Andriot Andriot Andriot Andriot Andriot Andriot Not published by GPO: Monthly Catalog MoCat Key “Black Dot” Titles Item Numbers Indicated WADC Titles SuDocs Numbers PB Number Indicated Issuing Office AD Numbers Notices Qualified Requesters For sale to the general public Printed “AD Number” Stamped “AD Number” Front Cover Statement Handwritten “AD Number” Stamped “PB Number” NTIS Document Recap of “Germination” • Lack of information from the Monthly Catalog led to: – Assignment of the incorrect SuDoc number to hundreds of reports – Assumption that each and every report we possessed was distributed to depository libraries Reality • Even though SuDocs numbers could be found for pretty much all the reports, that fact did not indicate public availability • Some reports we possessed were for sale to the public and some weren’t released to be sold to the public How did we get so many? • The reports were in paper format, but both ASTIA/DTIC and OTS/NTIS holdings are in microfiche • Distributed to research organizations who had DoD contracts depending on their research subject • Armour/IIT did research in materials science Later reports • Reports received through donation from the 1960’s sometimes had distribution lists • It became clear that very few paper copies of most reports were distributed • Recipients only received a small fraction of the overall series Trouble Brewing • We possessed materials we thought were fully public • We created a database with metadata for the reports we possessed • A plan to digitize and post the reports to a public website was hatched • Wright Air Development Center Digital Collection was on its way! Initialization (1999-2000) Website Features • IIT connection – The Illinois Institute of Technology, Armour Research Foundation and IIT Research Institute were all involved in research that generated WADC Technical Reports, so we created a section to highlight the role of the institutions Website Features • Space Race – This feature highlighted the role of the laboratories at Wright-Patterson AFB with regards to spaceflight research prior to the Apollo missions – Before NASA was created, there was a possibility that the Air Force could have take the lead in spaceflight Website Features • Roswell – Many of the government explanations for the supposed alien activity near Roswell, New Mexico had to do with the activities of WADC researchers – Digitized both the relevant WADC reports and the two government monographs on Roswell By 2001 • Digitized and posted approximately 400 reports, with about 800 more to go • Had a fully fleshed out website, one focused on hits and then page views • Digitization equipment difficult to work with and unreliable, and corners cut to boost numbers Obstruction (2001-2002) DoD Comes a Knocking • Actually, it was emails and phone calls • We were informed that many, actually most, of the reports we had posted had never been cleared for public release • Could we please remove those reports • Would have devastated our efforts Worked with DTIC • They searched their Private STINET database (now DTIC Online Access Controlled) to try to help us keep as much posted as possible • Ended up removing many, many reports, many of which still reside in a offline network folder named “Removed” Problem overstated • Private STINET was a very dirty database. • Additionally, report numbers were erroneous on a global scale • WADC TR & WADD TR converted to ASD TR • Led to low return on searches Incorrect Distributions • Additionally, there were many reports that were listed as “Limited Distribution” that had actually been released to the public. • Evidence for this included cover statements, index entries and PB Numbers Evidence of Public Availability • We used evidence of public availability to justify re-posting many reports • We informed DTIC of the fact that we felt their database was inaccurate and that we would repost reports that had evidence of public availability • I guess they shrugged More information • The information just covered was the focus of the presentations from 200506, so I won’t go into great detail – See: • http://hdl.handle.net/10560/1326 • http://hdl.handle.net/10560/1327 Redesign & Reboot (2003-2004) Website Redesign • During 2003-2004 an effort was made to update the website • Aesthetic and architecture of the new site followed the main library website • Added a search engine function • Created a Report Number matrix • Received first major donation Website Features • Added and removed features from the “Historical Overview” – Removed “IIT Connection” – Added “Pearl Harbor” – Added “Image Gallery” – Added “Feature Report” – Digitized SuDocs from the “D 301.2:” stem Report Digitization • Gained access to Private STINET – Able to determine that many reports actually were public, just had report number or spelling errors that made records hard to find – Posted only those that DTIC agreed were public – Created list of “limited” reports with evidence of public availability Rebrand and On-Demand (2005-2008) Rebranding • Since the Wright Air Development Center was defunct, we were only web presence • Gave the impression that WADC was a unit of IIT • Decided to rebrand to Contrails Benefits to rebranding • Remove confusion about WADC • Better reflected our broader digitization efforts – Donations of reports from other laboratories – Many, many report numbers • Domain name: contrails.iit.edu Drawbacks to rebranding • Contrails not unique enough – There are now many websites on the topic • • • • • • • Wikipedia NASA, FAA & NOAA UIUC, Wisconsin-Madison PBS Company names “Chemtrails” conspiracy Currently Page 4 of Google results On-demand Scanning • Always receiving emails about reports that people desired • Sometimes we had them • Sometimes we didn’t • Only resource online aside from DTIC Online, who are there to serve DoD and contractors, not public On-demand Scanning • Created dynamic links – If we had scanned the report, it would created download link – If the report was Limited Distribution, it would indicate that fact – If we didn’t possess the report, it would make that clear – If we had yet to scan report, it would create a “mailto” link to request scanning On-demand Scanning • Very successful initially • Many scanning requests received and filled • Depended on Google’s search engine algorithm, which followed and indexed our dynamic links • Felt we could depend on this model Capitulation & Celebration (2009 forward) On-demand Scanning • Requests began to drop off • Google had OCRed and indexed all of our pdfs • Users weren’t being driven to our site, and weren’t seeing our scanning request dynamic links • Turns out, Google stopped indexing them Fight Google?! • Wanted to increase traffic to Contrails – Users bypassing site to download pdfs straight from Google search – Missing dynamic links to request scanning of reports we possessed but hadn’t digitized – How to drive people to site? Can’t fight Google • Best bet was to change model – Return to heavier digitization rather than relying on scanning requests – Once reports were posted, and indexed by Google, accessibility to users was great – Who cares whether user is driven to our site, if they get what they need – Anecdotal evidence of the value of our efforts Contrails: Scanning Requests and Citations, 2005-08, 2009-14 180 18 160 16 153 15 140 14 129 125 120 12 100 10 Cumulative Requests (Relevancy Period) Cumulative Requests (Immediacy Period) 80 8 Cumulative Cites (Relevancy Period) Cumulative Cites (Immediacy Period) 60 6 40 4 26 3 20 2 1 0 0 2005 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Celebration • Anecdotal evidence of value – Contrails.iit.edu showing up in bibliographies • • • • • 4 Patents 2 Books 6 journal articles 2 dissertations 1 Conference paper JA4 ('14) JA6 ('14) CP - Conference Paper D - Dissertation JA - Journal Article M - Monograph P - US Patent JA3 ('13) JA2 ('11) D1 ('11) CP1 ('09) D2 ('12) '10s P4 ('14) M2 ('11) '00s '90s M3 ('12) P1 ('12) '80s '70s '60s JA1 ('08) '50s JA5 ('14) P2 ('13) P3 ('14) M1 ('09) Cited Reference Pub. Date Contemporary Research Pub. Date JA4 ('14) JA6 ('14) CP - Conference Paper D - Dissertation JA - Journal Article M - Monograph P - US Patent JA3 ('13) JA2 ('11) D1 ('11) CP1 ('09) D2 ('12) '10s P4 ('14) M2 ('11) '00s '90s M3 ('12) P1 ('12) '80s '70s '60s JA1 ('08) '50s JA5 ('14) P2 ('13) P3 ('14) M1 ('09) Cited Reference Pub. Date Contemporary Research Pub. Date JA4 ('14) JA6 ('14) CP - Conference Paper D - Dissertation JA - Journal Article M - Monograph P - US Patent JA3 ('13) JA2 ('11) D1 ('11) CP1 ('09) D2 ('12) '10s P4 ('14) M2 ('11) '00s '90s M3 ('12) P1 ('12) '80s '70s '60s JA1 ('08) '50s JA5 ('14) P2 ('13) P3 ('14) M1 ('09) Cited Reference Pub. Date Contemporary Research Pub. Date JA4 ('14) JA6 ('14) CP - Conference Paper D - Dissertation JA - Journal Article M - Monograph P - US Patent JA3 ('13) JA2 ('11) D1 ('11) CP1 ('09) D2 ('12) '10s P4 ('14) M2 ('11) '00s '90s M3 ('12) P1 ('12) '80s '70s '60s JA1 ('08) '50s JA5 ('14) P2 ('13) P3 ('14) M1 ('09) Cited Reference Pub. Date Contemporary Research Pub. Date JA4 ('14) JA6 ('14) CP - Conference Paper D - Dissertation JA - Journal Article M - Monograph P - US Patent JA3 ('13) JA2 ('11) D1 ('11) CP1 ('09) D2 ('12) '10s P4 ('14) M2 ('11) '00s '90s M3 ('12) P1 ('12) '80s '70s '60s JA1 ('08) '50s JA5 ('14) P2 ('13) P3 ('14) M1 ('09) Cited Reference Pub. Date Contemporary Research Pub. Date Soapbox Time Threats to NTIS • 1988: Discussion of privatization – Y 4.En 2/3:100-170 – Y 4.Sci 2:100/5 – Y 4.Sci 2:100-36 – Y 4.Sci 2:100/84 • 1999: Plan to close NTIS – Y 4.Sci 2:106-37 Threats to NTIS • 2012: NTIS’ Dissemination of Technical Reports needs congressional attention – GA 1.13:GAO -13-99 • 2014: NTIS’ Dissemination of Technical Reports needs attention – GA 1.13:GAO-14-781T GA 1.13:GAO-14-781T • 3 Major findings – NTIS’s fee based model is losing money • Outdated funding model – Reports are available elsewhere • Only 74% were available elsewhere – Demand is higher for newer reports • 62% of 21st century additions to repository are from the 20th century Future role for NTIS • Centralized shallow web indexing of all Federal technical report collections. NTIS should be the Google Scholar for technical reports • Not all agency technical report sites are shallow web, for instance NASA Increased indexing • From the looks of it, there are 350,000 conference proceedings held by NTIS • The papers from these conferences haven’t been indexed • Contrails indexing of conferences at the paper/presentation level has been wildly successful • There could be close to 10,000,000 unindexed papers held by NTIS Information Advocate • NTIS could take the lead in ensuring public availability of federally funded information – Work with DTIC to remedy their suppression of technical reports that have been announced as publicly available – Identify similar problems at other agencies and work with them as well Information Advocate • Work to release technical information from the various “Sensitive but Unclassified” distribution limitations invented to get around EO 10901 • Time to step back down off the Soap Box Contrails Value Added • Comparing Contrails to DTIC Online – Public can request digitization – Scans from paper instead of microfiche • Photographs in grayscale – Improved handling of foldout pages – Better resolution coupled with lower file size – Paper level indexing of conferences Moving Forward • Issues – May need to migrate website from an ASP model hosted on a Windows server – Would like to integrate the aesthetic with the new library and university website redesign – Lost access to DTIC Online Access Controlled Moving Forward • Opportunities – Split off historical resources from technical reports – Streamline indexing of reports and allow full-text web searching to augment our website – Accelerate digitization efforts Thank you report donors! • Lockheed Martin Missiles & Fire Control • Air Force Research Laboratory • University of Cincinnati • Embry-Riddle Aeronautical University • Royal Air Force Centre for Aerospace Medicine • Bombardier