View/Open - Illinois Institute of Technology

advertisement
Contrails Digital Initiative
from a Documents Librarian point of view
Presented at:
NIDL Spring 2015 Meeting
5/15/15
IIT – Galvin Library
Timeframe
•
•
•
•
•
•
Germination (1996-1998)
Initialization (1999-2000)
Obstruction (2001-2002)
Redesign & Reboot (2003-2004)
Rebrand & On Demand (2005-2008)
Capitulation & Celebration (2009-)
Germination (1996-1998)
Patron Requests
• Citations indicating government
technical reports with WADC & WADD
designations brought to Reference Desk
• A group of unprocessed reports, most
with yellow covers, lived in the
basement
• No shelf list or index of any kind
WADC & WADD?
• Became clear that the acronyms stand
for Wright Air Development Center and
Wright Air Development Division
• Bureaucratic umbrella organizations
that provided a centralized
administration of research efforts for
many laboratories at Wright-Patterson
Air Force Base near Dayton, Ohio
Let’s find call numbers!
Andriot:
Andriot:
Andriot:
Andriot:
Andriot
Andriot
Andriot
Andriot
Andriot
Andriot
Added call numbers
• We used D 301.45/6:yr-num, where “yr”
is the year in 2 digits, and “num” is the
number of the report for that year.
• The right stem was actually D 301.45/5:
• The D 301.45/6: stem was for a group
of reports with report numbers “AF TR
num”
Andriot
Andriot
Andriot
Andriot
Andriot
Andriot
Not published by GPO:
Monthly Catalog
MoCat Key
“Black Dot” Titles
Item Numbers Indicated
WADC Titles
SuDocs Numbers
PB Number Indicated
Issuing Office
AD Numbers
Notices
Qualified Requesters
For sale to the general public
Printed “AD Number”
Stamped “AD Number”
Front Cover Statement
Handwritten “AD Number”
Stamped “PB Number”
NTIS Document
Recap of “Germination”
• Lack of information from the Monthly
Catalog led to:
– Assignment of the incorrect SuDoc number
to hundreds of reports
– Assumption that each and every report we
possessed was distributed to depository
libraries
Reality
• Even though SuDocs numbers could be
found for pretty much all the reports,
that fact did not indicate public
availability
• Some reports we possessed were for
sale to the public and some weren’t
released to be sold to the public
How did we get so many?
• The reports were in paper format, but
both ASTIA/DTIC and OTS/NTIS
holdings are in microfiche
• Distributed to research organizations
who had DoD contracts depending on
their research subject
• Armour/IIT did research in materials
science
Later reports
• Reports received through donation from
the 1960’s sometimes had distribution
lists
• It became clear that very few paper
copies of most reports were distributed
• Recipients only received a small fraction
of the overall series
Trouble Brewing
• We possessed materials we thought
were fully public
• We created a database with metadata
for the reports we possessed
• A plan to digitize and post the reports to
a public website was hatched
• Wright Air Development Center Digital
Collection was on its way!
Initialization (1999-2000)
Website Features
• IIT connection
– The Illinois Institute of Technology, Armour
Research Foundation and IIT Research
Institute were all involved in research that
generated WADC Technical Reports, so
we created a section to highlight the role of
the institutions
Website Features
• Space Race
– This feature highlighted the role of the
laboratories at Wright-Patterson AFB with
regards to spaceflight research prior to the
Apollo missions
– Before NASA was created, there was a
possibility that the Air Force could have
take the lead in spaceflight
Website Features
• Roswell
– Many of the government explanations for
the supposed alien activity near Roswell,
New Mexico had to do with the activities of
WADC researchers
– Digitized both the relevant WADC reports
and the two government monographs on
Roswell
By 2001
• Digitized and posted approximately 400
reports, with about 800 more to go
• Had a fully fleshed out website, one
focused on hits and then page views
• Digitization equipment difficult to work
with and unreliable, and corners cut to
boost numbers
Obstruction (2001-2002)
DoD Comes a Knocking
• Actually, it was emails and phone calls
• We were informed that many, actually
most, of the reports we had posted had
never been cleared for public release
• Could we please remove those reports
• Would have devastated our efforts
Worked with DTIC
• They searched their Private STINET
database (now DTIC Online Access
Controlled) to try to help us keep as
much posted as possible
• Ended up removing many, many
reports, many of which still reside in a
offline network folder named “Removed”
Problem overstated
• Private STINET was a very dirty
database.
• Additionally, report numbers were
erroneous on a global scale
• WADC TR & WADD TR converted to
ASD TR
• Led to low return on searches
Incorrect Distributions
• Additionally, there were many reports
that were listed as “Limited Distribution”
that had actually been released to the
public.
• Evidence for this included cover
statements, index entries and PB
Numbers
Evidence of Public Availability
• We used evidence of public availability
to justify re-posting many reports
• We informed DTIC of the fact that we
felt their database was inaccurate and
that we would repost reports that had
evidence of public availability
• I guess they shrugged
More information
• The information just covered was the
focus of the presentations from 200506, so I won’t go into great detail
– See:
• http://hdl.handle.net/10560/1326
• http://hdl.handle.net/10560/1327
Redesign & Reboot (2003-2004)
Website Redesign
• During 2003-2004 an effort was made
to update the website
• Aesthetic and architecture of the new
site followed the main library website
• Added a search engine function
• Created a Report Number matrix
• Received first major donation
Website Features
• Added and removed features from the
“Historical Overview”
– Removed “IIT Connection”
– Added “Pearl Harbor”
– Added “Image Gallery”
– Added “Feature Report”
– Digitized SuDocs from the “D 301.2:” stem
Report Digitization
• Gained access to Private STINET
– Able to determine that many reports
actually were public, just had report
number or spelling errors that made
records hard to find
– Posted only those that DTIC agreed were
public
– Created list of “limited” reports with
evidence of public availability
Rebrand and On-Demand (2005-2008)
Rebranding
• Since the Wright Air Development
Center was defunct, we were only web
presence
• Gave the impression that WADC was a
unit of IIT
• Decided to rebrand to Contrails
Benefits to rebranding
• Remove confusion about WADC
• Better reflected our broader digitization
efforts
– Donations of reports from other
laboratories
– Many, many report numbers
• Domain name: contrails.iit.edu
Drawbacks to rebranding
• Contrails not unique enough
– There are now many websites on the topic
•
•
•
•
•
•
•
Wikipedia
NASA, FAA & NOAA
UIUC, Wisconsin-Madison
PBS
Company names
“Chemtrails” conspiracy
Currently Page 4 of Google results
On-demand Scanning
• Always receiving emails about reports
that people desired
• Sometimes we had them
• Sometimes we didn’t
• Only resource online aside from DTIC
Online, who are there to serve DoD and
contractors, not public
On-demand Scanning
• Created dynamic links
– If we had scanned the report, it would
created download link
– If the report was Limited Distribution, it
would indicate that fact
– If we didn’t possess the report, it would
make that clear
– If we had yet to scan report, it would create
a “mailto” link to request scanning
On-demand Scanning
• Very successful initially
• Many scanning requests received and
filled
• Depended on Google’s search engine
algorithm, which followed and indexed
our dynamic links
• Felt we could depend on this model
Capitulation & Celebration (2009 forward)
On-demand Scanning
• Requests began to drop off
• Google had OCRed and indexed all of
our pdfs
• Users weren’t being driven to our site,
and weren’t seeing our scanning
request dynamic links
• Turns out, Google stopped indexing
them
Fight Google?!
• Wanted to increase traffic to Contrails
– Users bypassing site to download pdfs
straight from Google search
– Missing dynamic links to request scanning
of reports we possessed but hadn’t
digitized
– How to drive people to site?
Can’t fight Google
• Best bet was to change model
– Return to heavier digitization rather than
relying on scanning requests
– Once reports were posted, and indexed by
Google, accessibility to users was great
– Who cares whether user is driven to our
site, if they get what they need
– Anecdotal evidence of the value of our
efforts
Contrails: Scanning Requests and Citations, 2005-08, 2009-14
180
18
160
16
153
15
140
14
129
125
120
12
100
10
Cumulative Requests (Relevancy Period)
Cumulative Requests (Immediacy Period)
80
8
Cumulative Cites (Relevancy Period)
Cumulative Cites (Immediacy Period)
60
6
40
4
26
3
20
2
1
0
0
2005
0
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
Celebration
• Anecdotal evidence of value
– Contrails.iit.edu showing up in
bibliographies
•
•
•
•
•
4 Patents
2 Books
6 journal articles
2 dissertations
1 Conference paper
JA4 ('14)
JA6 ('14)
CP - Conference
Paper
D - Dissertation
JA - Journal Article
M - Monograph
P - US Patent
JA3 ('13)
JA2 ('11)
D1 ('11)
CP1 ('09)
D2 ('12)
'10s
P4 ('14)
M2 ('11)
'00s
'90s
M3 ('12)
P1 ('12)
'80s
'70s
'60s
JA1 ('08)
'50s
JA5 ('14)
P2 ('13)
P3 ('14)
M1 ('09)
Cited Reference Pub. Date
Contemporary Research Pub. Date
JA4 ('14)
JA6 ('14)
CP - Conference
Paper
D - Dissertation
JA - Journal Article
M - Monograph
P - US Patent
JA3 ('13)
JA2 ('11)
D1 ('11)
CP1 ('09)
D2 ('12)
'10s
P4 ('14)
M2 ('11)
'00s
'90s
M3 ('12)
P1 ('12)
'80s
'70s
'60s
JA1 ('08)
'50s
JA5 ('14)
P2 ('13)
P3 ('14)
M1 ('09)
Cited Reference Pub. Date
Contemporary Research Pub. Date
JA4 ('14)
JA6 ('14)
CP - Conference
Paper
D - Dissertation
JA - Journal Article
M - Monograph
P - US Patent
JA3 ('13)
JA2 ('11)
D1 ('11)
CP1 ('09)
D2 ('12)
'10s
P4 ('14)
M2 ('11)
'00s
'90s
M3 ('12)
P1 ('12)
'80s
'70s
'60s
JA1 ('08)
'50s
JA5 ('14)
P2 ('13)
P3 ('14)
M1 ('09)
Cited Reference Pub. Date
Contemporary Research Pub. Date
JA4 ('14)
JA6 ('14)
CP - Conference
Paper
D - Dissertation
JA - Journal Article
M - Monograph
P - US Patent
JA3 ('13)
JA2 ('11)
D1 ('11)
CP1 ('09)
D2 ('12)
'10s
P4 ('14)
M2 ('11)
'00s
'90s
M3 ('12)
P1 ('12)
'80s
'70s
'60s
JA1 ('08)
'50s
JA5 ('14)
P2 ('13)
P3 ('14)
M1 ('09)
Cited Reference Pub. Date
Contemporary Research Pub. Date
JA4 ('14)
JA6 ('14)
CP - Conference
Paper
D - Dissertation
JA - Journal Article
M - Monograph
P - US Patent
JA3 ('13)
JA2 ('11)
D1 ('11)
CP1 ('09)
D2 ('12)
'10s
P4 ('14)
M2 ('11)
'00s
'90s
M3 ('12)
P1 ('12)
'80s
'70s
'60s
JA1 ('08)
'50s
JA5 ('14)
P2 ('13)
P3 ('14)
M1 ('09)
Cited Reference Pub. Date
Contemporary Research Pub. Date
Soapbox Time
Threats to NTIS
• 1988: Discussion of privatization
– Y 4.En 2/3:100-170
– Y 4.Sci 2:100/5
– Y 4.Sci 2:100-36
– Y 4.Sci 2:100/84
• 1999: Plan to close NTIS
– Y 4.Sci 2:106-37
Threats to NTIS
• 2012: NTIS’ Dissemination of Technical
Reports needs congressional attention
– GA 1.13:GAO -13-99
• 2014: NTIS’ Dissemination of Technical
Reports needs attention
– GA 1.13:GAO-14-781T
GA 1.13:GAO-14-781T
• 3 Major findings
– NTIS’s fee based model is losing money
• Outdated funding model
– Reports are available elsewhere
• Only 74% were available elsewhere
– Demand is higher for newer reports
• 62% of 21st century additions to repository are
from the 20th century
Future role for NTIS
• Centralized shallow web indexing of all
Federal technical report collections.
NTIS should be the Google Scholar for
technical reports
• Not all agency technical report sites are
shallow web, for instance NASA
Increased indexing
• From the looks of it, there are 350,000
conference proceedings held by NTIS
• The papers from these conferences
haven’t been indexed
• Contrails indexing of conferences at the
paper/presentation level has been wildly
successful
• There could be close to 10,000,000
unindexed papers held by NTIS
Information Advocate
• NTIS could take the lead in ensuring
public availability of federally funded
information
– Work with DTIC to remedy their
suppression of technical reports that have
been announced as publicly available
– Identify similar problems at other agencies
and work with them as well
Information Advocate
• Work to release technical information
from the various “Sensitive but
Unclassified” distribution limitations
invented to get around EO 10901
• Time to step back down off the Soap
Box
Contrails Value Added
• Comparing Contrails to DTIC Online
– Public can request digitization
– Scans from paper instead of microfiche
• Photographs in grayscale
– Improved handling of foldout pages
– Better resolution coupled with lower file
size
– Paper level indexing of conferences
Moving Forward
• Issues
– May need to migrate website from an ASP
model hosted on a Windows server
– Would like to integrate the aesthetic with
the new library and university website
redesign
– Lost access to DTIC Online Access
Controlled
Moving Forward
• Opportunities
– Split off historical resources from technical
reports
– Streamline indexing of reports and allow
full-text web searching to augment our
website
– Accelerate digitization efforts
Thank you report donors!
• Lockheed Martin
Missiles & Fire
Control
• Air Force Research
Laboratory
• University of
Cincinnati
• Embry-Riddle
Aeronautical
University
• Royal Air Force
Centre for
Aerospace Medicine
• Bombardier
Download