Texas Digital Newspaper Program Data

advertisement
Texas Digital Newspaper
Program Data
What we gather, and how we use it.
By
Ana Krahmer
&
Mark Phillips
University of North Texas
Libraries
4 February 2014
Start Spreading the News!
Overview
I.
II.
III.
IV.
Target Audiences
Data Collection
Data Use
Questions
Target Audiences
Of the Texas Digital Newspaper Program
Target Audiences
• K-12 Students & Educators
• Higher Education Researchers, including undergraduate, graduate,
and faculty researchers
• Librarians
• Genealogists
• Lay-Historians
• Lifelong-Learners
Current Classroom Use
• Teaching with Primary Sources: National History Day;
Texas Junior Historians; Texas History, grades 4 & 7.
• Texas Tech University, Dr. Ann Hawkins’ Texas Manuscript
Cultures Online, undergraduate and graduate Book
History and Research Methods courses.
• University of North Texas, Dr. Andrew Torget’s courses,
History of Texas and American History courses.
• Austin College, Dr. Light Cummins’ Research Methods.
Library Use
• Texas Digital Newspaper Program Feedback utility
• Partner institution pages
• Newspaper Program Traveling Banner
• Links to share on social networking feeds
Genealogists
• Full-text searchability
• Search content highlighting
• Full metadata records
• Faceted links
Lay-Historians
• Annual Digital Frontiers Conference
• Free and open access to newspaper content
• Zoomable views
• Permissions to use in research, with citations
Lifelong-Learners
• Emeritus College students
• Partner public library patrons
• Local civic interest groups
Data Collection
Where did it all come from?
Data Collection
• Qualitative: Surveys, Feedback responses, grant report comments,
partner communications
• Quantitative: Analysis of collection usage, geographic origin,
contributing partners, annual additions to collection
Data Collection (Qualitative)
• Grant-funded projects require final reports from TDNP partner
institutions.
• In 2012, Kathleen Murray and Dreanna Belden launched an impact
survey for Portal to Texas History users.
• The Portal feedback database offers years of user questions.
Data Collection (Qualitative)
“The newspaper digitization project places our library in position to
reach out toward the future. By taking a piece of the past and bringing
it with us, we are sure to grow and learn, appreciate and respect what
was, what is and what will be.” -Whitaker, L. (2013). Richard S. & Leah Morris Memorial Library Final
Grant Project Report to the Tocker Foundation.
Data Collection (Qualitative)
• The Portal to Texas History Impact Survey was launched in 2012 by
Murray and Belden.
• Of 573 respondents, 36% self-identified as genealogists, 19% as
lifelong-learners, 19% as historians, 6% as librarians, 5% as
students, and 15% as “other.”
• 93 individual comments within this survey cited the newspapers as
being of especial value.
Data Collection (Quantitative)
• Descriptive metadata for the Texas Digital Newspaper Program collection
is available via OAI-PMH in multiple formats.
• Utilized a Python-based harvester, pyoaiharvester, to collect metadata
from the TDNP OAI Repository endpoint.
• All years combined total 165,298 metadata records.
• Phillips then prepared a Python script to parse records and extract
relevant information.
• This script uses the output of the pyoaiharvester tool as input, to return a
tab-delimited file displaying one newspaper issue per row.
The fields* in the tab-delimited file are:
Field
Field Description
Example Data
ARK
ARK Identifier for issue
ark:/67531/metapth16320
Partner
Contributing Partner Code
BDPL
Year Online
Year issue went online (prefixed
with “od:”)
od:2006
Year
Year of newspaper issue
1934
Decade
Decade of newspaper issue
1930
County
County of newspaper issue
Palo Pinto County
Community
Community of newspaper issue
Mineral Wells
Title
Title of newspaper issue
The Tattler
*All values in the Partner field can be resolved from the controlled vocabulary:
http://digital2.library.unt.edu/vocabularies/institutions/
Number of issues added per year (n=165,149)
Year
# of Issues
Cumulative
% of Issues
Cummulative %
2006
54
54
0.03%
0.03%
2007
0
0
0%
0.03%
2008
44
98
0.03%
0.06%
2009
7,263
7,361
4.40%
4.46%
2010
44,788
52,149
27.10%
31.56%
2011
32,626
84,775
19.74%
51.30%
2012
30,836
115,611
18.65%
69.95%
2013
49,538
165,149
29.97%
99.92%
Number of counties added per year (n=109)
Year
# of Titles
Added
# of New Titles
Added
Cummulative #
of Titles
2006
2
2
2
2007
0
0
2
2008
2
2
4
2009
26
25
29
2010
121
114
143
2011
269
252
395
2012
163
145
540
2013
98
79
619
Number of communities added per year (n=142)
Year
# of Communities Added # of New Communities
Added
Cummulative # of
Communities
2006
2
2
2
2007
0
0
2
2008
2
2
4
2009
17
16
20
2010
35
22
42
2011
62
42
84
2012
66
37
121
2013
47
21
142
Counties currently represented in the TDNP Collection
Number of partners added per year (n=48)
Year
# of Partners Added
# of New Partners
Added
Cummulative # of
Partners
2006
2
2
2
2007
0
0
2
2008
2
2
4
2009
6
5
9
2010
10
6
15
2011
16
12
27
2012
20
10
37
2013
24
10
47
2014
3
1
48
TDNP Partner Institutions
Partner Type
# of Partners by Type
# of Issues
Public Libraries
27
66,320
Academic Libraries &
Archives*
13
41,308
Genealogical/Historical
Societies
4
5,065
Museums
2
2,544
* “UNT Archives” and “UNT Libraries” are two partner
institutions, thus actually totaling to 48 partners. This table
indicates content whose digitization was funded by
external partners. Content funded internally by UNT
totals to 49,805 and has been removed from this table.
Google Fusion Tables map,
derived from TDNP
Newspaper Locations:
https://www.google.com/fus
iontables/embedviz?q=select
+col1+from+17utAXOiLgXhaE
XlHgsIfE2DJ_2OlQyD-XIRLZU&viz=MAP&h=false&lat=
28.271046172964198&lng=103.64894140625&t=1&z=6
&l=col1&y=2&tmplt=2&hml=
GEOCODABLE
External References
• Python Metadata Extraction Script:
https://github.com/vphill/untl_metadata_extraction/blob/master/tdnp_dataset.py
• Texas Digital Newspaper Program OAI API:
http://texashistory.unt.edu/explore/collections/TDNP/oai/
• PYOAIHarvester Script: https://github.com/vphill/pyoaiharvester
• Belden, Dreanna & Murray, Kathleen R. Where do users find value?. UNT Digital
Library. http://digital.library.unt.edu/ark:/67531/metadc185793/. Accessed January
27, 2014.
Questions?
Ana Krahmer
Ana.Krahmer@unt.edu
Mark Phillips
Mark.Phillips@unt.edu
Download