Texas Digital Newspaper Program Data What we gather, and how we use it. By Ana Krahmer & Mark Phillips University of North Texas Libraries 4 February 2014 Start Spreading the News! Overview I. II. III. IV. Target Audiences Data Collection Data Use Questions Target Audiences Of the Texas Digital Newspaper Program Target Audiences • K-12 Students & Educators • Higher Education Researchers, including undergraduate, graduate, and faculty researchers • Librarians • Genealogists • Lay-Historians • Lifelong-Learners Current Classroom Use • Teaching with Primary Sources: National History Day; Texas Junior Historians; Texas History, grades 4 & 7. • Texas Tech University, Dr. Ann Hawkins’ Texas Manuscript Cultures Online, undergraduate and graduate Book History and Research Methods courses. • University of North Texas, Dr. Andrew Torget’s courses, History of Texas and American History courses. • Austin College, Dr. Light Cummins’ Research Methods. Library Use • Texas Digital Newspaper Program Feedback utility • Partner institution pages • Newspaper Program Traveling Banner • Links to share on social networking feeds Genealogists • Full-text searchability • Search content highlighting • Full metadata records • Faceted links Lay-Historians • Annual Digital Frontiers Conference • Free and open access to newspaper content • Zoomable views • Permissions to use in research, with citations Lifelong-Learners • Emeritus College students • Partner public library patrons • Local civic interest groups Data Collection Where did it all come from? Data Collection • Qualitative: Surveys, Feedback responses, grant report comments, partner communications • Quantitative: Analysis of collection usage, geographic origin, contributing partners, annual additions to collection Data Collection (Qualitative) • Grant-funded projects require final reports from TDNP partner institutions. • In 2012, Kathleen Murray and Dreanna Belden launched an impact survey for Portal to Texas History users. • The Portal feedback database offers years of user questions. Data Collection (Qualitative) “The newspaper digitization project places our library in position to reach out toward the future. By taking a piece of the past and bringing it with us, we are sure to grow and learn, appreciate and respect what was, what is and what will be.” -Whitaker, L. (2013). Richard S. & Leah Morris Memorial Library Final Grant Project Report to the Tocker Foundation. Data Collection (Qualitative) • The Portal to Texas History Impact Survey was launched in 2012 by Murray and Belden. • Of 573 respondents, 36% self-identified as genealogists, 19% as lifelong-learners, 19% as historians, 6% as librarians, 5% as students, and 15% as “other.” • 93 individual comments within this survey cited the newspapers as being of especial value. Data Collection (Quantitative) • Descriptive metadata for the Texas Digital Newspaper Program collection is available via OAI-PMH in multiple formats. • Utilized a Python-based harvester, pyoaiharvester, to collect metadata from the TDNP OAI Repository endpoint. • All years combined total 165,298 metadata records. • Phillips then prepared a Python script to parse records and extract relevant information. • This script uses the output of the pyoaiharvester tool as input, to return a tab-delimited file displaying one newspaper issue per row. The fields* in the tab-delimited file are: Field Field Description Example Data ARK ARK Identifier for issue ark:/67531/metapth16320 Partner Contributing Partner Code BDPL Year Online Year issue went online (prefixed with “od:”) od:2006 Year Year of newspaper issue 1934 Decade Decade of newspaper issue 1930 County County of newspaper issue Palo Pinto County Community Community of newspaper issue Mineral Wells Title Title of newspaper issue The Tattler *All values in the Partner field can be resolved from the controlled vocabulary: http://digital2.library.unt.edu/vocabularies/institutions/ Number of issues added per year (n=165,149) Year # of Issues Cumulative % of Issues Cummulative % 2006 54 54 0.03% 0.03% 2007 0 0 0% 0.03% 2008 44 98 0.03% 0.06% 2009 7,263 7,361 4.40% 4.46% 2010 44,788 52,149 27.10% 31.56% 2011 32,626 84,775 19.74% 51.30% 2012 30,836 115,611 18.65% 69.95% 2013 49,538 165,149 29.97% 99.92% Number of counties added per year (n=109) Year # of Titles Added # of New Titles Added Cummulative # of Titles 2006 2 2 2 2007 0 0 2 2008 2 2 4 2009 26 25 29 2010 121 114 143 2011 269 252 395 2012 163 145 540 2013 98 79 619 Number of communities added per year (n=142) Year # of Communities Added # of New Communities Added Cummulative # of Communities 2006 2 2 2 2007 0 0 2 2008 2 2 4 2009 17 16 20 2010 35 22 42 2011 62 42 84 2012 66 37 121 2013 47 21 142 Counties currently represented in the TDNP Collection Number of partners added per year (n=48) Year # of Partners Added # of New Partners Added Cummulative # of Partners 2006 2 2 2 2007 0 0 2 2008 2 2 4 2009 6 5 9 2010 10 6 15 2011 16 12 27 2012 20 10 37 2013 24 10 47 2014 3 1 48 TDNP Partner Institutions Partner Type # of Partners by Type # of Issues Public Libraries 27 66,320 Academic Libraries & Archives* 13 41,308 Genealogical/Historical Societies 4 5,065 Museums 2 2,544 * “UNT Archives” and “UNT Libraries” are two partner institutions, thus actually totaling to 48 partners. This table indicates content whose digitization was funded by external partners. Content funded internally by UNT totals to 49,805 and has been removed from this table. Google Fusion Tables map, derived from TDNP Newspaper Locations: https://www.google.com/fus iontables/embedviz?q=select +col1+from+17utAXOiLgXhaE XlHgsIfE2DJ_2OlQyD-XIRLZU&viz=MAP&h=false&lat= 28.271046172964198&lng=103.64894140625&t=1&z=6 &l=col1&y=2&tmplt=2&hml= GEOCODABLE External References • Python Metadata Extraction Script: https://github.com/vphill/untl_metadata_extraction/blob/master/tdnp_dataset.py • Texas Digital Newspaper Program OAI API: http://texashistory.unt.edu/explore/collections/TDNP/oai/ • PYOAIHarvester Script: https://github.com/vphill/pyoaiharvester • Belden, Dreanna & Murray, Kathleen R. Where do users find value?. UNT Digital Library. http://digital.library.unt.edu/ark:/67531/metadc185793/. Accessed January 27, 2014. Questions? Ana Krahmer Ana.Krahmer@unt.edu Mark Phillips Mark.Phillips@unt.edu