TEI Projects and Small Libraries Examining TEI Markup Decisions and Procedures Richard Wisneski, Head, Bibliographic and Metadata Services Virginia Dressler, Digital Librarian Stephanie Pasadyn, Technical Services Librarian Kelvin Smith Library November 2009 Introduction The Project: • Digitizing and Encoding Kelvin Smith Library’s (KSL) Books on Cleveland, Ohio and the Western Reserve Digital Text Collection • Using “Book Viewer” in KSL’s “Digital Case” (institutional digital repository) to Display texts’ PDF, Page Images, and TEI • Have applied for an NEH Humanities Collections and Reference Resources Grant to fund project • Will collaborate with neighboring institutions to incorporate into our collection their texts on the history of Cleveland and the Western Reserve Why Do This Project? • Availability of Texts is Limited: See Spreadsheet • No other institution has a project akin to this in Northeast Ohio • Interest in Cleveland and Western Reserve history among historians and scholars. Cleveland… Why TEI? • To allow researchers to have access to an electronic text that does not require special-purpose software or hardware • To analyze information – provide a standard textencoding scheme and metadata language which accommodates searching, retrieval, etc. • To share information – have a standard format for data interchange in humanities research • Texts are being encoded in Level 3 (structural) • To create stand-alone electronic text with hierarchy identified • Emphasis on divisions within text, tables, lists, notes, front and back matter Current Project Practices Workflow Project Log Currently, kept on Google Docs in MS Excel shared file: DIGITIZATION PROCESS • Step 1: Review and assess digital images Review digital content Organize and assess Image assessment Key points in assessment • Complete, uncorrupted files • Ascertain image quality as to current practices and standards • Check for legibility of text for OCR process • Compare illustrations and photos with original source if needed • Rescan if needed Optical Character Recognition • Step 2: Sidekick 1400u Image conversion • Processing tiff files for the book viewer Book viewer demo Text Clean-UP Student Workers, Volunteers do work in OpenOffice and oXygen TEI Headers • Professional Catalogers create TEI headers: <?xml version="1.0" encoding="UTF-8"?> <?oxygen RNGSchema="http://digitalcase.case.edu:9000/fedora/get/ksl:p5schema/tei_all.rng" type="xml"?> <TEI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> <title type="main">Report on the preliminary surveys for the Cleveland, Painesville and Ashtabula Rail Road Company </title> <title type="sub">An electronic version</title> <author> <persName>Harbach, Frederick, 1817-1851</persName> </author> <respStmt> <name xml:id="ksl">Kelvin Smith Library, Case Western Reserve University</name> <resp>Publisher of TEI-conformant electronic version.</resp> </respStmt> <respStmt> <name xml:id="mxb">Mary Burns</name> <resp>TEI Header creator</resp> </respStmt> <respStmt> <name xml:id="rlw">Richard Wisneski</name> <resp>encoder</resp> </respStmt> </titleStmt> <extent>1.448 MB</extent> <publicationStmt> <publisher>Digital Case, Kelvin Smith Library, Case Western Reserve University</publisher> <pubPlace>Cleveland, Ohio</pubPlace> <distributor n="collection">KSL Digital Book Collection</distributor> <availability> <p>This work is in the public domain and may be freely downloaded for personal or academic use.</p> </availability> <idno>http://hdl.handle.net/2186/ksl:harrep00</idno> <date when-iso="2009-09-01" /> </publicationStmt> ETC. TEI Structural Mark-up • Text Encoders mark text following TEI P5, Level 3 <body> <div type="section" xml:id="section1" n="1"> <pb n="5" facs=“clecle00-00003.jp2“ /> <head>HISTORY</head> <p>The first settlers of Cleveland were from Connecticut; and, according to tradition, as soon as three families had established themselves — it was about the beginning of the present century — they set up a school for their <hi rend="ital">five children.</hi> The population had increased to <hi rend="ital">fifty-seven</hi>in 1810, and the oldest inhabitants think there was a school taught in that year. It is certain, however, that it could not have been very large. The earliest school mentioned in any record was kept by a Mr. Capman in 1814. But it was not till1836, the year of organization under the City Charter, that any system of <hi rend="ital">public instruction </hi>was adopted. Previous to this year, the schools, of whatever grade or character, were supported mainly by private enterprise.</p> CONTINUED >> TEI Structural Mark-up (continued) <table rend="boxed" cols="3" rows="4" xml:id="Table2"> <head>TABLE OP CURVATURE.</head> <row> <cell>&#32;</cell> <cell role="label">SOUTH ROUTE.</cell> <cell role="label">NORTH ROUTE.</cell> </row> <row> <cell>Deflections to Right</cell> <cell>323°20</cell> <cell>236° </cell> </row> <row> <cell>Deflections to Left</cell> <cell>402</cell> <cell>213°45'</cell> </row> …AND SO ON Text Encoding Learning TEI • Practical Application • Internal Documentation • CaseLearns Learning TEI • • • • One on one overview Creating master outline Coding page by page Referring to and updating documentation Learning TEI Learning TEI Learning TEI Learning TEI • • • • Human Error Evolution of Institutional Practice Minimal Time Allotment Limited Opportunity for Continuing Education Issues To Be Done • Re-Scan some of the books • Continue to encode • Hold half- full-day workshops on text encoding to full-time staff • Create of MODS, MARC-XML, and METS records • Re-examine “Book Viewer” Discussion Questions • Ways to expedite text encoding • Ways to scan texts – outsourcing? • Funding challenges (outsourcing, scanning, equipment) • Book viewer – effective? Ineffective? • Text-Encoding Level – change? • Learning TEI – in-house classes and documentation, TEI-C documentation. Webinars? Online tutorials? Certificate program? Contact Richard Wisneski: rlw54@case.edu Virginia Dressler: vad17@case.edu Stephanie Pasadyn: sap68@case.edu Links and references • Digital Case homepage • Digital Case Book Viewer collection • Women Writers Online, Brown University: http://textbase.wwp.brown.edu/WWO/index.html • Poetess Archive, University of Miami at Ohio: http://unixgen.muohio.edu/~poetess/collections/index. php • Victorian Women Writers Project, Indiana University: http://www.indiana.edu/~letrs/vwwp/index.html • Swinburne Project, Indiana University: http://swinburnearchive.indiana.edu/swinburne/www/s winburne/