and the Illinois Harvest Portal A Presentation to the UIUC Library Faculty September 20, 2006 Betsy Kruger and Tim Cole UIUC Digitization Efforts Get Big Boost!! $200,000 • Paula, in consultation with ULs from Chicago and Springfield, requested funding from Steve Rugg (UI Comptroller) to help libraries move into larger digitization projects. $200,000 • Provost’s Office agreed to supply matching funds. $500,000 • Rep. Naomi Jacobsson expressed interest in obtaining state funding for UI projects. Paula sent short proposal regarding our interest in mass digitization project with the Open Content Alliance. TOTAL = $900,000 for FY2007 (must be spent by June 30, 2007) Mass Digitization Working Group Tim Cole Beth Sandore Nuala Koetter Sarah Shreeves Betsy Kruger, Chair Mary Stuart Michael Norman Tom Teper Chris Prom David Vess Working Group’s Goals • Insure funds are spent by June 30! Successfully and purposefully! • Coordinate Library’s participation in the Open Content Alliance • Explore and begin developing integrated access to UIUC owned/created digital content via a web portal • Develop selection criteria for digitization • Document costs related to mass digitization • Develop in-house expertise • Attract future funding! Budget Breakdown DIGITIZATION SERVICES Open Content Alliance scanners/staff/services $200,000 Illinois-related content digitization (non-OCA projects) $218,000 OTHER ALLOCATIONS Illinois Harvest Portal development staff $125,000 Data storage $80,000 Metadata librarian $42,000 Oak Street 3rd floor upgrades $65,000 Material preparation (wages) $80,000 Workstations $6,000 Contingency $84,000 TOTAL $900,000 In November 2005, Paula Kaufman asked Karen Schmidt to pull together a small group of Library faculty to recommend whether or not the Library should join the Open Content Alliance, a program of the Internet Archive. We said YES! Karen Schmidt Betsy Kruger Beth Sandore Mary Stuart Nuala Koetter Tom Teper Internet Archive – a nonprofit organization founded by Brewster Kahle in 1996 to build an “Internet library” offering permanent access for researchers, historians, and scholars to historical collections that exist in digital format. http://www.archive.org/index.php The Open Content Alliance – a program of the Internet Archive started in early 2005 a group of cultural, technology, nonprofit, and governmental organizations from around the world that will help build a permanent archive of multilingual digitized text and multimedia content. http://www.opencontentalliance.org/ OCA Goal To bring digital and newly digitized material online under principles of openness. OCA Principles • Contributed content is free to all for reading, viewing, listening to, downloading, sharing, crawling, indexing • Rehost at discretion of contributor • Open for research and computation • Services can be built by both commercial and non-commercial parties (e.g., navigation services, print-ondemand, etc.) A Few of the 60+ OCA Participants • University of California Libraries • University of Toronto Library • Johns Hopkins University Libraries • UNC Chapel Hill • National Archives (United Kingdom) Contributions can be: • National Library of Australia • Content • Yahoo • Facilities • RLG • Services • MSN • Tools • Microsoft • Funding Mass Digitization at UIUC OCA’s Responsibilities: • • • • • • • • • Install two “Scribe” scanning systems at Oak Street Hire and train staff Keep track of our materials Fetch descriptive metadata from our OPAC via a Z39.50 connection Digitization: creating content files (archival and access copies, PDFs) and structural/administrative metadata OCR Quality control measures Provide access to digital content via the Internet Archive (IA) website Long term management of content on IA website SCRIBE Scanning System • Non-destructive: Books are not disbound for scanning • Utilizes digital cameras rather than flatbed scanners • Book is held face up in a cradle, open at a 90 degree angle, as operator turns pages (snore…) • Pages held flat by a glass platen that is raised and lowered • Scanning cost is 10¢ per page • Up to 500 pages per hour • Our production will be around 200 books per week for first year. Mass Digitization at UIUC UIUC Responsibilities: • Infrastructure improvements to Oak Street 3rd floor • Selection of materials for digitization • Daily or weekly retrieval of books to be sent for scanning • Charging out materials, delivering materials to scanning center, returning material to shelves post-scanning • Validation and ingestion of metadata and content files for preservation storage • Linking from Voyager record to digital content files • Possibly some level of quality control beyond that performed by OCA Your Input Needed on Selection for OCA! • We anticipate digitizing 8,500 – 9,500 volumes this first year. • Must be in public domain or UIUC must own rights (e.g., some microfilm) • Uniqueness—We want to avoid duplication with the Michigan Google Project. • “Collections” vs. hodge-podge selection • Faculty support/interest and curricular tie-ins particularly attractive. • We need your suggestions NOW! • Suggestions accompanied by a little sweat equity are particularly attractive! Illinois Harvest Portal “A website combined with search, aggregation, and discovery services that will provide organized and thematic access to digitized and born-digital collections of public interest from the University of Illinois.” • In conjunction with the Illinois Harvest portal project, we will also digitize numerous smaller collections, through outsourcing and some in-house digitization • Will involve various formats (books, maps, audio, video) • Most will focus on content about Illinois. • We welcome suggestions of additional projects. Illinois Projects Under Consideration • Ilios • Engineering Experiment Station Bulletins • Bronze Tablets • Chicago Foreign Language Press Survey • Illinois counties surveys/maps • University of Illinois Press • Illinois Chemist • UI Board of Trustees proceedings (1927-) • WILL audio/video content • Illinois county atlases • INHS Technical Reports • Illinois newspapers • UI Historic Built Evironment • Library speeches and guest lecturers Visit the MassDigiWiki! http://massdigiwiki.pbwiki.com/FrontPage • Meeting agendas • Minutes • Project documents