Biodiversity Heritage Library Work Plan and Budget Justification Building a Digital Open Access Library for Biodiversity Literature 1 Biodiversity Heritage Library Work Plan and Budget Justification Building a Digital Open Access Library for Biodiversity Literature In February 2004, biologists, librarians, and information specialists gathered for a four-day workshop sponsored by the Pinhead Institute in Telluride, Colorado to assess the feasibility of assembling a web-based Encyclopedia of Life (EOL). Among the outcomes of this workshop was a recommendation to digitize the literature of biodiversity. In 2005, an international symposium, Library and Laboratory: the marriage of research, data, and taxonomic literature, funded by the Alfred P. Sloan Foundation, was held at the Natural History Museum in London. This was attended by over 80 biologists, librarians and computer scientists. Participants identified the lack of each access to the published literature of biodiversity as one of the principal obstacles to efficient and productive research. Few scientific disciplines are as dependent on the historical literature in their respective field. In the spring of 2005, representatives of ten major natural history museum libraries, botanical libraries, and research institutions joined in a collaborative effort to develop strategies to digitize the literature in an open access manner. From this partnership grew the biodiversity Heritage Library (BHL) project. The partners envision that a research scientist or student who has access to the Internet, located anywhere in the world, will be able to search for specific information in all of the literature relevant to biodiversity and transparently link the documentation to relevant taxonomic, geographic, or other useful databases. Such a tool would erase much of the expensive, laborintensive work of library research and speed the production of research results many times over. The BHL partnership is essential because while natural history museum and botanical garden libraries have collected biodiversity materials comprehensively, including many 2 specialized and rare materials, no single library holds the complete corpus of legacy literature. The partners’ collections represent a uniquely comprehensive assemblage of this literature. Within two years of the start of this project, the BHL will provide approximately 25 million digitized pages of literature to support multiple bioinformatics initiatives and research. For the first time in history, the core of our natural history museum and botanical garden libraries will be available to a truly global audience. This proposal seeks $6,053,000 to carry out the first two-year phase of this ambitious and far-reaching project and an additional $3,611,000 for the remaining three years of the project. The Partnership The participating institutions are: American Museum of Natural History (New York, NY) The Field Museum (Chicago, IL) Harvard University Botany Libraries (Cambridge, MA) Harvard University, Ernst Mayr Library of the Museum of Comparative Zoology (Cambridge, MA) Marine Biological Laboratory / Woods Hole Oceanographic Institution (Woods Hole, MA) Missouri Botanical Garden (St. Louis, MO) Natural History Museum (London, UK) The New York Botanical Garden (New York, NY) Royal Botanic Gardens, Kew (Richmond, UK) Smithsonian Institution (Washington, DC) The BHL members have formed a Board of Directors with elected officers and have signed agreements committing them to the project. Other participants may be sought at a later time. Methodology: BHL member institutions will scan (or have already scanned) their institution’s own scientific publications to contribute to the BHL. Members will scan other volumes from their collections that are not covered by copyright or for which permissions have been obtained using 3 the Internet Archive, a non-profit partner, to quickly add large segments of literature to the corpus. Ongoing negotiations with commercial publishers and learned society publishers are promising: newer literature may be made available with permissions. The BHL as a communitybased partnership should provide a trusted grouping to negotiate with copyright owners. Small society publishers need help scanning and storing their publications and if we provide this service, some are pleased to have their content accessible through the BHL portal. A set of clearly defined working relationships will be established as part of the project. User input to the process will be solicited regularly. The BHL selected the Internet Archive (IA), an organization with demonstrated technical qualifications for mass scanning and long-term digital content management, to scan the bulk of the literature through scanning centers. A scanning center consists of multiple, high-speed, stateof-the-art digital book scanners, with staff for two shifts daily, each able to handle large numbers of volumes daily. The IA will perform imaging, OCR, association of standard metadata (derived from MARC records provided from the libraries’ catalogs) with the digitized files, file arrangement, security, maintenance of optimal book conservation practices and delivery of the completed scans in conformance with agreed project standards. Article level access to journals will be provided. BHL Portal with Taxonomic Intelligence: The digitized literature will be served to users from the Biodiversity Heritage Library Portal to be hosted initially by the Missouri Botanical Garden. The BHL Portal will create an innovative research environment that will vastly accelerate research in life sciences and conservation: a freely accessible, service-based Web Service formed through coupling existing databases with digitized, searchable images and OCR text of heritage literature. The BHL Portal 4 will use existing informatics tools to identify strengths and overlap across the participating institutions' libraries and to help solve the problems associated with the naming of organisms over time. Taxonomic Intelligence: The digitization of a major corpus of biodiversity literature will advance world biodiversity initiatives significantly, but only to the extent that users can find relevant content. Names of organisms annotate content about species. However, the use of names for information retrieval is impeded because names are neither stable nor consistent. One organism may have more than one name. This prevents simple automated indexing services from bringing together complementary data. Moreover, about 1% of names change each year, such that the manynames-for-one-organism (synonyms) problem accumulate with time and will be particularly severe with heritage literature. Visitors to traditional library scanning projects who know organisms by their colloquial (common) names may be unable to find content unless they know the names used in the source documents. These issues will reduce the utility of the millions of pages of primary biodiversity information to be generated by the BHL without the added tools intended in the BHL Portal. The uBio team from the Marine Biological Laboratory / Woods Hole Oceanographic Institution (MBLWHOI) library has assembled an array of taxonomically intelligent services designed to overcome these problems. Selection of Materials to be Scanned Because BHL requires moving physical objects (books) and contracting for expensive scanning machines that are built to order and require extensive set up in specific locations, it is 5 necessary that selection of locations and material be done with clear priorities and reasons. BHL involves 10 separate institutions with different collection strengths. Selection of material to be prioritized will be a multi-prong approach including thematic areas, date of publication, and quantity of material. BHL Directors have organized a “BHL Collections Working Group,” which will refine and further articulate the thematic areas in March and April. The initial titles to be scanned will either be in the pubic domain or have copyright permission. Input from the EoL Secretariat will be the initial focus of the themes assigned. In addition to thematic focus provided by the EoL Secretariat, the BHL will also analyze such major indexes as Index Kewensis, Sherbourne’s Index Animalium, and Neave’s Nomenclator Zoologicus. BHL is in discussions to obtain permissions to mine Zoological Record to determine those journals that have been most cited in the literature of species identification and description. This will provide a priority list of journals and monographs. Scanning from a prioritized list created through citation analysis ensures that the BHL contains the most critical works for our audience of scientists and scholars. Additional thematic focus will be possible post-scanning after applying taxonomic intelligence to draw out all the species names so that a filter can be run with, for example, the species names from the Ocean Biogeographic Information System (OBIS) and tag those articles or monographs with marine data. Other name sources can be employed in a similar manner. BHL will also remain flexible enough to take timely advantage of offers from significant learned society journals to digitize back holdings in an open access manner. As the EoL Secretariat establishes thematic areas of emphasis for the EoL, the BHL libraries will select texts that best support the thematic area. Initial thematic assignments for 2007-2008 are indicated in the initial thematic area of the Appendix A, spreadsheet bhlfivegrant.xls. Years 2009 and following have the acronym EoLS representing that thematic 6 assignments will be based on guidance mediated by the EoL Secretariat. If that guidance is available in 2008, work can be redirected towards that area Deliverables Deliverable 1: Metadata Repository and Analysis The BHL has received $50,000 from the Richard P. Lounsbery Foundation, which has been partially applied to the creation of the BHL Metadata Repository and initial collection analysis. Members are now creating a union list of biodiversity serials to serve as the basis for distributing responsibility for scanning since serial titles contain hundreds of volumes and often reach over 100,000 pages. Digitizing serials will be a cost-effective method for mounting a large amount of the most useful scientific material online in the shortest amount of time. An early collection analysis indicated considerable overlap among participant libraries’ holdings. A new collection analysis will help establish a productive division of labor among members and to further inform digitization priorities. This activity will be completed by the end of the first year. BHL will contract with the Online Computer Library Corporation (OCLC) for the World Cat Collection Analysis Tool (using Lounsbery Foundation and member contributed funds), which allows consortia to analyze their collections and compare them to other collections. In this way, members will be able to compare group holdings by specific subject areas and through time -- ie. pre 1923. The tool includes a web interface that allows users from any library to view collections and choose from multiple parameters, including subject, titles and title counts, publication date, language, formant and audience. The collection analysis will allow the BHL to further identify overlap and uniqueness of all its member library collections. Deliverable 2: Digitized literature. 7 The volumes will be scanned with a unique digitization system called the Scribe, developed by the Internet Archive (IA), which facilitates the scanning and description of the object, as well as depositing it for storage, serving, and further reuse. Each digitized volume will be encoded with authoritative metadata from the contributing library’s catalog (such as Library of Congress Subject headings and other descriptive metadata about the title) and will include overall structure of the volume and the pages within. The digitized pages, derivative images and files, and all associated metadata will be deposited at the IA for further analysis and further redistribution via the BHL portal. The IA will host and serve the image files for searches from the BHL Portal. The text files, enhanced by taxonomic intelligence (see below), will be hosted initially from the Missouri Botanical Garden (MBG) and, as the design evolves, may continue to be hosted from there or may move to the MBLWHOI depending on architecture decisions in the first year of design. These materials will be referenced by persistent globally unique identifiers (GUIDs) at the title, volume and page level, so that they can be integrated across existing bibliographic and taxonomic citation databases, like Tropicos, NameBank, and ZooRecord. By the end of year two we expect to have more than 25 million scanned pages online and a total of at least 40 million pages online by the end of year five. As digitized objects are added to the IA, existing automated processes will complete two major workflow steps—text conversion and keyword indexing—that transform a digital image of a page into a text-based XML document, on which scientific names and other keywords will be annotated using semi-automated natural language processing tools that will be developed. These annotations and relationships between scanned images and keywords will be identified, recorded in the BHL Portal database, and then exposed within additional XML files. Deliverable 3: Integration of taxonomic intelligence. 8 Building on existing tools and services developed at uBio to index organism names (NameBank; which contains approximately 9.5 million name strings) and their associated hierarchies (ClassificationBank; which contains over 80 classifications), taxonomic intelligence will be integrated into the documents immediately as they are digitized using an established named entity recognition tool, TaxonFinder. (See the EoL Informatics Work Plan for further details) The integration of taxonomic intelligence, via links from name strings located within each XML file generated (Deliverable 2), will enable linkages to other relevant indexed content in EoL and other web-accessible name-based sources. In cases where a name string cannot be resolved to an existing NameBank string or reconciliation group (a set of strings that objectively represent the same organism), it will be passed to other complementary services (e.g., EoL WorkBench; see EoL Technical Documents). The types of organisms that are associated with each digitized document will be characterized using the taxonomic groupings reflected in ClassificationBank (including the proposed EoL Union taxonomy). This will include the generation of descriptive statistics pertaining to organisms relative to other comparative axes (e.g., temporal or geographic). A complete list of name strings as they appear in each digitized document, reconciled contemporary form of name string, and any other relevant metadata will be incorporated into the XML index files stored initially at the Missouri Botanical Garden. Deliverable 4: Development of a robust community portal with open, distributed architecture. Users today expect sophisticated presentation and collaborative, interactive Web resources. To make this Web portal operative, BHL will program an intelligent, customizable interface into all parts of the repository. The interface will enable users to conduct a search for particular terms in the BHL repository and find pages where these terms occur throughout the 9 entire collection of digitized literature. Users will be able to obtain a bibliography of the literature containing the keyword(s) or view the individual scanned pages (or the indexed text if they prefer) and see all of the keywords that appear on the page. In addition to the internal search capabilities and outward external links, the BHL Portal will provide the mechanism necessary to accept queries from external databases, libraries, or individual Web users, returning appropriate images or bibliographic references. Additional authoritative content available on the Internet will be linked via the incorporation of tools like LinkIT and uBioRSS, which respectively create dynamic hyperlinks to authoritative Web sites and enables access to recently published knowledge (e.g., literature, scientific news, blogs, pictures of the day, distribution data, and specimen collections). Both the LinkIT and uBioRSS tools will be modified to link resources that can meet the specific objectives set forth by the overall EoL vision. This project will be the first, through automated markup of the digitized literature and integration of taxonomic intelligence, to provide access to both the literature and supportive natural history data in an interactive, Web-based environment. The combined search options and broad interactive capabilities are unique to this project and will be the focus of initial development by the end of year two. However, the scanning can begin immediately, since postprocessing of the files will enable these features. By the end of year five, advanced search functionality and end-user tools determined through ongoing usability analysis and feedback will be developed and deployed. Testing of the system will be critical to determine whether the BHL Portal reaches its intended audience and will include internal review by the BHL partners and reviews by other scientists and IT professionals. We will undertake testing for all components and subsequent 10 modification and retesting as needed throughout the project. The BHL Portal will undergo intensive testing, including post-deployment surveys and on-line comment/suggestions options. The BHL Portal will communicate with the IA and other service providers assisting the project using web services, a mechanism by which data can be shared among disparate data sets using standard protocols and XML. Building web services on top of the BHL Portal will allow us to manage the data separately, while still providing a way to communicate the relationships between the data sets and the records within. These services can be “published” as entry points for other communities to inquire and interact with the BHL data, for example, a conservation organization interested in finding literature on a given scientific name, but requiring the literature to be displayed in its own application. The organization’s application can address BHL’s Web services and determine what digitized literature is available for the scientific name in question. The results will provide inlinks to the BHL digitized literature from the organization’s existing application. In this way, Web services will meet the specific needs of this usage, while continuing to make the raw data available to others. Deliverable 5: Permissions from Publishers. The BHL will negotiate permissions from non-profit biodiversity learned societies to digitize their back issues that are still covered by copyright and to aggregate the digitized files with other BHL content as well as provide free copies to the learned societies. The BHL will also seek permissions from commercial publishers. Deliverable 6: Digital Curation Digital curation is a critical part of making the BHL a sustainable project that will ensure sustained, persistent access to the BHL content for centuries using the best available technology and administrative structures for preservation of digital files. A plan for extending collaboration 11 from the original core group of institutions to include other natural history and botanical libraries will be pursued as a secondary step in the project evolution. The metadata, image files, digital derivatives, and text files generated during this project will create a significant resource that will require ongoing stewardship beyond the timetable outlined in the proposal. The BHL Board of Directors will develop a plan for archival storage and appropriate migration of data within the second year. The plan may involve multiple dark archives and the participation of primary publishers of content, such as commercial publishers and society publishers, as well as content aggregators such as OCLC. Working with the EoL Secretariat, their host institutions, and major stakeholders, BHL Directors will develop a plan for long-term administrative/corporate structures to ensure that the BHL digital assets remain openly available to users in the future. Whether this will require incorporating as 501 (c) 3 or a UK Trust or some other arrangement will be determined by a study in the third and fourth years. Refer to, Deliverables and Schedule following for the implementation schedule. 12 Deliverables and Schedule ESTIMATED output notes (Deliverable 1) Metadata Repository and Analysis 5/2006-6/2007 7/2007 – 12/2007 1/2008 -12/2008 1/2009 – 12/2009 First analysis; broad collection strengths; initial allocation of scanning priorities Harvesting, ingest, and export of BHL metadata to/from major sources, e.g. OCLC etc. Updated as new collections are acquired and as volumes are scanned (Deliverable 2) Digitize literature Negotiate agreements with Internet Archive; install scanners in selected locations; Negotiate agreements with regional centers to use their facilities where possible Prototype development; requirements definition Clean metadata repository prototype; union list of serials; OCLC analysis; refinement of selection process base on thematic direction from EoL Secretariat Hire library technicians for pulling and moving volumes. Output 5,990,000 digitized pages ~17,114 – 29,950 volumes; Output 20,508,000 digitized pages, ~58,594 – 102,540 volumes (more if fund raising is successful); content resolvable to the title, volume, and page level Sophisticated public search interface with TI to collocate taxon names in text; ability to include variant and vernacular names; ingest, harvesting, and export of files fully functional; integration with existing indices (Deliverable 3 &4) BHL Portal with open distributed architecture and integration of Taxonomic Intelligence (TI) Import of ocr files from IA; production testing of first iteration of (TI). Searching; simple public search interface; full text searching 1/2010 – 12/2010 Updated as new collections are acquired and as volumes are scanned 1/2011 - Output 17,676,000 digitized pages ~50,503 – 88,380 volumes (more if fund raising is successful) Output dependant on fund raising Output dependent on fund raising ingest, harvesting, and export of files fully functional; prototype open community interface; resolve citations using indices and matching algorithms; ability to limit searches to article titles, taxonomic treatments Ongoing maintenance and refinement; If additional funding obtained operationalize full open community functionality Ongoing maintenance and refinement. If additional funding obtained operationalize full open community functionality Updated as new collections are acquired and as volumes are scanned 13 (Deliverable 5) Learned societies and publishers 3 prototype permissions documents signed; presentations at conferences; legal strategy determined 20 permissions obtained; working agreement with BioOne obtained; (Deliverable 6) Digital Curation Fundraising Prepare funding proposals for additional projects, e.g. 100 year + stewardship planning and licensed content Working with EoL Secretariat develop global marketing plan for reaching as many learned society publishers as possible; more permissions Implement plan Planning for BHL administrative structure to ensure long-term 100 year plan for management of BHL digital assets Prepare funding proposals for additional projects, e.g. rare book collections Additional permissions obtained. Implement global marketing plan Additional permissions obtained Implement global marketing plan Additional permissions obtained. Implement global marketing plan. Planning for BHL administrative structure to ensure long-term 100 year plan for management of BHL digital assets Planning for BHL administrative structure to ensure long-term 100 year plan for management of BHL digital assets Implement new administrative structure, e.g. incorporation as 501© 3 or UK Trust Prepare funding proposals for additional projects, e.g. Underserved taxa Prepare funding proposals for additional projects; paleobiology Prepare funding proposals for additional projects; paleobiology 14 BHL Member Contributions and Fundraising From 2002 through 2007 BHL member institution will have contributed more than $1,200,000 to the BHL covering such costs as digitizing 1,500,000 pages of biodiversity literature, holding planning meetings, and proportionate salary expenses for staff directly involved in BHL work. These funds have come from operating budgets, internal special purpose funds, and from external development efforts. Additional local and coordinated fund raising efforts will be a major focus of the BHL. BHL member institutions will also redirect existing staff to BHL support work such as library technicians to move materials, catalogers, analysts, and others. The budget documents in Appendix A contain a spreadsheet, Estimates of BHL Institution Contributions, that reflects redeployment of existing staff, potential augmentation from EoL Cornerstone institution fundraising, and target fundraising goal for the BHL libraries. As the mass scanning starts, the Contributions spreadsheet will be refined. Outcomes of Scanning the Biodiversity Literature Scanning the biodiversity literature and making providing the text through a common portal with sophisticated search tools will produce a number of long-term benefits for the global biology community. These outcomes can be grouped under a number of headings: Improving the efficiency of research in the biology domain Improving access to information to non-museum biologists Repatriation of information about developing world species Capacity building in the developing world Preservation of rare and fragile materials 15 These outcomes may be implicit or explicit. BHL will put in place Web-based tools to survey the users of the BHL Portal to determine who, what, and how users approach our content. It is anticipated that a brief online questionnaire in the Web interface will pop up for a certain percentage of users. The questionnaire will ask questions about the users’ purpose in visiting the BHL Portal site, what benefits they expect to gain, and what savings they may achieve. These are mainly ‘soft’ measures, but will give us a sense of the changes in the customer base and their usage of materials. However, BHL members believe that some measurable financial outcomes are also achievable. For instance, the Smithsonian Institution Libraries hosts over 600 non-U.S. researchers, many from developing countries lacking libraries with significant biodiversity collections, who travel simply to use the volumes in its collection. Factoring an average two-day visit cost of $1,000 that includes flights and lodging in Washington, D.C., an estimated 50% drop would save the research community $300,000 annually. If 30% of the 15,000 visitors to the Natural History Museum in London from the UK and other countries were able to save two days of their visit by consulting the literature at their home base, this would save the visitors $1 million per annum. The Library of the Royal Botanic Gardens at Kew hosts over 170 visitor days annually from researchers from developing countries to whom they provide 4,000 sheets of copying paper gratis. See Appendix B for a sample of the thanks our libraries have already started receiving with our modest efforts to date. In order to measure these potential outcomes, BHL will spend some time at the beginning of the project ensuring that our approach to counting customers and surveying customers is consistent across the BHL partners. This will enable us to set an early baseline from which to 16 measure the impacts which we expect our outcomes achieve. To ensure consistency of approach, BHL will coordinate the methodology with the other EoL partners. Key Outcomes Improving the efficiency of research in the biology domain: BHL expects that the web availability of the majority of the biodiversity literature will have a long-term impact on the way that taxonomic science is done. The ability to bring together all the literature on a given taxon or group at an individual’s desktop will increase efficiency and speed up the process of taxonomic revision. Large paper collections of individual articles will become a thing of the past. The BHL will lead to an acceleration of the taxonomic process in both developed and developing countries. This supports the wider objectives of the EoL and enables the greater integration of taxonomic effort globally. Full-text searching and taxonomic intelligence will ‘unearth’ inaccessible information from older material. This will allow new analysis and data mining by bringing together material from different institutions to provide a new synthesis. Output Measures: BHL expects the online questionnaire to show changes in the efficiency and effectiveness of individual scientists e.g. Appendix B: 4, 5, 6. Improving access to information to non-Museum biologists: The BHL will expose the biodiversity literature to other biological science disciplines – ecology, forestry, land planning, environmental assessment, etc. - and a broad range of other potential users in medicine, history and social sciences. 17 The BHL will also provide improved access for non-taxonomists to original descriptions and identifications. Such links could provide powerful new tools to medical researchers or environmental and ecological monitoring organizations, where precise species identification is critical for their work. Output Measures: The online questionnaire will seek to obtain sector information about users, and ask them about the application of the BHL materials in their sector e.g. Appendix B: 7. Repatriation of information about developing world species: Full access to the published literature will effectively repatriate biodiversity information back to the original country containing described organisms. In many cases in the developing world, local taxonomists and para-taxonomists will not have access to any of the literature on their local biodiversity. Output Measures: The online questionnaire will seek information on the country of origin of the scientist using the BHL, and the territory in which the scientist intends to use the information e.g. Appendix B: 1, 2, 3, 4, 5, 6. Capacity building in the developing world: Access to the BHL content will support the curricula of training new taxonomists in developing countries. This can help mitigate the “Taxonomic Crisis” http://www.actionbioscience.org/biodiversity/page.html The number of out-of-town visitors that need to visits our library collections should reduce significantly. As indicated above, savings of at least $1 million per annum are achievable. This ‘saving’ will persist and should help to boost local capacity. 18 Output Measures: The online questionnaire will seek information on the country of origin of the scientist, and of the benefits which will accrue locally e.g. Appendix B: 2. Preservation of rare and fragile materials: The BHL partners will be able to conserve their rare and original volumes through minimizing handling. Where appropriate, BHL will have the ability to print new facsimile titles on-demand (for instance, for teaching or fieldwork). BHL partners may be able to achieve cost savings by the use of remote storage facilities for the print materials that have been digitized. The BHL will provide a low cost “not-for-profit” mechanism for small learned and professional societies to make the backfiles of their journals digitally available. Output Measures: The portal will enable us to track the use of rare and fragile materials, and this will give is a measure of increased usage while minimizing the handling. We will track the learned societies with whom we have agreements as part of the wider engagement of the EoL with the taxonomic community. 19 Budget The Biodiversity Heritage Library Annual Operating Budget (Appendix A attached spreadsheet bhlfiveyeargrant.xls) reflects all costs to be funded by the John D. and Catherine T. MacArthur Foundation and the Alfred P. Sloan Foundation over five years. The budget does not distinguish between costs funded by the two foundations, leaving that decision to representatives of the respective foundations. The Annual Page Output (Appendix A, attached spreadsheet bhlfiveyeargrant.xls) reflects our best estimate of the number of digitized pages that will result from the BHL project. The number of volumes reflect either a 350 page or a 200 page per volume estimate as indicated. More volumes can be scanned if local supplementary fundraising is successful. BHL Project Director: As with any large, collaborative project involving multiple institutions, the BHL requires high-level coordination and management. The BHL Director will negotiate agreements with learned societies for digitization rights, draft contracts/agreements with major partners such as the IA; manage, track, report, and disburse funds for the performance of the BHL work, create enduring partnerships with peer initiatives, prepare fund raising proposals, oversee the BHL Portal Development Team, liaise with the EoL Secretariat to assure congruence of efforts, and assure that BHL libraries and the IA deliver expected output. The BHL Project Director will need to perform onsite visits to assist in implementation and evaluation of the scanning operations and to engage BHL planning efforts. The BHL Director will be housed at the Smithsonian Institution Libraries in Washington, D.C. 20 Local Salaries/Library Technicians Finding volumes and checking them for scanning suitability, tracking them through local circulation systems, delivering them to the scanning location or pick up point, retrieving the volumes, checking for damage, and reshelving are all simple but very time-consuming tasks that become enormous in the quantities required for the BHL project. In many libraries, the high production levels required by the project cannot be completely absorbed by existing staff. Metadata Repository: The metadata repository will be developed using existing funds but ongoing simple maintenance will be necessary. BHL Portal Funds will cover 50% of a BHL Technical Manager at the Missouri Botanical Garden and two full time programmers for two years working under his direction. In prototypes to date, the Missouri Botanical Garden staff have demonstrated ability to manage development of this complexity and deliver results as required. Direct Scanning Costs These are the payments to our scanning and hosting service provider, the Internet Archive (IA), a non-profit organization dedicated to “Universal Access to Human Knowledge.” The IA has a proven track record of delivering and serving high-quality digital pages and their text versions at extremely low costs. Their cost model benefits from concentrating as large a mass of materials in as short a period of time as possible. IA is able to offer such compelling prices based 21 on a model where the total number of Scribe scanners is kept maximally occupied for a minimum of two years. Any slacking in the delivery drives the average per page cost up. The budget plans for funds to be allocated to specific locations, so the respective libraries can do extensive local planning and set up and so that, the appropriate number of scanning stations to be delivered and staffed can be estimated. However, prior to the actual agreements for each separate location being finalized, the BHL must to reserve the flexibility to change the provisional assignment of IA scanners at the locations reflected in the budget based on last minute information and changes in local library situations. Any such changes will be done only with approval of the EoL Secretariat. The differing per-page charges reflect such factors as number of staffed scanning machines, currency conversion rates, whether the scanning machine is part of a wider regional shared facility, etc. Transport Costs Many BHL Libraries can achieve significantly lower per-page scanning costs if they use IA facilities that are shared with other libraries and are not on site. However, they may incur substantial transport costs to deliver and retrieve their materials. Meetings A project of this scope requires extensive networking and sustained meeting time to review progress and plan next steps. As the BHL ramps up, a forum will be necessary for involving new partners. Budget Notes The funds requested will create the infrastructure and base for the BHL and create a large body of digitized literature as soon as possible. Funds in addition to those requested can be easily and 22 immediately applied to increase the number of volumes scanned without significantly increasing other costs except, in some cases, transport or local library technicians. Roughly, every additional forty dollars will add another volume. Appendix A is the spreadsheet bhlgrantfiveyear.xls. 23 Appendix B Researchers Comments Concerning BHL Digitization 1) First of all congratulations for the Botanicus.org webpage, it's a very useful project specially if you are located at countries far away from decent libraries (as I am in Costa Rica). I like the PDF version of the books, as it can be downloaded; it is easy to read parts of the books off-line. Best regards Walter Schug 2) Thank you so much for not only your quick response to my ILL request, but even more for your attaching the item as a .PDF file so that Prof. Newton received it almost instantly across the ether. I know he emailed you a much more timely thank you. He wrote me that he is extremely excited about your digitization project. At the moment he and his graduate botany students in Kenya have access to very few resources. He spends his summer terms at Kew doing his research for the next year's teaching and writing, but he tells me that now, because of what is already on your site, he will not have to carry so much back to Kenya for his research and his students but can download and work with your resources right there. I am cutting this note short as I myself now am heading off for my summer break, a family reunion back in Ohio, but I would like to send you a copy of Prof. Newton's original article, and you can then see how much you have helped his research. Emilie Pulver University of Kenya 3) I am absolutely amazed of this tool! I think it is fantastic what we can offer to the botanical community and beyond to have at our finger tips. Congratulations to you and the staff involved. Thanks for all your good work. Carmen Ulloa 4) In reference to: Bulletin des Séances de la Société Entomologique de France. [Paris] : Société entomologique de France, [1873-1884] My deepest gratitude for allowing me access to the digital version of the very rare "Bulletin des Séances de la Société Entomologique de France". It has been very important for my work on the database of the names of the butterflies of the world to be able to consult at leisure this series, which is held by extremely few libraries in the world. I cannot stress enough the importance of having access to electronic versions of the literature, especially to us researchers who cannot benefit from well-endowed institutional libraries. The Smithsonian Libraries are doing a great service to science by making openly accessible such crucial works as the "Biologia Centrali- 24 Americana", and now the above-mentioned "Bulletin". I only wish that there were many more such electronic resources. Please keep up the excellent work! Dr Gerardo Lamas Museo de Historia Natural Universidad Nacional Mayor de San Marcos 5) In reference to: Frederick Ducane Godman and Osbert Salvin, eds. Biologia CentraliAmericana. [London: Pub. for the editors by R. H. Porter], 1879-1915. I have to my position the collection of Coleoptera of the Faculty of Superior Studies of the Independent National University of Mexico and daily I consult volumes of BCA for the identification of specimens of Coleoptera, because it is a wonderful work and until the moment does not exist another source that supports in the identification of the Mexican species of several families of this group. Ma. Magdalena Ordóñez Reséndiz Museo de Zoología, FES Zaragoza, UNAM 6) In reference to: Walter Rothschild. The Avifauna of Laysan and the neighbouring islands with a complete history to date of the birds of the Hawaiian possession. London: R H Porter, 18931900 Aloha. I live on The Big Island of Hawai'i, a $300.00 plane ride away from Honolulu and the Bishop Museum. Even when I can make it to the Museum (where I study the Hawaiian Bird Skins), they do not have every single bird (moho apicalis, the Oahu moho is missing)….I have been looking for this text for over TWENTY YEARS. Mahalo nui loa for all your hard work. Reading these pages mean so much to me and many others. I hope they show there appreciation as well. It truly is very important. I cannot thank you enough, nor stress the importance of your website enough. Thank you for putting these items on the web, and in such a findable manner. Aloha Gwendolyn O'Connor 7) In reference to: Howard Jones. Illustrations of the nests and eggs of birds of Ohio. Circleville, Ohio, 1886. Virginia Hunt, a doctoral candidate at LSU, will be using this site as part of her dissertation on ornithological illustration; commented Ms. Hunt: “My study concerns surveying a large number of ornithological narrative paintings, in particular historical examples, in order to determine how these may be used by high school and college biology teachers to teach certain key concepts in ecology while doing 'double duty' in illustrating subtle aspects of the history and nature of science. Bruce Shelvey, Ph.D.; Chair, Department of Geography, History, and Political and International Studies and Associate Professor of History and Political Studies, Trinity Western University (Canada).