What is an institutional repository? A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution. Clifford Lynch, Executive Director Coalition for Networked Information Building an Institutional Repository Sarah L. Shreeves September 24, 2007 Illinois Digital Environment for Access to Learning and Scholarship © 2007, IDEALS This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/ What is IDEALS? Illinois Digital Environment for Access to Learning and Scholarship Institutional repository for the scholarship and research in digital form of the faculty, students, and staff of the University of Illinois at UrbanaChampaign. Supported by the Office of the Provost, CITES and the University Library. • Dissemination • Preservation • Persistent and reliable access http://ideals.uiuc.edu/ Benefits for our faculty, students, and staff? Increased dissemination of research Persistent URLs Preservation Promotion of research Full text searching of textual material Control over copyright What is in scope for IDEALS? Services Preservation Facilitating deposit of materials through consulting, training, and batch loading Consultation around copyright issues and IDEALS Providing as many access and dissemination points as possible for deposited material Providing additional services for end users and depositors as appropriate What type of materials? • Published research and scholarship • Unpublished research and scholarship in a ‘final’ state • In the future: digital art, complex data sets…. Publications Presentations Grey literature Pre-prints Raw and processed research data Management & organization of digital output Journal articles, books, etc by faculty Theses / Dissertations Manuscripts Some scholarly web sites What’s out of scope? Collections Administrative/electronic records Everyday curriculum material Published material where publisher policy does not allow deposit Services E-Portfolio for students Journal publishing Digitization of materials Shared collaborative space for groups Roles in a digital repository? Project manager Collections specialists Programmer / Technology specialists Metadata specialists Digitization specialists Legal specialists Public relations specialists Why should the library be responsible? Expertise in large scale collection management, description, and access Usually have a preservation component Long term commitment Is the library’s mission! But… Library should partner with others when needed Libraries ICT units Consortia Granting agencies Demo of IDEALS Production site: http://ideals.uiuc.edu/ Test site: http://loki.grainger.uiuc.edu/ideals/ Why Now for IDEALS? Management & organization of digital output Influence direction of scholarly communication Open Access Preservation of digital output Dissemination of scholarship and research See original proposal at: http://www.ideals.uiuc.edu/handle/2142/3 Influence Direction of Scholarly Communication Educate faculty on copyright issues IDEALS highly encourages open access to deposited material Funders beginning to mandate open access to research Provides multiple dissemination routes The Fundamental Issue Scholarly Literature is Different from Commercial Publication Not written for direct compensation Freely given to publishers Research and writing are supported through public funds Access is intended to be as wide as possible (from Trends in Scholarly Communications” by Richard Fyffe) Barriers to Broad Access High Costs Restrictive Licensing Terms Slow Speed of Publication Too Much Information (From “Trends in Scholarly Communications” by Richard Fyffe) High Costs Erosion of Subscription-based Access to Journals Decline in Book Purchases and Erosion of Scholarly Monograph Publishing (From “Trends in Scholarly Communications” by Richard Fyffe) Current Model of Scholarly Communications The Academy Published Research is a Contribution The Commercial Publisher Published Research is a Commodity From “Anatomy of a Crisis: Dysfunction in the Scholarly Communications System” by Lee C. Van Orsdel Restrictive Licenses Contract Terms Supersede Copyright Law and “Fair Use.” Contracts May (and Do) Restrict: Who may use the journal Permissible uses or kinds of research Classroom use Scholarly sharing (From “Trends in Scholarly Communications” by Richard Fyffe) Author Rights Typically copyright is transferred from the author to the publisher An author can request to keep copyright, but no guarantee that publisher will grant it Author addendums retain certain rights but not full copyright. See Hirtle at http://www.dlib.org/dlib/november06/hirtle/11hirtle.html Open Access “free availability of the results of research mainly in the form of scholarly articles” “Open Access Publishing: A developing country view” Papin-Ramcharan and Dawe in First Monday (http://www.firstmonday.org/) Two roads: Open access journals Archiving (self, institutional, discipline) Open Access Journals With internet access, articles are free to read, download, copy, distribute, and print Can also have a print fee-based version Costs of journal (including access and dissemination) paid for by author side fees (sometimes supplied by author, institution, granting agency) or by sponsorship Issues with OA Journals Sustainability for publisher? Preservation issues? For user, accessibility? – Requires internet and broadband access because most articles are pdf (exception is Bioline International, First Monday, Ariadne, Journal of Digital Information and handful of others who publish in html) Self Archiving Collect, describe, preserve, and provide access to digital output on personal, institutional, disciplinary repository Cost is generally paid for by organization maintaining repository Issues with Self Archiving Sustainability for organization Copyright issues Preservation issues Take up by faculty / researchers Federal Research Public Access Act of 2006 (Cornyn-Lieberman) Publications from federally-funded research must be deposited in agency repository; Agency ensures the manuscript is preserved in a stable, digital repository; Free, online access to each taxpayer-funded manuscript available no later than six months after its publication in a peer-reviewed journal. Benefits of Open Access Free and open access to research for all who have ability to access it Higher citation impact for open access articles Pressure on commercial publishers for pricing Forcing changes in the scholarly communication lifecycle Digital Preservation Not just about back-ups and storage Technology, organization, and resources Looking forward towards certification as a “Trusted Digital Repository” Only 42% of journal publishers have established formal arrangements for the long-term preservation of their journals Image from Cornell University Library What is Digital Preservation Management (DPM)? Process that requires the use of the best available technology as well as carefully thought out administrative policies and procedures. Consists of: Organizational concerns Technological development Resource management Building a “New” Library Format Support Less Preservable proprietary supported by only one software platform has low use More Preservable openly documented supported by a range of software platforms TIFF TIFF is widely adopted TIFF lossy data compression contains embedded files or programs/scripts lossless data compression does not contain embedded files, programs, or scripts TIFF TIFF Trusted Digital Repositories RLG/NARA Digital Preservation Repository Certification Task Force Audit Checklist Objectives: Produce certification requirements (for both self and external assessment), delineate a process for certifications, and identify a certifying body (or bodies) that can implement the process. http://www.crl.edu/content.asp?l1=13&l2=58&l3=162&l4=91 Principles: External to the digital archives (cannot consist solely of selfassessment) − Managed/performed by recognized authorities − Well-documented with comprehensive and explicit policies, procedures, and practices − Sustainable and monitorable over time − − Replicable Dissemination of Scholarship and Research Open to Google Scholar and other spiders Provide harvesting through the Open Archives Initiative Protocol for Metadata Harvesting Most hits to IRs come from the outside in http://www.oaister.org/ - OAIster at the University of Michigan Challenges to Establishing Digital Repositories Gathering content from faculty Digital preservation Metadata Copyright issues E-Science, digital art, and other data coming our way… Faculty reluctance Deposit is out of their ordinary workflow Concerns about copyright issues Want to keep research data private Don’t see value in the digital repository Copyright Issues Deposit of pre-print and post-print materials 83% of journal publishers require authors to transfer copyright in their articles to the publisher Generally publishers do not allow deposit of the publisher pdf copy Will authors want the penultimate copy deposited? Resistance to open access Generally digital repositories require non-exclusive dissemination and preservation license Sherpa/Romeo Publisher copyright policies: http://www.sherpa.ac.uk/romeo.php?all=yes 4 Approaches to Building Content 1) Working directly with faculty and scholarly units 2) Identifying publications from faculty that are able to be deposited 3) Inserting ourselves into the process of disseminating grey literature such as technical reports and working papers 4) Working directly with publishers Metadata Generally author-supplied For full text objects, is minimal metadata enough? How much resources should be spent on metadata enhancement? Different types of data Datasets (E-Science and other disciplines) Digital art Preservation issues Complex objects Massive datasets with many types of supporting documentation Preserving relationships between pieces difficult Many research questions! Technical Infrastructure DSpace 1.4.1 MIT Libraries & Hewlett-Packard http://www.dspace.org DSpace Community 130+ sites, 45+ countries Communicate via listservs, wiki, conferences Advisory Board – 13 institutions Technical Infrastructure UIUC Services Integration CITES Bluestem / LDAP NetFiles Illinois Compass UI Portal? Library Online Catalog Online Research Resources Conclusions What do you want to do and what does your institution need? Better dissemination of local research? Better management and organization of locally produced materials? Digital preservation? Can you phase these in? Who is going to be responsible? Resources IDEALS: http://www.ideals.uiuc.edu/ IDEALS About Pages: http://ideals.uiuc.edu/about/aboutus.html IDEALS Initiative Wiki: https://www.ideals.uiuc.edu/wiki/ Sarah Shreeves, Coordinator sshreeve@uiuc.edu Tim Donohue, Programmer tdonohue@uiuc.edu Copyright Notice Parts of this file is based on the work “Anatomy of a Crisis: Dysfunction in the Scholarly Communications System” by Lee C. Van Orsdel, Dean of Libraries, Eastern Kentucky University and “Trends in Scholarly Communications” by Richard Fyffe • Assistant Dean for Scholarly Communication, University of Kansas Libraries This file is distributed under a Creative Commons license: AttributionNonCommercial-ShareAlike 3.0 You are free to copy, distribute, display, and perform the work and to make derivative works Under the following conditions: Attribution. You must give the original author credit. Noncommercial. You may not use this work for commercial purposes. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. See www.creativecommons.org