Technical Framework Charl Roberts University of the Witwatersrand Charl.roberts@wits.ac.za Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Technical Setup In order to create an effective digital repository it is important that the technical set-up process is planned in detail. Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Considerations • Defining requirements. Without a requirements specification informed decisions cannot be made relating to choices of repository platform and environment • The installation of a repository platform which may require the purchase of hardware and software, or could involve negotiating a hosting contract • Integrating the repository with other systems such as local authentication systems • Testing the repository to ensure that it works as expected, and fulfils all the criteria set out in your requirements specification • Creation of technical policies for long-term aspects such as metadata, workflows and file formats • Technical promotion of the repository. This is important to ensure that other systems such as external search engines index or harvest content properly. Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Requirements and Specifications Define the requirements: • What is the repository for? • Who are the stakeholders - those people with a vested interest in how the repository represents the institution, and themselves, to the world, and what do they want from the repository? In the case of an institutional repository, stakeholders will include senior institutional managers, departmental leaders, and those who are expected to contribute content. This approach is likely to reveal a series of questions: What is the target content of the repository? • Are all content types to be managed in a single repository, or more than one? • What other systems and services might the repository be required to share information with? This is often referred to as 'interoperability'. • Is there appropriate budget and staffing to support the requirements? • What will the repository store? For a higher education institution, repository content could include research papers and data, electronic theses, as well as teaching and learning resources, perhaps including some audio-visual content Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Lower level requirements • Interoperability (getting data in and out) - • Open Archives Initiative (OAI) Deposit protocols (SWORD) Web 2.0 Standards (OAI, Dublin Core, W3C) Software options - Open Source Software (OSS) - - Free to download Staff requirements for support Community Hosted solutions - • Better support Comes at a cost Platform requirements - Depends on your software options – usually OSS requires other open source software (web server, database, etc.) • Programming requirements - What skills will be required • Quality - What standards do you need / want? • Performance - How many items will you store, growth forecasting Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Installation • IT staff required – Campus IT, Web developers, programmers – HTML, SQL, CSS, Java / PHP • Supporting software - Database, web server • Hardware questions - Technical requirements of the software - Growth of the system - Preservation (Physical & Access) • Backup • Maintenance • Customization Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Integrating the repository Three types of integration: • Integration with external systems to get items in to a repository: While repositories are often populated with items that have been submitted using the repository software, there are many cases where the information can be gathered from external systems. A common use-case is to populate the repository from an institutional publications database. Some institutions are investigating ways to populate their repository with learning and teaching materials from their VLE in an effort to make them more accessible both within the institution and sometimes in a more open way with the wider community. Another way of working is to provide depositors with desktop-based smart deposit tools that integrate with their working environment to help capture their work as it is created. The most widely adopted standard for depositing items into a repository is SWORD • Integration with systems to get items out of a repository: Once a repository contains a useful corpus of items it can be integrated with other systems that want to use that data. These may be local systems such as institutional search engines or researcher web pages, national systems such as EThOS, or international systems such as Google Scholar or OAIster. One of the most common methods for extracting the structured metadata of the items in repositories is 'harvesting', with the standard protocol being the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).RSS feeds are another standard mechanism that allow repositories to provide information to other systems; in this case, RSS feed readers. • Integration with systems that provide services to a repository: Repository software specialises in storing items and metadata, but can often work more effectively if it makes use of services provided by other systems. One common system that repositories are often configured to work with is local authentication systems such as Shibboleth, LDAP or Active Directory. These services allow the repository to look up usernames, passwords, and user details (name, email, telephone number etc) from a centrally managed system. Other systems that may provide services to a repository could include file format validation (JHOVE), virus scanning of ingested files, or external cloud storage of files Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Testing • Pilot System Testing/Functional testing • Pilot User Acceptance Testing • Production System Testing and User Acceptance Testing Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Creation of technical policies for longterm aspects • Metadata (Metadata-reuse http://www.opendoar.org/find.php?format=c harts) • Workflows • File Formats Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Wits Example • Wits hosts our own Institutional Repository (IR) called WIREDSPACE • We currently use DSPACE as our software platform • On the IT side we have one web developer (as the manager of the unit I assist with development work) • Our Technical Services Department work on our metadata creation and enrichment • Our DSPACE instance runs off a virtual server on the Wits cloud infrastructure, it’s Solaris based, although we are considering moving to UBUNTU Linux Source: Repositories Support Project (JISC) http://www.rsp.ac.uk Questions Source: Repositories Support Project (JISC) http://www.rsp.ac.uk