ePrints.FRI - case study

advertisement
ePrints.FRI – a case study
Open Access Repositories with EPrints
EIFL-FOSS and EIFL-OA free online workshop
23 May 2011
Miha.Peternel@fri.uni-lj.si
Overview
• Context
• Why EPrints ?
• Installation & configuration
• Case Study: Organization & service implementation @ FRI
• Case Study: Submission policy
• Measuring success
• Conclusion
ePrints.FRI – a case study
2
ePrints.FRI
• ePrints.FRI is the publications database of the Faculty of Computer
and Information Science, University of Ljubljana
• It is based on open source ePrints with modifications
• It was started in 2002 using ePrints 2, now it’s ePrints 3
• It is integrated into Open Archive Initiative indexer network
• It was the first OAI archive in wider region
• It is integrated into existing faculty web infrastructure
• In 2008 it has become official digital repository for all student
theses and dissertations
• It currently hosts 995 student works, 155 scientific works, plus
some books and other teaching materials
ePrints.FRI – a case study
3
Context
• ePrints.FRI is an effort of Faculty of Computer and Information
Science
• There is a separate University Library
• Most of material is also catalogued in COBISS (metadata only)
UL
Universityof Ljubljana
FRI
Facultyof Computer andInformationScience
COBISS
National catalog
DIKUL
UniversityLibrary
ePrints.FRI
ePrints.FRI – a case study
4
Goal
• Initial goal (volunteer effort, 2002)
• Provide a simple self-archiving tool for the laboratory
supporting Open Archive Initiative (prof. Franc Solina)
• Try to deploy it Faculty-wide
• Revised goal (institutional effort, 2008)
• Fulfill national directive on thesis publishing
• Revise policies for digital publishing
• Try to provide single point of meta-data entry
ePrints.FRI – a case study
5
Why did we choose ePrints 2 in 2002?
• Open Access Initiative (OAI)
• Web publishing of primarily scientific output and teaching materials
• Web based interface
• Open source on standard Linux servers (RedHat)
• Highly customizable metadata
• Multilingual metadata
• The other OAI alternative in 2002 not as customizable
ePrints.FRI – a case study
6
Why did we choose ePrints 3 again in 2008?
• Mandatory publishing of student output (theses, dissertations)
• Improved metadata customization
• Improved workflow customization
• Modernized web user interfaces
• CSS, help, auto-complete, preview…
• Fulfilled all our customization requirements
• Alternatives did not offer any advantages for our needs
• No security or performance issues with ePrints 2
• Easy import of metadata (XML) and documents (PDF,DOC)
• Support from ePrints wiki and mailing list
ePrints.FRI – a case study
7
Installing ePrints
Test installation:
• Pick your favourite OS (we prefer Debian Linux)
• Install Apache web server, MySQL, Perl
• Download & install EPrints package for your OS
• Add a few more Perl packages that EPrints requires
• Set up a test archive by running the configuration scripts
• Optionally install some more tools that EPrints can use
Pre-production installation:
• Configure and finalize metadata before you start adding documents
• Configure your server
• Virtual server has benefits
ePrints.FRI – a case study
8
Customizing ePrints
• All the source code is available
• A collage of Perl, XML, XHTML and structured text files
• The code base is modular
• The Perl code stubs that are expected to be modified are exposed
in archive configuration directories
• Most other configuration is done by modifying XML/XHTML
• Some hacking of base Perl code or additions to code may be
required for minor fixes (custom import/export, OAI language
preferences…) or other special needs
• Each new version is more customisable out of the box
• Wiki documentation is extensive but not always up to date
ePrints.FRI – a case study
9
Customizing ePrints 3 in practice
• Code and libraries – Perl
• Metadata definition – Perl
• Subject hierarchy and departments – text or XML
• Apache web server – conf files
• Workflow – XML
• Interface language – XML and some Perl
• OAI export – Perl, text
• Automation scripts – Linux crontab, some PHP
• Autocomplete – text, PHP
• Custom views – Perl, XML, XHTML
• Custom references – XHTML, some Perl
ePrints.FRI – a case study
10
OAI configuration
• Enter policy information
• (Re)configure metadata mapping
• Enable & test (via web interface)
• Language prioritization is an issue with multilingual metadata
• We rewrote some code
ePrints.FRI – a case study
11
Developmental phases
• ePrints 2 – 1 person effort
• Basic customization – less than 1 month
• Internal testing – 1 laboratory
• Dedicated server & multilingual debugging – 1 month
• ePrints 3 – institutional effort
• Organized process
ePrints.FRI – a case study
12
Developmental phases – ePrints 3
• Institutional planning
• Metadata definition
• Customization
• Translation
• Staff education
• Testing
• Migration of existing publications from ePrints 2
• Initial deployment
• Workflow facilitation
• Statistics
ePrints.FRI – a case study
13
Workgroup staff
• Workgroup manager: prof. Mira Trebar
• Software engineers (2)
• IT department representative
• Student office representative
• Library representative
• Linguist
• Plus occasional institutional representatives
• More student office & library personnel involved in final testing
ePrints.FRI – a case study
14
Chart: Institutional departments involved
• Workgroup staff dispersed over several departments
University of Ljubljana
FRI
Faculty of Computer and Information Science
Student office
IT department
Library
Representative
System engineer
Representative
Staff (4)
Script engineer
ePrints.FRI – a case study
Support engineer
Labs
Manager
ePrints engineer
Staff (2)
15
Workgroup organization chart
Faculty senate
& commissions
Workgroup manager
ePrints engineer
System engineer
Library
representative
Student office
representative
Linguist
Automation script
engineer
ePrints.FRI – a case study
16
Developmental milestones
• Institutional commitment: December 2007
• First workgroup meeting: January 2008
• Test installation: April 2008
• Metadata testing: May 2008
• Institutional presentation: June 2008
• Metadata migration, Testing: August 2008
• Institutional deployment, Testing: September 2008
• Public deployment: October 2008
ePrints.FRI – a case study
17
ePrints.FRI – 2008 revision
Bilingual interfaces
Multilingual metadata
and documents
ePrints.FRI – a case study
18
Hosting
• IT department, Faculty of Computer and Information Science
• Platform:
• IBM server
• VMWare hosting multiple virtual servers
• Virtual Debian Linux server
• Backup
• Backup virtual server images
• Provided by IT department
ePrints.FRI – a case study
19
Service sustainability
• Printed instructions
• 4 student office staff educated
• 2 library office staff educated
• One ePrints administrator plus one support engineer
• One system administrator plus one support engineer
• Virtual server with full-system backup
• System, metadata and publications all backed up
• Technical support
ePrints.FRI – a case study
20
Technical support
• 1st level: IT department
• System engineer
• Support engineer
• 2nd level: involved technical staff
• ePrints software engineer
• Automation script engineer
• Network administrator
ePrints.FRI – a case study
21
Policy formulation and licensing
• Policy formulated to respect national laws and university workflow
• Two-track licensing
• Strict requirements for student theses
• Scientific papers self-published and checked on best-effort basis
• Legal paperwork prepared for students
• Students must sign papers submitting rights for electronic
publishing
ePrints.FRI – a case study
22
Student submission
• Thesis work in printed form
• Thesis work in electronic form (PDF/DOC on CD)
• Metadata in electronic form (TXT/DOC on CD)
• Signed legal paperwork
All of above submitted to student office before oral defense, so that
announcement and publication proceed automatically
ePrints.FRI – a case study
23
Thesis submission workflow
ePrints.FRI – a case study
24
Workflow facilitation (1)
• Auto-complete names and titles from a database
• Avoids tedious ID lookups and related errors
ePrints.FRI – a case study
25
Workflow facilitation (2)
• Fill in standard fields
• Prepare links and fix them later
ePrints.FRI – a case study
26
Scientific submission
• Self-archived by author
• Electronic document in PDF form preferred
• Metadata in web forms
• Self-published by employed author (policy since 2010)
• Metadata and document validity periodically checked by
administrator on best-effort bases
• Returned to author in case of invalid metadata
• Removed from public archive in case of a serious problem
• No legal policy at the moment
ePrints.FRI – a case study
27
Publication HTML page
ePrints.FRI – a case study
28
System integration
• Web integration
• Thesis defense announcements
• Thesis details
• Personal publication lists
• Laboratory publication lists
• Links to content hosted in ePrints
• Information system integration
• Morning mails include thesis defense announcements
• Mentoring and committee participation statistics automated
ePrints.FRI – a case study
29
Web integration – defense announcements
Generated by
automation script
from ePrints 3 XML
ePrints.FRI – a case study
30
Web integration – publication lists
Generated by
PHP script from
ePrints 3 export
ePrints.FRI – a case study
31
Measuring and demonstrating success
• The first open archive in the wider region, quickly picked up by OAI
indexers and big search engines
• Relatively quick deployment with NO serious glitches
• Attracted interest from other open-access projects (DRIVER) and
faculties
• Access statistics:
• AWStats and Webalizer – general web access
• IRStats – repository specific
• Increased ability to monitor INTEREST and ORIGIN OF INTEREST
for publications and subjects of publications
ePrints.FRI – a case study
32
Google ranking
ePrints.FRI – a case study
33
Visitor statistics (IRStats)
ePrints.FRI – a case study
34
Document downloads (IRStats)
ePrints.FRI – a case study
35
Visitor statistics summary
• Publication dissemination about triple the number of enrolled
students PER MONTH
• Greatly increased promotion and dissemination of student theses –
on average 5 downloads per thesis per month
• Elevated practical status of thesis as a reference
• Most visitors arrive by search engines looking for general keywords
ePrints.FRI – a case study
36
Key challenges faced
• Translation and multi-language specific issues
• Terminology
• Missing ePrints language flexibility
• Missing OAI multilingual support
• Overcoming resistance to change
• 1 point data entry for student department
• Facilitators for data entry (auto-complete, workflow)
• System integration with existing web software: Ažur, Moodle
• Metadata set changed from ePrints 2 to ePrints 3
• Minor problems with indexing and automatic data transfer
ePrints.FRI – a case study
37
Important unresolved issues
• Legal policy for scientific publications
• Centralized archive provides little incentive for self-archiving
• If Google can find it, who cares about repository
• Automated integration with current and future national archives
• Goal: Enter metadata once, publish many times
ePrints.FRI – a case study
38
Conclusion
• EPrints is relatively easy to install and customize
• Most things can be customized within provided Perl stubs, XML and
XHTML
• Anything can be changed using basic Perl skills (and time)
• Google ranks ePrints archives highly
• Setting up OAI will boost your rankings (OAI indexers will link back
to your archive)
• We did not experience any serious security or performance issues
in ePrints code
• Based on this I can recommend ePrints for your archive
ePrints.FRI – a case study
39
Thank you
• Any questions ?
Miha.Peternel@fri.uni-lj.si
ePrints.FRI – a case study
40
Download