Working Paper, Technical Report

advertisement
Implementing
The HKUST
Institutional
Repository
Diana Chan
Head of Reference
HKUST Library
Nov, 2005
2005 Library Conference: Balancing the
External and Traditional Libraries at the
Tamkang University, Taiwan
Library and Online Resources Technologies
2005 Conference at Xiamen University, PRC
Contents
1.
2.
3.
4.
5.
6.
Open Access and Institutional Repositories
HKUST IR
Software Selection
Planning and Policies
Strategies in Acquiring Content
Challenges
2
HKUST
 Opened in 1991
 4 schools (SSCI,
SENG, SBM, HSS)
 450 faculty, 5,500
UGs, 2,800 PGs
 Ranks 42 among the
top 200 universities
(2004 The Times
Higher Education
Supplement)
 Library: 22 librarians,
75 support staff
3
1. Open Access and
Institutional Repositories
 Technological and social trends that lead to the
Open Access Movement
 Fruits of Open Access
 What is an Institutional Repository?
 Why create one?
4
Technological Trends
 Increasing ease of sharing documents via FTP and Web (HTTP)
 Enables researchers to “publish” their research results (working
papers, pre-prints, etc) in subject-specific, web-based open archives
for faster and wider dissemination
 Individual scholars or institutions post abstracts and full-text
 Social Science Research Network (SSRN)
 IDEAS – Working papers in Economics
 The success of such collections led to the Open Archives Initiative
(OAI) which promotes author self-archiving & interoperable standards
for file sharing
 Major outcome: Open Archives Initiative Protocol for Meta Data
Harvesting (OAI-PMH)
5
Social Trends
“Serials Crisis”
Journal titles Increasing
+
Prices rising
+
Library budgets cut
= Market dysfunction
(since the 1980’s)
Source ARL Statistics: Monographs and Serials Costs in ARL Libraries, 1986-2003
6
Open Access Movement Example
 Scholarly Publishing and Academic Resources
Coalition (SPARC)
 Sponsored by Association of Research Libraries
 Endorsed by many different groups: Assoc. of American Universities,
Assoc. of Universities and Colleges of Canada, Australian ViceChancellors Committee, etc.
 Founded in 1997 to correct market dysfunction in
scholarly publishing
 “Expand competition & support Open Access
to address high & rising journal costs”
7
A Fruit of OA Movement:
Open Access Journals
 Refereed or peer reviewed
 Emerging Infectious
Diseases
 Journal of Machine
Learning Research
More in Directory of Open Access Journals (DOAJ)
8
A Fruit of OA Movement: OAIster
 One searchable interface for open archives from 536 academic institutions
 5.9 million documents: articles from Open Access journals; working
papers, discussion papers, & conference papers; dissertations & theses
+ All of the above & more from Institutional Repositories
9
A Fruit of OA Movement:
Institutional Repositories
 Development of IRs gained momentum with
the release of two open source systems:
Eprints (U of Southampton)
DSpace (MIT)
 Examples of Individual IRs
Australian National University Eprint
Repository
eScholarship Repository (U of California)
CalTech CODA
 Institutional Archives Registry (468 as of Oct 5,
2005)
10
What is an Institutional Repository
(IR)?
 A “digital collection capturing and preserving the
intellectual output of a single or multi-university
community”.
-
Adopted from “The case for institutional repositories: a SPARC
position paper” prepared by Raym Crow.
<http://www.arl.org/sparc/IR/ir.html>
11
Why Create the IR?
 Budapest Open Access Initiative
http://www.soros.org/openaccess/index.shtml
 Recommends 2 Strategies:
1. Self-archiving in Open Electronic Archives
2. Open Access Journals
12
Dual Open-Access Strategy
 BOAI-2 ("gold"): Publish your article in a suitable
open-access journal whenever one exists.
 BOAI-1 ("green"): Otherwise, publish your article
in a suitable toll-access journal and also selfarchive it.
13
Must Satisfy Two Conditions
 The author…grants to all users a free …right of
access to, and a license to copy, use,
distribute, transmit and display the work
publicly …
 A complete version of the work is deposited
in…at least one online repository
- From the Berlin Declaration
14
Why We Created an IR at HKUST
 To create a permanent record of the scholarly
output of HKUST
 To make available and disseminate the scholarly
output of HKUST in a free and interoperable
digital format
 To help the international Open Access effort.
Because the mission of disseminating
knowledge is only half complete if it is not widely
and readily available to society.
- Adapted from the Berlin Declaration
15
2. HKUST Institutional Repository

Collects,
disseminates, and
preserves in digital
format the scholarly
output of the HKUST
community

Uses DSpace
software, OAI-PMH
compliant, supports
Chinese

Easily discovered by
Internet search
engines and indexing
tools
http://library.ust.hk/repository/
16
Total Number of Documents
Collection
Size
%
Conference Papers
579
26
Working Papers, Technical
Reports, Research Reports,
Pre-prints
534
25
Journal Articles
493
23
Doctoral Theses
394
18
Patents
58
3
Presentations
56
2
Book Chapters
37
2
Miscellaneous
8
1
Total
2,159 (incl. 100 As of Oct 5, 2005
duplicates)
17
Contributors by Department
(as of Oct 5, 2005)
HSS&SOSC
6%
OTHER
12%
COMP
21%
SBM
13%
ELEC
13%
OTHER SCI
9%
PHY
5%
MATH
6%
OTHER ENG
8%
MECH
7%
18
Home Page of the HKUST
Institutional Repository
19
Browsing by Communities and Collections
20
Communities in HKUST IR
 Accounting
 Advanced Engineering
Materials Facility
 Applied Technology Center
 Atmospheric, Marine and
Coastal Environment Program
 Biochemistry Biology
 Center for Enhanced Learning
and Teaching
 Centre for Display Research
 Chemical Engineering
 Chemistry
 Civil Engineering
 Computer Science
 Economics
 Electrical and Electronic
Engineering
 Finance
 Humanities
 Industrial Engineering and
Engineering Management
 Information and System
Management
 Institute of Nano Science and
Technology
 Language Center
 Library
 Management of Organizations
 Marketing
 Mathematics
 Mechanical Engineering
 Physics
 Social Science
21
To Find Papers by Authors
kwok y
22
23
The View of an IR Record
Click to
see full
text
24
Full Text in pdf Format
25
To Search in IR
Fill in keywords and
click Search
26
27
To Submit A Paper
Put in your
UST account
name and
password
28
Fill in the form, click
the “Submit” button
at the bottom of the
page
29
30
You will receive a
confirmation email
31
Access Data
32
3. Software Selection
 The July/August 2004  We followed CalTech’s
issue of Library
model and based our IR
Technology Reports
on open source
software and with OAIon IR systems and
PMH interface.
functional
requirements
 We evaluated 2 IR
systems: EPrints and
DSpace
33
DSpace
 Jointly developed by MIT Libraries
and Hewlett-Packard Company
 Open source software
 Released on Sourceforge during
our system evaluation period in
late December 2002
 Written in Java, with PostgreSQL
database, Lucene search engine,
and a Tomcat web servlet
container
34
DSpace
 We chose DSpace in 2003 because:
DSpace began the development with the
experience gained from EPrints - the first and
most popular open source IR software at that
time
EPrints did not have full support on Unicode
and is not Java- and servlet-based
Both EPrints and DSpace are open source
software, fulfill our functional requirements,
and follow state-of-the-art library standards
35
Current Configuration
of HKUST IR
As of Oct 5, 2005,
Home URL:
IR Software:
System Software:
http://repository.ust.hk/
DSpace Version 1.2.1
Fedora Core 2 Linux; Tomcat 5.0.28;
JDK1.4.2_05
Server:
Intel Pentium 4 2.4GHz, 2GB RAM
Content:
2,059 documents from 40 communities
Usages:
Documents were accessed
5,792 times in September 2005
36
Major Features
Data structure
Document submission form
Add item form
CJK support
OAI data provider
SRW/U interface
37
Data Structure
 Document Types
 journal articles, theses, etc,
 Document Formats
 Mainly PDF files; also contains PowerPoint files
 DSpace data model
 Communities (and sub-communities)
 Collections
 Items
 Metadata
 Bundles of bitsteams
 HKUST implementation: Items are grouped by
 Departments (i.e. communities)
 then by Document Types (i.e. collections).
38
Document Submission Form
Faculty are not willing to do self-submission
DSpace’s submission and workflow functions
are too lengthy
In need of a simple and effortless submission
form - as a quick medium for submitting
documents
Written in Perl
Submitted data stored in DSpace “Simple
Archive Format”
39
Add Item Form
Is a locally developed JSP application to add
items to DSpace by library staff
Allows staff to:
Create new item from scratch
Enhance the metadata from faculty
submission and then add the item to
DSpace
40
41
CJK Support
 CJK (Chinese, Japanese, Korean) Support
 DSpace supports Unicode
 Problem - Lucene search engine is unable to search
by CJK characters
Solved by replacing DSpace’s Tokenizer with a
CJKTokenizer - but has an interesting side effect
 Problem - URL of query containing CJK characters is
not properly encoded
Solved by setting Tomcat URIEncoding="UTF8"
42
43
44
OAI Data Provider
DSpace is OAI-compliant
This means that OAI harvesters can easily
collect the metadata (in Dublin Core format)
from various IRs (including HKUST’s) for their
added-value indexing/searching services.
For example: OAIster
OAI Path to IR at HKUST:
http://repository.ust.hk/dspace-oai/request?
45
http://repository.ust.hk/dspace-oai/request?verb=GetRecord& ... 1783.1/1805
46
SRW/U Interface
Search and Retrieval for the Web (or by URL)
Retain core functionality of Z39.50 but in the
form of web services
This means search service providers can
broadcast a search to various IRs and deliver
the search results in their own GUI interface
SRW/U Interface for the IR at HKUST
Based on OCLC’s SRW/U software
URL: http://repository.ust.hk/SRW/
47
The result of a SRW/U search, with XSLT transformation
48
Enhancements to DSpace
Document submission form
CJK searching problem
Subscript and superscript problem
Number of items displayed
Access data
Top 20
Recommend an item link
Faculty & staff link
49
4. Planning and Policies
 Task Force – software, scope, policies, database
structure, problems, action plans
 Information Services Committee – guidelines on
publications, publishers’ policies, data formats,
faculty concerns.
 Library Administrative Committee – problems,
issues, final decision, strategies.
50
Work Team – Subject Librarians
Correct
Version
Incorrect Version
To Data
Entry Staff
Index
Document
Dr. Samson Soong
Liaise & Subject librarians
With
Faculty
Check
Pub
List
Harvest
Document
Correct
Version
Verify
Document
Version
Ascertain
Pubs’
Policies
51
Work Team – Data Entry Staff
Verify and Convert
PDF Documents
Final Review
Input Metadata
Using Submission Form
Add Items to Repository
Set PDF Document Security &
Properties. Add Watermark for
Pre-published Version
Proof-Read
52
53
Guidelines on Different Publications
Type
Copyright
Action
Book chapter
Book
Conf paper
Conf proceed.
US Patent
Publisher
Need permission
Publisher, 50 years
Need permission
Author
Can archive
Publisher
Need permission
Public Domain
Author
Can archive
US Patents
Working Paper,
Technical Report
Author
Can archive
Presentation
Standard
Author
Can archive
Issuing Organization
No
54
SHERPA Summary of Publishers' Policies
55
Guidelines on Journal Articles
Publisher’s Policy
No
Arch.
Pub’s PrePostRef’ed Ref’ed
Both
All
Not
Specified
PreRefereed
Version
No
Yes
Yes
Yes
Yes
Yes
Ask
Pub
PostRefereed
Version
No
Yes
No
Yes
Yes
Yes
Ask
Pub
Publisher’s No
Version
Yes
No
Ask
Faculty
Ask
Yes
Faculty
Ask
Pub
Version
Available
On hand
56
Guidelines on Publishers’ Policies
 Studied publishers’ copyright & self-archiving
policies (SHERPA/RoMEO , Stevan Harnad’s
and publishers’ websites)
 Constructed our own table for reference
 Printout of publishers’ copyright statements and
date-stamped
 Noted their acknowledgement or credit
requirement
57
Credit to Publisher
 In the Rights field of a record:
APS copyright statement:
"[Journal title] © copyright (year) American
Physical Society. The Journal's web site is
located at http://....."
58
59
Other Policies
Withdrawal
Replacing Versions
Cooperation with User Groups
Authority Control
Indexing
Rights and Acknowledgement
60
5. Strategies in Acquiring Content
Our logics
How to Acquire by Type of Document?
How to Use Different Channels?
Sustainable Growth
61
Logics Behind our Strategies
 The research output is the University’s
intellectual property
 Create a critical mass of papers
 Copyright and self-archiving rights are our
concerns
 Ascertain publishers’ policies
 Ask permission from authors and publishers
 Deal with publications which are easier to
obtain and sources which are more accessible
 Those posted on the web
 Those from publishers allowing published
versions
62
How to Acquire by
Type of Document?
1. Working Papers, Technical Reports, Research
Reports
2. Conference Papers
3. Conference Presentations
4. Theses
5. Book Chapters
6. Peer-reviewed Journal Articles
7. Open Access Journal Articles
63
Sources of Scholarly Content
Library
Collection
Researchers
Web
Scholarly
Content
Publishers
Journals
64
Copyright VS. Self-Archiving Rights
Copyrighted
Non-copyrighted
Journal articles, book chapters, Working papers,
conference proceedings, theses, technical reports
presentations
Author’s
Permission
Author’s
Permission
Publisher’s &
Author’s
Permission
Archivable
University
Owned
Author
Owned
Publisher
Owned
Nonarchivable
Selected items to ask for
author’s & publisher’s
permission
Author’s permission
Department’s
Permission
65
Journal Articles
Journal Article
Check Author’s
Archiving Rights
No or Unclear
Ask
Publisher
Yes Publisher’s
Version
Harvest from
The Web
Yes Pre-refereed
Or Post-refereed
Version
Ask
Author
Deposit Into IR
66
How to Use Different Channels?
1.
2.
3.
Self Submission
Harvest from Websites (departmental, faculty, research
centers)
Library Collection



4.
5.
6.
7.
Conference proceedings
Theses and dissertations
University Archives
Harvest from the Source (databases, E-journals, Open
Access publications)
Publishers
Liaisons with Faculty, departments, research centers
Public Relations
67
Electronic Thesis Approval Form
Student Agreement:
I hereby grant to the Hong Kong University of
Science and Technology Library the nonexclusive right to archive my thesis in digital
format, and make it freely accessible, such as
over the Internet.
Signed:
Date:
68
Publisher’s policy: Emerald
Emerald’s Principles on Copyright
Emerald seeks to retain copyright of the articles it
publishes, without the authors giving up their rights to
use their own material. Authors are not required to seek
permission to re-use their own work. As an author you
can use your paper in part or in full,…in another article
written for us or another publisher, on your website, or
any other use, without asking us first.
http://ninetta.emeraldinsight.com/pdfs/jarform.pdf
69
Collection Growth Milestones
1800
83 Research Centers
1600
No. of Documents
1400
79 Univ. Archives
1200
50 IOP papers
1000
142 conference papers
35 papers with publishers' permission
800
96 CS papers
600
110 theses + 211 working papers
400
53 patents
200
116 papers from faculty websites
105 CS technical reports
0
May
2003
Jul
Sep
Nov
Jan
2004
Mar
May
July
Sep
70
Towards Sustainability for the
HKUST Institutional Repository
 How to make the submission to IR part of the
publication process?
Seeking permission from faculty to archive
papers supported by RGC grants
making use of the OCGA Research Output
report process, a checkbox is added to the
report form to denote agreement to archiving
in IR – 100+ papers was received in the
summer 2005.
71
6. Challenges - Faculty
 Low awareness of Open Access
 Concern over copyright issues
 Apathy in self submission
 Lack of willingness to negotiate on nonexclusive rights or self-archiving rights
 Lack of willingness to provide the right versions
of documents (pre- or post-refereed)
 Only a small % of their scholarly work can be
archived
72
Example of a Faculty
Retaining Self Archiving Rights
73
Challenges - Institution
 Needs to make a commitment to deposit all
research output with the Institutional Repository
 Needs to give financial support to faculty who
submit papers to open access journals
 Needs to give financial support to the Library for
archiving work
74
Challenges - Publishers
 In SHERPA project, 73 out of 107 publishers
(68%) allow some sort of archiving, as of Nov’04
 Many have no policy (Camford, Genetic Society
of America)
 Many have an unclear policy
 Need to include self-archiving into license
agreements with publishers
75
Challenges – Library
 Provide support for university research selfarchiving
 Promote the IR
 Educate users and faculty about the IR
 Showcase the IR
 Find champions and partners
 Seek institutional commitment and support
 Harvest documents
 Make self submission a part of faculty’s
publication reporting system
76
Challenges - Librarians
 System Evaluation
 Formulating and interpreting policies
 Internal and publishers’ policies
 Content Recruitment
 Advocacy




Education
Advisory
Perceived benefits
Public relations
 Use Assistance
77
References and Additional Resources

Chan, Diana L.H. (2004) “Managing the challenges : acquiring content for the HKUST Institutional Repository” International conference
on developing digital institutional repositories : experiences and challenges, Hong Kong, December 9-10, 2004, California Institute of
Technology Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/1973
(accessed September 24, 2005)

Chan, Diana L.H. (2004) “Strategies for acquiring content : experiences at HKUST” International conference on developing digital
institutional repositories : experiences and challenges, Hong Kong, December 9-10 2004, California Institute of Technology Libraries and
the Hong Kong University of Science and Technology Library, available at: http://hdl.handle.net/1783.1/1974 (accessed September 24,
2005)

Chan, Diana L. H., Kwok, Catherine S. Y., Yip, Stephen K. F. (2005) “Changing roles of reference librarians : the case of HKUST
Institutional Repository.” Reference Services Review, Vol. 33, No. 3, pp.268-282, available at http://hdl.handle.net/1783.1/2039 (accessed
September 24, 2005)

Crow, Raym. (2002) “SPARC Institutional repository checklist and resource guide” The Scholarly Publishing & Academic Resources
Coalition, November.

Crow, Raym. (2002) “The case for institutional repositories: a SPARC position paper”, available at
http://www.arl.org/sparc/IR/ir.html (accessed September 24, 2005)

Gibbons, Susan. (2004) “Establishing an institutional repository” Library Technology Reports, July/August, Vol. 40 No. 4, pp. 5-67.

Lam, Ki-Tat. (2004) “DSpace in action: implementing the HKUST Institutional Repository system“ International Conference on Developing
Digital Institutional Repositories : Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology
Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/2023 (accessed
September 24, 2005)

Special issue on reference librarians and institutional repositories (2005). Reference Services Review, vol. 33, no.3. pp. 259-346.
78
Download