An Issues Paper - HKU Libraries

advertisement
An Institutional Repository for the University of Hong Kong:
An Issues Paper
16 January 2004.
The Institutional Repositories Taskforce:
A Taskforce of the Knowledge Team
A
The Brief.
This taskforce was convened to study the Institutional IT/Knowledge Repositories
issue. Specifically the taskforce will look at LEARNet and other similar projects to
see if the University of Hong Kong should, like many institutions elsewhere
(including HKUST) take advantage of the open source software platform
phenomenon that enables institutions to: (1) capture and describe digital works using
a submission workflow module, (2) distribute an institution's digital works over the
web through a search and retrieval system, and (3) preserve digital works over the
long term. Following its first meeting and a report to the Knowledge Team, the
Taskforce was asked to provide an issues paper outlining some of the major issues
surrounding the introduction of an institutional repository at the University.
B
What is an Institutional Repository?
As a recent, emerging concept it is not an easy task to precisely define what
constitutes an institutional repository. In what is arguably the most significant article
dealing with institutional repositories to date, Clifford Lynch (2003) provides a broad
definition of what he believes an institutional repository to be. In particular, he states,
they provide:
…for the management and dissemination of digital materials created by the
institution…an organizational commitment to the stewardship of these digital
materials, including long-term preservation…as well as organization and access
or distribution…the management of technological changes, and the migration of
digital content from one set of technologies to the next… a mature and fully
realized institutional repository will contain the intellectual works of faculty and
students—both research and teaching materials—and also documentation of the
activities of the institution itself in the form of records of events and
performance and of the ongoing intellectual life of the institution. (Lynch, 2003,
p. 328).
The Ohio State University’s (OSU) Knowledge Bank provides another interesting
perspective on what an institutional repository might be. While the OSU Knowledge
Bank’s original purpose was “to collect, to index, and to preserve digital content
produced by faculty” (Rogers, 2003, p. 126), this definition was broadened to include
“the full array of digital assets and information services available to or being created
by OSU faculty, staff and students” (emphasis added), (Rogers, 2003, p. 126). The
introduction of digital content available to the institution in the latter definition raises
a host of complex issues that, it appears, other institutions with these repositories have
chosen not to tackle. The OSU Knowledge Bank is still in the planning stages,
-1-
insofar as only a central listing of OSU digital projects is being collated (Ohio State
University Library, 2003).
Essentially an institutional repository is defined by the institution itself. Specifically
the material to be included in the repository will vary among institutions but these
must be digital, searchable and wide ranging in their nature. The institution must also
make a commitment towards preservation of the material and perpetual access
ensuring that material is not deleted after a certain amount of time but continues to be
built upon.
C
The Situation at the University of Hong Kong.
LEARNet.
The University has contributed descriptions of learning objects to the Learning
Resource Catalogue (LRC), created by the University of New South Wales as a
Universitas21 initiative, for several years. This catalogue of learning objects is shared
among U21 member institutions who are in most cases free to reuse relevant objects
that they locate through the catalogue. As an extension of this programme the
University through its CAUT sought to extend this resource sharing among all UGC
funded institutions. With UGC funding the LEARNet project was established.
LEARNet enables the University, through the use of the U21 software, to share its
learning objects with both other U21 institutions as well as other UGC funded
institutions. Conversely however, other U21 institutions are blocked from viewing
other UGC records and other Hong Kong universities are blocked from accessing U21
institution records with the exception of the University of Hong Kong who shares
with both. Setting up this environment has been a complex task made possible
through the UGC grant which will expire at the end of 2004. The University of Hong
Kong has made a significant investment in this project and remains committed to its
success.
How does LEARNet differ from an institutional repository? Firstly the LEARNet
catalogue (LRC) is just that, a catalogue. It is not a repository. The objects are
described using learning object metadata (LOM) and a link is usually provided to
where the item can be retrieved. Learning objects themselves are described as
building blocks for teaching and learning purposes that are self-contained, shareable,
searchable, reusable and can be updated. So while it might be attractive to use the
LEARNet project as a springboard for an institutional repository, it can be seen that
the focus of LEARNet is far too restrictive in its purpose and if used as an
institutional repository would serve to limit its applicability to other digital materials.
Ideally, however, if an institutional repository is adopted at the University a high
degree of interoperability or record duplication across the two systems would be
necessary as these materials should form a significant part of such a repository.
Registry’s Research and Scholarship Database.
This database, collated annually, highlights significant research activities conducted
by Faculty as well as providing an overview of each Faculty's research directions
(HKU Registry, http://www.hku.hk/rss/rs2002/index.html). Each Faculty Profile also
provides entries on individual researchers in each Department under the Faculty.
These entries describe information about research projects and research outputs as
-2-
well as titles of theses produced in that academic year by research postgraduate
students. In 2002 the Libraries worked with the Registry and the Computer Centre to
provide links between the database and the electronic journals to which the Libraries
subscribe.
This database, while only providing descriptions and in some cases a link to the
published material, provides a significant variety of University output which, if
available in full text, could form the basis of an institutional repository.
The Libraries’ Hong Kong University Theses Online.
This collection comprises more than 8,000 titles of theses and dissertations submitted
for higher degrees to the University of Hong Kong since 1941. Of these titles only a
small number is available in full-text (59 at time of writing) with the remainder
containing metadata descriptions as well as contents and abstracts. Once again the
content of this database would make a major contribution to a University institutional
repository.
Other resources in faculties, departments, administrative divisions?
This is the great unknown. It is believed that some University Departments currently
digitise their departmental records but it is uncertain which records are digitised, how
they are stored and whether they are retained indefinitely or destroyed after a certain
number of years. The Libraries’ Administration Department, for example, digitises its
administrative records, letters and so on and utilises the Document Imaging System
developed by the Computer Centre and available through the Staff Intranet for storage
and retrieval.
D
Why Introduce an Institutional Repository?
‘Superarchives’ could hold all scholarly output’ (Young, 2002) was how the
Chronicle of Higher Education chose to pronounce the rise of institutional repositories.
While ostensibly positive in its by-line, Young’s article also raises a degree of
scepticism concerning the likely success of such repositories citing earlier failed
attempts at ‘widespread reform of academic publishing’ (Young, 2002, p. A30).
The changing scholarly publishing environment.
The Scholarly Publishing & Academic Resources Coalition (SPARC, 2002, p. 5)
believes that the impact of several coinciding factors make this the right time for a
new scholarly publishing environment. Specifically among these are:
 Technological change in digital publishing and networking has driven the
demand for more robust digital presentation
 Marked increases in research output (particularly in the sciences) is straining
the capacity of the print publishing model
 Increasing dissatisfaction with traditional print and electronic journal pricing
and marketing models
 Increasing uncertainty and concern over digital preservation of digital
scholarly material.
Institutional repositories capture the intellectual output of a university. They also
have the potential to enable a university to form part of the rapidly growing global
network of repositories who have the ability to be interoperable thus providing a new
-3-
disaggregated model of scholarly publishing.
Institutional visibility and prestige.
All universities pride themselves on their intellectual output. With its significant
output in both quantity and quality, the University of Hong Kong is no different. By
aggregating its intellectual output in an institutional repository, the University is in a
position to more readily demonstrate its prestige through its quality output and in turn
its value to society. Such demonstration has the opportunity to translate into real
benefits including funding from both private and public sources.
Preservation.
As the nature of scholarly communication changes universities are seeing academics
and other researchers developing research and teaching materials in increasingly
complex digital formats. The need to collect, store, arrange and disseminate this
material is a complex task and one that runs the risk of significant duplication and
therefore cost if not conducted at an institutional level with institutional commitment.
But digital preservation is a complex matter and one that most institutional
repositories have not yet dealt with in any satisfactory manner. The SPARC
institutional repository checklist & resource guide (SPARC, 2002a) highlights this
when it says that ‘many of the early institutional repository implementations have
deferred decisions about long-term digital preservation’ and that this is in anticipation
of progress being made ‘in terms of developing standards for digital preservation’ (p.
38).
Other (prestigious) universities are doing it.
While participation based on others involvement is not without flaws, it is noteworthy
to highlight some of the Universities now involved in institutional repositories or
some derivation of them. Perhaps the best known is the Massachusetts Institute of
Technology (MIT) Library who partnered with Hewlett-Packard to develop the
DSpace software now implemented there and being tested, adopted or adapted in
other institutions including Cambridge University, Ohio State University, Columbia
University, Cornell University, the University of Rochester, the University of Toronto,
and the University of Washington at Seattle. In Hong Kong the University of Science
and Technology has also implemented DSpace, albeit with a limited number of items
currently available (279 titles at the time of writing).
E
Problems with Introducing an Institutional Repository.
Several issues need to be addressed in order to successfully implement an institutional
repository at the University of Hong Kong.
What to include – the need for a local definition.
Contingent upon any discussion to implement an institutional repository is the need
for an institutional definition of its repository, in particular what types of digital
material will be contained within the repository. As we have seen, Clifford Lynch
provides a fairly broad definition that encompasses intellectual works of faculty and
students, both research and teaching materials as well as documentation of the
activities of the institution. The taskforce believes that should a HKU institutional
repository be established, it should only hold material of scholarly value, and not
-4-
administrative documentation such as departmental minutes. The Ohio State
University definition includes not only digital material created by the institution but
also digital material available to the institution. Through its publications, SPARC
tends to emphasize the research output of the university and as an alternative to more
traditional scholarly publishing methods.
Another aspect of the definition that must be considered early is whether the
institution is willing to share its resources beyond its immediate members. While part
of the spirit of the institutional repository is to enable greater scholarly
communication, this does not prohibit the institution from restricting access to certain
kinds of material housed in the repository. As an example, the MIT has blocked
external full-text access to its MIT Press publications.
Faculty participation.
Needless to say but Faculty involvement is critical to the success of an institutional
repository. In particular if faculty are asked to use the repository as an alternative
means of publishing their scholarly output, they need to be convinced that this
alternative is viable, promotes scholarship in their disciplines and indeed brings them
the prestige afforded by publishing in refereed established journals. If the repository
is to be used for traditionally unpublished material, this is less of a concern but faculty
still need to understand the benefits of sharing these materials.
Non-faculty participation.
The successful institutional repository will not only enjoy faculty participation but
also non-faculty participation. Once again institutional requirements will dictate
which non-faculty departments are required to be involved. Certainly the Libraries
and the Computer Centre and most likely the Press, the Registry and the Museum will
need to be involved. Potentially all administrative and service departments will make
some contribution. Engaging such an extensive range of players will require a strong
commitment from the University administration. Coordinating such a group is also
not without its problems.
How to implement.
Extensive implementation does not yet appear to be a reality in any of the institutions
with institutional repositories. Even MIT as leaders of DSpace have a limited number
of departments contributing to the repository. Their strategy may be to roll out the
software to only a select number of departments in order to identify further issues and
problems and when these have been dealt with to hold up the successful
implementation to other departments. This is most likely a suitable strategy for the
University of Hong Kong to adapt.
Costs.
While institutional repository software is open source and freely available, this does
not mean that there are no costs involved in such an implementation. These are
predominantly in the form of staffing and hardware. MIT estimates annual cost to
maintain DSpace at US$285,000 ($225,000 staffing; $25,000 operating expenses;
$35,000 system equipment).
-5-
Timing – should we wait and see?
Institutional repositories have attempted to set themselves apart from earlier failed
attempts at alternative academic publishing efforts. To some extent this could be seen
as successful as they can be distinguished from those earlier attempts with new
institutions participating and their attempt to collect a wider range of materials. Yet
the question remains whether or not the scholarly community is ready to embrace the
institutional repository as an alternative to traditional academic publishing. Extensive
adoption of this software and the participating institution’s commitment to sharing the
content contained within will judge the success of institutional repositories as an
alternative publishing means. If the University chose to implement a repository that is
a restricted, internal, central repository of the University’s digital output, then the
success relies largely upon a University wide commitment to the project.
Joe Branin’s Visit
F
In early December taskforce members attended presentations by Joe Branin, Director
of Libraries at the Ohio State University where the development of the Knowledge
Bank, an extensive repository and referatory, is currently under development and
receiving much coverage. The taskforce also met with Joe Branin on Friday 5th
December where we were able to discuss many of the issues raised in this paper and
gain a first hand account of the evolution of the Ohio State University Knowledge
Bank.
The Ohio State University Knowledge Bank
The Ohio State Knowledge Bank is far more than an institutional repository and, as
Branin himself admitted, might be more ambitious than their ability to deliver. The
Bank will consist of a number of components, namely:
•
•
•
•
•
•
•
•
Online Published Material
– E-books, e-journals, government documents, handbooks
Online Reference Tools
– Catalogs, indexes, dictionaries, encyclopedias, directories
Online Information Services
– Scholar’s portal, alumni portal, chat reference, online tutorials,, ereserves, e-course packs, technology help center
Electronic Records Management
Administrative Data Warehouse
Digital Publishing Assistance
– Pre-print services
– E-books, e-journal support
– Web site development and maintenance
Faculty Research Directory
Digital Institutional Repository
– Digital special collections
– Rich media (multimedia)
– Data sets and files
– Theses/dissertations
– Faculty publications, pre-publications, working papers
– Educational materials
• Learning objects
-6-
•
•
• Course reserves/E-course pack materials
• Course Web sites
Information Policy
Research/Development in Digital Information Services
– User needs studies
– Applying best practice
– Assistance with Technology Transfer
Such an undertaking is indeed ambitious and it would be difficult to imagine a similar
undertaking being successfully introduced here at the University of Hong Kong.
Implementation at Ohio State University
The mandate for the OSU Knowledge Bank was received from the Office of the
President where a senior member of the University has championed what they believe
is a worthy project. As Director of the Library, Joe Branin was asked to make this
happen.
Branin indicated that the starting point for OSU would be the establishment of a
Faculty Research Directory and that considerable interest was generated at their
institution for the development of this expertise database. Branin also reported that
the Knowledge Bank will not include traditionally published scholarly literature, nor
their pre-prints. The OSU faculty was not interested in this idea, and some were
adamantly opposed.
While OSU are adopting MIT’s open source DSpace, there are still considerable startup costs involved in doing this. Apart from the necessary hardware and technical
support, OSU have employed a full time project manager whose role is to 1) gain
university wide acceptance, indeed commitment, for the Knowledge Bank and 2)
identify appropriate resources to be contained within the Bank and seek the relevant
approval for doing so.
There are obvious difficulties in HKU following the OSU model of implementation.
G
The Options for HKU
Taskforce members agree there is great merit in introducing an institutional repository
at the University of Hong Kong. In the interests of 1) making accessible material that
is hidden away or accessible to only a few, 2) contributing to open scholarly
communication, and 3) establishing a commitment to long term preservation of
resources, the taskforce commends to the Knowledge Team four possible options for
its consideration:
1
Undertake a full implementation of DSpace (or other similar package)
with a full approach, similar to the OSU model.
This option provides for a fuller implementation of a repository than option 2, along
similar lines to the OSU model as espoused by Joe Branin. This model would require
a major commitment from the University in terms of both resources and support for
the project. There is obviously substantial risk in adopting this option as a
considerable outlay of resources would be necessary for any successful
implementation. The likelihood of success in such a full scale implementation is
-7-
limited as a considerable outlay of resources, and perpetual commitment of renewal of
those resources, would be necessary.
2
Implement DSpace (or other similar package) with a ‘soft’ approach.
This option suggests that DSpace (or other similar package) be implemented and that
a department or departments be asked to contribute working papers and other
unpublished material into the repository. If this is successful it can be held as a model
and used to encourage others to contribute. It should be stressed that the taskforce did
not undertake any significant assessment of DSpace or any of the other repository
packages and that such an assessment should be done prior to any implementation.
3
Develop an effective institutional search engine.
In option 3 wide-scale implementation would dictate that the resources mentioned in
section C, above, would be included in the repository. Each of these resources is
unique and can currently be accessed independently to meet a particular need. The
benefit of incorporating these into a single repository enables a single search to be
undertaken across all at once. In doing so these sets of data may lose their
individuality and become part of a larger set of potentially incongruous data. The
development of a HKU institutional search engine as opposed to a “fixed”
institutional repository that enables the user to choose the sets of data that they wish
to search then to conduct a single search across those datasets would serve the same
purpose as a repository whose principal function is to provide integrated searching.
One example of such a search engine is the MetaFind service recently implemented
by HKU Libraries
<http://metafind.iii.com/muse/servlet/MusePeer?action=logon&userID=uhk&userPw
d=uhk&templateFile=search/search.html&pageId=asearch>. This search engine will
search across several discrete databases or repositories, such as ScienceDirect,
LexisNexis and Inspec. It could also be made to search across HKU's local databases
in conjunction to the aforementioned repositories, or in isolation from them. Another
example is the Open Archive Initiative (OAI) service providers. Metadata could be
harvested from the several HKU repositories, and included in one or several of them,
to enable searching across HKU repositories in conjunction to other non-HKU
repositories, or in isolation from them. One OAI service provider is ARC at Old
Dominion University, <http://arc.cs.odu.edu:8080/oai/advanced_search.jsp>. It is
worthy to note that the HKU Theses Online (HKUTO) is already searchable in ARC.
4
Adopt a wait and see approach.
The concept of an institutional repository is a relatively recent phenomenon.
However tracking further developments in this area, in particular which models of
implementation, which hosting software, and which definition of IR content are most
successful, will provide the University with a greater degree of certainty in any future
implementation.
H
Conclusion
Members concurred that any institutional repository for HKU should, at least in the
first instance, only hold material that is of scholarly value. With this precondition,
taskforce members agree that option 1, based on the OSU model, is inappropriate.
-8-
Furthermore this option presents a high financial risk, particularly as we lack the
clarity of commitment from the wider University community for such an undertaking.
Option 2 provides a more realistic approach for the HKU situation and in fact seems
to be quite representative of most current models of implementation, with the obvious
exception of that at OSU. This option is also not without risk, albeit on a smaller
scale than the previous option. Testing the DSpace platform, identifying a relevant
department with appropriate material, purchase of hardware and dedicating technical
staff must all be considered.
Option 3 provides for a technology that has been tested and proven effective. It will
allow present HKU initiatives to continue on their present course. These include the
aforementioned Research & Scholarship database linking, HKUTO, and others such
as the Library’s ExamBase, Sun-Yat sen in Hong Kong, etc. It will allow these
existing databases and repositories, and future ones, to maintain their separate and
unique identities, as well as give the user the opportunity to search, at one go, across
all of them. Option 3, compared to Options 1 and 2, is much less labour/cost
intensive. It is also an option that could be a stepping stone for us; ie, allow us to
offer meta-searching across many HKU repositories now, but allow the opportunity in
the future to once again consider options 1 and 2, perhaps after more defining
developments have occurred and the field comes into better focus.
Option 4 provides the University with the time to witness the relative success of the
various models of implementation, software platform and content definition and to
base our implementation on the most successful of these.
Perhaps underpinning any final decision is the need to firstly determine our own
rationale for desiring an institutional repository for the University of Hong Kong.
Having identified this rationale, the choice of options and the definition of material to
be included may be made with greater confidence.
-9-
I
Summary of Options: Pros and Cons
Option
1 Undertake a full
implementation of
DSpace (or other similar
package) with a full
approach, similar to the
OSU model.
Pros
Greatest possible single
point of access to
university digital
information.
Cons
Much material may be of
minimal value and not
warrant the effort.
We already have several
components that could be
included.
These unique components
lose their identity.
Requires ongoing
commitment from many
areas of the University –
uncertainty that such
commitment will be
forthcoming.
High cost and therefore
financial risk.
2 Implement DSpace (or
other similar package)
with a ‘soft’ approach.
3 Develop an effective
institutional search
engine.
Need only identify a single
department to contribute
materials.
Provides a test-bed
environment with only
minor risk.
Most institutions
implementing IRs are
adopting this method.
Proven technology.
Low cost and
maintenance.
Allows existing databases
and repositories, and
future ones, to maintain
their separate and unique
identities.
- 10 -
Few other
implementations of this
kind to learn from.
Need to identify that
department
Some financial risk
(purchase of hardware and
dedicating technical staff
must all be considered),
but lower than option 1.
University not seen at the
forefront of the IR
movement.
4 Adopt a wait and see
approach.
Requires little immediate
effort.
Enables an analysis of
models of implementation,
hosting software, and
definition of IR content
that are most successful
Taskforce members
Colin Day, Press
Clara Ho, Press
MC Pong, Computer Centre
David Palmer Libraries
Tina Yee-wan Pang, Museum and Art Gallery
Peter Sidorko (Chair), Libraries.
- 11 -
University not seen at the
forefront of the IR
movement.
References
Lynch, C. A., 2003, Institutional repositories: Essential infrastructure for scholarship
in the digital age. portal: Libraries and the Academy, 3(2) pp. 327-336.
Massachusetts Institute of Technology (MIT) Library, 2002, DSpace durable digital
repository: definition <http://dspace.org/what/definition.html>.
Ohio State University Library, 2003, Digital projects at the Ohio State University,
<http://dlib.lib.ohio-state.edu/DISC/academics.php>.
Rogers, S.A., 2003, Developing an institutional knowledge bank at Ohio State
University: From concept to action plan. portal: Libraries and the Academy,
3(1) pp. 125-136.
Scholarly Publishing & Academic Resources Coalition (SPARC), 2002, The Case for
institutional repositories: a SPARC position paper, prepared by R Crow,
SPARC,
Washington,
DC,
available
at
<http://www.arl.org/sparc/IR/IR_Final_Release_102.pdf>.
Scholarly Publishing & Academic Resources Coalition (SPARC), 2002a, SPARC
institutional repository checklist & resource guide, prepared by R Crow,
SPARC,
Washington,
DC,
available
at
<http://www.arl.org/sparc/IR/IR_Guide_v1.pdf>.
Young, J. R., 2002, 'Superarchives' could hold all scholarly output. Chronicle of
Higher Education, 48(43) pp. A29-A30.
(c:\mmy\knowledge-team\institutional-repository-for-hku-issue-paper)
- 12 -
Download