presentation to abc - National Library of Australia wiki

advertisement
NSLA OPEN BORDERS PROJECT
USER AUTHENTICATION FOR E-RESOURCES WHICH WILL BE
ACCESSED VIA TROVE: A DRAFT MODEL
Working Draft: 2 December 2009
Introduction
The Open Borders Project is one of the “Reimagining Libraries” Projects sponsored by
National & State Libraries Australasia (NSLA).
The underlying objective of this Project is to allow Australian library users to have improved
access to e-resources, especially those e-resources subscribed to by NSLA member libraries
and Australian public libraries.
The Project aims to ensure that these e-resources are used to the maximum extent possible. It
will support a user-centric approach to the discovery and access of these e-resources, whereby
users can link to those articles which they are entitled to access by virtue of their library
memberships and their libraries’ licences.
This access and linking framework will be based on the National Library of Australia’s new
discovery service, known as Trove.
To support this user-centric approach, the National Library will develop a set of partnerships
with e-resource vendors, whereby:

article-level metadata including the vendor’s article URL, and in some cases full-text
articles for indexing, will be provided by the vendor to the NLA for inclusion in Trove;

the vendor will also supply data about which articles are in which products and which
products are licensed by which libraries;

users of Trove will be encouraged to register with Trove and to provide information to
Trove about which libraries they are affiliated with;

Trove will index the article-level metadata (and full text where available) and would use
the subscription and affiliation data to give “Available online” status to those articles
which the user is entitled to click through to and read;

Trove will facilitate the process of authenticating the user; and

Trove will refer the user to the vendor’s site, where any remaining authentication and
access to the full text would be managed.
To date two vendors (Cengage Gale and RMIT Publishing) have agreed to work with the
National Library to expose their e-resource content in Trove in accordance with the scenario
above.
The remainder of this paper presents a draft model for how the linking and authentication
process could operate. This model is offered in order to gain the feedback of the Open
Borders Project Group. Comments and suggestions for improving the model are welcomed.
In this model, five use cases have been identified: these are described below.
Please note that the National Library has planned to undertake these extensions to Trove
during the second half of 2010. However, at this stage the Library is not able to make a firm
commitment to implementing all five of the cases below by the end of 2010.
CASE 1. Trove has no information about what library, if any, the user is affiliated with.
Linking process

An article of interest is discovered by the Trove user after applying the facet “Online –
Access conditions”. The user clicks on the article details in the result set

Trove informs the user that if they registered with Trove and established a profile
identifying their affiliated libraries, Trove may be able to offer them free access courtesy
of those libraries. [Trove may do this via a mouse-over text note, or a “help” icon next
to the link which pops up a box explaining this, or an intermediate screen which explains
the situation and a “continue” button which leads to the vendor’s pay-per-view page]

If the user supplies such details – see Cases 2-5 below

If not, Trove refers the user to the vendor site’s “pay per view” page, passing the URL of
the article as a parameter. Access to the PDF of the article will be provided after the
user supplies valid credit card details.
Caveats

It is assumed that the e-resource vendor has a “pay per view” option (as does RMIT
Publishing).
CASE 2. Trove has information about the user’s library affiliations, but none of the
affiliated libraries subscribe to a product containing the article
The linking process for this case is identical to that for Case 1, except that Trove will not
inform the user about the benefits of providing the affiliation information.
Caveats

In some cases Trove’s knowledge of the vendor’s subscribing libraries will be out of
date, or its knowledge of IP address ranges within libraries will be out of date. In some
cases, therefore, the user may get free access instead of pay-per-view access.
2
CASE 3. Trove has information about the user’s library affiliations, at least one of these
libraries subscribes to a product containing the article, and the user can be IP
authenticated as having onsite access privileges
Linking process

An article of interest is discovered by the Trove user after applying the facet “Online –
Freely available”. The user clicks on the article details in the result set

Trove provides an intermediate page informing the user of which of their affiliated
libraries can provide access to this article, and asks the user to select a library

Trove refers the user to the vendor site, where the IP address of the library will be
verified, and the user will be given access to the PDF of the article without further
authentication.
Caveats

Some users that are onsite may not be affiliated with the library (they may be a “walkin” user). In some cases these users may be entitled to access the article

Some universities require the users to logon with student/staff-id and password in order
to access e-resources, whether they are onsite or not. Such cases will be handled as per
Case 4 below

In some cases it will not be possible to determine with certainty that the user is onsite
o
To infer onsite status, Trove will need to keep track of IP address ranges for
Australian libraries and individual library branches and campuses. Some of this
information may be incorrect or out of date
o
For some IP address ranges, Trove can do name lookups to convert the IP address
into a name and extract the domain name. For example, any IP address which
resolves to a domain name ending “.anu.edu.au” may be automatically (regardless
of user preferences) associated with the Australian National University library
o
For some IP addresses which do not resolve to names, Trove can see who owns the
network containing them, and could similarly infer the associated library
o
For some libraries, especially public libraries, the ISP arrangement may be such as
to make it impossible to determine a domain name that reflects the library.
o
For libraries with an OPAC, the library server will usually have a “proper” and
permanent name, but the “in library” public-use network may be unrelated to this.
3
CASE 4. Trove has information about the user’s library affiliations, at least one of these
libraries subscribes to a product containing the article, the user is not onsite at that
library, but the library has an EZproxy server.
Linking process

An article of interest is discovered by the Trove user after applying the facet “Online –
Freely available”. The user clicks on the article details in the result set

Trove checks its directory databases and finds that at least one of the user’s affiliated
libraries subscribes to a product containing the article, and has an EZproxy server

Trove provides an intermediate page informing the user of these affiliated libraries, and
asks the user to select a library

Trove creates a link to the relevant EZproxy server, passing the article URL as a
parameter

The user enters their credentials and the EZproxy server authenticates the user

The EZproxy server redirects the user to the article URL

The vendor site trusts the referrer, given that it can verify the address of the EZproxy
server, and the user will be given access to the PDF of the article.
Caveats

Some libraries, rather than using EZproxy, use a “simple authentication page” which
redirects the user with a referrer. This case implies that the vendor is willing to trust the
referrer header as an indicator that user really does come from the customer-library.
This alternative within Case 4 is not recommended because it is not secure (ie, the
referrer header can be “spoofed”).
CASE 5. Trove has information about the user’s library affiliations, at least one of these
libraries subscribes to a product containing the article, the user is not onsite at that
library, and the library does not have an EZproxy server.
Linking process

An article of interest is discovered by the Trove user after applying the facet “Online –
Freely available”. The user clicks on the article details in the result set.

Trove checks its directory databases and finds that at least one of the user’s affiliated
libraries subscribes to a product containing the article, that none of these libraries has an
EZproxy server, but that some of these libraries are listed in the directory of library login
pages

Trove provides an intermediate page informing the user of which of their affiliated
libraries can provide access to this article, and asks the user to select a library
4

The user is presented with a Trove login screen which requests local login information
for that library. This screen will include:
o
the library name and perhaps its logo (to help trigger context)
o
a sample picture of the borrower card this library issues
o
instructions for completing the credentials, which may vary, eg:
o
user name, password
o
borrower-id, pin
o
barcode, surname

The user enters their credentials

Trove attempts to validate these details by “pretending to be a human being” and
entering the login credentials at the real library login page

If this is login is successful (ie the library website isn’t down and the credentials are
accepted), one of the following three actions occur, depending on the arrangements
agreed with the vendor:
o
Behind the scenes (ie hidden from the user) Trove connects to the vendor’s site,
using the URL representing the article, and providing the user’s library
information as the vendor’s customer library code, which Trove has derived from
its own library code. (There will be a new session for every request. The vendor
will have previously agreed to permit and trust sessions that originate from Trove,
and will expect such sessions to be started in such a way as to tell the publisher of
the customer on whose behalf this request is being made.) The PDF of the article
is obtained in this “behind the scenes” process, and it is then returned to the user
by Trove.
o
Trove analyses the page that is obtained from the vendor’s site to find the link to
the PDF, and then issues another HTTP request to obtain the PDF and return it to
the user.
o
The user is referred to the vendor web site, starting a session for the user’s librarycustomer. (In this case, the vendor must trust Trove as a referrer, and the referring
URL must contain information which determines which library-customer this
session is to be run on behalf of.)
Caveats

The user’s access to the article will be dependent on the existence of a database of
Australian library login web addresses and screens. Each entry in this database will give
a “category” to the web page which appears following the login – this category usually
relates to the type of library ILMS

The vendor site may experience timeouts on its sessions, which may leave the user
“stranded”

Case 4 will always be preferred to Case 5. Case 4 gives control to the library and means
that Trove does not have to handle userids and passwords for other organisations.
However, Case 5 may be all that is available for most public libraries
5

There are two security issues with this process: (a) the National Library is handling third
party authentication credentials, and (b) in the case of the third alternative above where
the NLA referrer URL is trusted, it can be easily spoofed or forged
Summary of data development tasks
The analysis above has revealed that, contrary to some earlier expectations, a significant effort
will be required by the National Library to assist with user authentication. In particular, the
Library will need to:

Create a database of all Australian library EZproxy server addresses with sufficient
configuration information to enable Trove to format article URLs so they will be
correctly handled by the EZproxy server, and also record local library IP allocations to
assist Trove in determining if a user is onsite in a library

Create database of “short library names”, to help Trove users recognize and select their
library by name

Obtain lists from RMIT Publishing and Gale of all of their Australian library customers
and their customer codes

For all libraries without EZproxy servers, create a database of Australian library login
web addresses and associated information

Make mappings from Trove library codes to the vendor library codes

Work with the Open Borders Project Group to identify mechanisms to allow the above
information, if possible, to be collected and maintained in an efficient and timely manner
for state and public libraries.
6
Download