Evaluating the Usage of Networked Electronic Resources Terry Plum Assistant Dean, Simmons GSLIS Library Assessment Technological Educational Institution Thessaloniki, Greece June 15, 2005 Why Evaluate Usage of Digital Resources? • Data driven decisions • Justification to patron groups • Budget justification to external funding sources. • Collection development decisions • Outputs for performance assessment • Assessment of service quality • Outcomes assessment • Strategic planning June 15, 2005 2 Cost • Association of Research Library members spend 215% more per serial unit cost in 2003 than they did in 1986. • The average expenditures for serial subscriptions for all serials (not just scholarly journals) in ARL academic libraries in 2003 are $5.46 million. • From 1984 to 2002, business and economics journals increased in price 423.7%, chemistry and physics journals increased 664%, and journals in medicine by 628.7%. June 15, 2005 3 Cost June 15, 2005 4 Vendor Supplied Data • Problems 1. 2. 3. 4. • Vendor reports do not provide sufficiently detailed information. Vendor reports are inconsistent in their application of the definitions of variables. Vendor reports are not commensurable between each other. Some vendors do not report anything. Practical solutions 1. 2. 3. 4. 5. 6. Number of login (sessions) to networked electronic resources Number of queries (searches) in networked electronic resources Number of items requested in networked electronic resources. Turnaways or exceed simultaneous use level. Monthly Level of effort, both by the vendor and by the library June 15, 2005 5 Vendor Supplied Data • Project COUNTER - Counting Online Usage of Networked Electronic Resources – http://www.projectcounter.org/ • ICOLC – International Coalition of Library Consortia – http://www.library.yale.edu/consortia/ • ISO – International Standards Organization – ISO 11620 Library Performance Indicators – http://www.iso.org/ • NISO – National Information Standards Organization – NISO Z39.7 Library Statistics – http://www.niso.org/ June 15, 2005 6 ARL E-Metrics • As summarized by Blixrud and Kyrillidou (2003), asks for the following data from ARL libraries for measuring use of networked electronic resources, data which most libraries can only provide by collecting and analyzing vendorsupplied transaction data: • Number of login (sessions) to networked electronic resources • Number of queries (searches) in networked electronic resources • Number of items requested in networked June 15, 2005 7 electronic resources. Web Statistics • Web server log files – transaction - client/server – Technical representation of tasks performed by server • Log files (common) – IP address of requesting computer – Remote host: name of computer accessing the web server – Name of remote user (usually blank) – Login of remote user (usually blank) – Date June 15, 2005 8 Log Files • Referrer Log File – URL requested from or referring page • Agent Log File – – – – Browser Operating system Name of spiders or robots used to probe your web site IP address of requesting computer • Example • 127.0.0.1 - frank [10/Oct/2004:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" June 15, 2005 9 Log files generated by library proxy servers • Proxy servers or passthrough (clickthrough) servers firewalls are based in some degree on an examination of headers • Can examine all requests that pass through it, so it is starting to make sense to put a proxy server in front of all library databases and ejounals. • Increasingly used as a data collection point for commensurable or comparable data. June 15, 2005 10 What do log files tell us? • Nothing if they are not analyzed. – – – – – What pages are requested on your site IP addresses of computers making requests Date and time of requests Success of file transfer Last page a requester visited before coming to your site – Search terms which led someone to your site. June 15, 2005 11 More log files • Logs and reports from locally implemented journal article services • Logs and reports from locally implemented digital library projects • ILS log files and reports – Becoming more interesting with metasearch engines – OPAC June 15, 2005 12 ILS Log Files • OPAC Search statistics – Number of searches attempted • By fields • Search terms • Null results – Print statistics such as items checked out, holds placed, etc. – Difficult to track usage of 856 links. June 15, 2005 13 Log Analysis Software • Analog – http://www.analog.cx/ – example • http://www.statslab.cam.ac.uk/webstats/stats.html • http-Analyze – http://www.netstore.de/Supply/http-analyze/ • WebTrends – http://www.netiq.com/webtrends/default.asp June 15, 2005 14 Log Analysis Software June 15, 2005 15 Issues with web surveys • Non-probability – Entertainment surveys – Self selected surveys – Volunteer panels • Probability – Intercept (every nth) – Surveys that obtain respondents from an e-mail request. – Mixed-mode surveys where one of the options is a Web survey. – Pre-recruited panels of a particular population as a probability sample June 15, 2005 16 Issues with web surveys • Research design – Coverage error • Unequal access to the Internet • Internet users are different than non-users – Response rate • Response representativeness – Random sampling and inference – Non-respondents June 15, 2005 17 Issues with web surveys • Mistrust of web surveys • Vendor data is census; web survey is a sample • Web surveys typically associated with user data, not usage data. • Even if usage, web surveys often collect predicted, intended or remembered usage, not actual usage • Web survey forms make appear differently in different browsers June 15, 2005 18 • • • • • Networked electronic resources and services - assessment environment Resources are accessible from many different web pages and web servers Bookmarks The survey data must be collected and commensurable for all networked electronic resources. Different authentication methods have to be accommodated, whether the institution used IP, password, referring URL, or an authentication and access gateway. Remote usage has to be measured, regardless of the channel of communication, whether locally implemented proxy server, modem pool, or other institutional service. June 15, 2005 19 MINES strategy • A representative sampling plan, including sample size, is determined at the outset. Typically, there are 48 hours of surveying over 12 months at a medical library and 24 hours a year at a main library. • Random moment/web-based surveys are employed at each site. • Participation is usually mandatory, negating nonrespondent bias, and is based on actual use in real-time. • Libraries with database-to-web gateways or proxy re-writers offer the most comprehensive networking solution for surveying all networked services users during survey periods. June 15, 2005 20 Web Survey Design Guidelines • Web survey design guidelines that MINES followed: – Presentation • Simple text for different browsers – no graphics – Different browsers render web pages differently • • • • • • Few questions per screen or simply few questions Easy to navigate Short and plain No scrolling Clear and encouraging error or warning messages Every question answered in a similar way - consistent – Radio buttons, drop downs • Introduction page or paragraph • Easy to read – Must see definitions of sponsored research. • Can present questions in response to answers – for example if sponsored research was chosen, could present another survey June 15, 2005 21 How to implement web surveys on library web sites • Because the point of use requirement, libraries that had a virtual gateway in library web architecture succeeded the best. • Rewriting proxy server • Database-to-web solutions • Serials Solutions • Interestingly openURL solutions are a gateway. June 15, 2005 22 Library web architecture June 15, 2005 23 Digital Libraries June 15, 2005 24 Digital Libraries June 15, 2005 25 Pre-print and post-print servers June 15, 2005 26 Pre-print and post-print servers June 15, 2005 27 Open Access Journals June 15, 2005 28 Library web architecture June 15, 2005 29 What is the future of assessment of networked electronic services • Library is responsible for many heterogeneous resources, not just subscriptions. • A library gateway could position the library to constantly assess usage of its resources. • This tool will just be one of many, along with LibQUAL+tm and other initiatives. June 15, 2005 30