Analytics @ Lancaster University Library IGeLU 2014 John Krug, Systems and Analytics Manager, Lancaster University Library http://www.slideshare.net/jhkrug/igelu-analytics-2014 Lancaster University, the Library and Alma • • • • • We are in Lancaster in the UK North West. ~ 12,000 FTE students, ~ 2300 FTE Staff Library has 55 FTE staff, building refurbishment in progress University aims to be 10, 100 – Research, Teaching, Engagement Global outlook with partnerships in Malaysia, India, Pakistan and a new Ghana campus • Alma implemented January 2013 as an early adopter. • I am Systems and Analytics Manager, at LUL since 2002 to implement Aleph – systems background, not library • How can library analytics help? Alma Analytics reporting and dashboards • Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks • Ongoing work in this area, refining existing and developing new reports Results Fun with BLISS B Floor 9AZ (B) 347 lines of this! Projects & Challenges • LDIV – Library Data, Information & Visualisation • ETL experiments done using PostgresQL and Python • Data from Aleph, Alma, Ezproxy, etc. • Smaller projects: • e.g. Re-shelving performance – required to use Alma Analytics returns data along with the number of trolleys re-shelved daily. • Challenges – Infrastructure, Skills, time • Lots of new skills/knowledge needed for Analytics. For us : Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc Alma analytics data extraction • Requires using a SOAP API (thankfully a RESTful API is now available for Analytics) • SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc. • A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result. • Needs porting from SOAP to REST. • Data extraction from Alma Analytics is straight forward, especially with REST Data from other places • • • • • • • • • • Ezproxy logs Enquiry/exit desk query statistics Re-shelving performance data Shibboleth logs, hopefully soon. We are dependent on central IT services Library building usage counts Library PC usage statistics JUSP & USTAT aggregate usage data University faculty and department data Social networking New Alma Analytics subject areas, especially uResolver data Gaps in the electronic resource picture • Currently we have aggregate data from JUSP, USTAT • Partial off campus picture from ezproxy, but web orientated rather than resource • Really want the data from Shibboleth and uResolver • Why the demand for such low level data about individuals? The library and learner analytics • Learner analytics a growth field • Driven by a mass of data from VLEs and MOOCs …. and libraries • Student satisfaction & retention • Intervention(?) • if low(library borrowing) & low(eresource access) & high(rate of near late or late submissions) & low_to_middling(grades) then do_something() • The library can’t do all that, but the university could/can • Library can provide data The library as data provider • LAMP – Library Analytics & Metrics Project from JISC • • • • • http://jisclamp.mimas.ac.uk We will be exporting loan and anonymised student data for use by LAMP. They are experimenting with dashboards and applications Prototype application later this year. Overlap with our own project LDIV • The Library API • For use by analytics projects within the university • Planning office, Student Services and others The Library API • Built using openresty, nginx, lua • Restful like API interface • e.g. Retrieve physical loans for a patron • GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json) <?xml version="1.0" encoding="UTF-8"?> <response> <record> <call_no>AZKF.S75 (H)</call_no> <loan_date>2014-07-10 15:44:00</loan_date> <num_renewals>0</num_renewals> <bor_status>03</bor_status> <rowid>3212</rowid> <returned_date>2014-08-15 10:16:00</returned_date> <collection>MAIN</collection> <rownum>1</rownum> <material>BOOK</material> <patron>b3ea5253dd4877c94fa9fac9</patron> <item_status>01</item_status> <call_no_2>B Floor Red Zone</call_no_2> <bor_type>34</bor_type> <key>000473908000010-200208151016173</key> <due_date>2015-06-19 19:00:00</due_date> </record> </response> [{ "rownum": 1, "key": "000473908000010-200208151016173", "patron": "b3ea5253dd4877c94fa9fac9", "loan_date": "2014-07-10 15:44:00", "due_date": "2015-06-19 19:00:00", "returned_date": "2014-08-15 10:16:00", "item_status": "01", "num_renewals": 0, "material": "BOOK", "bor_status": "03", "bor_type": "34", "call_no": "AZKF.S75 (H)", "call_no_2": "B Floor Red Zone", "collection": "MAIN", "rowid": 3212 }] How does it work? • GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml • Nginx configuration maps REST url to database query location ~ /ploans/(?<patron>\w+) { ## collect and/or set default parameters rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt; } location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json { postgres_pass database; rds_json on; postgres_query HEAD GET " select * from ploans where patron = $patron and row >= $start and row < $start + $nrows"; } Proxy for making Alma Analytics API requests • e.g. Analytics report which produces • nginx configuration location /aa/patron_count { set $b "api-na.hosted.exlibri … lytics/reports"; set $p "path=%2Fshared%2FLancas … tron_count"; set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73"; proxy_pass https://$b?$p&$k; } • So users of our API can get data directly from Alma Analytics and we manage the interface they use and shield them from any API changes at Ex Libris. Re-thinking approaches • Requirements workshops • Application development • Data provider via API interfaces • RDF/SPARQL capability • LDIV – Library Data, Information and Visualisation • Still experimenting • Imported data from ezproxy logs, GeoIP databases, student data, primo logs, a small amount of Alma data • Really need Shibboleth and uResolver data • Tableau as the dashboard to these data sets Preliminary results More at http://public.tableausoftware.com/profile/john.krug#!/ • First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM • Questions?