Some thoughts about the access models and authentication practices used by the MIT Library to control availability of licensed content. 01-21-2015 Rich Wenger, MIT Library I. Context The Library’s infrastructure for managing access to licensed e-resources is anachronistic and constructed on assumptions and practices that are far out of date. The access to the majority of our e-resources is based on IP address filtering, i.e. the assumption that users are physically present on the MIT campus. MIT’s IP address ranges are sent to vendors, who configure their servers to accept HTTP connections from any IP address in these ranges. While the assumption about the geographical location of users may have been largely true at some time in the past, it has become less and less true over time. Legitimate MIT users are now scattered worldwide in all time zones. The solution to the dispersal of legitimate users beyond the campus network was to implement a proxy server. This aggregates all off-campus network traffic behind one on-campus IP address, enabling users to gain the advantage of a presence within the range of IP addresses that vendors accept. Several points should be noted about the infrastructure just described. a. It is a model based on ideas that are so out of date that they nearly cross into parody. These anachronous ideas are: - a user’s geographical location can be inferred by his/her IP address. - authentication and authorization decisions can accurately be determined from this derived geographical location. b. Because the assumptions in (a) are so wildly off the mark, we have fostered a small industry of intentional deception i.e. proxy servers and VPNs. Both proxy servers and VPNs pretend that users who are globally dispersed are in fact located at some privileged geographical location. We have attached sophisticated authentication and authorization tools to these systems. Aside from the evident absurdities, these systems have one salient feature: they require users to enter our portals first before they may proceed to the journal or database in question. c. All “off-campus” network traffic to a vendor’s site is aggregated behind a single IP address (the proxy server) and is “seen” by the vendors’ servers as a single user. If the vendor detects inappropriate use from this IP address and disables it, the entire campus is excluded from access until the matter is resolved. Meanwhile, the world has changed. a. Assumptions about users’ geographical location no longer make sense. b. Users are discovering electronic resources via direct access to search engines and discovery systems. They don’t necessarily enter our portals first. c. We are licensing a growing amount of streaming content, which is not well suited to handling through a proxy server. d. Collaboration between universities has increased markedly, and will continue to grow, requiring users from one institution to have an authenticated presence at the collaborating institution. This increasing collaboration will accelerate the need for federated ID management, a practice that we should extend to vendors as well. e. The exploitation of compromised machines, stolen credentials, and Tor exit nodes as vectors for illegitimate use of licensed resources is increasing sharply. When vendors detect inappropriate use, they block the offending IP address and inform us of the problem. The important point is that the IP address the vendor blocks is our proxy server, interrupting service to thousands of patrons. II. What we need a. A much simpler access path that neither assumes the physical presence of users on the MIT campus, nor requires their entrance through our portals as a pre-condition for access to licensed resources. Users should be able to navigate directly to a vendor’s site, and be referred to the appropriate institution for authentication by the vendor when/if authentication is needed. b. An access model that enables session-based control instead of IP-based aggregation. c. An ability to impose authorization requirements on access to licensed resources where needed. A flexible, federated approach to identity management accomplishes this. III. Questions for discussion a. Federated identity management tools like Shibboleth have been available for years and have reached a high level of stability. Why have academic libraries been so reticent to embrace and implement these tools? b. Proxy server aggregation and IP filtering are dead end practices that have changed little in two decades. Why are so many of us content to plod along with these dead end practices? What is preventing us from updating our practices to something more appropriate to the electronic environment in which we find ourselves? c. Are there other solutions in addition to federated ID management that should be considered? d. One small step we could take to simplify the landscape a bit would be to make unconditional authentication (everybody has to log in, including on-campus people) the default setup. It appears that very few of us do this. Why? e. How can university libraries and IT departments collaborate to standardize and promote updated practives?