1 OCLC Strategy & Vision: Building Web-Scale for Libraries Robin Murray - Vice-President, Global Product Management, OCLC Robin.Murray@oclc.org Introduction This document presents a brief overview of the OCLC strategy & Vision for library services. This is described within the context of ‘Web-Scale’ services and the implications of such for libraries. The introductory sections provide a brief overview of what we mean by ‘Web-scale’ and how it relates to libraries. This is followed by an overview of OCLC’s strategy and tactical activities for helping libraries deliver a Web-scale presence – an objective we feel is critical for the future of library services. Within this document, the specific questions raised in the invitation are covered implicitly rather than explicitly as follows: How will data be gathered and how will data be aggregated? - is covered in the section on the strategy for incrementally building Web-scale – the gravitational pull of the data aggregation motivates greater aggregation (the data load graph gives some evidence to this). What services will need to be built on top of the aggregation? – the services OCLC are building on top of the aggregation is summarised in the section on the strategy for incrementally building Webscale. What services will this enable for end users (librarians, information managers, researchers, teachers and learners)? – is covered very briefly in the strategy section. How will the aggregation relate to the rest of the web? – is covered very briefly in the section on ‘Provide the most compelling web-scale presence for Libraries’. How will this work with existing services at the local and national level? – this is covered very briefly in terms of linking to local systems for transactional support and national services as ‘metadata hubs’. Will value added services for local provision be provided? – is covered briefly in the strategy section and in the description of web-services and APIs. What are the business models and sustainability issues for data and services? – is not covered explicitly within this document. In the interests of (almost!) meeting the page limit for this document all these sections are extremely brief; far more detail can be provided on request if required. Libraries and Web-Scale What is Web-Scale? Scale matters in the Web environment. Think of the services we use most: Google, Amazon, eBay, FaceBook. They are massive concentrations of data and connections, and they mobilise large user populations. They serve consumer needs (for search, connection, purchase). They also serve provider needs (through e-commerce, storefront facilities, advertising …). At the same time, we are seeing businesses emerge—Salesforce.com is a good example—that use the Web to deliver infrastructure and enterprise solutions on an ‘on-demand’ or ‘utility’ basis to businesses. In these examples, the Web allows capacities to be concentrated to create scale, and to deliver the benefits of scale to many users. Successful organisations and communities on the web have created a virtuous circle by aggregating supply, and through that, aggregate demand. This is at the heart of Web scale. The principle is simple. The Web allows organisations to create scale and to deliver the benefits of that scale to large numbers of other users through the Web. Robin Murray Page 1 2 Characteristics of Web-Scale Services Web-Scale services operate at a “Network Level” (or on a Software-as-a-Services basis) and leverage the typical benefits of cost savings and greater efficiencies (by removing redundant local provision and by creating improved system-wide capacity). However, Web-Scale services go further than this and should provide greater impact through: Network effects. These are a powerful driver for many services, producing a ‘rich get richer’ dynamic. The more buyers there are on eBay the more attractive it is to sellers; the more sellers there are on eBay the more attractive it is to buyers. The more people use Google, the more sites want to be indexed by it and the more advertisers want to use it. The more people buy from Amazon the better its recommendations become and the more people want to use it. This dynamic will also increasingly be visible in Web-scale process/workflow providers as they share, build and leverage integrated data and leverage integrated workflows. Data-driven. Sites collect data about use behaviours and use this to improve the service. They use shared purchasing patterns, navigation options, recommendations and so on, to develop a stronger relationship with individual customers and to refine their offerings. The long tail: efficient matching of supply and demand at the network level. Web-scale services make the matching of supply and demand more efficient. Consider an online bookstore compared to a physical one. In a physical store, stock is limited by available shelf space, and has to justify its position on the shelves by sales. The potential customer base is limited by the demographics of a geographically proximate population. An online bookstore can aggregate supply from many sources and provide integrated search and delivery over a very large inventory. A successful site can also mobilise a large online population. In this way supply and demand are better matched: less common materials have a better chance of finding a buyer, and specialist buyers have a better chance of finding materials of interest. Sales and satisfaction increase. This is at the heart of the long tail dynamic that finds audiences for niche materials. Short Install/refresh cycles. – software install and refresh cycles are very short to ensure rapid enhancement of the platform allowing the platform to stay current in the web environment. This differs dramatically from the deployed software environment where users’ applications are typically many years behind current technology. Web services access – in order to maximise use of, and value-added enhancements to, the platform access is provided to Application Program/Web Service Interfaces alongside direct user interfaces. Web-Scale and Libraries Whilst the descriptions above have used examples of Web-Scale services from the general internet environment, the applicability and benefits of Web-Scale concepts to libraries are clear. It is our contention that libraries NEED to operate at Web-Scale to be successful and this is fundamental to their future in the internet age. However, currently libraries do not scale well in Web terms. They have a very fragmented presence on the Web: their impact is low. Their cost of doing business is high as they have a complicated array of systems across print, licensed and digital materials workflows. Libraries need to invest time in putting their systems to work, not in trying to get their systems to work. If libraries are to create a compelling Web presence and system-wide management efficiencies they will have to move to a Web-scale model. They cannot do this individually. OCLC Strategy: Building Web-Scale for Libraries OCLC’s Unique Positioning Given its reach, scale, capacities, and mission, OCLC is uniquely placed to help libraries create Webscale. WorldCat (and associated services such as the Registry) aggregate data about libraries, about their collections and about their services. WorldCat is distinctive in its scale: as an aggregate it is Robin Murray Page 2 3 unsurpassed, and the bibliographic data it collects can be mined to improve the service and create new knowledge. However, WorldCat is not simply a collection of data: WorldCat allows us to configure the library network to support Web-scale services. It makes connections between bibliographic data and libraries, and between libraries and the services they offer. This increasingly is its distinctive value. It makes the library network work. Given our belief that Web-Scale is fundamental to the future of libraries; that OCLC is uniquely positioned to help libraries achieve Web-Scale; and the historic mission, vision and values of OCLC, building Web-Scale for libraries is at the heart of OCLC’s vision and strategy. Incremental Building Web-Scale for Libraries: The Fly-Wheel Delivering Web-Scale for libraries is critical, but far from simple; it cannot be achieved in a single step, by a single organisation, or with a single product. Our strategy for helping libraries to Web-Scale aims to generate a ‘fly-wheel’ effect (depicted here); a set of mutually reinforcing initiatives that incrementally build Web scale. These initiatives are: Provide the most compelling web-scale presence for Libraries – a large-scale, compelling presence on the web that drives traffic toward local library services. Leverage the value of web-scale into end-user solutions for libraries – Provide local library views of this web-scale presence that surface and access local services. Syndicate Grid services through partner programs – Provide web-service access to the web-scale platform to maximise usage and to encourage an eco-system of development partners and library developers to provide value-added services. Reduce the cost of Library Management with Grid-enabled systems and services – leverage the economies of scale of network level services to dramatically reduce costs for libraries. Enrich the WorldCat Grid through greater service and data coverage – by demonstrating the value of web-scale services generate gravitational pull for greater data and service aggregation, thus adding momentum to the fly-wheel. To execute this strategy we aim to continually drive each of these initiatives to increase momentum toward the web-scale presence for libraries. Executing the Strategy The following section briefly describes some of the activities being undertaken at each of these points on the fly-wheel: Provide the most compelling web-scale presence for Libraries – Our strategy is to continue to enhance WorldCat.org as a web-scale destination for library services, but more critically as a global switch into local library services. During 2009 we expect to deliver ~8M click-throughs to library services from the open web. The vast majority of these originate from linking sites such as Google, Yahoo!, Google Scholar and a myriad of partner sites as opposed to users directly looking for library services by entering the site - Search-Engine-Optimization and library-affiliate partnering and linking are key features of the development strategy. Through building momentum on the fly-wheel we aim to achieve 50%-100% p.a. growth in traffic on WorldCat.org. Robin Murray Page 3 4 Leverage the value of web-scale into Local & Regional end-user solutions for libraries – efficiently delivering users from the open-web to local library services requires smooth transition of user-experience; Local library OPACs are typically not good at this and present barriers to the transaction flow. WorldCat Local is OCLC’s offering that aims to surface the full range of local library services on the web, tightly linked to and leveraging the value of WorldCat.org. It aims to provide a smooth transition over the ‘last mile’ from the open web to the local library service. WorldCat Local can be tailored and branded for Local and/or Regional/national service provision. WorldCat Local provides a localized view of the library catalogue links to the local circulation system for real-time availability and transaction management. It leverages ‘Web-Scale’ by providing contextualized access to the whole of WorldCat and benefiting from global aggregation for services such as reviews, ratings and community-building features. 2009 will see the inclusion of licensed database metadata within the WorldCat index along with integrated meta-search in order to get closer to surfacing the full range of library services. The launch of WorldCat Local ‘quick start’, freely available as part of the OCLC FirstSearch subscription, is a strategic drive to maximise uptake, building momentum in the fly-wheel in order to incrementally achieve web-scale for libraries. Syndicate Grid services through partner programs – The strategy for OCLC services is that even those libraries not using OCLC applications should be able to benefit from the benefits of the webscale platform through the use of APIs and Web-Services. Those libraries that do use OCLC applications can also use these interfaces to derive value-added enhancements. The launch of the WorldCat API and related services along with the OCLC Developer Network are the start of the strategy to broaden the use of the platform beyond OCLC-derived applications. Reduce the cost of Library Management with Grid-enabled systems and services – a full suite of management applications is being developed that operate at the network level; leverage network effects; and dramatically reduce costs. As above libraries will be able to use the native applications or interface with a set of APIs. Enrich WorldCat through greater service and data coverage – as the scale and richness of the WorldCat services increase, the gravitational pull of the aggregation increases and thus more libraries wish to benefit from that scale. The chart shows the annual volume of data loading into WorldCat for the last 10 years. The dramatic growth gives some evidence to the fly-wheel effect of incrementally building web-scale. As data volumes increase, logistics in data loading, data quality control and ongoing synchronisation become an ever growing challenge. The technological capabilities of libraries vary so dramatically that many different tools and techniques are used for data loading and synchronisation. Increasingly regional and national union catalogues are playing the role of ‘metadata hubs’ which synchronise with local catalogues, and perform regional/national quality control. These are then synchronised (either periodically or in real-time) upstream to WorldCat. The WorldCat database is then indexed directly by the Internet search engines. Robin Murray Page 4