Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael VIVO: Enabling National Networking of Scientists A. A Semantic Approach to Research Networking This application proposes a solution to facilitate research networking and collaboration of basic, clinical, and translational researchers including investigators, students, technical staff and others. The Semantic Web/Linked Data approach we envision confers the ability to implement locally controlled researcher network installations that interoperate to create a flexible and scalable multi-institutional network. Although we focus solely on the researcher network for this project, our platform has the capacity to transparently include and interrelate resource listings and other relevant information. Our technology choice allows us to easily consume, integrate and expose data hosted by partners who have other research network or resource discovery platforms in place. B. Rationale and Approach B.1. Rationale We propose an open, Semantic Webbased network of local ontologydriven databases called VIVO to enable national networking via information sharing about researchers and their activities. VIVO will draw on as well as contribute to, other webaccessible services and tools. The Semantic Web1 enables automated and human navigation to represent and mine digital data, and it supports interoperability and integration of data from a variety of sources2. Recently, many of the larger goals of the Semantic Web are starting to be realized, particularly in the new Linked Data3 effort. We have 5 years of experience with VIVO (see Figure 1 Discover Cornell VIVO Interface Figure 1), a real-world Semantic Web application developed at Cornell University in Ithaca (Cornell), and currently in use at Cornell and as GatorScholar4 at the University of Florida (UF). VIVO facilitates research discovery and networking and demonstrates that Semantic Web technology is ready to serve as the foundation for enabling national networking of scientists, providing significant benefits for describing inter-linked data in flexible and openly accessible ways. A significant portion of ongoing and proposed technical innovation related to biomedical research revolves around the goal of facilitating the sharing of data and other sorts of information and resources while enhancing collaboration among researchers across a variety of disciplines. For many researchers the geographical and organizational confines of a department, college, or even a single university bear very little relevance to the scope of their research or the pool of colleagues they may seek for collaboration. Researchers are often left to find their own paths to discover current activities and active researchers in their field and beyond, usually by a combination of personal connection, disciplinary knowledge, and fortuitous discovery through search engines, leaving those who have yet to develop their own network of personal contacts at a significant disadvantage. PHS 398/2590 (Rev. 11/07) Page 1 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael A number of social networking tools attempt to facilitate interpersonal connection by providing a local, national or even global platform to post and link profiles, pictures, ideas and comments. Most of these platforms are closed worlds that often do not support direct interaction with other systems. Technology and marketplace transitions dictate that services and data available today may no longer be freely available tomorrow; no single tool or service is likely to successfully maintain a consistent leadership position, resulting in collective information investment risks being lost in favor of the next popular interface or feature set. Biomedical and translational institutions and programs face similar challenges as do researchers, in presenting a clear picture of their biomedical teaching and research capabilities internally and to the outside world. These institutions seek to encourage cross-disciplinary collaborations but rarely provide any venue to support discovery and nurture person-to-person connections. There is often disconnect between functional areas, with most resources allocated to defining administrative, instructional and research computing needs, rather than the evolving nature of research. The problem is even more severe when looking beyond one institution to understand patterns or trends or identify specific expertise; scientific information is rarely provided with any consistency except within narrow disciplinary confines. We Figure 2 Local VIVOweb instances interlinked with each other and the must be able to Semantic Web communicate diverse activities, expertise, outcomes, and resources in ways that can be understood nationally and even globally, not just in a local context. In this fluid landscape, the key element is how to combine authoritative information from its most local context into a coherent, large-scale picture that will meet the needs of research teams, institutions, and cross-institutional views. VIVO enables a web (VIVOweb) of researcher data that will catalyze and accelerate the creation of connections between researchers to meet these needs. VIVOweb will empower researchers to find information about people of professional interest and extend their research communities not just via prior knowledge or serendipity, but through recommendation or suggestion networks based on commonalities in the profile data. The most fruitful way of promoting researcher networking and discovery at the individual/personal, institutional and national level is to provide authoritative data from and about researchers themselves and about other related institutional resources in an open and consistent format. This is what we PHS 398/2590 (Rev. 11/07) Page 2 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael currently support with VIVO, and are proposing for VIVOweb. These data will be described using explicit semantic relationships, and published on the Semantic Web according to accepted Linked Data standards. We will also make these data available on the human web through locally-managed institutional portals that allow researchers to directly browse and search this data within or across institutions (see Error! Reference source not found.). VIVOweb does not attempt to re-invent collaborative tools such as wikis and blogs or impose that anyone be globally accepted, acknowledging the plethora of established and emerging popular platforms. Instead, it focuses on enabling users to discover each other via networks based on common interests and other direct or indirect connections, incorporating and sharing structured data with other tools as appropriate. As of May 2009 the Linked Open Data initiative offers nearly five billion data element links represented as “triples” of the form (object, relationship, object), for example (person, co-authored with, person) or (person, published, paper). Many of these triples represent biomedically-relevant genome, gene expression, protein and pathway data5. This number continues to grow, and VIVOweb’s semantic approach allows it to easily consume these data to enrich researcher profiles, while also interoperating with these and other data sources and making content available for immediate consumption. A number of publications also suggest that there has been great innovation and interest in Semantic Web applications to facilitate research in a variety of areas within the biomedical and life sciences communities, including genetic and drug efficacy analyses and clinical and molecular dataset management, particularly from the perspective of strengthening translational research—a key goal of the National Institute of Health Roadmap for Medical Research6,7,8,9,10. The richness of the literature on the utility of Semantic Web technologies to further biomedical research leaves little doubt that the application we propose here to enable research networking is not only timely, but the most assured path to long-term utility and participation by individuals and institutions. Interoperable enhanced VIVO installations managed locally, but offering cross-institutional searching, browsing, and other capabilities will form VIVOweb, which will grow as a natural extension of Cornell’s VIVO application from a single, multi-campus research discovery tool to a distributed network. Proof of concept for VIVO’s value in a variety of settings and languages is provided by active installations at UF (GatorScholar11), the University of Melbourne (Find an Expert12), and the Chinese Academy of Sciences (Southwest China Biodiversity portal13). VIVO was initially developed to enable the discovery of researchers and resources in the life sciences across Cornell’s complicated administrative landscape of disciplines, departments, centers, colleges and campuses. VIVO integrates public content about researchers from a variety of authoritative databases at the university and also allows individuals to log in using their Cornell net ID and password to modify their own profiles. VIVO also promotes resource discovery across Cornell—including facilities, equipment and research-related services, such as databases and sample collections, workshops and seminars. All data in VIVO are available for easy consumption by other web pages or services. To ensure adoption, usage, maintenance, and post-funding sustenance of VIVOweb at the individual, institutional, and national levels we propose technical innovation and support coupled with a community-focused approach that provides a high-value product to institutions through local installation and control of each VIVO platform. The involvement of appropriately skilled information management specialists from libraries, as well as researchers, administrators and IT personnel from all partner institutions, including recipients of Clinical and Translational Science Awards, also contribute to the success of VIVOweb. Finally, the governance structure we envision will contribute further oversight and direction, and includes Scientific, Technical, and Executive Advisory Boards, the last consisting of personnel and researchers from the NIH, as well as other major research universities. PHS 398/2590 (Rev. 11/07) Page 3 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): B.2. Conlon, Michael Approach The VIVO research networking platform currently installed at Cornell and UF will be extended and enhanced to address needs at the individual, institutional, and national levels—with modifications to create a more complete institutional research discovery tool with a variety of new capabilities, including the creation of active personal and team networks through the application of social networking tools, and the production of semantically-rich data to integrate, analyze, visualize and distribute at the national level. B.2.a. VIVO Platform at Cornell and Florida VIVO was developed by the Cornell University Library beginning in 2003 to Figure 3 Drupal test portal driven by VIVO content meet individual and institutional research discovery needs and already addresses many areas of importance to researchers. VIVO supports ontology as well as content editing, and is also a simple content management system that enables the representation of the resulting structured information in web pages. It uses the standard Resource Description Framework (RDF) Semantic Web data model and Web Ontology Language (OWL) schemas that identify distinct types of data and defines properties to connect these data with consistent, bi-directional relationships. For example, the profile of a person includes simple text attributes including name, title and statements of research or teaching interests, but extends much farther to include affiliation, activity and outcome relationships to departments, research grants, publications, talks, courses, research areas and geographic areas. Each of these entities is a defined type in its own right that may in turn have its own relationships to other people or to a funding agency, event sponsor, research center or topic. The VIVO installations at Cornell and UF aim not only to automate the harvesting of information from a variety of authoritative sources into a common institutional resource, but also to make data or profiles available for consumption or display by web sites and services across the university. While the accumulation of content, both entities and the relationships between them, initially depended largely on manual entry of information by librarian and staff editors, currency and accuracy concerns have prompted integration into information technology framework via automated data ingest procedures that are already utilized PHS 398/2590 (Rev. 11/07) Figure 4 Graduate Programs in the Life Sciences Powered by dynamic queries from VIVO Page 4 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael at Cornell, and soon to be implemented at UF. Data currently ingested at Cornell include: active personnel, titles, affiliations, and courses from PeopleSoft databases, grants from a custom Oracle database, publications from PubMed and public information reported by faculty via a reporting system called Activity Insight newly adopted by the majority of Cornell’s colleges. The Cornell and UF installations also feature an editing component that ties in with local authentication systems to enable personnel very easily to manage and update their own pre-populated VIVO profiles. This “self-editing” service has been utilized successfully by researchers at Cornell for over a year. Two additional portals illustrate VIVO’s ability to deliver filtered semantic data for the realm of data sharing: a test portal developed in the Drupal content management system14 (see Figure 3) and one showcasing Graduate Programs in the Life Sciences for prospective graduate students, and powered by life sciences content queried dynamically from VIVO15 (see Figure 4). B.2.b. Proposed Multi-Institutional Researcher Network This project will extend VIVO from a single institutional installation to a multi-institutional, distributed model that is VIVOweb. No central portal will be created; local installations will facilitate access to both local and national-level information in all installations. VIVOweb will offer the functionality already provided by VIVO, as well as new features and services tailored to the local context, including but not limited to analysis and visualization tools to promote new paths to discovery, improved data ingest, streamlined ontology editing, an increased number of authentication options, and a decentralized indexing capability to enable cross-institutional browsing and searching. VIVOweb will also include the ability to provide data as email lists or in a variety of formats for social networking tools, for the automatic generation of NIH and other biosketches, and for faculty reporting purposes. VIVOweb’s flexible and extensible data model will allow it to present a simple structure of people and their activities within and across institutions, featuring links among them and connections to other people as well as their professional information. There are many ways a person’s expertise may be discovered, through grants, presentations, courses and news releases, as well as through research statements or publications listed on their profile— resulting in the creation of implicit groups or networks of people based on a number of pre-identified, shared characteristics. We will extend the VIVO ontology to support personal work groups and associated properties to represent the informal relationships evolving around collaboration, and allow individuals and groups the option to limit the visibility of these more informal and dynamic relationships, or “active groups”. New types and properties can be added without writing additional code or altering the database structure of the application, and selected portions of a personal network can be managed as an independent graph for export to social networking tools. New plug-ins already in development within Dr. Katy Börner’s group at Indiana University will also allow easy and effective visualization of the various relationships possible at the institutional or national level. Cornell’s VIVO currently spans the Ithaca campus and the Weill Cornell Medical College within a single software installation. Rather than combining multiple institutions into a single, large, central database, we propose to install separate versions capable of supporting direct cross-institutional references using Linked Data standards. Each entry in VIVOweb will have a stable URI from which its constituent and immediately related data can be requested. RDF can be requested for data harvesters and HTML can be requested for web browsers. This allows seamless linking between one installation and another across VIVOweb. If researchers move from one institution to the next, their persistent URLs can be ‘forwarded’. Linking one VIVO to another where a connection is known to exist addresses one component of the need for a national network. VIVO’s distributed indexing capability will enable individuals to search across institutions and find collaborators where they have no known connections, and to discover the existence and patterns of collaboration across multiple institutions and ultimately at the national level. Development effort will support indexing distributed content at all participating institutions from the beginning. Cornell and UF will host indexing services. Additional participating institutions can choose PHS 398/2590 (Rev. 11/07) Page 5 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael to replicate the index to optimize local performance for cross-institutional searching. Indexing sites will harvest data from each independent VIVOweb site based on the common core ontology that identifies a level of granularity for harvesting people, expertise, topics, research activities, and other data across all the sites. The development and refinement of this ontology will be the subject of investigation by Dr. Ying Ding at Indiana University, in close collaboration with the core development and facilitation teams. Searches initiated from any local VIVO node will then have the option of extending to the multisite index. The first step towards this is the local installation of the VIVO platform at partner institutions. Our technology is capable of integrating seamlessly with other researcher networking platforms via workflows that first convert data from these systems into semantic form using templates to be provided with the VIVO installation package or commercially licensed tools (see Section C.2 for details). VIVOweb’s Semantic Web principles and open, flexible structure represent a research networking solution that will appropriately and efficiently allow integration of the application with varied institutional infrastructures. They will allow VIVOweb to scale in size and scope while adapting to new purposes and unforeseen content, providing an evolving, dynamic, virtual community for the biomedical sciences—and beyond at every institution. Data from local systems—whether based on the VIVO platform or not—will be linked and shared across institutional platforms, but visible locally through institutional portals such as VIVO and GatorScholar to facilitate the networking and discovery of people. The visibility and unique functionality of these portals will stimulate the further evolution of this virtual community across institutional boundaries. Specific functionalities and services proposed for VIVOweb may be summarized as follow: • • • • • • • • Ability to search and browse locally and nationally to “find people like me”, most searched, or topic-related expertise: By keyword or MeSH term, location, department or institution, grants, geographic area, publication, authorship, types of papers or journals commonly published in, and more. Profile modification using institutional authentication system: Add to or change information ingested from institutional authoritative sources; display or hide sections from national view. Ability to ingest data from authoritative sources: Including human resources, grants, and course databases, faculty reporting systems, personal citation management tools and web pages. Easy modification of core ontology: Using improved ontology editor capabilities to more accurately reflect local needs that might deviate from the Semantic Web for Research Communities ontology bundled with each local VIVO installation. Delivery of data to consuming services and mobile devices: Including specialized topical or unit portals, web social networking or collaborative tool APIs, reporting tools and biosketches. Networking: Create and share public and private groups, adding or removing investigators to and from designated groups or contact lists, suggest useful additions to others, navigate across successive connection paths. Communication: Dynamically create and manage list-servs through external list management tools, such as. Lyris Listmanager16, query based on affiliation or topical affiliation to create email lists for a variety of purposes. Analytical capabilities and spatial mapping: Using multi-dimensional network analysis tools and visualization techniques to analyze small team, departmental, institutional, or national PHS 398/2590 (Rev. 11/07) Page 6 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael groupings by publications, grants, funding agencies, and expertise as determined by keywords and concepts conveyed in publications, grants, self-designation and more. B.2.c. Sustainability through Building Community The primary goal of our approach to enable national networking of researchers is to offer a wideranging perspective on multiple aspects of biomedical and translational research across multiple institutions not just to researchers, but also to students, administrative and service officials, prospective faculty and students, donors, funding agencies, and the public and to empower them to contribute – each in their own way. We are advocating for the creation of asset-based, rather than need-based virtual communities, at the individual, institutional, and national level with the focus on making previously “invisible” human assets visible at all levels. According to research conducted at Northwestern University, while a need-based community focuses on “needs, deficiencies and problems”, an asset-based community begins with a commitment to uncovering the community’s capabilities and assets17. This and other work has demonstrated that investment in asset-based models is the most effective way of solving problems, as long as a need can be rapidly and accurately linked to an asset18,19. It is critical to recognize that any technology or tool designed to create a network of human assets within and between academic institutions will be adopted, used and maintained only if the individuals – the assets in this case – and the institutions perceive value in it. Value to the individual is most likely to be assessed by responses to questions such as: • What does this tool do to advance my research and academic standing? • Can I use it to reliably find potential collaborators and other people of interest to me within my institution and beyond? • Can I use it to create inter-institutional groups or networks based on common areas of research or interest? • Is it current, accurate, immediate? • Will it expedite the reporting and communication of my work? • How much effort will it take to maintain my information—how easy is it to use and update? Value to the institution is likely to be assessed primarily by administrators, based on responses to questions such as: • What does this tool do to advance the standing of my institution? • Does it enhance recruitment and retention of top-notch faculty and students? • Does it foster collaboration, particularly across traditional boundaries? • Does it help improve the fraction of successfully funded grant proposals? • Can it increase efficiencies associated with the management and dissemination of information about people and resources? • What does it cost to sustain and improve? • How easily does the technology or platform interoperate with others and how agile is it? Finally, value to NIH and other federal agencies, professional biomedical societies and organizations can be evaluated by such factors as easier identification of experts and potential reviewers, more effective use of grant dollars through improved collaboration, and possible synergies with services already offered by NIH—such as eraCommons, PubMed and others. The technical sections of this proposal will make it clear that the capabilities suggested by these questions are indeed functionalities that the VIVO platform will enable for individuals and the institution. However, our work with the platform at Cornell and UF also demonstrates that delivering technical capability alone is not sufficient to ensure adoption, usage and maintenance by either the PHS 398/2590 (Rev. 11/07) Page 7 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael individual or the institution. Technical innovation for a networking tool such as this must be backstopped by human facilitation—in this instance, by information specialists from institutional libraries, or by other informatics professionals, wherever possible. We anticipate that researcher engagement and outreach by information specialists will promote adoption, usage, and maintenance of VIVOweb by the research communities in their institutions, thereby fostering the creation of virtual communities of biomedical researchers at all three levels above, accessible through local VIVO installations. That the asset-based community approach employed for the initial development of VIVO at Cornell and UF, and the library-based outreach efforts associated with it are valued and successful, is evidenced by this small sample of feedback from researchers and administrators at both institutions: “VIVO provides unparalleled access to information about the life sciences at Cornell in a user-friendly way. This will be of particular benefit not only to those researchers and students already at Cornell, but potential faculty and students as well, by offering a much-needed, integrated view of the life sciences community at Cornell.” “VIVO saved my life as a new faculty member at Cornell; I used it all the time to find facilities and people I might work with.” “First, <GatorScholar> allows individuals…to quickly find faculty with specific research interest. Undoubtedly, this raises the visibility of the life sciences faculty among potential granting agencies, students, and policymakers. Second, it facilitates interactions between life science faculty with divergent backgrounds. This facilitation increases the likelihood of grant funding by drawing on synergisms within the college.” “<The> up-to-date awareness <provided by GatorScholar> is vital for researchers to make timely contacts, find potential collaborative partners, access literature searches, and locate other resources necessary for their work. It will be particularly useful because of the growing interdisciplinary among the sciences.” B.2.d. Support through the Libraries Our proposal posits that the engagement of a neutral and trusted campus entity with information management expertise will greatly facilitate adoption, usage, and maintenance of such a tool. Evidence from Cornell and UF suggests that the academic library—in its capacity as a generally impartial and trustworthy organization with a clear understanding of the needs of the research community and the proven capability of engaging with it, expertise in information management and dissemination, and an established liaison function—admirably performs this role. Further, medical and science and engineering libraries have traditionally provided information resources and technology in support of educational, research, and patient care objectives, and are taking on an increasing role in fostering and supporting collaborative efforts on campus to shorten the gap between bench and bedside. Recent advancements in translational medicine have prompted libraries to develop information solutions which support dissemination and facilitate a fluid exchange of data in the increasingly cross-disciplinary research setting. Over the last few years, a number of medical libraries have responded to changing information needs by expanding their services to offer visionary programs which enhance the flow of information and promote collaborative opportunities in the translational research environment. The stalwart engagement and stewardship provided by the NIH’s National Library of Medicine (NLM) in support of biomedical research has provided a valuable model, and many programs and services offered by these libraries are frequently developed and coordinated by PhD-level specialists trained and certified by the NIH. That libraries have successfully met these needs provides a foundation for a library-based community support network for VIVO. While support of both user and development communities will be PHS 398/2590 (Rev. 11/07) Page 8 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael challenging, a library-based model best addresses many of the issues which may arise during this process. Librarians, including several with PhDs and/or bioinformatics expertise and NIH training with expertise as NCBI course developers and instructors have been included to facilitate intra-and interinstitutional adoptions, usage, and maintenance of VIVOweb. Through technology advancements mentioned in this proposal, as well as evaluation and further development by the seven adoption partners, VIVOweb will grow a community based support network. As adopters become developers, the support network will work to develop the critical components for building a community-based support network. Based on the Cornell and UF experience, librarians and domain specialists will be particularly valuable in: • • • • • • • Establishing virtual environments which facilitate communication and collaborations—such as wikis—for both the outreach and development teams which will serve all members of the VIVOweb consortium. This includes, but is not limited to a listserv, development and outreach wikis, news items and publications. Providing in-person and e-mail “help desk” support in the use of VIVOweb. Developing support documentation, including an FAQ, quick-start guide and manuals on the VIVO application, suggested and proven outreach and support strategies, and guidelines for development of new modules. Creating and supporting a comprehensive suite of educational materials for VIVO users and implementation and support teams, including both text-based and video tutorials which range in complexity from basic needs to more complicated or innovative uses of the application. Facilitating inter-institutional collaboration on the development of a common ontology. Engaging with researchers and administrators in the local setting to educate and engender buy-in and ensure institutional support Serving as link between researchers and central technology teams by regularly providing feedback on usability – problems encountered, what works well and what is missing but essential for a successful product. Training materials and support documentation will be modeled after widely used materials provided by the NLM for applications, databases and services such as PubMed, NCBI, MyNCBI and others20. We anticipate that personnel outside the Library will increasingly assist with this task as VIVOweb becomes accepted and increasingly integrated into the administrative and communications mainstream. However, it represents a considerable technological and cultural shift from current practice for most institutions, just as any new campus-wide initiative faces many challenges in achieving clarity in mission and consistency in execution. For an ambitious e-community building endeavor such as this to truly succeed—that is, to be adopted, used, and maintained— technical innovation as well as careful and engaged stewardship by institutional libraries will be essential. B.2.e. Engagement of Recipients of Clinical and Translational Science Awards (CTSAs) Any project to support biomedical researchers will clearly need support from recipients of CTSAs, which represent a primary community of practice. The CTSA institutions21 have considerable interest in national networking and have formed a workgroup to facilitate consortium-wide collaboration. One of the functions of the consortium is to support researcher networking across institutions. VIVO is designed specifically to address this need. Members of the CTSA consortium will be asked to serve on VIVO governance bodies – Executive, Scientific and Technical – and actively participate in facilitated discussions of the needs of this important group of research institutions. PHS 398/2590 (Rev. 11/07) Page 9 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael B.2.f. Support for all Institutions It is important to note that many other National Center for Research Resources (NCRR) and NIHfunded centers and programs also include researchers making very significant contributions to advance biomedical research. Our consortium of institutions therefore includes CTSA recipients as well as other NCRR awardees to ensure the broadest possible interpretation of biomedical researchers. This approach will ensure that our ontology is scalable across a wide variety of disciplinary types and therefore more easily scalable and extensible beyond the funding cycle of this grant. As schools choose to adopt VIVO, the community-based mechanisms for support scale to national levels and are sustainable in supporting networking of researchers. It is important to note that many other National Center for Research Resources (NCRR) and NIHfunded centers and programs also include researchers making very significant contributions to advance biomedical research. Our consortium of institutions therefore includes CTSA recipients as well as other NCRR awardees to ensure the broadest possible interpretation of biomedical researchers. This approach will ensure that our ontology is scalable across a wide variety of disciplinary types and therefore more easily scalable and extensible beyond the funding cycle of this grant. The creation and distribution of support materials, both educational and promotional, will be an essential means of facilitating institutional awareness and adoption of VIVO. Materials will be designed and created for the institutional and national audiences at the University of Florida, under the direction of Dr. George Hack. Among the materials to be developed is a comprehensive suite of educational resources for VIVO users and implementation and support teams. These resources will include a variety of tools help facilitate integration of VIVO at the institutional and national level. Educational support is an important component in the library-based support model. For successful implementation of VIVO to occur, researchers at implementation sites need to feel comfortable navigating VIVO, allowing the resource to promote serendipitous discovery of collaborative opportunities. Informative web-based support materials will be available from the VIVO project website such as a FAQ, a “quick-start” guide, links to documentation and published papers about the application. A robust collection of online tutorials will be developed, offering just-in-time support to researchers who wish to utilize the power of VIVO. This immediate response will be further supported by providing access to podcasts and videocasts of VIVO-related events. Strong educational support of VIVO is best served by combining this rich online VIVO presence with a strong in-person support component at implementation sites. To accomplish this, a robust series of instructional materials, including PowerPoint slides and handouts, will be developed for use to deliver in-person instruction and presentations. These instructional materials will be developed in a series of stand-alone modules, such as: a basic VIVO overview and training; institutional discovery: using VIVO for new investigators, students, and staff; managing VIVO profiles by proxy: support for administrative support staff; VIVO for the institutional administrator, and advanced VIVO applications. The modules will be easily interchangeable and present an ala carte approach to standardized instructional design. A comprehensive suite of marketing materials to be used on a national basis will also be created by the University of Florida. Such materials will be created in a variety of formats including web, print, graphics, audio, video, and animation technologies to support curriculum offerings and promote VIVO. The marketing/communications coordinator at UF will work closely with institutional outreach teams during adoption, use best practices to identify change agents, promote and market the characteristics of VIVO as a new innovation, and establish the key elements of a change process that will facilitate adoption. PHS 398/2590 (Rev. 11/07) Page 10 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Both educational and promotional materials will offer a standardized look and feel, but still offer institutions opportunities for customization with their own logos. The VIVO logo and color scheme will be featured prominently to build the VIVO presence in materials related to the application to ensure that VIVO is a brand that becomes recognized nationwide – from locally-hosted resource workshops to national-level scientific meetings. Support materials will include images and logos, PowerPoint templates, and code for incorporation on implantation site websites – all for use by the VIVO consortium members While these support materials will be designed and created at the University of Florida, all modules could be easily customized with institutional logos and further customized with real-world examples from any specific institution, with the assistance of a librarian. This approach will be convenient and can be scaled up or back, depending upon the institutional needs. As schools choose to adopt VIVO, the community-based mechanisms for support scale up to a national and sustainable activity in support of networking researchers. B.2.g. Support through Professional Societies Professional societies of researchers will be engaged to adopt VIVO. Professional societies have a natural role to play in facilitating the networking of their members. By adopting VIVO, they make themselves visible in the national network. By participating in community-based support, they provide increased visibility for their services as well as additional support for their members. By ensuring that the Semantic Web recognizes and facilitates the identification of members, the societies leverage VIVO in support of their goals, helping to build the national network. Professional societies can promote VIVO through their own communication channels, reaching large numbers of researchers. Researchers who are members of professional societies can highlight this membership in a rich manner through their own VIVO profile as well. Implementation of VIVO for a professional society is a straightforward process. The VIVO software is designed to be easy to host. Creating profiles for professional society staff members is simple and allows these people to be found through the Semantic Web and support research activities. C. Project Plan C.1. Governance The development and support of VIVOweb will be governed by three national advisories – an Executive Advisory Board, a Scientific Advisory Board and a Technical Advisory Board. These groups ensure that VIVOweb meets the needs of researchers, institutions and the NIH. C.1.a. Executive Advisory Board The Executive Advisory Board (EAB) sets the direction for VIVOweb development and support activities and ensures full coordination with the implementation of resource discovery resulting in a seamless Semantic Web of both scientists and resources. Constituted from a cross-section of the research community, with NCRR representation, and with representation by the implementers of the network for resource discovery, the EAB advises the Principal Investigator and the project teams on all matters related to the creation of national networking of researchers. Quarterly evaluation reports are provided to EAB members. The group will meet twice per year. Members will receive travel support. One meeting per year will be held at NIH in Bethesda. One meeting per year will be held in conjunction with a national meeting such as the CTSA Consortium meeting. PHS 398/2590 (Rev. 11/07) Page 11 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Table 1 VIVOweb Executive Advisory Board Member TBA TBA Julianne Imperato-McGinley TBA TBA TBA Gloria Thomas, PhD Peter Stacpoole, MD, PhD Michael Conlon, PhD Affiliation TBA TBA, PI Resource Discovery Weill Cornell Medical College, PI CTSA TBA CTSA Consortium Representative NCRR Representative Xavier University University of Florida, PI CTSA University of Florida, Ex officio, PI Research Networking C.1.b. Scientific Advisory Board The Scientific Advisory Board will consist of a spectrum of biomedical researchers who will provide direct input regarding the support activities and the needs for features and ontology components to support their work. Members will be recruited nationally by the EAB members and by members of the project team. Support systems including a web site and wiki, will be provided to facilitate the gathering of input from the Scientific Advisory Board. Bi-monthly conference calls and gatherings at national meetings will be used to solicit further input. C.1.c. Technical Advisory Board The Technical Advisory Board (TAB) will guide all aspects of the technical development of VIVOweb, including ensuring that: 1) content from local installations can be picked up by any national network; 2) the VIVOweb installations can use community-sourced data such as Linked Open Data; 3) VIVOweb is fully interoperable with the resource discovery network; and 4) interfaces to and from VIVOweb to other tools meet the needs of the research community. Table 2 VIVOweb Technical Advisory Board Member John Wilbanks York Sure Neil Smalheiser Barand Mons Kei Cheung Chris Bizer Steffen Staab Abel L. Packer Stefan Decker Carole Goble Dean Krafft PHS 398/2590 (Rev. 11/07) Affiliation Creative Commons University of Koblenz, Germany. Scientific Director of the Liebniz Institute for Social Science University of Illinois, Chicago University of Rotterdam, The Netherlands Yale University Free University of Berlin, Linked Open Data University of Koblenz, Germany BIREME/OPS/OMS, Director, Brazil Director of DERI Galway, Ireland University of Manchester, UK, co-director of e-Science NorthWest Cornell University Page 12 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael C.1.d. Project Organization Figure 5 shows the project organization. The EAB oversees the project. Evaluation (cyan), Project Operations (orange) and Project Governance bodies (blue) report to the EAB. Project Operations is organized into three activities – Development, coordinated by Jonathan Corson-Rikert, National Activities, coordinated be Medha Devare and Site Implementations, coordinated by Valrie Davis. C.1.e. Development Teams The project will support three development clusters, at Cornell, UF and Indiana University. The Cornell group will focus on extensions to the current core VIVO functionality and access controls to better support individual and team networking, improve scalability, and support workflow for data ingest and export. This group will also develop the distributed search indexing capability and Linked Data functionality. Any architectural changes necessary to support a more modular architecture for ingest, export, or to allow plug-in extensions for visualization or other purposes will be coordinated with UF and Indiana University teams. The Indiana University developers will work in two teams under the leadership of Katy Börner and Ying Ding. Börner’s Cyberinfrastructure for Network Science Center will implement advanced data mining and visualization in support of social networking, metrics, and presentation. Ding will lead efforts pertaining to the development and maintenance of ontologies used by the Semantic Web to represent scientists and investigators. UF developers, under the direction of Christopher Barnett, will focus on Figure 5 VIVOweb project organization interfaces to software in the institutional setting, and packaging of VIVO for deployment. Interfaces will be built for 1) PeopleSoft22, to provide authoritative data to VIVO regarding people in the institution; 2) Drupal23, to enable the use of VIVOweb from within research team Drupal implementations; 3) Shibboleth, to provide federated identity management, and 4) Sakai24, to provide access to research networking from within the popular open source course and content management platform. We anticipate the need for additional interfaces as determined by the VIVOweb governance processes. C.1.f. Media Support Team Dr. Devare will direct the efforts of the team at UF lead by Dr. George Hack in the development of instructional support and other media materials for VIVOweb. Instructional videos, promotional material, web sites, conference materials, collateral for exhibits and other material will be developed by Dr. Hack’s team. PHS 398/2590 (Rev. 11/07) Page 13 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael C.1.g. Adoption Support Dr. Devare will coordinate effort related to the national adoption of VIVOweb. This includes development of promotional materials and web sites, presentations at professional societies and conferences. In this effort she will be supported by all members of the project team. C.1.h. Implementation Teams Each of the seven participating institutions has an implementation team that will deploy VIVO during the first year of the project and then implement VIVOweb during the second year. Each implementation team participates in the evaluation led by Dr. Leslie McIntosh of Washington University. Table 3 Implementation Team Leads at each of the participating institutions Participating Institution Cornell University University of Florida Indiana University Scripps Research Institute Ponce Medical School Washington University Weill Cornell Medical College Implementation Lead Medha Devare Sara Gonzalez William K. Barnett Gerald Joyce Richard Noel Rakesh Nagarajan Curtis Cole Valrie Davis at UF will coordinate the implementations and provide support to the implementation teams. Implementation teams provide input to the evaluation team. The evaluation team prepares quarterly summaries regarding the implementation for the advisory boards. C.1.i. Researcher Support Teams The libraries of each institution will provide support for researchers using VIVOweb. Librarian contributions to creating support for the adoption, usage and maintenance of VIVOweb may be summarized as follows: Organizational/Workflow and Training Responsibilities. The information specialists in the institutional libraries will facilitate growth and maintenance of local VIVO instances by: • Hiring, training, coordinating and supervising staff who will initially enter data for individual profiles and related pages in VIVO, • Ensuring that relevant content and content types related to biomedical research are entered in VIVO and the relationships between individuals and pieces of information—the entities—are accurately and consistently represented, • Integrating project support resources into the institutional culture, including in-person training events and just-in-time online instructional and support resources, and • Organizing and implementing usability testing for both self-editing of individual profiles and those of related academic units. Outreach Responsibilities. As academic appointees, the information specialists provide outreach to departments, programs, centers and individual researchers, with whom they have enduring professional relationships and to whom they provide assistance to facilitate research. Through their liaison roles, these information specialists: • Bring to the project an understanding of both news-worthy and day-to-day activities and issues of importance that inform data element and design decisions. Examples of these might include research areas, collaborative initiatives and committees that are used to pre-populate pick lists that researchers can use while editing their profiles, PHS 398/2590 (Rev. 11/07) Page 14 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): • • • Conlon, Michael Collaborate with groups across departments and administrative units to add content streams and improve efficiency, Demonstrate VIVO and its self- and proxy-editing capability at departments, institutes, centers and researcher’s offices to inform individuals, provide feedback from users and increase support for the VIVOweb initiative. Their experience as instructors of digital information resources endows them with a unique awareness of user behavior in a digital climate. Have developed strong and trusted professional relationships with their research clients, and will be able to use these connections to facilitate all tasks performed in relation to this project. Navigating VIVOweb’s Technological Underpinnings. VIVO is an ontology-based tool to integrate diverse information through simple, consistent categorization by types and relationships. Librarians are trained to understand, develop and encode ontological relationships and apply them pragmatically to keep VIVO straightforward and simple to use. Membership in the Institutional Research Community. Research is the primary subject of a librarian’s work. Information specialists are well positioned, trusted arbiters within an institution’s research community, capable of efficiently clustering information for VIVOweb that reflects important nuances at a level appropriate for a general higher-education audience. C.1.j. Evaluation Team Dr. Leslie McIntosh of Washington University will lead the evaluation efforts related to the project. Dr. McIntosh will be assisted by a biomedical informatics specialist to conduct assessment tasks, acquire data, and analyze these data sets in collaboration with Dr. McIntosh. Quarterly reports will be prepared and made available to the advisory boards. C.2. Technical Design VIVOweb is based solidly on Semantic Web technologies recommended by the World Wide Web Consortium (W3C). The core is the RDF25 where items being described are assigned globally unique identifiers (URIs, or Uniform Resource Identifiers) and their relationships and attributes are described in discrete pieces called “triples” or “statements.” A collection of triples forms a graph of data that may be stored in a single file or distributed across the entire web. Another W3C standard, SPARQL (Sparql Protocol and RDF Query Language26) makes it possible to query Semantic Web data using SQL-like syntax. RDF relationships may also be embedded into standard web pages using RDFa (Resource Description Framework in Attributes), which allows browsers or search engines to extract structured data. Google recently announced that it would begin harvesting RDFa data27, following on the heels of other search engines such as Yahoo28. Semantic Web standards, such as RDF Schema (RDFS29) and the Web Ontology Language (OWL30), make it possible to exchange ontologies, which specify the semantics of the terminology and relationships used in RDF descriptions. Ontologies also enable reasoning, or inference of new triples based on existing data. VIVO takes advantage of RDF’s triple-based structure and OWL’s constructs for defining types of resources and their relationships to build a flexible, extensible knowledge base describing academic researchers and their activities. VIVO takes an additional step beyond the use of Semantic Web technologies at the local application level by embracing the principles of Linked Data, which is a concept articulated by World Wide Web inventor Tim Berners-Lee in a 2006 design note31. Linked Data promotes a web of data on the scale of today’s human-readable web, where interconnections between datasets are created as easily as HTML hyperlinks. With Linked Data, RDF resources are assigned URIs that are dereferenceable, that is, a request for the URI will direct humans or machines to useful data describing the resource. These data should include additional URIs to allow the web of data to be browsed or crawled seamlessly. PHS 398/2590 (Rev. 11/07) Page 15 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael The Linked Data community estimates that 142 million links between Semantic Web datasets have been created.32 Links between institutional VIVO datasets will allow seamless browsing across institutions. VIVOweb does not require coordination between installations when describing new people, organizations, topics, or other entities. Different URIs representing the same resource can be cross-referenced through OWL sameAs properties. In the cases where this approach is not possible due to differences in ontology semantics between datasets, we will follow best practices emerging from ongoing research in the field (Glaser et al., 2009). C.2.a. The VIVO Platform VIVO is unique in offering three major functional components in one package: ontology editing to create or modify a data model, intuitive user editing for data and relationships and a simple content management system to present an attractive web presence. This integration was designed and developed from the ground up to support a researcher networking application in the institutional environment. Unlike relational database-driven systems, VIVO requires no fixed data model with tables and fields internally defining the data elements supported in the system. VIVO instead provides an administrative editing interface to define types of data and relationships among these data types; a common core ontology data structure (see Error! Reference source not found.) will be supplied with the VIVO installation package, but institutions will be free to extend the model further as required for local needs without additional coding. Institutions may choose the extent to which they integrate VIVO into local IT infrastructure for authentication to allow modification of profiles by individual researchers or their proxies and for data ingest. This integration generates additional startup cost but lowers ongoing operational costs – data is only entered once into the appropriate system of record and is pushed to VIVO through interfaces. Data quality is improved through use of normal university data management processes and changes to core institutional data can continue to happen in the appropriate database of record. VIVO is also capable of disseminating data to other institutional web sites as well as harvesting from them. VIVO provides generic RDF/XML output that can be customized or filtered within VIVO or transformed into desired reports outside of VIVO according to local requirements. By providing incoming and outgoing data paths through both human interaction and machine processes, VIVO is capable of integrating well into institutional enterprise architectures. Under the hood, VIVO is a Java servlet application using Java Server Pages for page rendering; existing installations use the open-source Apache Tomcat servlet container and the Apache web server. VIVO’s search function employs the Lucene library33. RDF data are managed through HP’s Jena Semantic Web library,34 which allows direct access to a variety of triple store implementations, including those based on familiar relation database systems. Existing VIVO installations use MySQL35, which, like all of the libraries used by VIVO, is freely available and open source. VIVO’s default configuration caches RDF data in memory to support very fast queries and web page rendering. This technique scales to an institution the size of Cornell or Florida; in cases where much larger RDF data sets are involved, VIVO may use any RDF triple store that implements Jena’s graph service provider interface and supports the SPARQL query language. These include third-party commercial RDF stores such as AllegroGraph36 and OpenLink Virtuoso37, as well as a number of open-source stores provided by HP. Several of these systems have been demonstrated to store more than one billion RDF triples successfully.38 The VIVOweb technical development process will include further testing and optimization in order to deploy highly scalable triple stores for large data sets, including modification if necessary to integrate triple stores that do not provide a direct Jena interface, such as the Sesame39 native store. A cluster of Sesame stores is used in SemaPlorer,40 which took first prize in the Billion Triples Track of the 2008 Semantic Web Challenge.41 PHS 398/2590 (Rev. 11/07) Page 16 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael C.2.b. Ontologies VIVO’s flexible and extensible data model will allow it to present a simple structure of people and their activities across a university, featuring links among them and connections to other people as well as their professional information – using a network graph structure to most naturally represent a realworld network of relationships (see Error! Reference source not found.). There are many ways a person’s expertise may be discoverable, including talks, courses, and news releases as well as through research statements or publications listed on their profile—resulting in the creation of implicit groups or networks of people based on a number of pre-identified, shared characteristics. 42 Ontology is an important approach to model knowledge so as to improve information organization, sharing and understanding. It has a crucial role to enable content-based access, interoperability, communications and provide qualitatively new levels of services on the next generation web. VIVO is powered by ontological approaches to digest main assets of information and knowledge derived from and requested by research networks43. It re-organizes the current existing authorized information from faculty annual reports, institutional scholarly databases, funding records, teaching materials in an ontological manner so that this information can be re-packaged and re-presented to the researchers to facilitate their networking.44,45 The ontology work to date at Cornell and UF informs (but does not wholly determine) the course of ontology development for this project, to be conducted as a close collaboration between the community and technical teams under the overall direction of Professor Ying Ding of Indiana University. Goals Figure 6 Sample entity structure for a faculty member showing common internal include data properties as well as object property relationships with other entities optimal alignment with existing ontologies in wide use, extensibility for local needs and provision for ontologylevel local controls over what information is shared nationally. Mapping different localized VIVO ontologies for VIVOweb’s multi-institutional scope can be realized through the community efforts to achieve the agreement for specific mappings. Extending and maintaining VIVO ontologies should reflect biomedical community needs and facilitate visualization, semantic analysis and networking, developed by Börner, also at Indiana University. Ontology documentation will include information about the ontology’s design principles and guidelines for local extensions. The ontology team will prepare a set of best practices for training potential users and facilitating adoption of our technologies and approaches. Maintaining a modular ontology structure facilitates ontology re-use, ontology mapping and data integration. The core ontology for VIVO installations will be based on the Semantic Web Research Community (SWRC) ontology developed by the large European Funded Network of Excellence KnowledgeWeb46. The SWRC ontology models major entities of research communities about persons, organizations, publications and their relationships. We will also implement mappings where possible to enable VIVOweb data to be queried locally and nationally using a number of different widely-adopted social ontologies including FOAF (Friend of a Friend)47, SKOS (Simple Knowledge Organization System)48, DOAP (Description of a Project)49, SIOC PHS 398/2590 (Rev. 11/07) Page 17 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael (Semantically-Interlinked Online Communities)50, Dublin Core51, and GEO (Geographic Names)52. These will ensure interoperability with other data and systems publishing data to the Semantic Web.53 Throughout this project we aim to further enhance this ontology to better reflect the requirements coming from research networks in the biomedical domain, especially through the testing of VIVO in our partners’ institutions and universities. We will extend the VIVO ontology to support personal work groups and associated properties to represent the informal relationships evolving around collaboration, and to allow individuals and groups the option to limit the visibility of these more informal and dynamic networks and manage them as an independent graph for export to social networking or collaborative tool APIs. C.2.c. VIVO in the Institutional Context Scalability through multiple independently administered installations is a major strength of this proposal. During the scope of this project, VIVO can provide a customized and extensible presence at the diverse participating institutions and provide convincing and varied models for propagation under full local institutional control in the national context. Institutions without broad IT support services will be able to utilize a more basic version, while larger institutions with more technologically integrated resources will be able to add additional content modules and more fully integrate the application to consume existing data sources at that institution and serve as an integrated source of data for other applications. The VIVO approach as demonstrated at Cornell is designed to transcend the administrative and organizational constraints of any one institution. If Cornell and UF are at all typical of research institutions, an integrated view of people, affiliations, grants, publications, courses, talks, research interests and international activities across internal organizational units fills a rather glaring void in university data federation and data presentation for internal and external communications, especially at the level of detail the VIVO platform affords. VIVO offers a solution to appropriately and efficiently integrate with varied institutional infrastructures. For most institutions there will be tangible benefits to justify the initial overhead of closer integration of VIVO into Systems of Record (SOR). VIVO offers ample potential for synergies deriving from its data integration capabilities, effectiveness as a public web application, and ability to disseminate filtered data to other services and web sites. The investment required for an institution to interface VIVO’s authentication framework or to adapt VIVO’s tools to integrate core data will be repaid by improved data consistency and a higher public visibility for researchers, students and staff; individual buy-in will be improved by reducing data entry time, and through the VIVOweb network, authoritative and consistent data will be propagated to the national level. In the institutional setting, the VIVO installation is interfaced to SOR that indicate who is a faculty member or researcher and provide basic authoritative information regarding department affiliations, previous positions, degrees earned and administrative roles. At Cornell, these SOR currently include human resources, grants, courses, annual faculty reports and the LDAP directory. As the SOR update, interfaces keep VIVO up-to-date. Established university processes are used to maintain data in the SOR, while faculty and their proxies continue to maintain data local only to VIVO, such as research interests, international focus, and professional or service activities. Cornell has also successfully addressed issues of data stewardship through a clear separation between “faculty as employee” data (often private) from “faculty as academic” data (largely public) in its faculty annual reporting. VIVO provides a coherent public outlet for academic and research-focused data that individual faculty have marked as publicly visible in their annual reporting. Authentication to VIVO is required in the local setting to gain authorized access to edit information for a researcher. Researchers or their designated proxies may update information in the researcher’s profile. Local authentication (a VIVO username and password) is supported, as well as use of institutional authentication methods such as LDAP/Active Directory and Kerberos. For use cases involving cross-institutional access to privileged information, federated authentication via Shibboleth54 PHS 398/2590 (Rev. 11/07) Page 18 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael will be supported. Shibboleth enables researchers to access privileged information in VIVOweb implementations other than the one at their home institution using credentials from their home institution, provided they are authorized to access the information. The VIVO platform will run independently at each institution and offer a local search as currently configured at Cornell (Ithaca and Weill) and UF; any changes in local content are automatically reflected in the local index as they are saved. Local installations can, through annotations on the ontology, limit the range of data elements considered for public viewing or export, in concert with appropriate administrative staff and institutional policy. The VIVOweb ontology team will focus crosssite data indexing at a level appropriate for cross-institutional and national discovery, exposing data through common vocabularies such as FOAF55 as well as the native SWRC-derived internal ontology. Institutional VIVO portals will make researchers and their multiple interconnections more visible on the web through standard search indexing, shortly to be enhanced through Google and Yahoo’s recent announcement that special tags embedded in web pages will be harvested to improve relevance ranking algorithms and enhance search results. VIVO will support RDFa56, an extensible vocabulary for referencing relationships via published ontologies within HTML tags on standard web pages. C.2.d. VIVO in the Internet Context VIVO is ideally positioned to ingest data from Internet sources such as PubMed and other publications databases. While some institutions such as North Carolina State University, maintain an institutionwide citation database in connection with an institutional repository57, or have licensed special access to bibliometric tools through commercial databases, publications are perhaps the leading data source for research networking, but are poorly exploited by institutional data sources. The distributed technical and content development teams working across partners during the grant period will collaborate to streamline the acquisition of each institution’s publications citations from national and international database, focusing initially on PubMed. The teams will develop an improved workflow using web service APIs when available, concentrating on known challenges such as author disambiguation, where some combination of automated processing and interactive review will be required. Initiatives for unique identifiers such as the ID.LOC.GOV project (currently limited to addressing Library of Congress subject headings) and PubMed Unique Identifiers (PMIDs) offer promise that this problem may become less burdensome in the future. Although several proprietary systems for unique author identifiers are also being developed, we do not expect these private systems will be openly available at the scale of entire institutions. VIVO can easily store any number of identifiers to help disambiguate authors and investigators, but it will use an institution-based URI as the primary identifier for any individual entry following Linked Data standards that rely only on standard HTTP request and responses and institutional domain name registration rather than any special resolution services. If a person moves from one institution to another, a standard HTTP redirect can be returned and the redirect accomplished without a user’s intervention or even knowledge. Initiatives such as the Indiana University Scholarly Database58 with 22 million paper, patent, and grant records, as well as the growing datasets available as Linked Data, offer largely untapped sources of additional information to enhance local institutional data about researchers. The Bio2RDF project59 provides Linked Data for dozens of data sets, including microarray, pathway analysis, human genome and protein data. The ability to make RDF connections as appropriate to biomedical data and to reuse existing efforts ranging from RDF versions of MeSH to authoritative databases for referencing species and geographic places, makes it possible for VIVO to augment researcher social networks with rich descriptions of the content of their research. PHS 398/2590 (Rev. 11/07) Page 19 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael C.2.e. VIVO in the Semantic Web While Cornell’s VIVO links researchers across four physical campuses and numerous disciplines and departments within one software installation, multiple independent instances of VIVO will be interlinked as VIVOweb to support cross-institutional discovery and networking. This active networking across a national body of diverse institutions will be promoted by a cross-site search engine as well as through exploration tools employing network analysis and visualization. Knowledge and expertise navigation, management and utilization will be supported through network analysis and visualization services. The cross institutional search will allow a VIVO system at one institution to query across all relationships in the national network. An example of this would be a researcher querying a local VIVO system with a question such as, “Who are all the people in New York who are working on astroviridae infections?” VIVO will run independently at each institution and offer local search and editing services, but many times there will be information relevant to a search at a VIVO system at a non-local institution. To provide results to queries across institutions the data from institutions will be aggregated by a distributed system which will operate as part of VIVOweb. Each institutional system will be able to query against the distributed system containing the aggregated data indexes. The VIVO instance at each institution will not only provide a web-based front end for querying and browsing, but it will also contribute a node to a clustered system for the support of a distributed national search index. This clustered system will aggregate all of the information from the local VIVO systems and process it into a full text index and a RDF index. The RDF index will service queries based on the relations between entities in the system and the full text index will allow unstructured term based searching. The clustered component of the system will be built using Apache Hadoop60 and Hbase61. The Hadoop framework transparently provides the execution of parallelized jobs such as aggregation of data from local VIVO systems, construction of indexes and processing intensive visualization jobs. Hadoop also provides for transparent distributed data storage; which will be critical to scaling when managing aggregated datasets. Hbase, a database built on top of Hadoop, will be used to store the aggregated RDF and for servicing relation based queries. The national index will be updated daily with changes from the local VIVO systems by a job which runs on the cluster and pulls data from the local systems. Running a separate index for the national network will enable local control over what is exposed for indexing and allow the national index to filter content based at the level of the ontology appropriate for national-level discovery and networking. The analysis and visualization tools developed by the Indiana University team will also access the distributed VIVO instances for source data in the form of RDF triples. If computationally intensive processing is required by the analysis tools it will be executed using the Hadoop cluster. No central hub will be needed to support national indexing, nor will the full content of any local installation be pulled into a central index or database. The VIVO software will be modified to allow local users to extend searches to the national index, when so desired. Initiation of queries and display of search results will be supported through REST style web services returning common data formats such as HTML, JSON, or XML. C.2.f. Network Analysis and Visualization Many cross-institutional relationships can be mapped directly through co-authorships, shared service on professional committees, joint grant projects, and similar direct linkages, more extensive relationships can be discovered or prospectively suggested through network-enabled analysis of text content, linkages to common keywords and evolving patterns of relationships that indicate common experience or research interests62. The Indiana University development team will investigate analysisdriven enhancements, query tools, and visualization tools to build pathways for discovery across PHS 398/2590 (Rev. 11/07) Page 20 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael multiple VIVO instances and evaluate the potential of such techniques at the scale of a national network. At all three levels—the individual, institutional, and national—multiple techniques can be applied to identify trends, patterns and outliers in support of insight and easy interpretation, including temporal, geospatial, topical (semantic text mining) and network analysis techniques63. Exactly what analysis and visualization techniques are most appropriate depend very much on the final set of supported user needs, the available data and the delivery mechanism. In many ways, the most directly communicable forms of analysis based on transparent linkages will be most effective. Börner’s team at Indiana University has developed scholarly knowledge management tools over the past four years and has actively been using them for three years64. Samples are available.65,66 Diverse approaches to analyze and visualize scholarly data have been developed and tested. Among them are tools for the visualization of evolving co-authorship networks67 such as those shown at geospatial visualization of conference attendances, co-investigator networks (Figure 7, left) or advisor-funding-student networks (Figure 7, right). To reduce the cognitive load associated with the learning of new network layouts or ‘reference Figure 7 Exemplary project-investigator network (left) and Advisor-Project-Student network with faculty member in center (right) systems’, static base maps such as geospatial maps or maps of science can be used. An exemplary visual interface to neuroscience jobs in the U.S. is given in Figure 8, left. Co-authorship patterns or other linkages can be overlaid over geospatial or topical reference systems as well (see Figure 8, right). PHS 398/2590 (Rev. 11/07) Page 21 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Figure 8 Interactive on-line browser interface to neuroscience jobs in the United States (left) and overlaid geospatial world map (right) Figure 9 shows the UCSD Map of Science68 covering all sciences as well as the arts and humanities – 23,748 journals indexed by Scopus and Reuters/Thomson Scientific (ISI SCI, SSCI, and A&H Indexes). Each of the 13 main scientific disciplines is labeled and color coded in a metaphorical way, Figure 9 UCSD Map of Science with sample data overlays of expertise profiles e.g., Medicine is blood red and Earth Sciences are brown as soil. Circle size denotes the number of papers and multiple graphs can be prepared and animated over time. In this manner, VIVO usage per science area can be identified based on the journals in which researchers in VIVO publish. Circle size denotes the number of papers and multiple graphs can be prepared and animated over time. The map can be used to communicate the ‘intellectual footprint’ or ‘trajectory’ over the landscape of science for one individual researcher based on papers s/he cites and/or publishes. It has also been used to communicate the (evolving) ‘expertise profiles’ of institutions and even countries. The map can also be used to communicate the very different temporal dynamics of scientific disciplines, bursts of activity, or emergent research frontiers. PHS 398/2590 (Rev. 11/07) Page 22 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael This part of the project will benefit from the interdisciplinary, multi-institution Network Workbench69,70 (NWB) tool development project lead by Börner. The NWB tool supports the large-scale analysis of scholarly data including publication, citation and joint investigator relationships. It provides access to more than 110 algorithms relevant for the study of social networks, and can be used to quickly test and refine analysis workflows and visualizations in support of effective research networking. C.2.g. VIVO Networking for Researchers and Groups The root object of interest in a research networking infrastructure is the individual researcher node. Researchers, whether in the role of author, investigator, faculty member, inventor or trainee, can have multiple attributes and linkages depending on these roles as reflected in the ontology structure. We anticipate there might be upwards 1,000,000 researcher nodes in the distributed VIVOweb system. Researchers form themselves into groups, formally constituted research teams, institutes and centers, informal project staff and networks of common interest. A researcher might be a member of several dozen groups. Groups will vary in size from a few people to hundreds of researchers, and we expect to support the tasks of group formation, management and productivity. We envision that creating a group or research network proceeds in a very similar manner to friending people in FaceBook: simply find a person, ask him/her if she wants to join a group and upon confirmation, both researchers are connected to a ‘group’ node. Formal groups will be populated by systems of record and authorized individuals. Team formation typically requires understanding the expertise, resources and network connections of each participating researcher. It also benefits from seeing what a new member adds to an existing team through new connections based on subject area, research activities, affiliations and other forms of linkages. Group management benefits from a local view of the triples (person, member of, team) that make up the researchers/students in a group. Group productivity requires effective exploitation of strong and weak linkages of researchers, effective communication of intermediate and final results and evaluation of researchers’ contributions as input to future group formation. VIVOweb will provide bi-lateral data exchange that can be used to interface VIVOweb installations to group productivity tools. Support for groups will be a key area of new development in the ontology, so that groups can be linked not only to multiple investigators, but also to publications, grants and facilities. Access controls leveraging the ontology structure will provide fine-grained, contextual control over viewing and editing, an important feature for individuals wishing to use VIVOweb for personal and team-based networking, where some connections may be speculative or private, especially when just forming. C.3. Implementation Each of the seven schools will implement VIVOweb and join a prototype national research network. Implementing VIVOweb involves hosting the VIVO platform, populating VIVOweb with information regarding the researchers at the institution, and creating a community of practice around support and maintenance of the platform and its data. C.3.a. Release 1 Implementation To achieve the goal of prototyping a national network for research networking, we propose to deploy a local VIVO application at each partner institution. VIVO will be installed and configured by the local institution based on the availability of institutional data sources and configured for interactive editing in accordance with the institution’s authentication systems. VIVO can provide a self-contained, local research networking solution featuring a public web display portal for researcher interests and accomplishments (see Figure 1). VIVO can consume RDF from PHS 398/2590 (Rev. 11/07) Page 23 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael any source and ontologies created using any standard editor such as Protégé.71 Phase 1 will prioritize connections with local authentication systems to ensure that data can also be modified by researchers who log in with their institutional credentials to use the self-editing component in VIVO. Local modifications to the VIVOweb ontology will also be possible through interactive editing screens. Administrative editing roles will typically be assigned to librarians or research support professionals, with student labor to capture data from CVs or existing web sites. This model enables small biomedical research institutes or any institution without assured central IT support to provide an attractive research networking system “out of the box,” capable of serving as a proof of concept to elicit commitments of scarce institutional IT resources for localized authentication and tapping into data sources of record. Local VIVO installations will be sustainable only if data are current and accurate. Researchers have little time to maintain information in their profiles. As a result, data ingest will be a critical part of the technical innovation for institutional adopters of the VIVO platform. The early focus will be on implementing an accepted common ontology (such as the Semantic Web for Research Communities ontology adopted by UF’s instance, GatorScholar), and on setting up data feeds from institutional sources for authoritative human resource information (active personnel, titles, affiliations), grants and publications from PubMed and other databases such as Web of Science or Scopus, depending on local licenses. Ingest procedures will be implemented in year 2 to harvest information from faculty reporting systems in use at partner institutions. VIVO at Cornell is populated primarily by data feeds from the PeopleSoft human resources database, from an Oracle grants database and from a PeopleSoft student records system that provides course information. XML web services from a new, externally-hosted faculty reporting system will provide very granular information directly from annual updates by faculty from several Cornell colleges, using workflow tools that identify both additions and deletions. The campus LDAP server provides updates to contact information, and feeds to a new university events calendar and news service are underway, following the model of leveraging any and all appropriate existing institutional data sources to assure information currency and to allow maintenance for each data element to happen in the database of record. Development at UF will improve the data ingest workflow from SOR, which most frequently involves converting data from relational databases into the statement-based Semantic Web data model through the vehicles of CSV or tab-delimited text files, direct database views, XML data files or web services. Workflow templates to keep VIVO updated from SOR will be included in the distribution package for implementation as automated or semi-automated processes depending on local situations, allowing VIVO to be updated using established institutional policies and procedures. Each institution’s VIVO will become part of a distributed computing cluster that will harvest data from each local node for a cross-institutional search index and for network analysis and visualization. The local VIVO application will be used to edit and display profiles of researchers at an institution in the full context of their affiliations, activities and accomplishments. VIVO has already been independently installed at UF and at two international locations. The partners in this proposal have committed to begin working with local VIVO installations from the beginning of the grant period to allow maximum opportunity for formative evaluation and feedback within the first year. Phase 1 will include providing a level of documentation and quick start guides to allow installation of “VIVO in a box” by information technology professionals with no previous experience with the constituent open source tools. The VIVO platform will be further documented and installation packages developed during phase 1 to facilitate deployment at additional institutions (Section C.2.a). The ontology group will also enhance the core VIVO ontology to improve direct interoperability with ontologies in the research project, temporal, geographic, publication and biomedical domains. PHS 398/2590 (Rev. 11/07) Page 24 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael C.3.b. Release 2 Implementation Research and development at Indiana University initiated during phase 1 will add more direct support for personal and institutional-level networking and for reporting and query tools designed to support prospective discovery of collaborators to complement relationships discoverable through coauthorship and other more direct common affiliations presently visible in VIVO (Section C.2.c). These features are also discussed in more detail as use cases in Section B.1 above. Cornell and UF will add features previously described. The resulting release will fully implement networking of researchers among the seven schools. Each school will upgrade their VIVO release 1 system to release 2. Upgrades are intended to be straightforward. Initial seeding of the databases and maintenance of data will have taken place during the implementation of the first release – implementation of release 2 is intended to demonstrate the long-term viability of locally support VIVOweb maintenance. During release 2, data from these institutionally-hosted VIVO systems will be made available for local harvesting and repurposing using standard RDF syntaxes such as RDF/XML72. National networking capability will be fully realized during this phase, and enable more institutions to join the network. Institutions will be encouraged to join the network as the first seven complete their release 2 implementations. C.3.c. Release 3 Implementation Release 3 will be developed over the last six months of the two year grant period. Features and improvements will be drive by the evaluation of the first two releases and the governance process. Release 3 will be available prior to the end of the grant period and marks the transition of VIVOweb to community-supported open source. The seven participating schools are not expected to implement the system as part of their work on this proposal. A positive outcome would be for schools to accept responsibility for the care and maintenance of their VIVOweb systems based on the utility they have observed during the grant period. By release 3 we anticipate and welcome adoption by schools outside the initial group of seven. By broadening the community we begin the path to true research networking. C.4. Dissemination The dissemination and adoption of VIVOweb by institutions will be fueled by outreach efforts coupled with a strong technical and community support model. A researcher network platform may be technically very sound, but will only be used and maintained if it is of value at multiple levels as has already been mentioned—and if the stakeholders are well-supported, and completely understand and appreciate the value of participating in the network. While a large part of this value is provided by technical innovation, experience with the VIVO platform at Cornell and UF has indicated that sustained outreach efforts targeted at administrators and researchers alike to publicize the tool and its value can have immense and lasting positive ramifications. Dissemination efforts will also have to take into account the likelihood that national dissemination will differ from adoption by participating institutions—the “early adopters”—characterized by author Geoffrey Moore73 as those “…who have the insight to match an emerging technology to a strategic opportunity…”. Although Moore’s book, relates primarily to commercial products, many of its principles apply, and will be employed in the VIVOweb dissemination plan. C.4.a. Dissemination within Participating Institutions Proactive and “apolitical” service and support, and “marketing” VIVO to researchers and administrators has been a key to dissemination and adoption of the VIVO platform at Cornell and UF, and will be heavily relied upon as VIVOweb is disseminated within participating institutions. Based on experience accumulated through these implementations, support for participating institutions will primarily focus on short one-on-one or group presentations that highlight the benefits to researchers and the institution of participating in VIVOweb, and inform users and administrators of the ease in PHS 398/2590 (Rev. 11/07) Page 25 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael maintaining personal information. Technical backstopping by personnel on local and national development teams, and building collaborations with institutional information technology will be a key support element. However, the most effective provision of the liaison and outreach activities that these goals presuppose will require community support, which will be provided by information specialist facilitators in the institutional library system. Personnel with important roles in this effort will ideally exhibit good understanding of individual and institutional research programs, activities, and needs, have the ability to effectively navigate the administrative and political landscape, be able to communicate confidently and knowledgably with their stakeholders, and be capable of conveying essential information about VIVOweb without resort to technical jargon and details. They will liaise not only with institutional stakeholders, but also with personnel within the national coordination and implementation teams, to implement a locally viable process and workflow to promote dissemination. They will also work with the metadata and other librarians on the project team (such as those with expertise in MeSH, CTSC activities, metadata, and ontologies) to ensure that the project is responsive to the need for local modifications—to the core ontology, for instance. Responsiveness to user feedback is critical to ensuring the successful dissemination of VIVOweb. A well-conceived and implemented dissemination effort for participating institutions will earn a good reputation for VIVOweb, an essential feature of the national dissemination endeavor. C.4.b. National Dissemination The national dissemination effort will concentrate on promoting adoption of VIVOweb by new institutions who are not early adopters. Moore classifies this group into two: early majority—or “pragmatists”, and the late majority, and identifies the major barrier to successful dissemination of a technology product being the pragmatists. According to Moore: “Overall, to market to pragmatists, you must be patient. You need to be conversant with the issues that dominate their particular business. You need to show up at the industry-specific conferences and trade shows they attend. You need to be mentioned in articles that run in magazines they read. You need to be installed in other companies in their industry. You need to have developed applications that are specific to their industry. You need to have partnerships and alliances with the other vendors who serve their industry. You need to have earned a reputation for quality and service.” Based on this analogy, our pragmatists are likely to be encountered when the national dissemination effort begins, and success at fostering adoption of VIVOweb at this scale will likely require patience, familiarity with current issues in biomedical research, and a presence at biomedical events, venues— and literature, if possible. That VIVOweb will be an application that services needs in biomedical research is a given, as is the fact that it will collaborate with and draw information from other biomedical service providers such as PubMed and other data and resource discovery sources as much as possible. Movement towards national marketing and dissemination will begin immediately after funding begins, even though the early dissemination focus will be on participating institutions. It is clear that with the essential role that these early adopters will play in the development and implementation of VIVOweb, libraries are a constituency to which to market VIVOweb as one way to promote adoption. The Medical Library Association (MLA) is the primary library association for librarians who serve biomedical researchers, and its annual conference would be a logical place to introduce VIVOweb. However, with funding beginning in September, and MLA not meeting on a national basis until May 2010, initial marketing to this audience will begin at the regional level with presentation and/or exhibition at MLA chapter conferences. In 2009, 8 chapter meetings will be held between 21 PHS 398/2590 (Rev. 11/07) Page 26 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael September and 1 November (with a ninth meeting in January 2010) covering all regions of the United States. Some other appropriate scientific, library, and informatics related conferences that will be covered include the American Association for the Advancement of Science Annual Meeting (February 2010 in San Diego); the American Society for Microbiology Annual Meeting (May 2010 in San Diego); the American Medical Informatics Associations Annual Symposium (AMIA; November 2009 in San Francisco, November 2010 in Washington, D.C.); AMIAs Summit on Translational Bioinformatics (Spring 2010 in San Francisco); AMIAs Spring Conference (May 2010 in Phoenix); the Special Library Association Annual Conference (June 2010 in New Orleans); and the Association of Research Libraries, that meets as part of the American Library Association’s annual conference (June 2010 in Washington, D.C.) Another way to introduce VIVOweb to library decision-makers is to present at the Association of Academic Health Libraries, a group made up of library directors and associate directors that will meet in Boston in November of 2009. It is imperative that VIVOweb also be introduced to potential endusers (researchers) through presentation and exhibition at their conferences as well. The American Association for the Advancement of Science Annual Meeting is a natural choice for exhibition. Other ways to advertise VIVOweb to potential end-users is through correspondence with associations such as the National Academy of Sciences, the Institute of Medicine, and the National Academy of Engineering. An obvious group to whom to advertise is the CTSA Consortium. Additional methods of dissemination will include the use of association email lists, demonstrations on YouTube, and an advertising webpage dedicated to VIVOweb. C.5. Evaluation The end goal of this project is to have a tool for researchers to facilitate research networking, collaborations, and data sharing to improve scientific dissemination. To provide evidence that the VIVO project meets this goal, we will employ various evaluation techniques to assess a minimum of six objectives related to VIVO support, implementation, dissemination (see Table 4). The Washington University (WU) team will lead the evaluation of VIVO at each site, gathering information using datamining, surveys, observational analysis, and personal interviews. We will focus on usability and outcome evaluations. The usability evaluation will be designed to document and analyze the implementation of VIVO at the adoption sites assessing the implementation completion, consistency among institutions, and identification of gaps between design and delivery. Outcome evaluations will be designed to assess the impact, benefits, and changes at each institution. The evaluation will be guided by the milestones outlined in Table 7 and Table 8 in section E.5. As outlined by the National Institute of Standards and Technology, five key attributes will be assessed through the usability evaluation: learnability, efficiency, memorability, errors, and user satisfaction74 using evaluation methods such as data-mining, surveys, task analyses, and focus groups. We will employ data-mining metrics using WU-developed code or through a pre-packaged tool such as Morae75 to conduct the usability evaluation. Through data-mining we will assess VIVO usage monitoring, gathering data such as page views, number of visits and unique views to analyze the usage of VIVO by volume of participants and quantity of information viewed. Additional measures may include: referring and referral websites, and successful and failed search results, and path analysis. While website monitoring will be continuous, data will be collected on a quarterly basis. User testing will be incorporated into the evaluation where trained evaluators and software developers will observe and record end-users interacting with VIVO in both general and specific tasks. This will identify specific uses and difficulties with VIVO for reporting and further action. To understand how end-users utilize VIVO, we will conduct a task analysis and compare this with the goals of VIVO creators and conduct focus groups to understand key information such as barriers to usage and suggestions for improvements. PHS 398/2590 (Rev. 11/07) Page 27 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael A set evaluation tool such as the MIT Usability Guideline76 will provide measurement consistency with evaluators over the two-year project period. The guidelines include assessing navigation, functionality, user control, language and content, online help and user guides, system and user feedback, web accessibility, consistency, error prevention and correction, and architectural and visual clarity. Adopting institutions will be given these guidelines at the beginning of the program, and two assessments will take place for each institution with a report with recommendations returned to the institutions within two months after evaluation completion. Items to be assessed and analyzed for the outcomes evaluation include: inputs, activities conducted, outputs, outcomes, and, outcome indicators. At the initiation of the program a survey will be designed and administered to assess the expectations for VIVO program implementation and usage by adopting institutions. Follow-up evaluations to the same persons will be conducted at the end of the first and second year incorporating dissemination and adoption practices. All surveys will be conducted on-line sending a link to potential respondent’s e-mail. Personal interviews with researchers, site representatives, IT implementers, and other key personnel will be conducted by Dr. Leslie McIntosh and a junior bioinformatics specialist. Additionally, we will conduct web searches for VIVO references in presentations, papers, and other documentation sources. When necessary, both quantitative and qualitative questions will be designed and delivered through structured and unstructured interviews. Through the methods described, we will answer the following questions: 1. How well does the software meet the needs of investigators for finding appropriate people for collaboration and research? 2. How well does the software meet the needs of institutions for learning about their own activities? 3. How much effort is involved in implementing, hosting and maintaining the system and the data stored in the system? 4. How has the 2-year grant program addressed the issues of sustainability of development and support for the software? 5. How accurate and timely are the data at each institution? Is the accuracy related to the techniques used to implement and support the software? How? What recommendations can be made for improvement? We will apply for IRB exemption or submit the necessary paperwork to satisfy IRB requirements at implementation sites for the evaluation. The evaluation team at Washington University will prepare quarterly reports, which will be made available to the executive advisory board. The project plan addresses governance, technical design, implementation, dissemination and evaluation based on the development of community supported by the libraries, and focused technical activity to extend VIVO’s capabilities to for national networking of researchers. PHS 398/2590 (Rev. 11/07) Page 28 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Table 4 Primary Objectives to be evaluated in VIVO assessment Objective Assessment Evaluation method Support network will be in place Identify individuals tasked with outreach and support in the network Evaluate communication/interaction among persons Discussions with VIVO Wiki and Advisory Board On-line survey incorporating social network analysis Telephone interviews VIVO implemented at adopting institutions Visit VIVO websites at adopting institutions Web-based surveys to key personnel at implementation sites Trained evaluators will employ usability guidelines Report the number of institutions using VIVO and descriptive statistics of website use VIVO support services and training meets the needs of users at VIVO implementation sites Evaluate success of training (both in-person instruction as well as “just-in-time” webbased tutorials) Web-based surveys upon completion of training modules Follow-up survey within two months after training VIVO disseminated beyond initial adopters Collect evidence demonstrating presentations given to promote VIVO Educational outreach activities Web search of presentations (e.g. use Google Scholar) On-line survey to key personnel at implementation sites VIVO accessed and used by diverse user community Visit VIVO websites at adopting institutions Data-mine usage of VIVO sites including incoming IP addresses VIVO community support developed beyond initial implementers Monitor on-line VIVO forums Data-mine forum content, robustness, and end-user usage D. Role of the Participating Institutions and Staffing of the Project D.1. Participating Institutions Seven schools will serve as early adopters of the VIVO system (see Table 5). These schools represent significant diversity in terms of size, geography, student population and NIH activity. All seven schools have NCRR centers. Five of the schools have CTSA awards – UF, Weill Cornell Medical College, Indiana University, Scripps Research Institute and WU. Three schools – UF, Cornell University and Indiana University will participate in the technical activity required to develop subsequent versions of VIVO. All seven schools will implement two versions of VIVO, the current version and the version to be developed under this proposal (see Project Deliverables and Timeline). As part of the implementation, all seven schools will participate in the evaluation of VIVO and its use by researchers. PHS 398/2590 (Rev. 11/07) Page 29 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Table 5 Participating Institutions School University of Florida Cornell University, Ithaca Indiana University Washington University Weill Cornell Medical College Scripps Research Institute Ponce Medical School, Puerto Rico Role NCRR, CTSA, Development, Implementation, Indexing, Lead NCRR, Development, Implementation, Indexing NCRR, CTSA, Development, Implementation NCRR, CTSA, Implementation, Evaluation NCRR, CTSA, Implementation NCRR, CTSA, Implementation NCRR, Implementation The University of Florida (UF) will serve as the lead institution and developer of interfaces and packaging for rapid deployment of VIVO at other institutions. UF is the fourth largest university in the United States, with over 51,000 students on its Gainesville campus. As a land grant university, UF operates in all 67 counties across the state of Florida. Research awards to UF faculty account for $576M of external support annually. Through its Clinical and Translational Science Institute UF is affiliated with the Moffitt Cancer Center77, Shands HealthCare78, the Malcom Randall Veterans Association Medical Center of the North Florida/South Georgia Veterans Health System79, the largest in the country, and the Burnham Research Institute80 in Orlando. UF is the lead institution for the project. Efforts include overall project direction, facilitation of governance processes and structures, local implementation and support, participation in evaluation, development of VIVOweb interfaces to SOR and other platforms, packaging of VIVO software for rapid deployment, identity management support, instructional media and design and coordination of site implementations. Cornell University, Ithaca will serve as the lead institution for the extension of VIVO for national networking. Founded in 1865 by Ezra Cornell and Andrew Dickson White, Cornell is the federal landgrant institution of New York State, a private endowed university, a member of the Ivy League/Ancient Eight, and a partner of the State University of New York. It consists of fourteen colleges and schools: seven undergraduate units and four graduate and professional units in Ithaca, two medical graduate and professional units in New York City, and one in Doha, Qatar. The Ithaca campus includes 1,627 faculty, 13,562 undergraduate students, and 6,077 graduate and professional students. Life Sciences research at Cornell cuts across most of the colleges and schools with 44 graduate fields from animal breeding to zoology. It is a particular focus of both Cornell’s College of Agriculture and Life Sciences and its College of Veterinary Medicine. The Cornell University Library is one of the twelve largest academic research libraries in the United States, with a long history of research and development in the area of digital information services. The Albert R. Mann Library, part of CUL and the home of VIVO, offers one of the country’s best library collections in agriculture, life sciences and human ecology, as well as providing extensive computing facilities, a broad suite of digital media technology, tools for GIS, hands-on workshops, customized reference consultations and a range of other services. Mann Library is internationally known for its digital library efforts including VIVO, TEEAL (The Essential Electronic Agricultural Library) and the USDA Economics, Statistics, and Market Information System. Efforts at Cornell include leadership of the Technology Advisory Committee; technical development of enhancements to VIVO including distributed, multi-institutional indexing, scalability, and support for individual and team networking; and national coordination of the VIVOweb outreach efforts. Indiana University will lead development in social networking and ontologies. Indiana University was founded in 1820 and is one of the state’s leading research and educational institutions. General information about Indiana University, including an overview of physical facilities, is available online81. Indiana University includes two main research campuses and six regional (primarily teaching) campuses. The Indiana University Bloomington campus includes 2,309 full- and part-time faculty, PHS 398/2590 (Rev. 11/07) Page 30 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael 5,201 professional staff, 8,596 graduate and professional students and 30,394 undergraduate students. The Indiana University—Purdue University Indianapolis (IUPUI) campus is operated by Indiana University and includes schools from Indiana University and Purdue. The IUPUI campus includes 3,161 full- and part-time faculty, 4,645 professional staff, 8,652 graduate and professional students and 21,202 undergraduate students. The key Indiana University schools located at IUPUI include the Indiana University Schools of Medicine, Informatics and Business. The Office of the Vice President for Information Technology (OVPIT) and University Information Technology Services (UITS) are responsible for delivery of core information technology and cyberinfrastructure services and support. OVPIT and UITS collectively have a budget of more than $110,000,000 annually, employing more than 700 full time staff members. Through its Clinical and Translational Science Institute, Indiana is affiliated with sixteen commercial and university entities82 including Purdue University and the University of Notre Dame. Washington University School of Medicine (WUSM) Adoption Team will perform evaluation of implementation and integration of the VIVO application at all partner institutions and will serve as an implementation site for VIVO. Washington University (WU) has a rich tradition of academic, research, and clinical excellence. WU includes the School of Medicine located at the Medical Center Campus and six other schools (Arts and Sciences, Business, Design and Visual Arts, Engineering and Applied Science, Law, and Social Work) located at the Danforth Campus two miles away. The two campuses are connected by a regular shuttle service and the public light rail service. WU has 105 academic departments with 11,158 full time students. WU has a history of distinguished faculty: 30 are currently members of the National Academy of Sciences, 26 are members of the Institute of Medicine, 19 hold MERIT awards from NIH, and six are Howard Hughes Medical Institute investigators. Twenty-two Nobel laureates have been associated with WU as faculty or students, 17 from the School of Medicine. The WUSM is organized into 20 departments, four teaching and research divisions, and seven graduate training divisions with a total of 1,727 faculty, 594 medical students, and 638 graduate students. The Division of Biology and Biomedical Sciences oversees an array of graduate training programs, including the largest Medical Scientist Training Program (MD/PhD) in the country. More than 90% of the Medical Scientist Training Program graduates are actively involved in research. In FY07, NIH grants awarded to WUSM faculty totaled $365,986 million ranking amongst the top NIH funded medical schools in the country. WU also has outstanding patient care programs through its affiliation with BJC Healthcare, a 13-hospital integrated health care delivery network in the Midwest, which is anchored by two nationally ranked teaching hospitals, Barnes-Jewish Hospital and St. Louis Children’s Hospital. These resources make WU well-suited to act as a key participant in the VIVO project as both an implementation site as well as the lead for evaluation efforts of the VIVO consortium. WU has a keen understanding of institutional, collaborative, cultural, and regulatory challenges that slow the process of transferring basic and clinical scientific discoveries into improvements in human health and looks forward to participating in this important effort which will facilitate academic collaborations, thus ultimately speeding the implementation of scientific discoveries. Weill Cornell Medical College joined with 6 partners, Cornell University, Ithaca, Cornell University Cooperative Extension, New York City, Hospital for Special Surgery, Hunter Center for Study of Gene Structure and Function, Hunter-Bellevue School of Nursing and Memorial Sloan-Kettering Cancer Center to devise a strategy for creating a Clinical and Translational Science Center. The mission of this Center is to nurture and promote a research environment that would accelerate the clinical application of basic science discoveries. We are shaping programs to integrate clinical and translational science across multiple departments, schools, clinical and research institutes and hospitals. We developed mechanisms to foster the creation of multidisciplinary research teams, incubators to develop innovative research tools and information technologies, which ultimately would advance the application of new knowledge and techniques to good clinical practice in patient care. Weill Cornell is already a participant in VIVO through Cornell Ithaca. We will extend our PHS 398/2590 (Rev. 11/07) Page 31 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael implementation with the new features and semantics and extend the VIVO functionality to our partner CTSC institutions. The Scripps Research Institute (TSRI) will be an early adopter of VIVO. One of the world's largest independent, non-profit biomedical research organizations, The Scripps Research Institute operates two campuses with headquarters in LaJolla, California, and a new campus focused on basic biomedical science, drug discovery, and technology development in Jupiter, Florida. TSRI is internationally recognized for its discoveries in immunology, molecular and cellular biology, chemistry, neurosciences, autoimmune, cardiovascular, and infectious diseases, and synthetic vaccine development. Established in its current configuration in 1961, it employs approximately 3,000 scientists, postdoctoral fellows, scientific and other technicians, doctoral degree graduate students, and administrative and technical support personnel. Ponce School of Medicine, Puerto Rico will be an early adopter of VIVO. Founded in 1977, Ponce now holds nationally accredited graduate programs in the disciplines of Medicine, Clinical Psychology, and Biomedical Sciences, and a Masters Degree in Public Health. The Ponce School of Medicine Partnership with the Moffitt Cancer Center83 addresses the cancer problem in Puerto Rico focusing on basic research, cancer education and training, outreach and tissue procurement. Ponce is a member of the Alliance for Advancement in Biomedical Research in Puerto Rico84, an NCRR funded center. D.2. Staffing of the Project Dr. Michael Conlon will serve as principal investigator and project director. Dr. Conlon has extensive experience in large-scale software development and deployment and in biomedical research. Dr. Conlon led the development of the software used in the INVEST clinical trial, collecting and processing data from over 850 physicians offices in 14 countries. Dr. Conlon led the technical implementation of the UF PeopleSoft System (myUFL85) and built and managed a team that implemented infrastructure, system interfaces, data conversions and system configuration for 18 modules in 18 months. The $29M project was delivered on-time and on-budget. Dr. Conlon led the design and implementation of the UF Directory86, an identity management system containing records of over 1.7 million current and former faculty, staff and students of UF and supporting federated identity through Shibboleth87 for 170,000 current credential holders. Dr. Conlon led the efforts to create UF’s Active Directory88 system supporting servers in 50 locations across the State of Florida, a BizTalk89 system providing service oriented architecture services to enterprise applications, and UF Exchange90 – UF’s implementation of Microsoft Exchange and Microsoft Office Communications Server. Dr. Conlon is Associate CIO for IT Architecture at UF as well as Associate Director of the UF Clinical and Translational Science Institute91 and Interim Director of Biomedical Informatics in the College of Medicine. A former CIO of the UF Health Science Center and Research Associate Professor of Biostatistics, Dr. Conlon is a frequent presenter on identity and access management and serves on the InCommon92 Research Administration working group. Valrie Davis will serve as site liaison, and coordinate implementation of VIVO at all seven institutions. Davis has co-led the local VIVO implementation called GatorScholar at UF. She is a member of the UF Libraries’ Emerging Technologies Committee and leads local exploration of ontologies and assists in the dissemination of technologies across the campus community. She coordinates librarybased services for off-campus users including more than 995 faculty and staff located at 13 Research & Education Centers and 67 County Extension offices throughout the State of Florida. She also supports a variety of on-campus agricultural and life science departments. As a library instructor, she presents specialized face-to-face training sessions and develops specialized training tutorials using software such as Camtasia. A member of the Born Digital Initiative Working Group, Policy Development and Grant Writing Sub-Committee, she assisted in the identification of preservation and access issues related to a national interface for born-digital and reborn digital agricultural resources. She is an active member in many national organizations where she provides expertise in the PHS 398/2590 (Rev. 11/07) Page 32 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael agricultural sciences, information sharing technologies, and electronic resource development. Ms. Davis will serve as site liaison, coordinating implementation of VIVO at participating schools. Dr. Sara Russell Gonzalez will lead the local UF implementation, expanding the current implementation to all of UF. She co-led UF’s initial test implementation of the VIVO database. She is the Physical Sciences librarian at the Marston Science Library at UF, providing research assistance and instruction in the subjects of Physics, Astronomy, and Geology. Through her liaison work to these departments, Dr. Russell Gonzalez has developed an expertise in harvesting and retrieval of scientific publications. Her research interests include applying bibliometrics to understanding the publishing behavior of scientists. She was recently a consultant on a NASA grant with members of the UF Astronomy department to acquire and setup an Astrowall for display of 3-D astronomical data for educational purposes. Prior to joining UF, Dr. Russell Gonzalez was a research seismologist with Weston Geophysical Corporation investigating discrimination and location of nuclear explosions. Dr. George Hack will serve as the lead in the development of instructional support and media for VIVOweb. George Hack has been on faculty at the University of Florida since 1997 serving in the Institute of Food and Agricultural Sciences as coordinator of extension education programs, teaching graduate and undergraduate technology courses in the College of Education, and as Assistant Director for Instruction and Information systems in the Health Science Center Libraries. Dr. Hack has a doctorate in Educational Technology and has designed online and face-to-face instruction at the University of Florida and other universities. Recently he has collaborated on the Compendium for Children’s Health with a team of international physicians, setting up an online environment for Pediatricians to receive instruction in Community Pediatrics. His current research investigations include human-computer interactions as they relate to information resources and information seeking behaviors. He plans to use the findings from this research to better inform interface development, bibliographic instruction, physical and technology spaces within the library, and web design. Christopher Barnes manages software development for the Clinical and Translational Informatics Program at UF. Mr. Barnes has led the development of hundreds of research systems including those supporting the Florida Brain Tumor Registry, the Emerging Pathogens Institute, the Claude Pepper Center for Aging, the Texas Medicare data repository, and the portal of the UF Clinical and Translational Science Institute. He has significant experience with Drupal, Shibboleth, and research software development. Mr. Barnes will lead the UF development teams responsible for VIVO packaging, incorporation of federated identity management using Shibboleth, the construction of interfaces for systems of record and interfaces for Sakai and Drupal. Dr. Michele R. Tennant is the Bioinformatics Librarian at the UF Health Science Center Libraries and U.F. Genetics Institute. Dr. Tennant has provided reference and liaison services at the library since 1995. Since 2001 she has served as embedded librarian in the UF Genetics Institute, providing consultations and extensive instruction in the use of bioinformatics and more traditional library resources. As liaison and embedded librarian, she has forged strong professional relationships with UF biomedical researchers, particularly those whose work has genetic-, molecular- or bioinformaticsrelated components. She currently serves as contact for implementation of GatorScholar at the UF Health Sciences Center. Dr. Tennant was part of two teams of information professionals from throughout the country that created online educational materials for National Center for Biotechnology Resources, and taught courses onsite and on a regional basis for the NCBI. Dr. Tennant’s research interests include how scientists use bioinformatics-related databases, in particular those developed by the NCBI, and attitudes of researchers and librarians to library-based bioinformatics support. She is active at the national level in the Medical Library Association and the Special Libraries Association, and is currently a member of the National Library of Medicine’s Biomedical Library and Informatics Review Committee. Dr. Tennant’s work on the proposed grant is fourfold: 1. She will serve as researcher support at the University of Florida; 2. She will coordinate the UF liaison librarians’ outreach efforts (marketing, instruction, and communication to the technical team of researchers’ PHS 398/2590 (Rev. 11/07) Page 33 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael needs related to VIVOweb) and will perform these same functions with her research clients; 3. She will serve on the team coordinating national activities (efforts to recruit additional libraries, present and exhibit at national and regional conferences, etc.); 4. She will assist UF’s ontology team. Dr. Dean Krafft is the Chief Technology Strategist at the Cornell University Library and a Senior Research Associate in Information Science. Dr. Krafft will lead the Cornell effort, overseeing the VIVOweb technical development at Cornell. He will also chair the project’s Technical Advisory Board, working with technical experts from across the country. As the former Director of IT for Computing and Information Science at Cornell and the former Principal Investigator on the National Science Digital Library (NSDL) project93, he has extensive experience in managing large software development projects, in IT support and production, and in working in large, complex virtual organizations. While with NSDL, he led the effort to create Ncore94, an open-source technical infrastructure for digital libraries, to support the thirteen NSDL Pathways partners and the over 130 collections that comprised NSDL. Dr. Medha Devare will serve as national coordinator for VIVO. Dr. Devare is a bioinformaticist based in Cornell’s Albert R. Mann Library, and has coordinated the implementation and outreach efforts for VIVO across Cornell University's 11 colleges and 3 U.S. campus locations. She has also developed relationships with and built interest in the VIVO platform at a number of institutions through liaison and outreach activities and conference presentations. Apart from coordinating the VIVO project at Cornell, Dr. Devare has taught bioinformatics workshops at Cornell (Ithaca) and at Weill Cornell Medical College, and organized and taught a genomics course and co-taught a cropping systems course at Cornell. She is currently working with faculty to recreate the introductory biology laboratory at Cornell. Dr. Devare remains involved with research on agricultural biotechnology, with several reports and publications out and in review on this topic. As national coordinator for VIVO, she will promote the project, the VIVOweb platform and the library-based support model, and coordinate outreach efforts with information specialists and other personnel at all seven schools, and at the national scale. Instructional media and promotional materials will be developed under Dr. Devare's coordination with the team from UF. Jonathan Corson-Rikert will lead development teams at Cornell University to extend VIVO’s capabilities as described in this proposal. Mr. Corson-Rikert has been a programmer and project leader in Information Technology Services at Cornell’s Albert R. Mann Library since 2001, working on projects including the VIVO virtual life sciences library95, the Cornell University Geospatial Information Repository96, and e-Clips97, Cornell’s collection of digital video clips on entrepreneurship. Prior to joining Mann Library, he worked as research administrator for the Program of Computer Graphics at Cornell, programmed geographic software at the Harvard Lab for Computer Graphics and Spatial Analysis, and developed early digital cartography applications at the Dane County Regional Planning Commission in Wisconsin. Dr. Katy Bòˆrner will direct efforts related to social networking, metrics and presentation. Börner is the Victor H. Yngve Associate Professor of Information Science at the School of Library and Information Science, Adjunct Associate Professor in the School of Informatics, Core Faculty of Cognitive Science, Research Affiliate of the Biocomplexity Institute, Fellow of the Center for Research on Learning and Technology, Member of the Advanced Visualization Laboratory, and Founding Director of the Cyberinfrastructure for Network Science Center98 at Indiana University. She is a curator of the Places & Spaces: Mapping Science exhibit99. Her research focuses on the development of data analysis and visualization techniques for information access, understanding, and management. She is particularly interested in the study of the structure and evolution of scientific disciplines; the analysis and visualization of online activity; and the development of cyberinfrastructures for large scale scientific collaboration and computation. She is the co-editor of the Springer book on ‘Visual Interfaces to Digital Libraries’ and of a special issue of PNAS on ‘Mapping Knowledge Domains’ (2004). Her new book ‘Atlas of Science’ published by MIT Press will become available in 2010. PHS 398/2590 (Rev. 11/07) Page 34 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Dr. Ying Ding will lead efforts pertaining to the development and maintenance of ontologies used by the Semantic Web to represent scientists and investigators. Dr. Ying Ding is an Assistant Professor in School of Library and Information Science, Indianan University. She previously worked as an Assistant Professor at the University of Innsbruck, Austria and as a researcher at the Division of Mathematics and Computer Science at the Free University of Amsterdam, the Netherlands. She has more than eight years of experience and a strong research track in the Semantic Web area. She was involved in the early development of the DAML+OIL language which evolved into OWL, the current W3C standard for ontology definition. She has been involved in various European-Union funded projects in the Semantic Web area (KnowledgeWeb, Ontoweb EASAIER, OntoKnowledge, IBROW, SWWS, COG, Htechsight, Esperonto, SEKT, DIP, Triple Space Computing). She was one of the major organizers and initiators for International Semantic Web Conference, European Semantic Web Conference and Asian Semantic Web Conference She has published more than 70 papers in journals, conferences and workshops and has served as a program committee member for more than 80 international conferences and workshops. She is co-author of the book, “Intelligent Information Integration in B2B Electronic Commerce,” published by Kluwer Academic Publishers. She is also coauthor of book chapters in “Spinning the Semantic Web,” published by MIT Press, and “Towards the Semantic Web: Ontology-driven Knowledge Management,” published by Wiley. Her current interest areas include Semantic Web, Webometrics, citation analysis, information retrieval, knowledge management and application of Web Technology. Dr. William K. Barnett will be responsible for coordinating all aspects of the implementation of the Cornell VIVO at the Indiana CTSI, including oversight of all grant personnel, coordination with Indiana CTSI programs and institutions, and coordination with technical implementation. Barnett oversees life sciences and biomedical research technologies at Indiana University and the Indiana University School of Medicine (IUSM). As the Senior Manager of Life Sciences, he oversees the development and implementation of research technology programs for biological research including high performance computing (HPC) applications, analytical pipelines, and genomics research. As the Director of the Advanced IT Core at the IUSM, he oversees the development and management of biomedical applications, including HPC and applications development in support of health care research. As the Director of Information Architectures for the Indiana CTSI, he oversees the development of collaborative technologies for the Indiana Clinical and Translational Sciences Institute. Dr. Barnett previously served as the Vice President and Chief Information Officer at the Field Museum of Natural History, where he oversaw the development of collections-based digital library initiatives and genomics research technologies. Previously at the American Museum of Natural History in New York, he oversaw the development of the institutional infrastructure and the growth of core analytical imaging facilities. Dr. Barnett has a BA in Anthropology (College of William and Mary) and a MA and Ph.D. in Archaeology (Boston University). Dr. Anurag Shankar will be responsible for technical coordination among Cornell technical staff installing VIVO, the Indiana CTSI HUB technical staff, and the Purdue University systems administrators responsible for software installation and maintenance on the HUB servers. Shankar serves as a project manager and is the primary customer liaison for the Advanced IT Core, a partnership between the IU School of Medicine and Life Sciences. A computational astrophysicist by training, Shankar has held various positions in the past with UITS, including Manager of Unix support, Distributed Storage Services (now Research Storage), and the national TeraGrid project. He was also responsible for overseeing the Advanced IT Core for the Indiana Genomics Initiative. Shankar holds degrees in physics, mathematics, and astronomy. Dr. Rakesh Nagarajan will lead the Washington University effort. Dr. Nagarajan is the Biomedical Informatics Program director of the WU CTSA, termed the Institute of Clinical and Translational Sciences (ICTS), which has one of its sub aims to implement research networking solutions. Through the ICTS and other initiatives, he leads the biomedical informatics infrastructure development effort at Washington University as director of the WU Center for Biomedical Informatics (CBMI). Dr. Nagarajan PHS 398/2590 (Rev. 11/07) Page 35 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael and his team are implementing a common informatics infrastructure to support the diverse needs of physician-scientists and bench researchers. Dr. Kristi Holmes is a bioinformaticist based in Becker Medical Library and will lead the outreach efforts at WU, including promotion and training of VIVO at WU and assistance with ontology development. At WU, she is tasked with the development and presentation of bioinformatics resource workshops for the university community, integration of molecular biology information resources into medical school and graduate-level curricula, and application of bioinformatics resources to research problems through individualized consultations and collaborative relationships. She has also served as a course developer and instructor for the NCBI Advanced Workshop for Bioinformatics Information Specialists offered by the National Center for Biotechnology Information (NCBI). Dr. Holmes is wellsuited for leading outreach efforts at WU, given her active role in investigating collaboration and faculty profiling applications, her involvement in assessing issues related to research impact, and her efforts to provide instruction, training resources and support materials to researchers. Dr. Leslie McIntosh of Washington University will serve as implementation lead and will also coordinate project evaluation activities for participating institutions. Dr. McIntosh will be able to serve in this position as she has an extensive background in database, web site, and on-line survey development for both educational and private institutions. In addition, Dr. McIntosh has experience performing evaluations in other projects including designing, conducting, and analyzing quantitative and qualitative evaluations of Evidence-Based Public Health trainings within the Missouri Public Health departments, national public health department, and the World Health Organization; conducting evaluations of table-top exercises to assess public health disaster preparedness; and, evaluating youth attitudes, opinion, and beliefs from on-line forums using text analysis. She has also been a consultant with the FBI to assist in survey techniques for collecting social network data. Dr. Curtis Cole will lead the implementation of VIVOweb at the Weill Cornell Medical College. Dr. Cole, board certified in internal medicine, is Chief Medical Information Officer of Weill and Acting codirector of the Weill Clinical and Translational Science Center Biomedical Informatics Program. Dr. Gerald Joyce of The Scripps Research Institute is dean of the faculty and Co-Program Director and Director for Translational Science, Scripps Translational Science Institute, NIH Clinical and Translational Science Award (CTSA) Consortium. Dr. Joyce will lead the implementation of VIVOweb at the Scripps Research Institute. Paula King of the Scripps Research Institute is the Director of the Kresge Library and will lead the researcher support processes for TSRI. Dr. Richard Noel is Associate Professor of Biochemistry at Ponce Medical School and internal advisor for the Ponce Medical School Moffitt Cancer Center Partnership. Dr. Noel will serve as institutional liaison for Ponce Medical School and will lead the implementation and support efforts. E. Project Deliverables and Timeline VIVO work will proceed along three major activities – product development, community support development and governance. Product development will be driven by three releases. Release 1 is focused on institutional setting deployments. Release 2 implements national networking scientists. Release 3 integrates resource discovery and features originating from community development. Complimentary community support is developed along adoption, implementation and use processes. The goal of the community support effort during the project is to create sustainability support activity for VIVO after the end of the project. The governance processes provide a means for accepting community input and determining the future direction of VIVO. An open, participatory governance process will drive adoption of the national network and create value for all participating scientists and institutions. PHS 398/2590 (Rev. 11/07) Page 36 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Table 6 VIVO Project Deliverables Deliverable VIVO Release 1 VIVO Release 2 VIVO Release 3 Community Support Process Product Development Process Governance Process Final Report E.1. Description Scientist discovery in the institutional setting National Networking of Scientists National Networking of Scientists and Resource Discovery On-going support for adoption, implementation and use On-going support for software development and maintenance On-going processes for community input and decision-making Summary, Accomplishments, Challenges, Lessons Learned, Results of Evaluation, Next Steps VIVO Release 1 VIVO Release 1 is a refinement of the existing VIVO platform. It is focused on researcher discovery within an institution. Data will be exposed via the Semantic Web and interoperable with Linked Data. Release 1 will include additional features for interfacing to systems of record, as well as improvements in packaging – the scripts and procedures used to install the software, as well as instructional materials for installation. E.2. VIVO Release 2 Release 2 includes all social networking features as well as visualization. Release 2 includes support for federated identity management as well as support for groups. Indexing features will be provided for semantic query across institutions. Release 2 constitutes the full national networking capability described in this proposal. E.3. VIVO Release 3 Release 3 includes features identified by evaluation and vetted by governance processes throughout the grant period. Release 3 includes integration with the resource discovery platform described in the U24 Request for Application. E.4. Community Support Process A critical component of the project plan is the development of on-going, community-based support for VIVO. As previously described, the libraries constitute a natural foundation for this support. Throughout the project, the libraries will develop and provide support for adoption, implementation and use of VIVO. They will lay the foundation for on-going sustainability. Technical, support and governance of VIVO must be sustained. Support sustainability is in the best interests of the institutions which have adopted VIVO. As the VIVO community grows, the resources to support VIVO on-line grow. Each institution will need to commit to some support of their faculty and scientists in the on-going use of VIVO. In this way, ongoing support activity is created during implementation and use of VIVO. E.5. Product Development Process During the grant period, the VIVOweb team will develop sustainable product development process, ensuring the long-term viability of the VIVOweb technical platform. Based on existing open source models, the VIVOweb development process will provide on-going enhancements and maintenance of the VIVOweb software. Technical sustainability will be achieved by creating an open source community around VIVO in much the same way as communities have developed in support of Sakai, Drupal, and Kuali100 The support for technical sustainability will be generated during the project by the libraries. All activities are oriented to the ultimate goal of a self-sustaining, community-based PHS 398/2590 (Rev. 11/07) Page 37 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael technical activity. Participants share ideas and code for the purpose of supporting and improving VIVO. E.6. Governance Process On-going governance of VIVO will be developed over the course of the project. The CTSAs and other interested groups of schools, as well as the NIH, have a strong vested interest in the continuation of VIVO governance. E.7. Final Report The project will produce a final report summarizing the work of the two year period and laying out the next steps for continued development and support of VIVO in the open source community. The final report will include summary evaluation by Dr. Leslie McIntosh as well as lessons learned, challenges and how they were addressed and any remaining challenges with proposals for addressing them. The final report will be prepared with input from all elements of the project – the project teams, the principal investigator, the advisory groups and the evaluation team. E.8. Timeline The project timelines for year 1 and year 2 are shown in Table 7 and Table 8. Major goals for the first year include 1) establish all governance, support and development teams, structures and processes; 2) Finalize release 1, implement at participating institutions; 3) establish adoption, implementation, use and sustain support activities. Table 7 VIVO Project Timeline, Year 1 Tasks Governance Establish governance groups and support structures Executive Advisory meetings Scientific and Technical Advisory processes Evaluation activities and reporting Support Staff support teams Community outreach efforts for release 1 Development Establish development coordination Staff development teams Complete release 1 Develop release 2 Adoption Facilitate adoption of release 1 beyond initial participants Implementation Consortium schools implement release 1 Feedback from release 1 Use Support release 1 Sustain Establish web sites, download APIs for module development Support community development Establish community input process PHS 398/2590 (Rev. 11/07) Page 38 Qtr 1 Qtr 2 Qtr 3 Qtr 4 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael By the completion of year 1, the participating schools will have completed their implementations in stand-alone and institutional settings. A community of practice will be in place to drive the adoption of VIVO at institutions beyond those participating in the project. A support network for the use of VIVO will be in place. Development of release 2 will be more than 50% complete. The executive and governance processes will be constituted and functioning. An open source community will be established to support the continuing development and technical support of VIVO. Year 2 focuses on completion of release 2 with its deployment and support activities, and the activities needed to create an on-going support system for VIVO. Table 8 VIVO Project Timeline, Year 2 Tasks Governance Technical and Scientific Advisory processes continue Executive Advisory meetings Evaluation activities and reporting Final report Support Staff support teams Community outreach efforts for release 1 Development Staff development teams Complete release 2 Develop release 3 Transition to community development Adoption Presentations at conferences Adoption by professional societies Implementation Consortium schools implement release 2 Feedback from release 2 Use Support release 1 Support release 2 Sustain Develop community of adoption Develop community of implementation support Develop community of usage support Maintain community development Qtr 1 Qtr 2 Qtr 3 Qtr 4 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X The VIVO implementation plan focuses on the development of a strong product and the development of a strong community. The product and community generate support for adoption, implementation and use. Activities throughout the project are oriented to the purpose of creating an on-going, sustainable community of technical and support resources for the national networking of scientists supported by VIVO. F. Data and Software Sharing Plan All software from this project will be made available freely to researchers and their institutions for educational, research and non-profit purposes. PHS 398/2590 (Rev. 11/07) Page 39 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Licensing. All software developed under this project will be freely available for educational, research and non-profit purposes under the terms of the VIVO software license to be developed in conjunction with the NIH. The VIVO software license will be an appropriately modified version of non-commercial open source licenses that will permit the use of the software at participating institutions. No claims of suitability will be made and no warranty of any kind will be made. VIVO software can be modified, but under no conditions can anyone assert ownership over the code or its modifications. Availability. The software will be freely available under the terms of the VIVO license to all biomedical researchers, educators and institutions in the non-profit sector such as education, research institutions and government laboratories. Software will be available for free unrestricted download under the terms of the license, from sites at participating institutions, including UF and Cornell University. Open Source Community. While the source code for the VIVO software is already publicly available, the VIVOweb project will cultivate an active open-source development community by providing extensive developer documentation and a plug-in architecture enabling others to contribute new functionality to VIVO. Because VIVO is built on the popular Jena Semantic Web library, RDF tools developed in other contexts should in many cases be easy to integrate into the VIVO environment with minimal modification. Timeline. Three releases of the software will be made by the project. Release 1 will be completed within three months of project start date. This release will include VIVO, and its required software environment for deployment in a stand-alone setting. Release 2 will be completed eighteen months from the project start date. This release will contain all features described in this proposal as well as support for all use cases described in this proposal. All consortium members will upgrade to release 2 during the course of the project. All schools will provide evaluation and feedback regarding release 2. Release 3 will be completed before the end of the project. It will include all features recommended and approved by the governance process. (See Section E for details of project plan and timeline.) Enhancements. A community of practice will develop around VIVO to support it after the proposal period. Community activity includes the submission of enhancements for inclusion in future releases. The R Project for Statistical Computing101 is an example of a vibrant open source community supporting a complex software system for statistical and data analysis. The VIVO community will operate in a similar fashion, establishing and archive and providing mirror sites for downloads, as well as on-line technical support through a blog and wiki. Commercialization. The VIVO software license permits, under appropriate terms, the use of the software in commercial settings, as well as modification of the software by commercial entities and inclusions of it and/or subsets of it in other software packages. Under no conditions will software provided to the commercial entities under the terms of the VIVO license become the property of a commercial entity. Required Components. VIVO requires the use of other open source components. No commercial software is required to run or host VIVO. Specifically, VIVO requires the use of Apache Tomcat102. Shibboleth103 is required for support of federated identity use cases. VIVO and its required components can be run on a wide variety of operating systems, both open source and commercial. VIVO and its required components can be run on a wide-range of commercially available hardware. It is strongly recommended that VIVO be deployed in accord with all institutional information security and privacy requirements. Data. All data residing in VIVO systems remains the property of the institutions hosting VIVO. Institutions control the release of data residing in VIVO to the Semantic Web for the purpose of enabling national networking of researchers. No other use of the data is implied. Data may reside in indexing systems as part of the operation of the Semantic Web. Data in indexing systems remains PHS 398/2590 (Rev. 11/07) Page 40 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael the property of the host institutions. Host institutions providing data to indexing systems can terminate or alter their release policies at any time. Data and software sharing for VIVO will support the goals of the NIH in enabling national networking of scientists. The community approach supporting adoption, implementation, use and sustainability through an open process facilitated by libraries coupled with the simplicity and power of a completely semantic-based approach to data and social networking enable simple and compelling discovery for scientists and institutions. PHS 398/2590 (Rev. 11/07) Page 41 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael Bibliography and References Cited 1 Berners-Lee, T. 1998. Semantic Web Road Map. Available: http://www.w3.org/DesignIssues/Semantic.html 2 Berners-Lee, T., J. Hendler and O. Lassila, 2001 “The Semantic Web: A New Form of Web Content That Is Meaningful to Computers Will Unleash a Revolution of New Possibilities”, Scientific American, May 17, 2001 3 Linked Data Home Page. http://linkeddata.org. Accessed June 8, 2009. 4 Gator Scholar Home Page. http://gatorscholar.uflib.ufl.edu. Accessed May 27, 2009. 5 SWEO Community Project: Linking Open Data on the Semantic Web: Statistics on Data Sets. http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics. Accessed June 10, 2009. 6 NIH Roadmap for Medical Research. http://nihroadmap.nih.gov. Accessed June 10, 2009. 7 Mukherjea, S. 2005. Information retrieval and knowledge discovery utilizing a biomedical Semantic Web. Briefings in Bioinformatics 6(3): 252-262. 8 Ruttenberg, A., Clark, T., Bug, W., Samwald, M., Bodenreider, O., Chen, H., Doherty, D., Forsberg, K., Gao, Y., Kashyap, V., Kinoshita, J., Luciano, J., Marshall, M., Ogbuji, C., Rees, J., Stephens, S., Wong, G., Wu, E., Zaccagnini, D., Hongsermeier, T., Neumann, E., Herman, I., Cheung, K. 2007. Advancing translational research with the Semantic Web. BMC Bioinformatics 8 (Suppl 3): S2. 9 Deus, H.F, Stanislaus, R., Veiga, D.F. Behrens, C., Wistuba, I.I., Minna, J.D., Garner, H.R., Swisher, S.G., Roth, J.A., Correa, A.M., Broom, B., Coombes, K., Chang, A., Vogel, L.H., Almeida, J.S. 2008. A Semantic Web Management Model for Integrative Biomedical Informatics. PLoS ONE. 2008; 3(8): e2946. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2491554. Accessed June 10, 2009. 10 Chen, H., Ding, L., Wu, Z., Yu, T., Dhanapalan, L., Chen, J.Y. 2009 Semantic Web for Integrated Network Analysis in Biomedicine. Briefings in Bioinformatics Advance 10(2): 177-192. 11 GatorScholar Home Page. http://gatorscholar.uflib.ufl.edu/ . Accessed June 1, 2009. 12 Find an Expert: Network Proof of Concept. http://vitrofe.esrc.unimelb.edu.au:8333/vitrofe/. Accessed June 10, 2009. 13 Southwest Biodiversity Knowledge Environment http://168.160.50.68/vitro/index.jsp?home=1&primary=1. Accessed June 10, 2009. 14 CALS Experts Guide http://cals-experts.mannlib.cornell.edu/. Accessed June 10, 2009. 15 Cornell University: Graduate Programs in the Life Sciences http://gradeducation.lifesciences.cornell.edu/. Accessed June 10, 2009. 16 Lyris Home Page. http://www.lyris.com. Accessed June 6, 2009. 17 Kretzmann , J.P., McKnight, J.L. 1993. Building Communities from the Inside Out: A Path Toward Finding and Mobilizing a Community's Assets. ACTA Publications, Chicago, IL. 18 Pinkett, R. 2003. Community Technology and Community Building: Early Results from the Creating Community Connections Project. The Information Society 19(5): 365-379. 19 Gaved, M., Anderson, B. 2006. The impact of local ICT initiatives on social capital and quality of life. Chimera Working Paper 2006-6, Colchester, University of Essex. 20 See http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helppubmed, http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpcollect and http://www.ncbi.nlm.nih.gov/bookshelf for a complete listing of materials. 21 CTSA Clinical and Translational Science Awards Home Page. http://www.ctsaweb.org. Accessed June 6, 2009. 22 Oracle and PeopleSoft http://www.oracle.com/peoplesoft. Accessed June 5, 2009. 23 Drupal Home Page. http://drupal.org. Accessed June 5, 2009. 24 Sakai Collaboration and Learning http://www.sakaiproject.org/portal . Accessed June 5, 2009. PHS 398/2590 (Rev. 11/07) Page 42 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael 25 Resource Description Framework (RDF) http://www.w3.org/RDF/. Accessed June 10, 2009. SQARQL Query Language for RDF http://www.w3.org/TR/rdf-sparql-query/. Accessed June 10, 2009. 27 Marking Up Structured Data http://www.google.com/support/webmasters/bin/answer.py?answer=99170. Accessed June 10, 2009. 28 The Yahoo! Search Open EcoSystem. http://www.ysearchblog.com/2008/03/13/the-yahoo-searchopen-ecosystem/. Accessed June 10, 2009. 29 RDF Vocabulary Description Language 1.0: RDF Schema http://www.w3.org/TR/rdf-schema/. Accessed June 10, 2009. 30 OWL Web Ontology Language http://www.w3.org/TR/owl-features/. Accessed June 10, 2009. 31 Berners-Lee, Tim. Linked Data http://www.w3.org/DesignIssues/LinkedData.html. Accessed June 10, 2009. 32 SWEO Community Project: Linking Open Data on the Semantic Web Statistics on links between Data sets http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/LinkStatist ics. Accessed June 10, 2009. 33 Apache Lucene Overview http://lucene.apache.org/java/docs/. Accessed June 5, 2009. 34 Jena – A Semantic Web Framework for Java. http://jena.sourceforge.net. Accessed June 5, 2009. 35 MySQL 5.4 Home Page. http://dev.mysql.com. Accessed June 5, 2009. 36 AllegroGraph RDFStore. http://www.franz.com/agraph/allegrograph. Accessed June 5, 2009. 37 OpenLink Virtuoso Home Page. http://www.openlinksw.com/virtuoso. Accessed June 5, 2009. 38 LargeTripleStores. http://esw.w3.org/topic/LargeTripleStores. Accessed June 8, 2009. 39 Open RDF Home Page. http://www.openrdf.org/. Accessed June 8, 2009. 40 SemaPlorer Home Page. http://www.uni-koblenzlandau.de/koblenz/fb4/institute/IFI/AGStaab/Research/systeme/semap. Accessed June 9, 2009. 41 Semantic Web Challenge. http://challenge.semanticweb.org/. Accessed June 8, 2009. 42 Gruber T. (1994). Toward principles for the design of ontologies used for knowledge sharing, International Journal of Human and Computer Studies, 43(5/6): 907–928. 43 Y. Ding and D. Fensel (2001). Ontology Library Systems: The key for Successful Ontology Reuse. In I. Cruz, S. Decker, J. Euzenat and D. McGuinness (eds). Proceedings of SWWS'01, The First Semantic Web Working Symposium, page: 93-112, Stanford University, California, USA, July 29th-August 1st, 2001 http://info.slis.indiana.edu/~dingying/Publication/SWWI2001.pdf Accessed June 10, 2009. 44 Y. Ding and S. Foo (2002). Ontology Research and Development: Part 1 - A Review of Ontology Generation. Journal of Information Science, 28(2): 123-136. http://info.slis.indiana.edu/~dingying/Publication/OntologySurvey-Part1.pdf. Accessed June 10, 2009. 45 Y. Ding and S. Foo (2002). Ontology Research and Development: Part 2 - A Review of Ontology Mapping and Evolving. Journal of Information Science, 28(5): 383-396. http://info.slis.indiana.edu/~dingying/Publication/JIS_28%285%29_383.396_YING_DING.pdf. Accessed June 10, 2009. 46 Knowledge Web. http://knowledgeweb.semanticweb.org. Accessed June 3, 2009. 47 The Friend of a Friend (FOAF) Project http://www.foaf-project.org/. Accessed June 3, 2009. 48 SKOS – Simple Knowledge Organization System http://www.w3.org/2004/02/skos/. Accessed June 3, 2009. 49 DOAP – Description of a Project http://trac.usefulinc.com/doap. Accessed June 3, 2009. 50 SIOC – Semantically Interlinked On-line Communities. http://sioc-project.org/. Accessed June 3, 2009. 26 PHS 398/2590 (Rev. 11/07) Page 43 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael 51 Dublin Core Metadata Initiative http://dublincore.org/. Accessed June 3, 2009. Geonames Ontology http://www.geonames.org/ontology/. Accessed June 3, 2009. 53 Hugh Glaser, Afraz Jaffri, Ian Millard. Managing Co-reference on the Semantic Web. Presented at Linked Data on the Web (LDOW 2009), Madrid, Spain, April 2009. http://events.linkeddata.org/ldow2009/papers/ldow2009_paper11.pdf. Accessed June 10, 2009. 54 Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed June 5, 2009. 55 Friend of a Friend (FOAF) Project http://www.foaf-project.org/. Accessed June 5, 2009 56 RDFa Primer: Bridging the Human and Data Webs. http://www.w3.org/TR/xhtml-rdfa-primer/, Accessed June 5, 2009 57 About the Scholarly Publications Repository. http://www.lib.ncsu.edu/repository/spr/about.html, Accessed June 8, 2009. 58 Scholarly Database http://sdb.slis.indiana.edu/. Accessed June 5, 2009 59 Bio2RDF.org: Semantic Web Atlas of Postgenomic Knowledge. http://bio2rdf.org/. Accessed June 5, 2009 60 Apache Hadoop Home Page. http://hadoop.apache.org/. Accessed June 8, 2009. 61 Apache Hbase Home Page. http://hadoop.apache.org/hbase/. Accessed June 8, 2009. 62 Börner, Katy, Chen, Chaomei & Boyack, Kevin W.. (2003). Visualizing Knowledge Domains. In Cronin, Blaise (Eds.), Annual Review of Information Science & Technology (Vol. 37, pp. 179255), chapter 5, American Society for Information Science and Technology, Medford, NJ. http://ivl.slis.indiana.edu/km/pub/2003-borner-arist.pdf. Accessed June 10, 2009. 63 Börner, Katy, Sanyal, Soma & Vespignani, Alessandro. (2007). Network Science. In Cronin, Blaise (Eds.), Annual Review of Information Science & Technology (Vol. 41, pp. 537-607), chapter 12, Medford, NJ: Information Today, Inc./American Society for Information Science and Technology. http://ivl.slis.indiana.edu/km/pub/2007-borner-arist.pdf. Accessed June 10, 2009. 64 Neirynck, Thomas & Börner, Katy. (2007). Representing, Analyzing, and Visualizing Scholarly Data in Support of Research Management. Proceedings of the 11th Annual Information Visualization International Conference, Zürich, Switzerland, July 4-6, IEEE Computer Society Conference Publishing Services, pp. 124-129. http://ivl.slis.indiana.edu/km/pub/2007neirynck-ivl.pdf. Accessed June 10, 2009. 65 Information Visualization Laboratory. http://ivl.slis.indiana.edu/. Accessed June 10, 2009. 66 Cyberinfrastructure for Network Science http://cns.slis.indiana.edu/. Accessed June 10, 2009. 67 http://iv.slis.indiana.edu/ref/iv04contest/Ke-Borner-Viswanath.gif. Accessed June 5, 2009. 68 Klavans, Richard, Kevin W. Boyack. (2007). Is There a Convergent Structure to Science? In Daniel Torres-Salinas& Henk F. Moed (Eds.), Proceedings of the 11th International Conference of the International Society for Scientometrics and Informetrics. Pages 437-448. Madrid: CSIC. 69 NWB Team. (2006). Network Workbench Tool. Indiana University, Northeastern University, University of Michigan. http://nwb.slis.indiana.edu. Accessed on March 10, 2009. 70 Cyberinfrastructure for Network Science Center. (2009). Network Workbench Tool: User Manual, 1.0.0 beta, http://nwb.slis.indiana.edu/Docs/NWB-manual-1.0.0beta.pdf. Accessed on April 13, 2009. 71 Protégé Home Page. http://protege.stanford.edu/. Accessed June 5, 2009. 72 RDF/XML Syntax Specification (Revised) http://www.w3.org/TR/rdf-syntax-grammar/. Accessed June 10, 2009. 73 Moore, Geoffrey, 1999, Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers, Harper Business. 256 pages. 74 Scholtz, Jean. Usability Evaluation. http://www.itl.nist.gov/iad/IADpapers/2004/Usability%20Evaluation_rev1.pdf. Accessed June 10, 2009. 75 TechSmith Morae Home Page. http://www.techsmith.com/morae.asp. Accessed June 10, 2009. 52 PHS 398/2590 (Rev. 11/07) Page 44 Continuation Format Page Program Director/Principal Investigator (Last, First, Middle): Conlon, Michael 76 MIT IS Usability Guidelines http://web.mit.edu/is/usability/usability-guidelines.html. Accessed June 10, 2009. 77 Moffitt Cancer Center Home Page. http://www.moffitt.org. Accessed May 24, 2009. 78 Shands HealthCare Home Page. http://www.shands.org. Accessed May 24, 2009. 79 Malcom Randall Veterans Affairs Medical Center, http://www2.va.gov/directory/GUIDE/facility.asp?id=54. Accessed June 7, 2009. 80 Burnham Institute for Medical Research Home Page. http://www.burnham.org. Accessed May 24, 2009. 81 Indiana University Fact Book. http://factbook.indiana.edu/index.shtml. Accessed June 7, 2009. 82 Indiana Clinical and Translational Science Institute Home Page. http://www.indianactsi.org/. Accessed May 24, 2009. 83 Ponce School of Medicine – Moffitt Cancer Center Partnership. http://www.moffitt.org/mccpsmpartnership. Accessed June 7, 2009. 84 AABRE Home Page. http://aabre.hpcf.upr.edu/. Accessed June 7, 2009. 85 MyUFL Home Page. http://my.ufl.edu. Accessed May 23, 2009. 86 UF Directory Home Page. http://www.bridges.ufl.edu/directory, Accessed May 23 2009. 87 Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed May 23, 2009. 88 UF Active Directory Home Page. http://www.ad.ufl.edu. Accessed May 23, 2009. 89 Microsoft BizTalk Home Page. http://www.microsoft.com/biztalk. Accessed May 23, 2009. 90 UF Exchange Home Page. http://www.mail.ufl.edu. Accessed May 23, 2009. 91 UF Clinical and Translational Science Institute Portal. http://www.ctsi.ufl.edu. Accessed May 23, 2009. 92 Incommon Federation Home Page. http://www.incommonfederation.org. Accessed May 23, 2009. 93 The National Science Digital Library. http://nsdl.org. Accessed June 10, 2009. 94 NSDL: NCore Platform http://ncore.nsdl.org. Accessed June 10, 2009. 95 VIVO at the Cornell University Library. http://VIVO.library.cornell.edu. Accessed May 23, 2009. 96 Cornell University Geospatial Information Repository Home Page http://cugir.mannlib.cornell.edu. Accessed May 23, 2009. 97 E-Clips Home Page. http://eclips.cornell.edu. Accessed May 24, 2009. 98 Cyberinfrastructure for Network Science Center. http://cns.slis.indiana.edu. Accessed June 10, 2009. 99 Places and Spaces: Mapping Science. http://scimaps.org. Accessed June 10, 2009. 100 Kuali Foundation Home Page. http://www.kuali.org. Accessed May 30, 2009. 101 R Software Project Home Page. http://www.r-project.org. Accessed May 30, 2009. 102 Apache Tomcat Home Page. http://tomcat.apache.org. Accessed May 30, 2009. 103 Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed May 30, 2009. PHS 398/2590 (Rev. 11/07) Page 45 Continuation Format Page