VIVO: Enabling National Networking of Scientists

advertisement
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
VIVO: Enabling National Networking of Scientists
A. A Semantic Approach to Research Networking
This application proposes a solution to facilitate research networking and collaboration of basic,
clinical, and translational researchers including investigators, students, technical staff and others. The
Semantic Web/Linked Data approach we envision confers the ability to implement locally controlled
researcher network installations that interoperate to create a flexible and scalable multi-institutional
network. Although we focus solely on the researcher network for this project, our platform has the
capacity to transparently include and interrelate resource listings and other relevant information. Our
technology choice allows us to easily consume, integrate and expose data hosted by partners who
have other research network or resource discovery platforms in place.
B. Rationale and Approach
B.1.
Rationale
We propose an open, Semantic Webbased network of local ontologydriven databases called VIVO to
enable national networking via
information sharing about researchers
and their activities. VIVO will draw on
as well as contribute to, other webaccessible services and tools.
The
Semantic
Web1
enables
automated and human navigation to
represent and mine digital data, and it
supports
interoperability
and
integration of data from a variety of
sources2. Recently, many of the larger
goals of the Semantic Web are
starting to be realized, particularly in
the new Linked Data3 effort. We have
5 years of experience with VIVO (see
Figure 1 Discover Cornell VIVO Interface
Figure 1), a real-world Semantic Web
application developed at Cornell
University in Ithaca (Cornell), and currently in use at Cornell and as GatorScholar4 at the University of
Florida (UF). VIVO facilitates research discovery and networking and demonstrates that Semantic
Web technology is ready to serve as the foundation for enabling national networking of scientists,
providing significant benefits for describing inter-linked data in flexible and openly accessible ways.
A significant portion of ongoing and proposed technical innovation related to biomedical research
revolves around the goal of facilitating the sharing of data and other sorts of information and
resources while enhancing collaboration among researchers across a variety of disciplines. For many
researchers the geographical and organizational confines of a department, college, or even a single
university bear very little relevance to the scope of their research or the pool of colleagues they may
seek for collaboration. Researchers are often left to find their own paths to discover current activities
and active researchers in their field and beyond, usually by a combination of personal connection,
disciplinary knowledge, and fortuitous discovery through search engines, leaving those who have yet
to develop their own network of personal contacts at a significant disadvantage.
PHS 398/2590 (Rev. 11/07)
Page 1
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
A number of social networking tools attempt to facilitate interpersonal connection by providing a local,
national or even global platform to post and link profiles, pictures, ideas and comments. Most of these
platforms are closed worlds that often do not support direct interaction with other systems.
Technology and marketplace transitions dictate that services and data available today may no longer
be freely available tomorrow; no single tool or service is likely to successfully maintain a consistent
leadership position, resulting in collective information investment risks being lost in favor of the next
popular interface or feature set.
Biomedical and translational institutions and programs face similar challenges as do researchers, in
presenting a clear picture of their biomedical teaching and research capabilities internally and to the
outside world. These institutions seek to encourage cross-disciplinary collaborations but rarely provide
any venue to support discovery and nurture person-to-person connections. There is often disconnect
between functional areas, with most resources allocated to defining administrative, instructional and
research
computing needs,
rather than the
evolving nature of
research.
The problem is
even
more
severe
when
looking beyond
one institution to
understand
patterns or trends
or
identify
specific
expertise;
scientific
information
is
rarely
provided
with
any
consistency
except
within
narrow
disciplinary
confines.
We Figure 2 Local VIVOweb instances interlinked with each other and the
must be able to Semantic Web
communicate
diverse activities, expertise, outcomes, and resources in ways that can be understood nationally and
even globally, not just in a local context. In this fluid landscape, the key element is how to combine
authoritative information from its most local context into a coherent, large-scale picture that will meet
the needs of research teams, institutions, and cross-institutional views. VIVO enables a web
(VIVOweb) of researcher data that will catalyze and accelerate the creation of connections between
researchers to meet these needs. VIVOweb will empower researchers to find information about
people of professional interest and extend their research communities not just via prior knowledge or
serendipity, but through recommendation or suggestion networks based on commonalities in the
profile data.
The most fruitful way of promoting researcher networking and discovery at the individual/personal,
institutional and national level is to provide authoritative data from and about researchers themselves
and about other related institutional resources in an open and consistent format. This is what we
PHS 398/2590 (Rev. 11/07)
Page 2
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
currently support with VIVO, and are proposing for VIVOweb. These data will be described using
explicit semantic relationships, and published on the Semantic Web according to accepted Linked
Data standards. We will also make these data available on the human web through locally-managed
institutional portals that allow researchers to directly browse and search this data within or across
institutions (see Error! Reference source not found.). VIVOweb does not attempt to re-invent
collaborative tools such as wikis and blogs or impose that anyone be globally accepted,
acknowledging the plethora of established and emerging popular platforms. Instead, it focuses on
enabling users to discover each other via networks based on common interests and other direct or
indirect connections, incorporating and sharing structured data with other tools as appropriate.
As of May 2009 the Linked Open Data initiative offers nearly five billion data element links
represented as “triples” of the form (object, relationship, object), for example (person, co-authored
with, person) or (person, published, paper). Many of these triples represent biomedically-relevant
genome, gene expression, protein and pathway data5. This number continues to grow, and
VIVOweb’s semantic approach allows it to easily consume these data to enrich researcher profiles,
while also interoperating with these and other data sources and making content available for
immediate consumption. A number of publications also suggest that there has been great innovation
and interest in Semantic Web applications to facilitate research in a variety of areas within the
biomedical and life sciences communities, including genetic and drug efficacy analyses and clinical
and molecular dataset management, particularly from the perspective of strengthening translational
research—a key goal of the National Institute of Health Roadmap for Medical Research6,7,8,9,10. The
richness of the literature on the utility of Semantic Web technologies to further biomedical research
leaves little doubt that the application we propose here to enable research networking is not only
timely, but the most assured path to long-term utility and participation by individuals and institutions.
Interoperable enhanced VIVO installations managed locally, but offering cross-institutional searching,
browsing, and other capabilities will form VIVOweb, which will grow as a natural extension of Cornell’s
VIVO application from a single, multi-campus research discovery tool to a distributed network. Proof
of concept for VIVO’s value in a variety of settings and languages is provided by active installations at
UF (GatorScholar11), the University of Melbourne (Find an Expert12), and the Chinese Academy of
Sciences (Southwest China Biodiversity portal13). VIVO was initially developed to enable the
discovery of researchers and resources in the life sciences across Cornell’s complicated
administrative landscape of disciplines, departments, centers, colleges and campuses. VIVO
integrates public content about researchers from a variety of authoritative databases at the university
and also allows individuals to log in using their Cornell net ID and password to modify their own
profiles. VIVO also promotes resource discovery across Cornell—including facilities, equipment and
research-related services, such as databases and sample collections, workshops and seminars. All
data in VIVO are available for easy consumption by other web pages or services.
To ensure adoption, usage, maintenance, and post-funding sustenance of VIVOweb at the individual,
institutional, and national levels we propose technical innovation and support coupled with a
community-focused approach that provides a high-value product to institutions through local
installation and control of each VIVO platform. The involvement of appropriately skilled information
management specialists from libraries, as well as researchers, administrators and IT personnel from
all partner institutions, including recipients of Clinical and Translational Science Awards, also
contribute to the success of VIVOweb. Finally, the governance structure we envision will contribute
further oversight and direction, and includes Scientific, Technical, and Executive Advisory Boards, the
last consisting of personnel and researchers from the NIH, as well as other major research
universities.
PHS 398/2590 (Rev. 11/07)
Page 3
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
B.2.
Conlon, Michael
Approach
The VIVO research networking platform
currently installed at Cornell and UF will be
extended and enhanced to address needs
at the individual, institutional, and national
levels—with modifications to create a more
complete institutional research discovery
tool with a variety of new capabilities,
including the creation of active personal
and team networks through the application
of social networking tools, and the
production of semantically-rich data to
integrate, analyze, visualize and distribute
at the national level.
B.2.a. VIVO Platform at Cornell and
Florida
VIVO was developed by the Cornell
University Library beginning in 2003 to
Figure 3 Drupal test portal driven by VIVO content
meet individual and institutional research
discovery needs and already addresses
many areas of importance to researchers. VIVO supports ontology as well as content editing, and is
also a simple content management system that enables the representation of the resulting structured
information in web pages. It uses the standard Resource Description Framework (RDF) Semantic
Web data model and Web Ontology Language (OWL) schemas that identify distinct types of data and
defines properties to connect these data with consistent, bi-directional relationships. For example, the
profile of a person includes simple text attributes including name, title and statements of research or
teaching interests, but extends much farther to include affiliation, activity and outcome relationships to
departments, research grants, publications,
talks, courses, research areas and
geographic areas. Each of these entities is
a defined type in its own right that may in
turn have its own relationships to other
people or to a funding agency, event
sponsor, research center or topic.
The VIVO installations at Cornell and UF
aim not only to automate the harvesting of
information from a variety of authoritative
sources into a common institutional
resource, but also to make data or profiles
available for consumption or display by web
sites and services across the university.
While the accumulation of content, both
entities and the relationships between
them, initially depended largely on manual
entry of information by librarian and staff
editors, currency and accuracy concerns
have prompted integration into information
technology framework via automated data
ingest procedures that are already utilized
PHS 398/2590 (Rev. 11/07)
Figure 4 Graduate Programs in the Life Sciences
Powered by dynamic queries from VIVO
Page 4
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
at Cornell, and soon to be implemented at UF. Data currently ingested at Cornell include: active
personnel, titles, affiliations, and courses from PeopleSoft databases, grants from a custom Oracle
database, publications from PubMed and public information reported by faculty via a reporting system
called Activity Insight newly adopted by the majority of Cornell’s colleges. The Cornell and UF
installations also feature an editing component that ties in with local authentication systems to enable
personnel very easily to manage and update their own pre-populated VIVO profiles. This “self-editing”
service has been utilized successfully by researchers at Cornell for over a year. Two additional portals
illustrate VIVO’s ability to deliver filtered semantic data for the realm of data sharing: a test portal
developed in the Drupal content management system14 (see Figure 3) and one showcasing Graduate
Programs in the Life Sciences for prospective graduate students, and powered by life sciences
content queried dynamically from VIVO15 (see Figure 4).
B.2.b. Proposed Multi-Institutional Researcher Network
This project will extend VIVO from a single institutional installation to a multi-institutional, distributed
model that is VIVOweb. No central portal will be created; local installations will facilitate access to
both local and national-level information in all installations. VIVOweb will offer the functionality already
provided by VIVO, as well as new features and services tailored to the local context, including but not
limited to analysis and visualization tools to promote new paths to discovery, improved data ingest,
streamlined ontology editing, an increased number of authentication options, and a decentralized
indexing capability to enable cross-institutional browsing and searching. VIVOweb will also include the
ability to provide data as email lists or in a variety of formats for social networking tools, for the
automatic generation of NIH and other biosketches, and for faculty reporting purposes. VIVOweb’s
flexible and extensible data model will allow it to present a simple structure of people and their
activities within and across institutions, featuring links among them and connections to other people
as well as their professional information.
There are many ways a person’s expertise may be discovered, through grants, presentations, courses
and news releases, as well as through research statements or publications listed on their profile—
resulting in the creation of implicit groups or networks of people based on a number of pre-identified,
shared characteristics. We will extend the VIVO ontology to support personal work groups and
associated properties to represent the informal relationships evolving around collaboration, and allow
individuals and groups the option to limit the visibility of these more informal and dynamic
relationships, or “active groups”. New types and properties can be added without writing additional
code or altering the database structure of the application, and selected portions of a personal network
can be managed as an independent graph for export to social networking tools. New plug-ins already
in development within Dr. Katy Börner’s group at Indiana University will also allow easy and effective
visualization of the various relationships possible at the institutional or national level.
Cornell’s VIVO currently spans the Ithaca campus and the Weill Cornell Medical College within a
single software installation. Rather than combining multiple institutions into a single, large, central
database, we propose to install separate versions capable of supporting direct cross-institutional
references using Linked Data standards. Each entry in VIVOweb will have a stable URI from which its
constituent and immediately related data can be requested. RDF can be requested for data
harvesters and HTML can be requested for web browsers. This allows seamless linking between one
installation and another across VIVOweb. If researchers move from one institution to the next, their
persistent URLs can be ‘forwarded’. Linking one VIVO to another where a connection is known to
exist addresses one component of the need for a national network. VIVO’s distributed indexing
capability will enable individuals to search across institutions and find collaborators where they have
no known connections, and to discover the existence and patterns of collaboration across multiple
institutions and ultimately at the national level.
Development effort will support indexing distributed content at all participating institutions from the
beginning. Cornell and UF will host indexing services. Additional participating institutions can choose
PHS 398/2590 (Rev. 11/07)
Page 5
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
to replicate the index to optimize local performance for cross-institutional searching. Indexing sites will
harvest data from each independent VIVOweb site based on the common core ontology that identifies
a level of granularity for harvesting people, expertise, topics, research activities, and other data across
all the sites. The development and refinement of this ontology will be the subject of investigation by
Dr. Ying Ding at Indiana University, in close collaboration with the core development and facilitation
teams. Searches initiated from any local VIVO node will then have the option of extending to the multisite index. The first step towards this is the local installation of the VIVO platform at partner
institutions. Our technology is capable of integrating seamlessly with other researcher networking
platforms via workflows that first convert data from these systems into semantic form using templates
to be provided with the VIVO installation package or commercially licensed tools (see Section C.2 for
details).
VIVOweb’s Semantic Web principles and open, flexible structure represent a research networking
solution that will appropriately and efficiently allow integration of the application with varied
institutional infrastructures. They will allow VIVOweb to scale in size and scope while adapting to new
purposes and unforeseen content, providing an evolving, dynamic, virtual community for the
biomedical sciences—and beyond at every institution. Data from local systems—whether based on
the VIVO platform or not—will be linked and shared across institutional platforms, but visible locally
through institutional portals such as VIVO and GatorScholar to facilitate the networking and discovery
of people. The visibility and unique functionality of these portals will stimulate the further evolution of
this virtual community across institutional boundaries. Specific functionalities and services proposed
for VIVOweb may be summarized as follow:
•
•
•
•
•
•
•
•
Ability to search and browse locally and nationally to “find people like me”, most
searched, or topic-related expertise: By keyword or MeSH term, location, department or
institution, grants, geographic area, publication, authorship, types of papers or journals
commonly published in, and more.
Profile modification using institutional authentication system: Add to or change
information ingested from institutional authoritative sources; display or hide sections from
national view.
Ability to ingest data from authoritative sources: Including human resources, grants, and
course databases, faculty reporting systems, personal citation management tools and web
pages.
Easy modification of core ontology: Using improved ontology editor capabilities to more
accurately reflect local needs that might deviate from the Semantic Web for Research
Communities ontology bundled with each local VIVO installation.
Delivery of data to consuming services and mobile devices: Including specialized topical
or unit portals, web social networking or collaborative tool APIs, reporting tools and
biosketches.
Networking: Create and share public and private groups, adding or removing investigators to
and from designated groups or contact lists, suggest useful additions to others, navigate
across successive connection paths.
Communication: Dynamically create and manage list-servs through external list management
tools, such as. Lyris Listmanager16, query based on affiliation or topical affiliation to create
email lists for a variety of purposes.
Analytical capabilities and spatial mapping: Using multi-dimensional network analysis tools
and visualization techniques to analyze small team, departmental, institutional, or national
PHS 398/2590 (Rev. 11/07)
Page 6
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
groupings by publications, grants, funding agencies, and expertise as determined by keywords
and concepts conveyed in publications, grants, self-designation and more.
B.2.c. Sustainability through Building Community
The primary goal of our approach to enable national networking of researchers is to offer a wideranging perspective on multiple aspects of biomedical and translational research across multiple
institutions not just to researchers, but also to students, administrative and service officials,
prospective faculty and students, donors, funding agencies, and the public and to empower them to
contribute – each in their own way.
We are advocating for the creation of asset-based, rather than need-based virtual communities, at the
individual, institutional, and national level with the focus on making previously “invisible” human assets
visible at all levels. According to research conducted at Northwestern University, while a need-based
community focuses on “needs, deficiencies and problems”, an asset-based community begins with a
commitment to uncovering the community’s capabilities and assets17. This and other work has
demonstrated that investment in asset-based models is the most effective way of solving problems, as
long as a need can be rapidly and accurately linked to an asset18,19.
It is critical to recognize that any technology or tool designed to create a network of human assets
within and between academic institutions will be adopted, used and maintained only if the individuals
– the assets in this case – and the institutions perceive value in it.
Value to the individual is most likely to be assessed by responses to questions such as:
• What does this tool do to advance my research and academic standing?
• Can I use it to reliably find potential collaborators and other people of interest to me within my
institution and beyond?
• Can I use it to create inter-institutional groups or networks based on common areas of
research or interest?
• Is it current, accurate, immediate?
• Will it expedite the reporting and communication of my work?
• How much effort will it take to maintain my information—how easy is it to use and update?
Value to the institution is likely to be assessed primarily by administrators, based on responses to
questions such as:
• What does this tool do to advance the standing of my institution?
• Does it enhance recruitment and retention of top-notch faculty and students?
• Does it foster collaboration, particularly across traditional boundaries?
• Does it help improve the fraction of successfully funded grant proposals?
• Can it increase efficiencies associated with the management and dissemination of information
about people and resources?
• What does it cost to sustain and improve?
• How easily does the technology or platform interoperate with others and how agile is it?
Finally, value to NIH and other federal agencies, professional biomedical societies and organizations
can be evaluated by such factors as easier identification of experts and potential reviewers, more
effective use of grant dollars through improved collaboration, and possible synergies with services
already offered by NIH—such as eraCommons, PubMed and others.
The technical sections of this proposal will make it clear that the capabilities suggested by these
questions are indeed functionalities that the VIVO platform will enable for individuals and the
institution. However, our work with the platform at Cornell and UF also demonstrates that delivering
technical capability alone is not sufficient to ensure adoption, usage and maintenance by either the
PHS 398/2590 (Rev. 11/07)
Page 7
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
individual or the institution. Technical innovation for a networking tool such as this must be
backstopped by human facilitation—in this instance, by information specialists from institutional
libraries, or by other informatics professionals, wherever possible. We anticipate that researcher
engagement and outreach by information specialists will promote adoption, usage, and maintenance
of VIVOweb by the research communities in their institutions, thereby fostering the creation of virtual
communities of biomedical researchers at all three levels above, accessible through local VIVO
installations.
That the asset-based community approach employed for the initial development of VIVO at Cornell
and UF, and the library-based outreach efforts associated with it are valued and successful, is
evidenced by this small sample of feedback from researchers and administrators at both institutions:
“VIVO provides unparalleled access to information about the life sciences at Cornell
in a user-friendly way. This will be of particular benefit not only to those researchers
and students already at Cornell, but potential faculty and students as well, by
offering a much-needed, integrated view of the life sciences community at Cornell.”
“VIVO saved my life as a new faculty member at Cornell; I used it all the time to find
facilities and people I might work with.”
“First, <GatorScholar> allows individuals…to quickly find faculty with specific
research interest. Undoubtedly, this raises the visibility of the life sciences faculty
among potential granting agencies, students, and policymakers. Second, it
facilitates interactions between life science faculty with divergent backgrounds. This
facilitation increases the likelihood of grant funding by drawing on synergisms
within the college.”
“<The> up-to-date awareness <provided by GatorScholar> is vital for researchers
to make timely contacts, find potential collaborative partners, access literature
searches, and locate other resources necessary for their work. It will be particularly
useful because of the growing interdisciplinary among the sciences.”
B.2.d. Support through the Libraries
Our proposal posits that the engagement of a neutral and trusted campus entity with information
management expertise will greatly facilitate adoption, usage, and maintenance of such a tool.
Evidence from Cornell and UF suggests that the academic library—in its capacity as a generally
impartial and trustworthy organization with a clear understanding of the needs of the research
community and the proven capability of engaging with it, expertise in information management and
dissemination, and an established liaison function—admirably performs this role. Further, medical and
science and engineering libraries have traditionally provided information resources and technology in
support of educational, research, and patient care objectives, and are taking on an increasing role in
fostering and supporting collaborative efforts on campus to shorten the gap between bench and
bedside. Recent advancements in translational medicine have prompted libraries to develop
information solutions which support dissemination and facilitate a fluid exchange of data in the
increasingly cross-disciplinary research setting. Over the last few years, a number of medical libraries
have responded to changing information needs by expanding their services to offer visionary
programs which enhance the flow of information and promote collaborative opportunities in the
translational research environment. The stalwart engagement and stewardship provided by the NIH’s
National Library of Medicine (NLM) in support of biomedical research has provided a valuable model,
and many programs and services offered by these libraries are frequently developed and coordinated
by PhD-level specialists trained and certified by the NIH.
That libraries have successfully met these needs provides a foundation for a library-based community
support network for VIVO. While support of both user and development communities will be
PHS 398/2590 (Rev. 11/07)
Page 8
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
challenging, a library-based model best addresses many of the issues which may arise during this
process. Librarians, including several with PhDs and/or bioinformatics expertise and NIH training with
expertise as NCBI course developers and instructors have been included to facilitate intra-and interinstitutional adoptions, usage, and maintenance of VIVOweb. Through technology advancements
mentioned in this proposal, as well as evaluation and further development by the seven adoption
partners, VIVOweb will grow a community based support network. As adopters become developers,
the support network will work to develop the critical components for building a community-based
support network. Based on the Cornell and UF experience, librarians and domain specialists will be
particularly valuable in:
•
•
•
•
•
•
•
Establishing virtual environments which facilitate communication and collaborations—such as
wikis—for both the outreach and development teams which will serve all members of the
VIVOweb consortium. This includes, but is not limited to a listserv, development and outreach
wikis, news items and publications.
Providing in-person and e-mail “help desk” support in the use of VIVOweb.
Developing support documentation, including an FAQ, quick-start guide and manuals on the
VIVO application, suggested and proven outreach and support strategies, and guidelines for
development of new modules.
Creating and supporting a comprehensive suite of educational materials for VIVO users and
implementation and support teams, including both text-based and video tutorials which range
in complexity from basic needs to more complicated or innovative uses of the application.
Facilitating inter-institutional collaboration on the development of a common ontology.
Engaging with researchers and administrators in the local setting to educate and engender
buy-in and ensure institutional support
Serving as link between researchers and central technology teams by regularly providing
feedback on usability – problems encountered, what works well and what is missing but
essential for a successful product.
Training materials and support documentation will be modeled after widely used materials provided by
the NLM for applications, databases and services such as PubMed, NCBI, MyNCBI and others20.
We anticipate that personnel outside the Library will increasingly assist with this task as VIVOweb
becomes accepted and increasingly integrated into the administrative and communications
mainstream. However, it represents a considerable technological and cultural shift from current
practice for most institutions, just as any new campus-wide initiative faces many challenges in
achieving clarity in mission and consistency in execution. For an ambitious e-community building
endeavor such as this to truly succeed—that is, to be adopted, used, and maintained— technical
innovation as well as careful and engaged stewardship by institutional libraries will be essential.
B.2.e. Engagement of Recipients of Clinical and Translational Science Awards (CTSAs)
Any project to support biomedical researchers will clearly need support from recipients of CTSAs,
which represent a primary community of practice. The CTSA institutions21 have considerable interest
in national networking and have formed a workgroup to facilitate consortium-wide collaboration. One
of the functions of the consortium is to support researcher networking across institutions. VIVO is
designed specifically to address this need. Members of the CTSA consortium will be asked to serve
on VIVO governance bodies – Executive, Scientific and Technical – and actively participate in
facilitated discussions of the needs of this important group of research institutions.
PHS 398/2590 (Rev. 11/07)
Page 9
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
B.2.f. Support for all Institutions
It is important to note that many other National Center for Research Resources (NCRR) and NIHfunded centers and programs also include researchers making very significant contributions to
advance biomedical research. Our consortium of institutions therefore includes CTSA recipients as
well as other NCRR awardees to ensure the broadest possible interpretation of biomedical
researchers. This approach will ensure that our ontology is scalable across a wide variety of
disciplinary types and therefore more easily scalable and extensible beyond the funding cycle of this
grant.
As schools choose to adopt VIVO, the community-based mechanisms for support scale to national
levels and are sustainable in supporting networking of researchers.
It is important to note that many other National Center for Research Resources (NCRR) and NIHfunded centers and programs also include researchers making very significant contributions to
advance biomedical research. Our consortium of institutions therefore includes CTSA recipients as
well as other NCRR awardees to ensure the broadest possible interpretation of biomedical
researchers. This approach will ensure that our ontology is scalable across a wide variety of
disciplinary types and therefore more easily scalable and extensible beyond the funding cycle of this
grant.
The creation and distribution of support materials, both educational and promotional, will be an
essential means of facilitating institutional awareness and adoption of VIVO.
Materials will be
designed and created for the institutional and national audiences at the University of Florida, under
the direction of Dr. George Hack.
Among the materials to be developed is a comprehensive suite of educational resources for VIVO
users and implementation and support teams. These resources will include a variety of tools help
facilitate integration of VIVO at the institutional and national level. Educational support is an important
component in the library-based support model. For successful implementation of VIVO to occur,
researchers at implementation sites need to feel comfortable navigating VIVO, allowing the resource
to promote serendipitous discovery of collaborative opportunities. Informative web-based support
materials will be available from the VIVO project website such as a FAQ, a “quick-start” guide, links to
documentation and published papers about the application. A robust collection of online tutorials will
be developed, offering just-in-time support to researchers who wish to utilize the power of VIVO. This
immediate response will be further supported by providing access to podcasts and videocasts of
VIVO-related events. Strong educational support of VIVO is best served by combining this rich online
VIVO presence with a strong in-person support component at implementation sites. To accomplish
this, a robust series of instructional materials, including PowerPoint slides and handouts, will be
developed for use to deliver in-person instruction and presentations. These instructional materials will
be developed in a series of stand-alone modules, such as: a basic VIVO overview and training;
institutional discovery: using VIVO for new investigators, students, and staff; managing VIVO profiles
by proxy: support for administrative support staff; VIVO for the institutional administrator, and
advanced VIVO applications. The modules will be easily interchangeable and present an ala carte
approach to standardized instructional design.
A comprehensive suite of marketing materials to be used on a national basis will also be created by
the University of Florida. Such materials will be created in a variety of formats including web, print,
graphics, audio, video, and animation technologies to support curriculum offerings and promote VIVO.
The marketing/communications coordinator at UF will work closely with institutional outreach teams
during adoption, use best practices to identify change agents, promote and market the characteristics
of VIVO as a new innovation, and establish the key elements of a change process that will facilitate
adoption.
PHS 398/2590 (Rev. 11/07)
Page 10
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Both educational and promotional materials will offer a standardized look and feel, but still offer
institutions opportunities for customization with their own logos. The VIVO logo and color scheme will
be featured prominently to build the VIVO presence in materials related to the application to ensure
that VIVO is a brand that becomes recognized nationwide – from locally-hosted resource workshops
to national-level scientific meetings. Support materials will include images and logos, PowerPoint
templates, and code for incorporation on implantation site websites – all for use by the VIVO
consortium members
While these support materials will be designed and created at the University of Florida, all modules
could be easily customized with institutional logos and further customized with real-world examples
from any specific institution, with the assistance of a librarian. This approach will be convenient and
can be scaled up or back, depending upon the institutional needs. As schools choose to adopt VIVO,
the community-based mechanisms for support scale up to a national and sustainable activity in
support of networking researchers.
B.2.g. Support through Professional Societies
Professional societies of researchers will be engaged to adopt VIVO. Professional societies have a
natural role to play in facilitating the networking of their members. By adopting VIVO, they make
themselves visible in the national network. By participating in community-based support, they provide
increased visibility for their services as well as additional support for their members. By ensuring that
the Semantic Web recognizes and facilitates the identification of members, the societies leverage
VIVO in support of their goals, helping to build the national network. Professional societies can
promote VIVO through their own communication channels, reaching large numbers of researchers.
Researchers who are members of professional societies can highlight this membership in a rich
manner through their own VIVO profile as well.
Implementation of VIVO for a professional society is a straightforward process. The VIVO software is
designed to be easy to host. Creating profiles for professional society staff members is simple and
allows these people to be found through the Semantic Web and support research activities.
C. Project Plan
C.1.
Governance
The development and support of VIVOweb will be governed by three national advisories – an
Executive Advisory Board, a Scientific Advisory Board and a Technical Advisory Board. These
groups ensure that VIVOweb meets the needs of researchers, institutions and the NIH.
C.1.a. Executive Advisory Board
The Executive Advisory Board (EAB) sets the direction for VIVOweb development and support
activities and ensures full coordination with the implementation of resource discovery resulting in a
seamless Semantic Web of both scientists and resources. Constituted from a cross-section of the
research community, with NCRR representation, and with representation by the implementers of the
network for resource discovery, the EAB advises the Principal Investigator and the project teams on
all matters related to the creation of national networking of researchers. Quarterly evaluation reports
are provided to EAB members.
The group will meet twice per year. Members will receive travel support. One meeting per year will
be held at NIH in Bethesda. One meeting per year will be held in conjunction with a national meeting
such as the CTSA Consortium meeting.
PHS 398/2590 (Rev. 11/07)
Page 11
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Table 1 VIVOweb Executive Advisory Board
Member
TBA
TBA
Julianne Imperato-McGinley
TBA
TBA
TBA
Gloria Thomas, PhD
Peter Stacpoole, MD, PhD
Michael Conlon, PhD
Affiliation
TBA
TBA, PI Resource Discovery
Weill Cornell Medical College, PI CTSA
TBA
CTSA Consortium Representative
NCRR Representative
Xavier University
University of Florida, PI CTSA
University of Florida, Ex officio, PI Research Networking
C.1.b. Scientific Advisory Board
The Scientific Advisory Board will consist of a spectrum of biomedical researchers who will provide
direct input regarding the support activities and the needs for features and ontology components to
support their work. Members will be recruited nationally by the EAB members and by members of the
project team. Support systems including a web site and wiki, will be provided to facilitate the gathering
of input from the Scientific Advisory Board. Bi-monthly conference calls and gatherings at national
meetings will be used to solicit further input.
C.1.c. Technical Advisory Board
The Technical Advisory Board (TAB) will guide all aspects of the technical development of VIVOweb,
including ensuring that: 1) content from local installations can be picked up by any national network;
2) the VIVOweb installations can use community-sourced data such as Linked Open Data; 3)
VIVOweb is fully interoperable with the resource discovery network; and 4) interfaces to and from
VIVOweb to other tools meet the needs of the research community.
Table 2 VIVOweb Technical Advisory Board
Member
John Wilbanks
York Sure
Neil Smalheiser
Barand Mons
Kei Cheung
Chris Bizer
Steffen Staab
Abel L. Packer
Stefan Decker
Carole Goble
Dean Krafft
PHS 398/2590 (Rev. 11/07)
Affiliation
Creative Commons
University of Koblenz, Germany. Scientific Director of the Liebniz
Institute for Social Science
University of Illinois, Chicago
University of Rotterdam, The Netherlands
Yale University
Free University of Berlin, Linked Open Data
University of Koblenz, Germany
BIREME/OPS/OMS, Director, Brazil
Director of DERI Galway, Ireland
University of Manchester, UK, co-director of e-Science NorthWest
Cornell University
Page 12
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
C.1.d. Project Organization
Figure 5 shows the project organization. The EAB oversees the project. Evaluation (cyan), Project
Operations (orange) and Project Governance bodies (blue) report to the EAB. Project Operations is
organized into three activities – Development, coordinated by Jonathan Corson-Rikert, National
Activities, coordinated be Medha Devare and Site Implementations, coordinated by Valrie Davis.
C.1.e. Development Teams
The project will support three development clusters, at Cornell, UF and Indiana University.
The Cornell group will focus on extensions to the current core VIVO functionality and access controls
to better support individual and team networking, improve scalability, and support workflow for data
ingest and export. This group will also
develop the distributed search indexing
capability and Linked Data functionality.
Any architectural changes necessary to
support a more modular architecture for
ingest, export, or to allow plug-in
extensions for visualization or other
purposes will be coordinated with UF and
Indiana University teams.
The Indiana University developers will
work in two teams under the leadership of
Katy Börner and Ying Ding. Börner’s
Cyberinfrastructure for Network Science
Center will implement advanced data
mining and visualization in support of
social
networking,
metrics,
and
presentation. Ding will lead efforts
pertaining to the development and
maintenance of ontologies used by the
Semantic Web to represent scientists and
investigators.
UF developers, under the direction of
Christopher Barnett, will focus on Figure 5 VIVOweb project organization
interfaces to software in the institutional
setting, and packaging of VIVO for
deployment. Interfaces will be built for 1) PeopleSoft22, to provide authoritative data to VIVO
regarding people in the institution; 2) Drupal23, to enable the use of VIVOweb from within research
team Drupal implementations; 3) Shibboleth, to provide federated identity management, and 4)
Sakai24, to provide access to research networking from within the popular open source course and
content management platform. We anticipate the need for additional interfaces as determined by the
VIVOweb governance processes.
C.1.f. Media Support Team
Dr. Devare will direct the efforts of the team at UF lead by Dr. George Hack in the development of
instructional support and other media materials for VIVOweb. Instructional videos, promotional
material, web sites, conference materials, collateral for exhibits and other material will be developed
by Dr. Hack’s team.
PHS 398/2590 (Rev. 11/07)
Page 13
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
C.1.g. Adoption Support
Dr. Devare will coordinate effort related to the national adoption of VIVOweb. This includes
development of promotional materials and web sites, presentations at professional societies and
conferences. In this effort she will be supported by all members of the project team.
C.1.h. Implementation Teams
Each of the seven participating institutions has an implementation team that will deploy VIVO during
the first year of the project and then implement VIVOweb during the second year. Each
implementation team participates in the evaluation led by Dr. Leslie McIntosh of Washington
University.
Table 3 Implementation Team Leads at each of the participating institutions
Participating Institution
Cornell University
University of Florida
Indiana University
Scripps Research Institute
Ponce Medical School
Washington University
Weill Cornell Medical College
Implementation Lead
Medha Devare
Sara Gonzalez
William K. Barnett
Gerald Joyce
Richard Noel
Rakesh Nagarajan
Curtis Cole
Valrie Davis at UF will coordinate the implementations and provide support to the implementation
teams. Implementation teams provide input to the evaluation team. The evaluation team prepares
quarterly summaries regarding the implementation for the advisory boards.
C.1.i. Researcher Support Teams
The libraries of each institution will provide support for researchers using VIVOweb. Librarian
contributions to creating support for the adoption, usage and maintenance of VIVOweb may be
summarized as follows:
Organizational/Workflow and Training Responsibilities. The information specialists in the
institutional libraries will facilitate growth and maintenance of local VIVO instances by:
• Hiring, training, coordinating and supervising staff who will initially enter data for individual
profiles and related pages in VIVO,
• Ensuring that relevant content and content types related to biomedical research are entered in
VIVO and the relationships between individuals and pieces of information—the entities—are
accurately and consistently represented,
• Integrating project support resources into the institutional culture, including in-person training
events and just-in-time online instructional and support resources, and
• Organizing and implementing usability testing for both self-editing of individual profiles and
those of related academic units.
Outreach Responsibilities. As academic appointees, the information specialists provide outreach to
departments, programs, centers and individual researchers, with whom they have enduring
professional relationships and to whom they provide assistance to facilitate research. Through their
liaison roles, these information specialists:
• Bring to the project an understanding of both news-worthy and day-to-day activities and issues
of importance that inform data element and design decisions. Examples of these might include
research areas, collaborative initiatives and committees that are used to pre-populate pick lists
that researchers can use while editing their profiles,
PHS 398/2590 (Rev. 11/07)
Page 14
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
•
•
•
Conlon, Michael
Collaborate with groups across departments and administrative units to add content streams
and improve efficiency,
Demonstrate VIVO and its self- and proxy-editing capability at departments, institutes, centers
and researcher’s offices to inform individuals, provide feedback from users and increase
support for the VIVOweb initiative. Their experience as instructors of digital information
resources endows them with a unique awareness of user behavior in a digital climate.
Have developed strong and trusted professional relationships with their research clients, and
will be able to use these connections to facilitate all tasks performed in relation to this project.
Navigating VIVOweb’s Technological Underpinnings. VIVO is an ontology-based tool to integrate
diverse information through simple, consistent categorization by types and relationships. Librarians
are trained to understand, develop and encode ontological relationships and apply them pragmatically
to keep VIVO straightforward and simple to use.
Membership in the Institutional Research Community. Research is the primary subject of a
librarian’s work. Information specialists are well positioned, trusted arbiters within an institution’s
research community, capable of efficiently clustering information for VIVOweb that reflects important
nuances at a level appropriate for a general higher-education audience.
C.1.j. Evaluation Team
Dr. Leslie McIntosh of Washington University will lead the evaluation efforts related to the project. Dr.
McIntosh will be assisted by a biomedical informatics specialist to conduct assessment tasks, acquire
data, and analyze these data sets in collaboration with Dr. McIntosh. Quarterly reports will be
prepared and made available to the advisory boards.
C.2.
Technical Design
VIVOweb is based solidly on Semantic Web technologies recommended by the World Wide Web
Consortium (W3C). The core is the RDF25 where items being described are assigned globally unique
identifiers (URIs, or Uniform Resource Identifiers) and their relationships and attributes are described
in discrete pieces called “triples” or “statements.” A collection of triples forms a graph of data that may
be stored in a single file or distributed across the entire web. Another W3C standard, SPARQL
(Sparql Protocol and RDF Query Language26) makes it possible to query Semantic Web data using
SQL-like syntax. RDF relationships may also be embedded into standard web pages using RDFa
(Resource Description Framework in Attributes), which allows browsers or search engines to extract
structured data. Google recently announced that it would begin harvesting RDFa data27, following on
the heels of other search engines such as Yahoo28.
Semantic Web standards, such as RDF Schema (RDFS29) and the Web Ontology Language (OWL30),
make it possible to exchange ontologies, which specify the semantics of the terminology and
relationships used in RDF descriptions. Ontologies also enable reasoning, or inference of new triples
based on existing data. VIVO takes advantage of RDF’s triple-based structure and OWL’s constructs
for defining types of resources and their relationships to build a flexible, extensible knowledge base
describing academic researchers and their activities.
VIVO takes an additional step beyond the use of Semantic Web technologies at the local application
level by embracing the principles of Linked Data, which is a concept articulated by World Wide Web
inventor Tim Berners-Lee in a 2006 design note31. Linked Data promotes a web of data on the scale
of today’s human-readable web, where interconnections between datasets are created as easily as
HTML hyperlinks. With Linked Data, RDF resources are assigned URIs that are dereferenceable, that
is, a request for the URI will direct humans or machines to useful data describing the resource. These
data should include additional URIs to allow the web of data to be browsed or crawled seamlessly.
PHS 398/2590 (Rev. 11/07)
Page 15
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
The Linked Data community estimates that 142 million links between Semantic Web datasets have
been created.32 Links between institutional VIVO datasets will allow seamless browsing across
institutions. VIVOweb does not require coordination between installations when describing new
people, organizations, topics, or other entities. Different URIs representing the same resource can be
cross-referenced through OWL sameAs properties. In the cases where this approach is not possible
due to differences in ontology semantics between datasets, we will follow best practices emerging
from ongoing research in the field (Glaser et al., 2009).
C.2.a. The VIVO Platform
VIVO is unique in offering three major functional components in one package: ontology editing to
create or modify a data model, intuitive user editing for data and relationships and a simple content
management system to present an attractive web presence. This integration was designed and
developed from the ground up to support a researcher networking application in the institutional
environment. Unlike relational database-driven systems, VIVO requires no fixed data model with
tables and fields internally defining the data elements supported in the system. VIVO instead provides
an administrative editing interface to define types of data and relationships among these data types; a
common core ontology data structure (see Error! Reference source not found.) will be supplied with
the VIVO installation package, but institutions will be free to extend the model further as required for
local needs without additional coding.
Institutions may choose the extent to which they integrate VIVO into local IT infrastructure for
authentication to allow modification of profiles by individual researchers or their proxies and for data
ingest. This integration generates additional startup cost but lowers ongoing operational costs – data
is only entered once into the appropriate system of record and is pushed to VIVO through interfaces.
Data quality is improved through use of normal university data management processes and changes
to core institutional data can continue to happen in the appropriate database of record.
VIVO is also capable of disseminating data to other institutional web sites as well as harvesting from
them. VIVO provides generic RDF/XML output that can be customized or filtered within VIVO or
transformed into desired reports outside of VIVO according to local requirements. By providing
incoming and outgoing data paths through both human interaction and machine processes, VIVO is
capable of integrating well into institutional enterprise architectures.
Under the hood, VIVO is a Java servlet application using Java Server Pages for page rendering;
existing installations use the open-source Apache Tomcat servlet container and the Apache web
server. VIVO’s search function employs the Lucene library33. RDF data are managed through HP’s
Jena Semantic Web library,34 which allows direct access to a variety of triple store implementations,
including those based on familiar relation database systems. Existing VIVO installations use
MySQL35, which, like all of the libraries used by VIVO, is freely available and open source. VIVO’s
default configuration caches RDF data in memory to support very fast queries and web page
rendering. This technique scales to an institution the size of Cornell or Florida; in cases where much
larger RDF data sets are involved, VIVO may use any RDF triple store that implements Jena’s graph
service provider interface and supports the SPARQL query language. These include third-party
commercial RDF stores such as AllegroGraph36 and OpenLink Virtuoso37, as well as a number of
open-source stores provided by HP. Several of these systems have been demonstrated to store more
than one billion RDF triples successfully.38 The VIVOweb technical development process will include
further testing and optimization in order to deploy highly scalable triple stores for large data sets,
including modification if necessary to integrate triple stores that do not provide a direct Jena interface,
such as the Sesame39 native store. A cluster of Sesame stores is used in SemaPlorer,40 which took
first prize in the Billion Triples Track of the 2008 Semantic Web Challenge.41
PHS 398/2590 (Rev. 11/07)
Page 16
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
C.2.b. Ontologies
VIVO’s flexible and extensible data model will allow it to present a simple structure of people and their
activities across a university, featuring links among them and connections to other people as well as
their professional information – using a network graph structure to most naturally represent a realworld network of relationships (see Error! Reference source not found.). There are many ways a
person’s expertise may be discoverable, including talks, courses, and news releases as well as
through research statements or publications listed on their profile—resulting in the creation of implicit
groups or networks of people based on a number of pre-identified, shared characteristics. 42
Ontology is an important approach to model knowledge so as to improve information organization,
sharing and understanding. It has a crucial role to enable content-based access, interoperability,
communications and provide qualitatively new levels of services on the next generation web. VIVO is
powered by ontological approaches to digest main assets of information and knowledge derived from
and requested by research networks43. It re-organizes the current existing authorized information from
faculty annual reports, institutional scholarly databases, funding records, teaching materials in an
ontological manner so that this information can be re-packaged and re-presented to the researchers
to facilitate their networking.44,45
The ontology work to date at Cornell and UF informs (but does not wholly determine) the course of
ontology development for this project, to be conducted as a close collaboration between the
community
and technical
teams under
the
overall
direction of
Professor
Ying Ding of
Indiana
University.
Goals
Figure 6 Sample entity structure for a faculty member showing common internal
include
data properties as well as object property relationships with other entities
optimal
alignment with existing ontologies in wide use, extensibility for local needs and provision for ontologylevel local controls over what information is shared nationally. Mapping different localized VIVO
ontologies for VIVOweb’s multi-institutional scope can be realized through the community efforts to
achieve the agreement for specific mappings. Extending and maintaining VIVO ontologies should
reflect biomedical community needs and facilitate visualization, semantic analysis and networking,
developed by Börner, also at Indiana University.
Ontology documentation will include information about the ontology’s design principles and guidelines
for local extensions. The ontology team will prepare a set of best practices for training potential users
and facilitating adoption of our technologies and approaches.
Maintaining a modular ontology structure facilitates ontology re-use, ontology mapping and data
integration. The core ontology for VIVO installations will be based on the Semantic Web Research
Community (SWRC) ontology developed by the large European Funded Network of Excellence
KnowledgeWeb46. The SWRC ontology models major entities of research communities about
persons, organizations, publications and their relationships.
We will also implement mappings where possible to enable VIVOweb data to be queried locally and
nationally using a number of different widely-adopted social ontologies including FOAF (Friend of a
Friend)47, SKOS (Simple Knowledge Organization System)48, DOAP (Description of a Project)49, SIOC
PHS 398/2590 (Rev. 11/07)
Page 17
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
(Semantically-Interlinked Online Communities)50, Dublin Core51, and GEO (Geographic Names)52.
These will ensure interoperability with other data and systems publishing data to the Semantic Web.53
Throughout this project we aim to further enhance this ontology to better reflect the requirements
coming from research networks in the biomedical domain, especially through the testing of VIVO in
our partners’ institutions and universities. We will extend the VIVO ontology to support personal work
groups and associated properties to represent the informal relationships evolving around
collaboration, and to allow individuals and groups the option to limit the visibility of these more
informal and dynamic networks and manage them as an independent graph for export to social
networking or collaborative tool APIs.
C.2.c. VIVO in the Institutional Context
Scalability through multiple independently administered installations is a major strength of this
proposal. During the scope of this project, VIVO can provide a customized and extensible presence at
the diverse participating institutions and provide convincing and varied models for propagation under
full local institutional control in the national context. Institutions without broad IT support services will
be able to utilize a more basic version, while larger institutions with more technologically integrated
resources will be able to add additional content modules and more fully integrate the application to
consume existing data sources at that institution and serve as an integrated source of data for other
applications. The VIVO approach as demonstrated at Cornell is designed to transcend the
administrative and organizational constraints of any one institution.
If Cornell and UF are at all typical of research institutions, an integrated view of people, affiliations,
grants, publications, courses, talks, research interests and international activities across internal
organizational units fills a rather glaring void in university data federation and data presentation for
internal and external communications, especially at the level of detail the VIVO platform affords. VIVO
offers a solution to appropriately and efficiently integrate with varied institutional infrastructures.
For most institutions there will be tangible benefits to justify the initial overhead of closer integration of
VIVO into Systems of Record (SOR). VIVO offers ample potential for synergies deriving from its data
integration capabilities, effectiveness as a public web application, and ability to disseminate filtered
data to other services and web sites. The investment required for an institution to interface VIVO’s
authentication framework or to adapt VIVO’s tools to integrate core data will be repaid by improved
data consistency and a higher public visibility for researchers, students and staff; individual buy-in will
be improved by reducing data entry time, and through the VIVOweb network, authoritative and
consistent data will be propagated to the national level.
In the institutional setting, the VIVO installation is interfaced to SOR that indicate who is a faculty
member or researcher and provide basic authoritative information regarding department affiliations,
previous positions, degrees earned and administrative roles. At Cornell, these SOR currently include
human resources, grants, courses, annual faculty reports and the LDAP directory. As the SOR
update, interfaces keep VIVO up-to-date. Established university processes are used to maintain data
in the SOR, while faculty and their proxies continue to maintain data local only to VIVO, such as
research interests, international focus, and professional or service activities. Cornell has also
successfully addressed issues of data stewardship through a clear separation between “faculty as
employee” data (often private) from “faculty as academic” data (largely public) in its faculty annual
reporting. VIVO provides a coherent public outlet for academic and research-focused data that
individual faculty have marked as publicly visible in their annual reporting.
Authentication to VIVO is required in the local setting to gain authorized access to edit information for
a researcher. Researchers or their designated proxies may update information in the researcher’s
profile. Local authentication (a VIVO username and password) is supported, as well as use of
institutional authentication methods such as LDAP/Active Directory and Kerberos. For use cases
involving cross-institutional access to privileged information, federated authentication via Shibboleth54
PHS 398/2590 (Rev. 11/07)
Page 18
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
will be supported. Shibboleth enables researchers to access privileged information in VIVOweb
implementations other than the one at their home institution using credentials from their home
institution, provided they are authorized to access the information.
The VIVO platform will run independently at each institution and offer a local search as currently
configured at Cornell (Ithaca and Weill) and UF; any changes in local content are automatically
reflected in the local index as they are saved. Local installations can, through annotations on the
ontology, limit the range of data elements considered for public viewing or export, in concert with
appropriate administrative staff and institutional policy. The VIVOweb ontology team will focus crosssite data indexing at a level appropriate for cross-institutional and national discovery, exposing data
through common vocabularies such as FOAF55 as well as the native SWRC-derived internal ontology.
Institutional VIVO portals will make researchers and their multiple interconnections more visible on the
web through standard search indexing, shortly to be enhanced through Google and Yahoo’s recent
announcement that special tags embedded in web pages will be harvested to improve relevance
ranking algorithms and enhance search results. VIVO will support RDFa56, an extensible vocabulary
for referencing relationships via published ontologies within HTML tags on standard web pages.
C.2.d. VIVO in the Internet Context
VIVO is ideally positioned to ingest data from Internet sources such as PubMed and other publications
databases. While some institutions such as North Carolina State University, maintain an institutionwide citation database in connection with an institutional repository57, or have licensed special access
to bibliometric tools through commercial databases, publications are perhaps the leading data source
for research networking, but are poorly exploited by institutional data sources.
The distributed technical and content development teams working across partners during the grant
period will collaborate to streamline the acquisition of each institution’s publications citations from
national and international database, focusing initially on PubMed. The teams will develop an
improved workflow using web service APIs when available, concentrating on known challenges such
as author disambiguation, where some combination of automated processing and interactive review
will be required. Initiatives for unique identifiers such as the ID.LOC.GOV project (currently limited to
addressing Library of Congress subject headings) and PubMed Unique Identifiers (PMIDs) offer
promise that this problem may become less burdensome in the future. Although several proprietary
systems for unique author identifiers are also being developed, we do not expect these private
systems will be openly available at the scale of entire institutions. VIVO can easily store any number
of identifiers to help disambiguate authors and investigators, but it will use an institution-based URI as
the primary identifier for any individual entry following Linked Data standards that rely only on
standard HTTP request and responses and institutional domain name registration rather than any
special resolution services. If a person moves from one institution to another, a standard HTTP
redirect can be returned and the redirect accomplished without a user’s intervention or even
knowledge.
Initiatives such as the Indiana University Scholarly Database58 with 22 million paper, patent, and grant
records, as well as the growing datasets available as Linked Data, offer largely untapped sources of
additional information to enhance local institutional data about researchers. The Bio2RDF project59
provides Linked Data for dozens of data sets, including microarray, pathway analysis, human genome
and protein data. The ability to make RDF connections as appropriate to biomedical data and to
reuse existing efforts ranging from RDF versions of MeSH to authoritative databases for referencing
species and geographic places, makes it possible for VIVO to augment researcher social networks
with rich descriptions of the content of their research.
PHS 398/2590 (Rev. 11/07)
Page 19
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
C.2.e. VIVO in the Semantic Web
While Cornell’s VIVO links researchers across four physical campuses and numerous disciplines and
departments within one software installation, multiple independent instances of VIVO will be
interlinked as VIVOweb to support cross-institutional discovery and networking. This active networking
across a national body of diverse institutions will be promoted by a cross-site search engine as well as
through exploration tools employing network analysis and visualization. Knowledge and expertise
navigation, management and utilization will be supported through network analysis and visualization
services.
The cross institutional search will allow a VIVO system at one institution to query across all
relationships in the national network. An example of this would be a researcher querying a local VIVO
system with a question such as, “Who are all the people in New York who are working on astroviridae
infections?” VIVO will run independently at each institution and offer local search and editing services,
but many times there will be information relevant to a search at a VIVO system at a non-local
institution. To provide results to queries across institutions the data from institutions will be
aggregated by a distributed system which will operate as part of VIVOweb. Each institutional system
will be able to query against the distributed system containing the aggregated data indexes.
The VIVO instance at each institution will not only provide a web-based front end for querying and
browsing, but it will also contribute a node to a clustered system for the support of a distributed
national search index. This clustered system will aggregate all of the information from the local VIVO
systems and process it into a full text index and a RDF index. The RDF index will service queries
based on the relations between entities in the system and the full text index will allow unstructured
term based searching.
The clustered component of the system will be built using Apache Hadoop60 and Hbase61. The
Hadoop framework transparently provides the execution of parallelized jobs such as aggregation of
data from local VIVO systems, construction of indexes and processing intensive visualization jobs.
Hadoop also provides for transparent distributed data storage; which will be critical to scaling when
managing aggregated datasets. Hbase, a database built on top of Hadoop, will be used to store the
aggregated RDF and for servicing relation based queries.
The national index will be updated daily with changes from the local VIVO systems by a job which
runs on the cluster and pulls data from the local systems. Running a separate index for the national
network will enable local control over what is exposed for indexing and allow the national index to filter
content based at the level of the ontology appropriate for national-level discovery and networking.
The analysis and visualization tools developed by the Indiana University team will also access the
distributed VIVO instances for source data in the form of RDF triples. If computationally intensive
processing is required by the analysis tools it will be executed using the Hadoop cluster.
No central hub will be needed to support national indexing, nor will the full content of any local
installation be pulled into a central index or database. The VIVO software will be modified to allow
local users to extend searches to the national index, when so desired. Initiation of queries and display
of search results will be supported through REST style web services returning common data formats
such as HTML, JSON, or XML.
C.2.f. Network Analysis and Visualization
Many cross-institutional relationships can be mapped directly through co-authorships, shared service
on professional committees, joint grant projects, and similar direct linkages, more extensive
relationships can be discovered or prospectively suggested through network-enabled analysis of text
content, linkages to common keywords and evolving patterns of relationships that indicate common
experience or research interests62. The Indiana University development team will investigate analysisdriven enhancements, query tools, and visualization tools to build pathways for discovery across
PHS 398/2590 (Rev. 11/07)
Page 20
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
multiple VIVO instances and evaluate the potential of such techniques at the scale of a national
network.
At all three levels—the individual, institutional, and national—multiple techniques can be applied to
identify trends, patterns and outliers in support of insight and easy interpretation, including temporal,
geospatial, topical (semantic text mining) and network analysis techniques63. Exactly what analysis
and visualization techniques are most appropriate depend very much on the final set of supported
user needs, the available data and the delivery mechanism. In many ways, the most directly
communicable forms of analysis based on transparent linkages will be most effective.
Börner’s team at Indiana University has developed scholarly knowledge management tools over the
past four years and has actively been using them for three years64. Samples are available.65,66
Diverse approaches to analyze and visualize scholarly data have been developed and tested. Among
them are tools for the visualization of evolving co-authorship networks67 such as those shown at
geospatial visualization of conference attendances, co-investigator networks (Figure 7, left) or
advisor-funding-student networks (Figure 7, right).
To reduce the cognitive load associated with the learning of new network layouts or ‘reference
Figure 7 Exemplary project-investigator network (left) and Advisor-Project-Student network with
faculty member in center (right)
systems’, static base maps such as geospatial maps or maps of science can be used. An exemplary
visual interface to neuroscience jobs in the U.S. is given in Figure 8, left. Co-authorship patterns or
other linkages can be overlaid over geospatial or topical reference systems as well (see Figure 8,
right).
PHS 398/2590 (Rev. 11/07)
Page 21
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Figure 8 Interactive on-line browser interface to neuroscience jobs in the United States (left) and
overlaid geospatial world map (right)
Figure 9 shows the UCSD Map of Science68 covering all sciences as well as the arts and humanities –
23,748 journals indexed by Scopus and Reuters/Thomson Scientific (ISI SCI, SSCI, and A&H
Indexes). Each of the 13 main scientific disciplines is labeled and color coded in a metaphorical way,
Figure 9 UCSD Map of Science with sample data overlays of expertise profiles
e.g., Medicine is blood red and Earth Sciences are brown as soil. Circle size denotes the number of
papers and multiple graphs can be prepared and animated over time. In this manner, VIVO usage
per science area can be identified based on the journals in which researchers in VIVO publish. Circle
size denotes the number of papers and multiple graphs can be prepared and animated over time.
The map can be used to communicate the ‘intellectual footprint’ or ‘trajectory’ over the landscape of
science for one individual researcher based on papers s/he cites and/or publishes. It has also been
used to communicate the (evolving) ‘expertise profiles’ of institutions and even countries. The map
can also be used to communicate the very different temporal dynamics of scientific disciplines, bursts
of activity, or emergent research frontiers.
PHS 398/2590 (Rev. 11/07)
Page 22
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
This part of the project will benefit from the interdisciplinary, multi-institution Network Workbench69,70
(NWB) tool development project lead by Börner. The NWB tool supports the large-scale analysis of
scholarly data including publication, citation and joint investigator relationships. It provides access to
more than 110 algorithms relevant for the study of social networks, and can be used to quickly test
and refine analysis workflows and visualizations in support of effective research networking.
C.2.g. VIVO Networking for Researchers and Groups
The root object of interest in a research networking infrastructure is the individual researcher node.
Researchers, whether in the role of author, investigator, faculty member, inventor or trainee, can have
multiple attributes and linkages depending on these roles as reflected in the ontology structure. We
anticipate there might be upwards 1,000,000 researcher nodes in the distributed VIVOweb system.
Researchers form themselves into groups, formally constituted research teams, institutes and centers,
informal project staff and networks of common interest. A researcher might be a member of several
dozen groups. Groups will vary in size from a few people to hundreds of researchers, and we expect
to support the tasks of group formation, management and productivity. We envision that creating a
group or research network proceeds in a very similar manner to friending people in FaceBook: simply
find a person, ask him/her if she wants to join a group and upon confirmation, both researchers are
connected to a ‘group’ node. Formal groups will be populated by systems of record and authorized
individuals.
Team formation typically requires understanding the expertise, resources and network connections of
each participating researcher. It also benefits from seeing what a new member adds to an existing
team through new connections based on subject area, research activities, affiliations and other forms
of linkages.
Group management benefits from a local view of the triples (person, member of, team) that make up
the researchers/students in a group.
Group productivity requires effective exploitation of strong and weak linkages of researchers, effective
communication of intermediate and final results and evaluation of researchers’ contributions as input
to future group formation. VIVOweb will provide bi-lateral data exchange that can be used to interface
VIVOweb installations to group productivity tools.
Support for groups will be a key area of new development in the ontology, so that groups can be
linked not only to multiple investigators, but also to publications, grants and facilities. Access controls
leveraging the ontology structure will provide fine-grained, contextual control over viewing and editing,
an important feature for individuals wishing to use VIVOweb for personal and team-based networking,
where some connections may be speculative or private, especially when just forming.
C.3.
Implementation
Each of the seven schools will implement VIVOweb and join a prototype national research network.
Implementing VIVOweb involves hosting the VIVO platform, populating VIVOweb with information
regarding the researchers at the institution, and creating a community of practice around support and
maintenance of the platform and its data.
C.3.a. Release 1 Implementation
To achieve the goal of prototyping a national network for research networking, we propose to deploy a
local VIVO application at each partner institution. VIVO will be installed and configured by the local
institution based on the availability of institutional data sources and configured for interactive editing in
accordance with the institution’s authentication systems.
VIVO can provide a self-contained, local research networking solution featuring a public web display
portal for researcher interests and accomplishments (see Figure 1). VIVO can consume RDF from
PHS 398/2590 (Rev. 11/07)
Page 23
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
any source and ontologies created using any standard editor such as Protégé.71 Phase 1 will prioritize
connections with local authentication systems to ensure that data can also be modified by researchers
who log in with their institutional credentials to use the self-editing component in VIVO. Local
modifications to the VIVOweb ontology will also be possible through interactive editing screens.
Administrative editing roles will typically be assigned to librarians or research support professionals,
with student labor to capture data from CVs or existing web sites. This model enables small
biomedical research institutes or any institution without assured central IT support to provide an
attractive research networking system “out of the box,” capable of serving as a proof of concept to
elicit commitments of scarce institutional IT resources for localized authentication and tapping into
data sources of record.
Local VIVO installations will be sustainable only if data are current and accurate. Researchers have
little time to maintain information in their profiles. As a result, data ingest will be a critical part of the
technical innovation for institutional adopters of the VIVO platform. The early focus will be on
implementing an accepted common ontology (such as the Semantic Web for Research Communities
ontology adopted by UF’s instance, GatorScholar), and on setting up data feeds from institutional
sources for authoritative human resource information (active personnel, titles, affiliations), grants and
publications from PubMed and other databases such as Web of Science or Scopus, depending on
local licenses. Ingest procedures will be implemented in year 2 to harvest information from faculty
reporting systems in use at partner institutions.
VIVO at Cornell is populated primarily by data feeds from the PeopleSoft human resources database,
from an Oracle grants database and from a PeopleSoft student records system that provides course
information. XML web services from a new, externally-hosted faculty reporting system will provide
very granular information directly from annual updates by faculty from several Cornell colleges, using
workflow tools that identify both additions and deletions. The campus LDAP server provides updates
to contact information, and feeds to a new university events calendar and news service are underway,
following the model of leveraging any and all appropriate existing institutional data sources to assure
information currency and to allow maintenance for each data element to happen in the database of
record.
Development at UF will improve the data ingest workflow from SOR, which most frequently involves
converting data from relational databases into the statement-based Semantic Web data model
through the vehicles of CSV or tab-delimited text files, direct database views, XML data files or web
services. Workflow templates to keep VIVO updated from SOR will be included in the distribution
package for implementation as automated or semi-automated processes depending on local
situations, allowing VIVO to be updated using established institutional policies and procedures.
Each institution’s VIVO will become part of a distributed computing cluster that will harvest data from
each local node for a cross-institutional search index and for network analysis and visualization. The
local VIVO application will be used to edit and display profiles of researchers at an institution in the full
context of their affiliations, activities and accomplishments. VIVO has already been independently
installed at UF and at two international locations. The partners in this proposal have committed to
begin working with local VIVO installations from the beginning of the grant period to allow maximum
opportunity for formative evaluation and feedback within the first year.
Phase 1 will include providing a level of documentation and quick start guides to allow installation of
“VIVO in a box” by information technology professionals with no previous experience with the
constituent open source tools. The VIVO platform will be further documented and installation
packages developed during phase 1 to facilitate deployment at additional institutions (Section C.2.a).
The ontology group will also enhance the core VIVO ontology to improve direct interoperability with
ontologies in the research project, temporal, geographic, publication and biomedical domains.
PHS 398/2590 (Rev. 11/07)
Page 24
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
C.3.b. Release 2 Implementation
Research and development at Indiana University initiated during phase 1 will add more direct support
for personal and institutional-level networking and for reporting and query tools designed to support
prospective discovery of collaborators to complement relationships discoverable through coauthorship and other more direct common affiliations presently visible in VIVO (Section C.2.c). These
features are also discussed in more detail as use cases in Section B.1 above. Cornell and UF will add
features previously described. The resulting release will fully implement networking of researchers
among the seven schools. Each school will upgrade their VIVO release 1 system to release 2.
Upgrades are intended to be straightforward. Initial seeding of the databases and maintenance of
data will have taken place during the implementation of the first release – implementation of release 2
is intended to demonstrate the long-term viability of locally support VIVOweb maintenance.
During release 2, data from these institutionally-hosted VIVO systems will be made available for local
harvesting and repurposing using standard RDF syntaxes such as RDF/XML72. National networking
capability will be fully realized during this phase, and enable more institutions to join the network.
Institutions will be encouraged to join the network as the first seven complete their release 2
implementations.
C.3.c. Release 3 Implementation
Release 3 will be developed over the last six months of the two year grant period. Features and
improvements will be drive by the evaluation of the first two releases and the governance process.
Release 3 will be available prior to the end of the grant period and marks the transition of VIVOweb to
community-supported open source. The seven participating schools are not expected to implement
the system as part of their work on this proposal. A positive outcome would be for schools to accept
responsibility for the care and maintenance of their VIVOweb systems based on the utility they have
observed during the grant period.
By release 3 we anticipate and welcome adoption by schools outside the initial group of seven. By
broadening the community we begin the path to true research networking.
C.4.
Dissemination
The dissemination and adoption of VIVOweb by institutions will be fueled by outreach efforts coupled
with a strong technical and community support model. A researcher network platform may be
technically very sound, but will only be used and maintained if it is of value at multiple levels as has
already been mentioned—and if the stakeholders are well-supported, and completely understand and
appreciate the value of participating in the network. While a large part of this value is provided by
technical innovation, experience with the VIVO platform at Cornell and UF has indicated that
sustained outreach efforts targeted at administrators and researchers alike to publicize the tool and its
value can have immense and lasting positive ramifications. Dissemination efforts will also have to take
into account the likelihood that national dissemination will differ from adoption by participating
institutions—the “early adopters”—characterized by author Geoffrey Moore73 as those “…who have
the insight to match an emerging technology to a strategic opportunity…”. Although Moore’s book,
relates primarily to commercial products, many of its principles apply, and will be employed in the
VIVOweb dissemination plan.
C.4.a. Dissemination within Participating Institutions
Proactive and “apolitical” service and support, and “marketing” VIVO to researchers and
administrators has been a key to dissemination and adoption of the VIVO platform at Cornell and UF,
and will be heavily relied upon as VIVOweb is disseminated within participating institutions. Based on
experience accumulated through these implementations, support for participating institutions will
primarily focus on short one-on-one or group presentations that highlight the benefits to researchers
and the institution of participating in VIVOweb, and inform users and administrators of the ease in
PHS 398/2590 (Rev. 11/07)
Page 25
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
maintaining personal information. Technical backstopping by personnel on local and national
development teams, and building collaborations with institutional information technology will be a key
support element. However, the most effective provision of the liaison and outreach activities that these
goals presuppose will require community support, which will be provided by information specialist
facilitators in the institutional library system.
Personnel with important roles in this effort will ideally exhibit good understanding of individual and
institutional research programs, activities, and needs, have the ability to effectively navigate the
administrative and political landscape, be able to communicate confidently and knowledgably with
their stakeholders, and be capable of conveying essential information about VIVOweb without resort
to technical jargon and details. They will liaise not only with institutional stakeholders, but also with
personnel within the national coordination and implementation teams, to implement a locally viable
process and workflow to promote dissemination. They will also work with the metadata and other
librarians on the project team (such as those with expertise in MeSH, CTSC activities, metadata, and
ontologies) to ensure that the project is responsive to the need for local modifications—to the core
ontology, for instance. Responsiveness to user feedback is critical to ensuring the successful
dissemination of VIVOweb.
A well-conceived and implemented dissemination effort for participating institutions will earn a good
reputation for VIVOweb, an essential feature of the national dissemination endeavor.
C.4.b. National Dissemination
The national dissemination effort will concentrate on promoting adoption of VIVOweb by new
institutions who are not early adopters. Moore classifies this group into two: early majority—or
“pragmatists”, and the late majority, and identifies the major barrier to successful dissemination of a
technology product being the pragmatists. According to Moore:
“Overall, to market to pragmatists, you must be patient. You need to be conversant
with the issues that dominate their particular business. You need to show up at the
industry-specific conferences and trade shows they attend. You need to be
mentioned in articles that run in magazines they read. You need to be installed in
other companies in their industry. You need to have developed applications that are
specific to their industry. You need to have partnerships and alliances with the
other vendors who serve their industry. You need to have earned a reputation for
quality and service.”
Based on this analogy, our pragmatists are likely to be encountered when the national dissemination
effort begins, and success at fostering adoption of VIVOweb at this scale will likely require patience,
familiarity with current issues in biomedical research, and a presence at biomedical events, venues—
and literature, if possible. That VIVOweb will be an application that services needs in biomedical
research is a given, as is the fact that it will collaborate with and draw information from other
biomedical service providers such as PubMed and other data and resource discovery sources as
much as possible.
Movement towards national marketing and dissemination will begin immediately after funding begins,
even though the early dissemination focus will be on participating institutions. It is clear that with the
essential role that these early adopters will play in the development and implementation of VIVOweb,
libraries are a constituency to which to market VIVOweb as one way to promote adoption. The
Medical Library Association (MLA) is the primary library association for librarians who serve
biomedical researchers, and its annual conference would be a logical place to introduce
VIVOweb. However, with funding beginning in September, and MLA not meeting on a national basis
until May 2010, initial marketing to this audience will begin at the regional level with presentation
and/or exhibition at MLA chapter conferences. In 2009, 8 chapter meetings will be held between 21
PHS 398/2590 (Rev. 11/07)
Page 26
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
September and 1 November (with a ninth meeting in January 2010) covering all regions of the United
States.
Some other appropriate scientific, library, and informatics related conferences that will be covered
include the American Association for the Advancement of Science Annual Meeting (February 2010 in
San Diego); the American Society for Microbiology Annual Meeting (May 2010 in San Diego); the
American Medical Informatics Associations Annual Symposium (AMIA; November 2009 in San
Francisco, November 2010 in Washington, D.C.); AMIAs Summit on Translational Bioinformatics
(Spring 2010 in San Francisco); AMIAs Spring Conference (May 2010 in Phoenix); the Special Library
Association Annual Conference (June 2010 in New Orleans); and the Association of Research
Libraries, that meets as part of the American Library Association’s annual conference (June 2010 in
Washington, D.C.)
Another way to introduce VIVOweb to library decision-makers is to present at the Association of
Academic Health Libraries, a group made up of library directors and associate directors that will meet
in Boston in November of 2009. It is imperative that VIVOweb also be introduced to potential endusers (researchers) through presentation and exhibition at their conferences as well. The American
Association for the Advancement of Science Annual Meeting is a natural choice for exhibition. Other
ways to advertise VIVOweb to potential end-users is through correspondence with associations such
as the National Academy of Sciences, the Institute of Medicine, and the National Academy of
Engineering. An obvious group to whom to advertise is the CTSA Consortium. Additional methods of
dissemination will include the use of association email lists, demonstrations on YouTube, and an
advertising webpage dedicated to VIVOweb.
C.5.
Evaluation
The end goal of this project is to have a tool for researchers to facilitate research networking,
collaborations, and data sharing to improve scientific dissemination. To provide evidence that the
VIVO project meets this goal, we will employ various evaluation techniques to assess a minimum of
six objectives related to VIVO support, implementation, dissemination (see Table 4). The Washington
University (WU) team will lead the evaluation of VIVO at each site, gathering information using datamining, surveys, observational analysis, and personal interviews. We will focus on usability and
outcome evaluations. The usability evaluation will be designed to document and analyze the
implementation of VIVO at the adoption sites assessing the implementation completion, consistency
among institutions, and identification of gaps between design and delivery. Outcome evaluations will
be designed to assess the impact, benefits, and changes at each institution. The evaluation will be
guided by the milestones outlined in Table 7 and Table 8 in section E.5.
As outlined by the National Institute of Standards and Technology, five key attributes will be assessed
through the usability evaluation: learnability, efficiency, memorability, errors, and user satisfaction74
using evaluation methods such as data-mining, surveys, task analyses, and focus groups. We will
employ data-mining metrics using WU-developed code or through a pre-packaged tool such as
Morae75 to conduct the usability evaluation. Through data-mining we will assess VIVO usage
monitoring, gathering data such as page views, number of visits and unique views to analyze the
usage of VIVO by volume of participants and quantity of information viewed. Additional measures may
include: referring and referral websites, and successful and failed search results, and path analysis.
While website monitoring will be continuous, data will be collected on a quarterly basis. User testing
will be incorporated into the evaluation where trained evaluators and software developers will observe
and record end-users interacting with VIVO in both general and specific tasks. This will identify
specific uses and difficulties with VIVO for reporting and further action. To understand how end-users
utilize VIVO, we will conduct a task analysis and compare this with the goals of VIVO creators and
conduct focus groups to understand key information such as barriers to usage and suggestions for
improvements.
PHS 398/2590 (Rev. 11/07)
Page 27
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
A set evaluation tool such as the MIT Usability Guideline76 will provide measurement consistency with
evaluators over the two-year project period. The guidelines include assessing navigation, functionality,
user control, language and content, online help and user guides, system and user feedback, web
accessibility, consistency, error prevention and correction, and architectural and visual clarity.
Adopting institutions will be given these guidelines at the beginning of the program, and two
assessments will take place for each institution with a report with recommendations returned to the
institutions within two months after evaluation completion.
Items to be assessed and analyzed for the outcomes evaluation include: inputs, activities conducted,
outputs, outcomes, and, outcome indicators. At the initiation of the program a survey will be designed
and administered to assess the expectations for VIVO program implementation and usage by
adopting institutions. Follow-up evaluations to the same persons will be conducted at the end of the
first and second year incorporating dissemination and adoption practices. All surveys will be
conducted on-line sending a link to potential respondent’s e-mail. Personal interviews with
researchers, site representatives, IT implementers, and other key personnel will be conducted by Dr.
Leslie McIntosh and a junior bioinformatics specialist. Additionally, we will conduct web searches for
VIVO references in presentations, papers, and other documentation sources. When necessary, both
quantitative and qualitative questions will be designed and delivered through structured and
unstructured interviews.
Through the methods described, we will answer the following questions:
1. How well does the software meet the needs of investigators for finding appropriate people for
collaboration and research?
2. How well does the software meet the needs of institutions for learning about their own
activities?
3. How much effort is involved in implementing, hosting and maintaining the system and the data
stored in the system?
4. How has the 2-year grant program addressed the issues of sustainability of development and
support for the software?
5. How accurate and timely are the data at each institution? Is the accuracy related to the
techniques used to implement and support the software? How? What recommendations can
be made for improvement?
We will apply for IRB exemption or submit the necessary paperwork to satisfy IRB requirements at
implementation sites for the evaluation. The evaluation team at Washington University will prepare
quarterly reports, which will be made available to the executive advisory board.
The project plan addresses governance, technical design, implementation, dissemination and
evaluation based on the development of community supported by the libraries, and focused technical
activity to extend VIVO’s capabilities to for national networking of researchers.
PHS 398/2590 (Rev. 11/07)
Page 28
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Table 4 Primary Objectives to be evaluated in VIVO assessment
Objective
Assessment
Evaluation method
Support network will be in
place
Identify individuals tasked with outreach
and support in the network
Evaluate communication/interaction among
persons
Discussions with VIVO Wiki and
Advisory Board
On-line survey incorporating
social network analysis
Telephone interviews
VIVO implemented at
adopting institutions
Visit VIVO websites at adopting institutions
Web-based surveys to key personnel at
implementation sites
Trained evaluators will employ
usability guidelines
Report the number of institutions
using VIVO and descriptive
statistics of website use
VIVO support services and
training meets the needs of
users at VIVO
implementation sites
Evaluate success of training (both in-person
instruction as well as “just-in-time” webbased tutorials)
Web-based surveys upon
completion of training modules
Follow-up survey within two
months after training
VIVO disseminated beyond
initial adopters
Collect evidence demonstrating
presentations given to promote VIVO
Educational outreach activities
Web search of presentations
(e.g. use Google Scholar)
On-line survey to key personnel
at implementation sites
VIVO accessed and used
by diverse user community
Visit VIVO websites at adopting institutions
Data-mine usage of VIVO sites
including incoming IP addresses
VIVO community support
developed beyond initial
implementers
Monitor on-line VIVO forums
Data-mine forum content,
robustness, and end-user usage
D. Role of the Participating Institutions and Staffing of the Project
D.1.
Participating Institutions
Seven schools will serve as early adopters of the VIVO system (see Table 5). These schools
represent significant diversity in terms of size, geography, student population and NIH activity. All
seven schools have NCRR centers. Five of the schools have CTSA awards – UF, Weill Cornell
Medical College, Indiana University, Scripps Research Institute and WU. Three schools – UF, Cornell
University and Indiana University will participate in the technical activity required to develop
subsequent versions of VIVO. All seven schools will implement two versions of VIVO, the current
version and the version to be developed under this proposal (see Project Deliverables and Timeline).
As part of the implementation, all seven schools will participate in the evaluation of VIVO and its use
by researchers.
PHS 398/2590 (Rev. 11/07)
Page 29
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Table 5 Participating Institutions
School
University of Florida
Cornell University, Ithaca
Indiana University
Washington University
Weill Cornell Medical College
Scripps Research Institute
Ponce Medical School, Puerto Rico
Role
NCRR, CTSA, Development, Implementation, Indexing,
Lead
NCRR, Development, Implementation, Indexing
NCRR, CTSA, Development, Implementation
NCRR, CTSA, Implementation, Evaluation
NCRR, CTSA, Implementation
NCRR, CTSA, Implementation
NCRR, Implementation
The University of Florida (UF) will serve as the lead institution and developer of interfaces and
packaging for rapid deployment of VIVO at other institutions. UF is the fourth largest university in the
United States, with over 51,000 students on its Gainesville campus. As a land grant university, UF
operates in all 67 counties across the state of Florida. Research awards to UF faculty account for
$576M of external support annually. Through its Clinical and Translational Science Institute UF is
affiliated with the Moffitt Cancer Center77, Shands HealthCare78, the Malcom Randall Veterans
Association Medical Center of the North Florida/South Georgia Veterans Health System79, the largest
in the country, and the Burnham Research Institute80 in Orlando. UF is the lead institution for the
project. Efforts include overall project direction, facilitation of governance processes and structures,
local implementation and support, participation in evaluation, development of VIVOweb interfaces to
SOR and other platforms, packaging of VIVO software for rapid deployment, identity management
support, instructional media and design and coordination of site implementations.
Cornell University, Ithaca will serve as the lead institution for the extension of VIVO for national
networking. Founded in 1865 by Ezra Cornell and Andrew Dickson White, Cornell is the federal landgrant institution of New York State, a private endowed university, a member of the Ivy League/Ancient
Eight, and a partner of the State University of New York. It consists of fourteen colleges and schools:
seven undergraduate units and four graduate and professional units in Ithaca, two medical graduate
and professional units in New York City, and one in Doha, Qatar. The Ithaca campus includes 1,627
faculty, 13,562 undergraduate students, and 6,077 graduate and professional students. Life Sciences
research at Cornell cuts across most of the colleges and schools with 44 graduate fields from animal
breeding to zoology. It is a particular focus of both Cornell’s College of Agriculture and Life Sciences
and its College of Veterinary Medicine. The Cornell University Library is one of the twelve largest
academic research libraries in the United States, with a long history of research and development in
the area of digital information services. The Albert R. Mann Library, part of CUL and the home of
VIVO, offers one of the country’s best library collections in agriculture, life sciences and human
ecology, as well as providing extensive computing facilities, a broad suite of digital media technology,
tools for GIS, hands-on workshops, customized reference consultations and a range of other services.
Mann Library is internationally known for its digital library efforts including VIVO, TEEAL (The
Essential Electronic Agricultural Library) and the USDA Economics, Statistics, and Market Information
System. Efforts at Cornell include leadership of the Technology Advisory Committee; technical
development of enhancements to VIVO including distributed, multi-institutional indexing, scalability,
and support for individual and team networking; and national coordination of the VIVOweb outreach
efforts.
Indiana University will lead development in social networking and ontologies. Indiana University was
founded in 1820 and is one of the state’s leading research and educational institutions. General
information about Indiana University, including an overview of physical facilities, is available online81.
Indiana University includes two main research campuses and six regional (primarily teaching)
campuses. The Indiana University Bloomington campus includes 2,309 full- and part-time faculty,
PHS 398/2590 (Rev. 11/07)
Page 30
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
5,201 professional staff, 8,596 graduate and professional students and 30,394 undergraduate
students. The Indiana University—Purdue University Indianapolis (IUPUI) campus is operated by
Indiana University and includes schools from Indiana University and Purdue. The IUPUI campus
includes 3,161 full- and part-time faculty, 4,645 professional staff, 8,652 graduate and professional
students and 21,202 undergraduate students. The key Indiana University schools located at IUPUI
include the Indiana University Schools of Medicine, Informatics and Business. The Office of the Vice
President for Information Technology (OVPIT) and University Information Technology Services (UITS)
are responsible for delivery of core information technology and cyberinfrastructure services and
support. OVPIT and UITS collectively have a budget of more than $110,000,000 annually, employing
more than 700 full time staff members. Through its Clinical and Translational Science Institute,
Indiana is affiliated with sixteen commercial and university entities82 including Purdue University and
the University of Notre Dame.
Washington University School of Medicine (WUSM) Adoption Team will perform evaluation of
implementation and integration of the VIVO application at all partner institutions and will serve as an
implementation site for VIVO. Washington University (WU) has a rich tradition of academic, research,
and clinical excellence. WU includes the School of Medicine located at the Medical Center Campus
and six other schools (Arts and Sciences, Business, Design and Visual Arts, Engineering and Applied
Science, Law, and Social Work) located at the Danforth Campus two miles away. The two campuses
are connected by a regular shuttle service and the public light rail service. WU has 105 academic
departments with 11,158 full time students. WU has a history of distinguished faculty: 30 are currently
members of the National Academy of Sciences, 26 are members of the Institute of Medicine, 19 hold
MERIT awards from NIH, and six are Howard Hughes Medical Institute investigators. Twenty-two
Nobel laureates have been associated with WU as faculty or students, 17 from the School of
Medicine. The WUSM is organized into 20 departments, four teaching and research divisions, and
seven graduate training divisions with a total of 1,727 faculty, 594 medical students, and 638 graduate
students. The Division of Biology and Biomedical Sciences oversees an array of graduate training
programs, including the largest Medical Scientist Training Program (MD/PhD) in the country. More
than 90% of the Medical Scientist Training Program graduates are actively involved in research. In
FY07, NIH grants awarded to WUSM faculty totaled $365,986 million ranking amongst the top NIH
funded medical schools in the country. WU also has outstanding patient care programs through its
affiliation with BJC Healthcare, a 13-hospital integrated health care delivery network in the Midwest,
which is anchored by two nationally ranked teaching hospitals, Barnes-Jewish Hospital and St. Louis
Children’s Hospital. These resources make WU well-suited to act as a key participant in the VIVO
project as both an implementation site as well as the lead for evaluation efforts of the VIVO
consortium. WU has a keen understanding of institutional, collaborative, cultural, and regulatory
challenges that slow the process of transferring basic and clinical scientific discoveries into
improvements in human health and looks forward to participating in this important effort which will
facilitate academic collaborations, thus ultimately speeding the implementation of scientific
discoveries.
Weill Cornell Medical College joined with 6 partners, Cornell University, Ithaca, Cornell University
Cooperative Extension, New York City, Hospital for Special Surgery, Hunter Center for Study of Gene
Structure and Function, Hunter-Bellevue School of Nursing and Memorial Sloan-Kettering Cancer
Center to devise a strategy for creating a Clinical and Translational Science Center. The mission of
this Center is to nurture and promote a research environment that would accelerate the clinical
application of basic science discoveries. We are shaping programs to integrate clinical and
translational science across multiple departments, schools, clinical and research institutes and
hospitals. We developed mechanisms to foster the creation of multidisciplinary research teams,
incubators to develop innovative research tools and information technologies, which ultimately would
advance the application of new knowledge and techniques to good clinical practice in patient care.
Weill Cornell is already a participant in VIVO through Cornell Ithaca. We will extend our
PHS 398/2590 (Rev. 11/07)
Page 31
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
implementation with the new features and semantics and extend the VIVO functionality to our partner
CTSC institutions.
The Scripps Research Institute (TSRI) will be an early adopter of VIVO. One of the world's largest
independent, non-profit biomedical research organizations, The Scripps Research Institute operates
two campuses with headquarters in LaJolla, California, and a new campus focused on basic
biomedical science, drug discovery, and technology development in Jupiter, Florida. TSRI is
internationally recognized for its discoveries in immunology, molecular and cellular biology, chemistry,
neurosciences, autoimmune, cardiovascular, and infectious diseases, and synthetic vaccine
development. Established in its current configuration in 1961, it employs approximately 3,000
scientists, postdoctoral fellows, scientific and other technicians, doctoral degree graduate students,
and administrative and technical support personnel.
Ponce School of Medicine, Puerto Rico will be an early adopter of VIVO. Founded in 1977, Ponce
now holds nationally accredited graduate programs in the disciplines of Medicine, Clinical Psychology,
and Biomedical Sciences, and a Masters Degree in Public Health. The Ponce School of Medicine
Partnership with the Moffitt Cancer Center83 addresses the cancer problem in Puerto Rico focusing on
basic research, cancer education and training, outreach and tissue procurement. Ponce is a member
of the Alliance for Advancement in Biomedical Research in Puerto Rico84, an NCRR funded center.
D.2.
Staffing of the Project
Dr. Michael Conlon will serve as principal investigator and project director. Dr. Conlon has extensive
experience in large-scale software development and deployment and in biomedical research. Dr.
Conlon led the development of the software used in the INVEST clinical trial, collecting and
processing data from over 850 physicians offices in 14 countries. Dr. Conlon led the technical
implementation of the UF PeopleSoft System (myUFL85) and built and managed a team that
implemented infrastructure, system interfaces, data conversions and system configuration for 18
modules in 18 months. The $29M project was delivered on-time and on-budget. Dr. Conlon led the
design and implementation of the UF Directory86, an identity management system containing records
of over 1.7 million current and former faculty, staff and students of UF and supporting federated
identity through Shibboleth87 for 170,000 current credential holders. Dr. Conlon led the efforts to
create UF’s Active Directory88 system supporting servers in 50 locations across the State of Florida, a
BizTalk89 system providing service oriented architecture services to enterprise applications, and UF
Exchange90 – UF’s implementation of Microsoft Exchange and Microsoft Office Communications
Server. Dr. Conlon is Associate CIO for IT Architecture at UF as well as Associate Director of the UF
Clinical and Translational Science Institute91 and Interim Director of Biomedical Informatics in the
College of Medicine. A former CIO of the UF Health Science Center and Research Associate
Professor of Biostatistics, Dr. Conlon is a frequent presenter on identity and access management and
serves on the InCommon92 Research Administration working group.
Valrie Davis will serve as site liaison, and coordinate implementation of VIVO at all seven institutions.
Davis has co-led the local VIVO implementation called GatorScholar at UF. She is a member of the
UF Libraries’ Emerging Technologies Committee and leads local exploration of ontologies and
assists in the dissemination of technologies across the campus community. She coordinates librarybased services for off-campus users including more than 995 faculty and staff located at 13 Research
& Education Centers and 67 County Extension offices throughout the State of Florida. She also
supports a variety of on-campus agricultural and life science departments. As a library instructor, she
presents specialized face-to-face training sessions and develops specialized training tutorials using
software such as Camtasia. A member of the Born Digital Initiative Working Group, Policy
Development and Grant Writing Sub-Committee, she assisted in the identification of preservation and
access issues related to a national interface for born-digital and reborn digital agricultural resources.
She is an active member in many national organizations where she provides expertise in the
PHS 398/2590 (Rev. 11/07)
Page 32
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
agricultural sciences, information sharing technologies, and electronic resource development. Ms.
Davis will serve as site liaison, coordinating implementation of VIVO at participating schools.
Dr. Sara Russell Gonzalez will lead the local UF implementation, expanding the current
implementation to all of UF. She co-led UF’s initial test implementation of the VIVO database. She is
the Physical Sciences librarian at the Marston Science Library at UF, providing research assistance
and instruction in the subjects of Physics, Astronomy, and Geology. Through her liaison work to
these departments, Dr. Russell Gonzalez has developed an expertise in harvesting and retrieval of
scientific publications. Her research interests include applying bibliometrics to understanding the
publishing behavior of scientists. She was recently a consultant on a NASA grant with members of
the UF Astronomy department to acquire and setup an Astrowall for display of 3-D astronomical data
for educational purposes. Prior to joining UF, Dr. Russell Gonzalez was a research seismologist with
Weston Geophysical Corporation investigating discrimination and location of nuclear explosions.
Dr. George Hack will serve as the lead in the development of instructional support and media for
VIVOweb. George Hack has been on faculty at the University of Florida since 1997 serving in the
Institute of Food and Agricultural Sciences as coordinator of extension education programs, teaching
graduate and undergraduate technology courses in the College of Education, and as Assistant
Director for Instruction and Information systems in the Health Science Center Libraries. Dr. Hack has
a doctorate in Educational Technology and has designed online and face-to-face instruction at the
University of Florida and other universities. Recently he has collaborated on the Compendium for
Children’s Health with a team of international physicians, setting up an online environment for
Pediatricians to receive instruction in Community Pediatrics. His current research investigations
include human-computer interactions as they relate to information resources and information seeking
behaviors. He plans to use the findings from this research to better inform interface development,
bibliographic instruction, physical and technology spaces within the library, and web design.
Christopher Barnes manages software development for the Clinical and Translational Informatics
Program at UF. Mr. Barnes has led the development of hundreds of research systems including
those supporting the Florida Brain Tumor Registry, the Emerging Pathogens Institute, the Claude
Pepper Center for Aging, the Texas Medicare data repository, and the portal of the UF Clinical and
Translational Science Institute. He has significant experience with Drupal, Shibboleth, and research
software development. Mr. Barnes will lead the UF development teams responsible for VIVO
packaging, incorporation of federated identity management using Shibboleth, the construction of
interfaces for systems of record and interfaces for Sakai and Drupal.
Dr. Michele R. Tennant is the Bioinformatics Librarian at the UF Health Science Center Libraries and
U.F. Genetics Institute. Dr. Tennant has provided reference and liaison services at the library since
1995. Since 2001 she has served as embedded librarian in the UF Genetics Institute, providing
consultations and extensive instruction in the use of bioinformatics and more traditional library
resources. As liaison and embedded librarian, she has forged strong professional relationships with
UF biomedical researchers, particularly those whose work has genetic-, molecular- or bioinformaticsrelated components. She currently serves as contact for implementation of GatorScholar at the UF
Health Sciences Center. Dr. Tennant was part of two teams of information professionals from
throughout the country that created online educational materials for National Center for Biotechnology
Resources, and taught courses onsite and on a regional basis for the NCBI. Dr. Tennant’s research
interests include how scientists use bioinformatics-related databases, in particular those developed by
the NCBI, and attitudes of researchers and librarians to library-based bioinformatics support. She is
active at the national level in the Medical Library Association and the Special Libraries Association,
and is currently a member of the National Library of Medicine’s Biomedical Library and Informatics
Review Committee. Dr. Tennant’s work on the proposed grant is fourfold: 1. She will serve as
researcher support at the University of Florida; 2. She will coordinate the UF liaison librarians’
outreach efforts (marketing, instruction, and communication to the technical team of researchers’
PHS 398/2590 (Rev. 11/07)
Page 33
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
needs related to VIVOweb) and will perform these same functions with her research clients; 3. She
will serve on the team coordinating national activities (efforts to recruit additional libraries, present and
exhibit at national and regional conferences, etc.); 4. She will assist UF’s ontology team.
Dr. Dean Krafft is the Chief Technology Strategist at the Cornell University Library and a Senior
Research Associate in Information Science. Dr. Krafft will lead the Cornell effort, overseeing the
VIVOweb technical development at Cornell. He will also chair the project’s Technical Advisory Board,
working with technical experts from across the country. As the former Director of IT for Computing and
Information Science at Cornell and the former Principal Investigator on the National Science Digital
Library (NSDL) project93, he has extensive experience in managing large software development
projects, in IT support and production, and in working in large, complex virtual organizations. While
with NSDL, he led the effort to create Ncore94, an open-source technical infrastructure for digital
libraries, to support the thirteen NSDL Pathways partners and the over 130 collections that comprised
NSDL.
Dr. Medha Devare will serve as national coordinator for VIVO. Dr. Devare is a bioinformaticist based
in Cornell’s Albert R. Mann Library, and has coordinated the implementation and outreach efforts for
VIVO across Cornell University's 11 colleges and 3 U.S. campus locations. She has also developed
relationships with and built interest in the VIVO platform at a number of institutions through liaison and
outreach activities and conference presentations. Apart from coordinating the VIVO project at Cornell,
Dr. Devare has taught bioinformatics workshops at Cornell (Ithaca) and at Weill Cornell Medical
College, and organized and taught a genomics course and co-taught a cropping systems course at
Cornell. She is currently working with faculty to recreate the introductory biology laboratory at Cornell.
Dr. Devare remains involved with research on agricultural biotechnology, with several reports and
publications out and in review on this topic. As national coordinator for VIVO, she will promote the
project, the VIVOweb platform and the library-based support model, and coordinate outreach efforts
with information specialists and other personnel at all seven schools, and at the national scale.
Instructional media and promotional materials will be developed under Dr. Devare's coordination with
the team from UF.
Jonathan Corson-Rikert will lead development teams at Cornell University to extend VIVO’s
capabilities as described in this proposal. Mr. Corson-Rikert has been a programmer and project
leader in Information Technology Services at Cornell’s Albert R. Mann Library since 2001, working on
projects including the VIVO virtual life sciences library95, the Cornell University Geospatial Information
Repository96, and e-Clips97, Cornell’s collection of digital video clips on entrepreneurship. Prior to
joining Mann Library, he worked as research administrator for the Program of Computer Graphics at
Cornell, programmed geographic software at the Harvard Lab for Computer Graphics and Spatial
Analysis, and developed early digital cartography applications at the Dane County Regional Planning
Commission in Wisconsin.
Dr. Katy Bòˆrner will direct efforts related to social networking, metrics and presentation. Börner is the
Victor H. Yngve Associate Professor of Information Science at the School of Library and Information
Science, Adjunct Associate Professor in the School of Informatics, Core Faculty of Cognitive Science,
Research Affiliate of the Biocomplexity Institute, Fellow of the Center for Research on Learning and
Technology, Member of the Advanced Visualization Laboratory, and Founding Director of the
Cyberinfrastructure for Network Science Center98 at Indiana University. She is a curator of the Places
& Spaces: Mapping Science exhibit99. Her research focuses on the development of data analysis and
visualization techniques for information access, understanding, and management. She is particularly
interested in the study of the structure and evolution of scientific disciplines; the analysis and
visualization of online activity; and the development of cyberinfrastructures for large scale scientific
collaboration and computation. She is the co-editor of the Springer book on ‘Visual Interfaces to
Digital Libraries’ and of a special issue of PNAS on ‘Mapping Knowledge Domains’ (2004). Her new
book ‘Atlas of Science’ published by MIT Press will become available in 2010.
PHS 398/2590 (Rev. 11/07)
Page 34
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Dr. Ying Ding will lead efforts pertaining to the development and maintenance of ontologies used by
the Semantic Web to represent scientists and investigators. Dr. Ying Ding is an Assistant Professor in
School of Library and Information Science, Indianan University. She previously worked as an
Assistant Professor at the University of Innsbruck, Austria and as a researcher at the Division of
Mathematics and Computer Science at the Free University of Amsterdam, the Netherlands. She has
more than eight years of experience and a strong research track in the Semantic Web area. She was
involved in the early development of the DAML+OIL language which evolved into OWL, the current
W3C standard for ontology definition. She has been involved in various European-Union funded
projects in the Semantic Web area (KnowledgeWeb, Ontoweb EASAIER, OntoKnowledge, IBROW,
SWWS, COG, Htechsight, Esperonto, SEKT, DIP, Triple Space Computing). She was one of the
major organizers and initiators for International Semantic Web Conference, European Semantic Web
Conference and Asian Semantic Web Conference She has published more than 70 papers in
journals, conferences and workshops and has served as a program committee member for more than
80 international conferences and workshops. She is co-author of the book, “Intelligent Information
Integration in B2B Electronic Commerce,” published by Kluwer Academic Publishers. She is also coauthor of book chapters in “Spinning the Semantic Web,” published by MIT Press, and “Towards the
Semantic Web: Ontology-driven Knowledge Management,” published by Wiley. Her current interest
areas include Semantic Web, Webometrics, citation analysis, information retrieval, knowledge
management and application of Web Technology.
Dr. William K. Barnett will be responsible for coordinating all aspects of the implementation of the
Cornell VIVO at the Indiana CTSI, including oversight of all grant personnel, coordination with Indiana
CTSI programs and institutions, and coordination with technical implementation. Barnett oversees life
sciences and biomedical research technologies at Indiana University and the Indiana University
School of Medicine (IUSM). As the Senior Manager of Life Sciences, he oversees the development
and implementation of research technology programs for biological research including high
performance computing (HPC) applications, analytical pipelines, and genomics research. As the
Director of the Advanced IT Core at the IUSM, he oversees the development and management of
biomedical applications, including HPC and applications development in support of health care
research. As the Director of Information Architectures for the Indiana CTSI, he oversees the
development of collaborative technologies for the Indiana Clinical and Translational Sciences Institute.
Dr. Barnett previously served as the Vice President and Chief Information Officer at the Field Museum
of Natural History, where he oversaw the development of collections-based digital library initiatives
and genomics research technologies. Previously at the American Museum of Natural History in New
York, he oversaw the development of the institutional infrastructure and the growth of core analytical
imaging facilities. Dr. Barnett has a BA in Anthropology (College of William and Mary) and a MA and
Ph.D. in Archaeology (Boston University).
Dr. Anurag Shankar will be responsible for technical coordination among Cornell technical staff
installing VIVO, the Indiana CTSI HUB technical staff, and the Purdue University systems
administrators responsible for software installation and maintenance on the HUB servers. Shankar
serves as a project manager and is the primary customer liaison for the Advanced IT Core, a
partnership between the IU School of Medicine and Life Sciences. A computational astrophysicist by
training, Shankar has held various positions in the past with UITS, including Manager of Unix support,
Distributed Storage Services (now Research Storage), and the national TeraGrid project. He was also
responsible for overseeing the Advanced IT Core for the Indiana Genomics Initiative. Shankar holds
degrees in physics, mathematics, and astronomy.
Dr. Rakesh Nagarajan will lead the Washington University effort. Dr. Nagarajan is the Biomedical
Informatics Program director of the WU CTSA, termed the Institute of Clinical and Translational
Sciences (ICTS), which has one of its sub aims to implement research networking solutions. Through
the ICTS and other initiatives, he leads the biomedical informatics infrastructure development effort at
Washington University as director of the WU Center for Biomedical Informatics (CBMI). Dr. Nagarajan
PHS 398/2590 (Rev. 11/07)
Page 35
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
and his team are implementing a common informatics infrastructure to support the diverse needs of
physician-scientists and bench researchers.
Dr. Kristi Holmes is a bioinformaticist based in Becker Medical Library and will lead the outreach
efforts at WU, including promotion and training of VIVO at WU and assistance with ontology
development. At WU, she is tasked with the development and presentation of bioinformatics resource
workshops for the university community, integration of molecular biology information resources into
medical school and graduate-level curricula, and application of bioinformatics resources to research
problems through individualized consultations and collaborative relationships. She has also served as
a course developer and instructor for the NCBI Advanced Workshop for Bioinformatics Information
Specialists offered by the National Center for Biotechnology Information (NCBI). Dr. Holmes is wellsuited for leading outreach efforts at WU, given her active role in investigating collaboration and
faculty profiling applications, her involvement in assessing issues related to research impact, and her
efforts to provide instruction, training resources and support materials to researchers.
Dr. Leslie McIntosh of Washington University will serve as implementation lead and will also
coordinate project evaluation activities for participating institutions. Dr. McIntosh will be able to serve
in this position as she has an extensive background in database, web site, and on-line survey
development for both educational and private institutions. In addition, Dr. McIntosh has experience
performing evaluations in other projects including designing, conducting, and analyzing quantitative
and qualitative evaluations of Evidence-Based Public Health trainings within the Missouri Public
Health departments, national public health department, and the World Health Organization;
conducting evaluations of table-top exercises to assess public health disaster preparedness; and,
evaluating youth attitudes, opinion, and beliefs from on-line forums using text analysis. She has also
been a consultant with the FBI to assist in survey techniques for collecting social network data.
Dr. Curtis Cole will lead the implementation of VIVOweb at the Weill Cornell Medical College. Dr.
Cole, board certified in internal medicine, is Chief Medical Information Officer of Weill and Acting codirector of the Weill Clinical and Translational Science Center Biomedical Informatics Program.
Dr. Gerald Joyce of The Scripps Research Institute is dean of the faculty and Co-Program Director
and Director for Translational Science, Scripps Translational Science Institute, NIH Clinical and
Translational Science Award (CTSA) Consortium. Dr. Joyce will lead the implementation of VIVOweb
at the Scripps Research Institute.
Paula King of the Scripps Research Institute is the Director of the Kresge Library and will lead the
researcher support processes for TSRI.
Dr. Richard Noel is Associate Professor of Biochemistry at Ponce Medical School and internal
advisor for the Ponce Medical School Moffitt Cancer Center Partnership. Dr. Noel will serve as
institutional liaison for Ponce Medical School and will lead the implementation and support efforts.
E. Project Deliverables and Timeline
VIVO work will proceed along three major activities – product development, community support
development and governance. Product development will be driven by three releases. Release 1 is
focused on institutional setting deployments. Release 2 implements national networking scientists.
Release 3 integrates resource discovery and features originating from community development.
Complimentary community support is developed along adoption, implementation and use processes.
The goal of the community support effort during the project is to create sustainability support activity
for VIVO after the end of the project. The governance processes provide a means for accepting
community input and determining the future direction of VIVO. An open, participatory governance
process will drive adoption of the national network and create value for all participating scientists and
institutions.
PHS 398/2590 (Rev. 11/07)
Page 36
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Table 6 VIVO Project Deliverables
Deliverable
VIVO Release 1
VIVO Release 2
VIVO Release 3
Community Support Process
Product Development Process
Governance Process
Final Report
E.1.
Description
Scientist discovery in the institutional setting
National Networking of Scientists
National Networking of Scientists and Resource Discovery
On-going support for adoption, implementation and use
On-going support for software development and maintenance
On-going processes for community input and decision-making
Summary, Accomplishments, Challenges, Lessons Learned,
Results of Evaluation, Next Steps
VIVO Release 1
VIVO Release 1 is a refinement of the existing VIVO platform. It is focused on researcher discovery
within an institution. Data will be exposed via the Semantic Web and interoperable with Linked Data.
Release 1 will include additional features for interfacing to systems of record, as well as
improvements in packaging – the scripts and procedures used to install the software, as well as
instructional materials for installation.
E.2.
VIVO Release 2
Release 2 includes all social networking features as well as visualization. Release 2 includes support
for federated identity management as well as support for groups. Indexing features will be provided
for semantic query across institutions. Release 2 constitutes the full national networking capability
described in this proposal.
E.3.
VIVO Release 3
Release 3 includes features identified by evaluation and vetted by governance processes throughout
the grant period. Release 3 includes integration with the resource discovery platform described in the
U24 Request for Application.
E.4.
Community Support Process
A critical component of the project plan is the development of on-going, community-based support for
VIVO. As previously described, the libraries constitute a natural foundation for this support.
Throughout the project, the libraries will develop and provide support for adoption, implementation
and use of VIVO. They will lay the foundation for on-going sustainability. Technical, support and
governance of VIVO must be sustained.
Support sustainability is in the best interests of the institutions which have adopted VIVO. As the
VIVO community grows, the resources to support VIVO on-line grow. Each institution will need to
commit to some support of their faculty and scientists in the on-going use of VIVO. In this way, ongoing support activity is created during implementation and use of VIVO.
E.5.
Product Development Process
During the grant period, the VIVOweb team will develop sustainable product development process,
ensuring the long-term viability of the VIVOweb technical platform. Based on existing open source
models, the VIVOweb development process will provide on-going enhancements and maintenance of
the VIVOweb software. Technical sustainability will be achieved by creating an open source
community around VIVO in much the same way as communities have developed in support of Sakai,
Drupal, and Kuali100 The support for technical sustainability will be generated during the project by the
libraries. All activities are oriented to the ultimate goal of a self-sustaining, community-based
PHS 398/2590 (Rev. 11/07)
Page 37
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
technical activity. Participants share ideas and code for the purpose of supporting and improving
VIVO.
E.6.
Governance Process
On-going governance of VIVO will be developed over the course of the project. The CTSAs and other
interested groups of schools, as well as the NIH, have a strong vested interest in the continuation of
VIVO governance.
E.7.
Final Report
The project will produce a final report summarizing the work of the two year period and laying out the
next steps for continued development and support of VIVO in the open source community. The final
report will include summary evaluation by Dr. Leslie McIntosh as well as lessons learned, challenges
and how they were addressed and any remaining challenges with proposals for addressing them.
The final report will be prepared with input from all elements of the project – the project teams, the
principal investigator, the advisory groups and the evaluation team.
E.8.
Timeline
The project timelines for year 1 and year 2 are shown in Table 7 and Table 8. Major goals for the first
year include 1) establish all governance, support and development teams, structures and processes;
2) Finalize release 1, implement at participating institutions; 3) establish adoption, implementation,
use and sustain support activities.
Table 7 VIVO Project Timeline, Year 1
Tasks
Governance
Establish governance groups and support structures
Executive Advisory meetings
Scientific and Technical Advisory processes
Evaluation activities and reporting
Support
Staff support teams
Community outreach efforts for release 1
Development
Establish development coordination
Staff development teams
Complete release 1
Develop release 2
Adoption
Facilitate adoption of release 1 beyond initial participants
Implementation
Consortium schools implement release 1
Feedback from release 1
Use
Support release 1
Sustain
Establish web sites, download
APIs for module development
Support community development
Establish community input process
PHS 398/2590 (Rev. 11/07)
Page 38
Qtr 1
Qtr 2
Qtr 3
Qtr 4
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
By the completion of year 1, the participating schools will have completed their implementations in
stand-alone and institutional settings. A community of practice will be in place to drive the adoption of
VIVO at institutions beyond those participating in the project. A support network for the use of VIVO
will be in place. Development of release 2 will be more than 50% complete. The executive and
governance processes will be constituted and functioning. An open source community will be
established to support the continuing development and technical support of VIVO.
Year 2 focuses on completion of release 2 with its deployment and support activities, and the activities
needed to create an on-going support system for VIVO.
Table 8 VIVO Project Timeline, Year 2
Tasks
Governance
Technical and Scientific Advisory processes continue
Executive Advisory meetings
Evaluation activities and reporting
Final report
Support
Staff support teams
Community outreach efforts for release 1
Development
Staff development teams
Complete release 2
Develop release 3
Transition to community development
Adoption
Presentations at conferences
Adoption by professional societies
Implementation
Consortium schools implement release 2
Feedback from release 2
Use
Support release 1
Support release 2
Sustain
Develop community of adoption
Develop community of implementation support
Develop community of usage support
Maintain community development
Qtr 1
Qtr 2
Qtr 3
Qtr 4
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
The VIVO implementation plan focuses on the development of a strong product and the development
of a strong community. The product and community generate support for adoption, implementation
and use. Activities throughout the project are oriented to the purpose of creating an on-going,
sustainable community of technical and support resources for the national networking of scientists
supported by VIVO.
F. Data and Software Sharing Plan
All software from this project will be made available freely to researchers and their institutions for
educational, research and non-profit purposes.
PHS 398/2590 (Rev. 11/07)
Page 39
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Licensing. All software developed under this project will be freely available for educational, research
and non-profit purposes under the terms of the VIVO software license to be developed in conjunction
with the NIH. The VIVO software license will be an appropriately modified version of non-commercial
open source licenses that will permit the use of the software at participating institutions. No claims of
suitability will be made and no warranty of any kind will be made. VIVO software can be modified, but
under no conditions can anyone assert ownership over the code or its modifications.
Availability. The software will be freely available under the terms of the VIVO license to all
biomedical researchers, educators and institutions in the non-profit sector such as education,
research institutions and government laboratories. Software will be available for free unrestricted
download under the terms of the license, from sites at participating institutions, including UF and
Cornell University.
Open Source Community. While the source code for the VIVO software is already publicly
available, the VIVOweb project will cultivate an active open-source development community by
providing extensive developer documentation and a plug-in architecture enabling others to contribute
new functionality to VIVO. Because VIVO is built on the popular Jena Semantic Web library, RDF
tools developed in other contexts should in many cases be easy to integrate into the VIVO
environment with minimal modification.
Timeline. Three releases of the software will be made by the project. Release 1 will be completed
within three months of project start date. This release will include VIVO, and its required software
environment for deployment in a stand-alone setting. Release 2 will be completed eighteen months
from the project start date. This release will contain all features described in this proposal as well as
support for all use cases described in this proposal. All consortium members will upgrade to release 2
during the course of the project. All schools will provide evaluation and feedback regarding release 2.
Release 3 will be completed before the end of the project. It will include all features recommended
and approved by the governance process. (See Section E for details of project plan and timeline.)
Enhancements. A community of practice will develop around VIVO to support it after the proposal
period. Community activity includes the submission of enhancements for inclusion in future releases.
The R Project for Statistical Computing101 is an example of a vibrant open source community
supporting a complex software system for statistical and data analysis. The VIVO community will
operate in a similar fashion, establishing and archive and providing mirror sites for downloads, as well
as on-line technical support through a blog and wiki.
Commercialization. The VIVO software license permits, under appropriate terms, the use of the
software in commercial settings, as well as modification of the software by commercial entities and
inclusions of it and/or subsets of it in other software packages. Under no conditions will software
provided to the commercial entities under the terms of the VIVO license become the property of a
commercial entity.
Required Components. VIVO requires the use of other open source components. No commercial
software is required to run or host VIVO. Specifically, VIVO requires the use of Apache Tomcat102.
Shibboleth103 is required for support of federated identity use cases. VIVO and its required
components can be run on a wide variety of operating systems, both open source and commercial.
VIVO and its required components can be run on a wide-range of commercially available hardware. It
is strongly recommended that VIVO be deployed in accord with all institutional information security
and privacy requirements.
Data. All data residing in VIVO systems remains the property of the institutions hosting VIVO.
Institutions control the release of data residing in VIVO to the Semantic Web for the purpose of
enabling national networking of researchers. No other use of the data is implied. Data may reside in
indexing systems as part of the operation of the Semantic Web. Data in indexing systems remains
PHS 398/2590 (Rev. 11/07)
Page 40
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
the property of the host institutions. Host institutions providing data to indexing systems can terminate
or alter their release policies at any time.
Data and software sharing for VIVO will support the goals of the NIH in enabling national
networking of scientists. The community approach supporting adoption, implementation, use and
sustainability through an open process facilitated by libraries coupled with the simplicity and power of
a completely semantic-based approach to data and social networking enable simple and compelling
discovery for scientists and institutions.
PHS 398/2590 (Rev. 11/07)
Page 41
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
Bibliography and References Cited
1
Berners-Lee, T. 1998. Semantic Web Road Map. Available:
http://www.w3.org/DesignIssues/Semantic.html
2
Berners-Lee, T., J. Hendler and O. Lassila, 2001 “The Semantic Web: A New Form of Web Content
That Is Meaningful to Computers Will Unleash a Revolution of New Possibilities”, Scientific
American, May 17, 2001
3
Linked Data Home Page. http://linkeddata.org. Accessed June 8, 2009.
4
Gator Scholar Home Page. http://gatorscholar.uflib.ufl.edu. Accessed May 27, 2009.
5
SWEO Community Project: Linking Open Data on the Semantic Web: Statistics on Data Sets.
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics.
Accessed June 10, 2009.
6
NIH Roadmap for Medical Research. http://nihroadmap.nih.gov. Accessed June 10, 2009.
7
Mukherjea, S. 2005. Information retrieval and knowledge discovery utilizing a biomedical Semantic
Web. Briefings in Bioinformatics 6(3): 252-262.
8
Ruttenberg, A., Clark, T., Bug, W., Samwald, M., Bodenreider, O., Chen, H., Doherty, D., Forsberg,
K., Gao, Y., Kashyap, V., Kinoshita, J., Luciano, J., Marshall, M., Ogbuji, C., Rees, J.,
Stephens, S., Wong, G., Wu, E., Zaccagnini, D., Hongsermeier, T., Neumann, E., Herman, I.,
Cheung, K. 2007. Advancing translational research with the Semantic Web. BMC
Bioinformatics 8 (Suppl 3): S2.
9
Deus, H.F, Stanislaus, R., Veiga, D.F. Behrens, C., Wistuba, I.I., Minna, J.D., Garner, H.R.,
Swisher, S.G., Roth, J.A., Correa, A.M., Broom, B., Coombes, K., Chang, A., Vogel, L.H.,
Almeida, J.S. 2008. A Semantic Web Management Model for Integrative Biomedical
Informatics.
PLoS
ONE.
2008;
3(8):
e2946.
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2491554. Accessed June 10, 2009.
10
Chen, H., Ding, L., Wu, Z., Yu, T., Dhanapalan, L., Chen, J.Y. 2009 Semantic Web for Integrated
Network Analysis in Biomedicine. Briefings in Bioinformatics Advance 10(2): 177-192.
11
GatorScholar Home Page. http://gatorscholar.uflib.ufl.edu/ . Accessed June 1, 2009.
12
Find an Expert: Network Proof of Concept. http://vitrofe.esrc.unimelb.edu.au:8333/vitrofe/.
Accessed June 10, 2009.
13
Southwest Biodiversity Knowledge Environment
http://168.160.50.68/vitro/index.jsp?home=1&primary=1. Accessed June 10, 2009.
14
CALS Experts Guide http://cals-experts.mannlib.cornell.edu/. Accessed June 10, 2009.
15
Cornell University: Graduate Programs in the Life Sciences
http://gradeducation.lifesciences.cornell.edu/. Accessed June 10, 2009.
16
Lyris Home Page. http://www.lyris.com. Accessed June 6, 2009.
17
Kretzmann , J.P., McKnight, J.L. 1993. Building Communities from the Inside Out: A Path Toward
Finding and Mobilizing a Community's Assets. ACTA Publications, Chicago, IL.
18
Pinkett, R. 2003. Community Technology and Community Building: Early Results from the Creating
Community Connections Project. The Information Society 19(5): 365-379.
19
Gaved, M., Anderson, B. 2006. The impact of local ICT initiatives on social capital and quality of life.
Chimera Working Paper 2006-6, Colchester, University of Essex.
20
See http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helppubmed,
http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpcollect and
http://www.ncbi.nlm.nih.gov/bookshelf for a complete listing of materials.
21
CTSA Clinical and Translational Science Awards Home Page. http://www.ctsaweb.org. Accessed
June 6, 2009.
22
Oracle and PeopleSoft http://www.oracle.com/peoplesoft. Accessed June 5, 2009.
23
Drupal Home Page. http://drupal.org. Accessed June 5, 2009.
24
Sakai Collaboration and Learning http://www.sakaiproject.org/portal . Accessed June 5, 2009.
PHS 398/2590 (Rev. 11/07)
Page 42
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
25
Resource Description Framework (RDF) http://www.w3.org/RDF/. Accessed June 10, 2009.
SQARQL Query Language for RDF http://www.w3.org/TR/rdf-sparql-query/. Accessed June 10,
2009.
27
Marking Up Structured Data
http://www.google.com/support/webmasters/bin/answer.py?answer=99170. Accessed June
10, 2009.
28
The Yahoo! Search Open EcoSystem. http://www.ysearchblog.com/2008/03/13/the-yahoo-searchopen-ecosystem/. Accessed June 10, 2009.
29
RDF Vocabulary Description Language 1.0: RDF Schema http://www.w3.org/TR/rdf-schema/.
Accessed June 10, 2009.
30
OWL Web Ontology Language http://www.w3.org/TR/owl-features/. Accessed June 10, 2009.
31
Berners-Lee, Tim. Linked Data http://www.w3.org/DesignIssues/LinkedData.html. Accessed June
10, 2009.
32
SWEO Community Project: Linking Open Data on the Semantic Web Statistics on links between
Data sets
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/LinkStatist
ics. Accessed June 10, 2009.
33
Apache Lucene Overview http://lucene.apache.org/java/docs/. Accessed June 5, 2009.
34
Jena – A Semantic Web Framework for Java. http://jena.sourceforge.net. Accessed June 5, 2009.
35
MySQL 5.4 Home Page. http://dev.mysql.com. Accessed June 5, 2009.
36
AllegroGraph RDFStore. http://www.franz.com/agraph/allegrograph. Accessed June 5, 2009.
37
OpenLink Virtuoso Home Page. http://www.openlinksw.com/virtuoso. Accessed June 5, 2009.
38
LargeTripleStores. http://esw.w3.org/topic/LargeTripleStores. Accessed June 8, 2009.
39
Open RDF Home Page. http://www.openrdf.org/. Accessed June 8, 2009.
40
SemaPlorer Home Page. http://www.uni-koblenzlandau.de/koblenz/fb4/institute/IFI/AGStaab/Research/systeme/semap. Accessed June 9,
2009.
41
Semantic Web Challenge. http://challenge.semanticweb.org/. Accessed June 8, 2009.
42
Gruber T. (1994). Toward principles for the design of ontologies used for knowledge sharing,
International Journal of Human and Computer Studies, 43(5/6): 907–928.
43
Y. Ding and D. Fensel (2001). Ontology Library Systems: The key for Successful Ontology Reuse.
In I. Cruz, S. Decker, J. Euzenat and D. McGuinness (eds). Proceedings of SWWS'01, The
First Semantic Web Working Symposium, page: 93-112, Stanford University, California, USA,
July 29th-August 1st, 2001 http://info.slis.indiana.edu/~dingying/Publication/SWWI2001.pdf
Accessed June 10, 2009.
44
Y. Ding and S. Foo (2002). Ontology Research and Development: Part 1 - A Review of Ontology
Generation. Journal of Information Science, 28(2): 123-136.
http://info.slis.indiana.edu/~dingying/Publication/OntologySurvey-Part1.pdf. Accessed June
10, 2009.
45
Y. Ding and S. Foo (2002). Ontology Research and Development: Part 2 - A Review of Ontology
Mapping and Evolving. Journal of Information Science, 28(5): 383-396.
http://info.slis.indiana.edu/~dingying/Publication/JIS_28%285%29_383.396_YING_DING.pdf.
Accessed June 10, 2009.
46
Knowledge Web. http://knowledgeweb.semanticweb.org. Accessed June 3, 2009.
47
The Friend of a Friend (FOAF) Project http://www.foaf-project.org/. Accessed June 3, 2009.
48
SKOS – Simple Knowledge Organization System http://www.w3.org/2004/02/skos/. Accessed
June 3, 2009.
49
DOAP – Description of a Project http://trac.usefulinc.com/doap. Accessed June 3, 2009.
50
SIOC – Semantically Interlinked On-line Communities. http://sioc-project.org/. Accessed June 3,
2009.
26
PHS 398/2590 (Rev. 11/07)
Page 43
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
51
Dublin Core Metadata Initiative http://dublincore.org/. Accessed June 3, 2009.
Geonames Ontology http://www.geonames.org/ontology/. Accessed June 3, 2009.
53
Hugh Glaser, Afraz Jaffri, Ian Millard. Managing Co-reference on the Semantic Web. Presented at
Linked
Data
on
the
Web
(LDOW
2009),
Madrid,
Spain,
April
2009.
http://events.linkeddata.org/ldow2009/papers/ldow2009_paper11.pdf.
Accessed June 10,
2009.
54
Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed June 5, 2009.
55
Friend of a Friend (FOAF) Project http://www.foaf-project.org/. Accessed June 5, 2009
56
RDFa Primer: Bridging the Human and Data Webs. http://www.w3.org/TR/xhtml-rdfa-primer/,
Accessed June 5, 2009
57
About the Scholarly Publications Repository. http://www.lib.ncsu.edu/repository/spr/about.html,
Accessed June 8, 2009.
58
Scholarly Database http://sdb.slis.indiana.edu/. Accessed June 5, 2009
59
Bio2RDF.org: Semantic Web Atlas of Postgenomic Knowledge. http://bio2rdf.org/. Accessed June
5, 2009
60
Apache Hadoop Home Page. http://hadoop.apache.org/. Accessed June 8, 2009.
61
Apache Hbase Home Page. http://hadoop.apache.org/hbase/. Accessed June 8, 2009.
62
Börner, Katy, Chen, Chaomei & Boyack, Kevin W.. (2003). Visualizing Knowledge Domains. In
Cronin, Blaise (Eds.), Annual Review of Information Science & Technology (Vol. 37, pp. 179255), chapter 5, American Society for Information Science and Technology, Medford, NJ.
http://ivl.slis.indiana.edu/km/pub/2003-borner-arist.pdf. Accessed June 10, 2009.
63
Börner, Katy, Sanyal, Soma & Vespignani, Alessandro. (2007). Network Science. In Cronin, Blaise
(Eds.), Annual Review of Information Science & Technology (Vol. 41, pp. 537-607), chapter
12, Medford, NJ: Information Today, Inc./American Society for Information Science and
Technology. http://ivl.slis.indiana.edu/km/pub/2007-borner-arist.pdf. Accessed June 10, 2009.
64
Neirynck, Thomas & Börner, Katy. (2007). Representing, Analyzing, and Visualizing Scholarly Data
in Support of Research Management. Proceedings of the 11th Annual Information
Visualization International Conference, Zürich, Switzerland, July 4-6, IEEE Computer Society
Conference Publishing Services, pp. 124-129.
http://ivl.slis.indiana.edu/km/pub/2007neirynck-ivl.pdf. Accessed June 10, 2009.
65
Information Visualization Laboratory. http://ivl.slis.indiana.edu/. Accessed June 10, 2009.
66
Cyberinfrastructure for Network Science http://cns.slis.indiana.edu/. Accessed June 10, 2009.
67
http://iv.slis.indiana.edu/ref/iv04contest/Ke-Borner-Viswanath.gif. Accessed June 5, 2009.
68
Klavans, Richard, Kevin W. Boyack. (2007). Is There a Convergent Structure to Science? In Daniel
Torres-Salinas& Henk F. Moed (Eds.), Proceedings of the 11th International Conference of the
International Society for Scientometrics and Informetrics. Pages 437-448. Madrid: CSIC.
69
NWB Team. (2006). Network Workbench Tool. Indiana University, Northeastern University,
University of Michigan. http://nwb.slis.indiana.edu. Accessed on March 10, 2009.
70
Cyberinfrastructure for Network Science Center. (2009). Network Workbench Tool: User Manual,
1.0.0 beta, http://nwb.slis.indiana.edu/Docs/NWB-manual-1.0.0beta.pdf. Accessed on April
13, 2009.
71
Protégé Home Page. http://protege.stanford.edu/. Accessed June 5, 2009.
72
RDF/XML Syntax Specification (Revised) http://www.w3.org/TR/rdf-syntax-grammar/. Accessed
June 10, 2009.
73
Moore, Geoffrey, 1999, Crossing the Chasm: Marketing and Selling High-Tech Products to
Mainstream Customers, Harper Business. 256 pages.
74
Scholtz, Jean. Usability Evaluation.
http://www.itl.nist.gov/iad/IADpapers/2004/Usability%20Evaluation_rev1.pdf. Accessed June
10, 2009.
75
TechSmith Morae Home Page. http://www.techsmith.com/morae.asp. Accessed June 10, 2009.
52
PHS 398/2590 (Rev. 11/07)
Page 44
Continuation Format Page
Program Director/Principal Investigator (Last, First, Middle):
Conlon, Michael
76
MIT IS Usability Guidelines http://web.mit.edu/is/usability/usability-guidelines.html. Accessed June
10, 2009.
77
Moffitt Cancer Center Home Page. http://www.moffitt.org. Accessed May 24, 2009.
78
Shands HealthCare Home Page. http://www.shands.org. Accessed May 24, 2009.
79
Malcom Randall Veterans Affairs Medical Center,
http://www2.va.gov/directory/GUIDE/facility.asp?id=54. Accessed June 7, 2009.
80
Burnham Institute for Medical Research Home Page. http://www.burnham.org. Accessed May 24,
2009.
81
Indiana University Fact Book. http://factbook.indiana.edu/index.shtml. Accessed June 7, 2009.
82
Indiana Clinical and Translational Science Institute Home Page. http://www.indianactsi.org/.
Accessed May 24, 2009.
83
Ponce School of Medicine – Moffitt Cancer Center Partnership. http://www.moffitt.org/mccpsmpartnership. Accessed June 7, 2009.
84
AABRE Home Page. http://aabre.hpcf.upr.edu/. Accessed June 7, 2009.
85
MyUFL Home Page. http://my.ufl.edu. Accessed May 23, 2009.
86
UF Directory Home Page. http://www.bridges.ufl.edu/directory, Accessed May 23 2009.
87
Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed May 23, 2009.
88
UF Active Directory Home Page. http://www.ad.ufl.edu. Accessed May 23, 2009.
89
Microsoft BizTalk Home Page. http://www.microsoft.com/biztalk. Accessed May 23, 2009.
90
UF Exchange Home Page. http://www.mail.ufl.edu. Accessed May 23, 2009.
91
UF Clinical and Translational Science Institute Portal. http://www.ctsi.ufl.edu. Accessed May 23,
2009.
92
Incommon Federation Home Page. http://www.incommonfederation.org. Accessed May 23, 2009.
93
The National Science Digital Library. http://nsdl.org. Accessed June 10, 2009.
94
NSDL: NCore Platform http://ncore.nsdl.org. Accessed June 10, 2009.
95
VIVO at the Cornell University Library. http://VIVO.library.cornell.edu. Accessed May 23, 2009.
96
Cornell University Geospatial Information Repository Home Page http://cugir.mannlib.cornell.edu.
Accessed May 23, 2009.
97
E-Clips Home Page. http://eclips.cornell.edu. Accessed May 24, 2009.
98
Cyberinfrastructure for Network Science Center. http://cns.slis.indiana.edu. Accessed June 10,
2009.
99
Places and Spaces: Mapping Science. http://scimaps.org. Accessed June 10, 2009.
100
Kuali Foundation Home Page. http://www.kuali.org. Accessed May 30, 2009.
101
R Software Project Home Page. http://www.r-project.org. Accessed May 30, 2009.
102
Apache Tomcat Home Page. http://tomcat.apache.org. Accessed May 30, 2009.
103
Shibboleth Home Page. http://shibboleth.internet2.edu. Accessed May 30, 2009.
PHS 398/2590 (Rev. 11/07)
Page 45
Continuation Format Page
Download