Profiles Research Networking Software Users Group Meeting

advertisement
Profiles Research Networking Software
Users Group Meeting
http://profiles.catalyst.harvard.edu
July 20, 2012
Agenda
•
•
•
•
Welcome to New Members
Upcoming Events
EAGER-Profiles
Profiles RNS 1.0.1
Profiles Users Group Members
UCSF
Fred Hutchinson CRC
Oregon Health Sci U
UC Davis (CBST)
Touro University
U Southern California
UC San Diego
Charles Drew
U Hawaii
Arizona State
Montana State
U Colorado Denver
U Nebraska-Lincoln
UW Madison
U Illinois
U Chicago
Baylor College Med
UT Southwestern
UT Houston
Jackson State (RTRN)
Ohio State
Cincinnati Children’s
Case Western
U Kentucky
Vanderbilt Stem Cell
U Arkansas Little Rock
U Alabama Birmingham
Symplectic Limited (UK)
McGill University (Canada)
University of Cambridge (UK)
Makerere University (Uganda)
University of Leuven (Belgium)
South-Valley University (Egypt)
Elysium, Geneva (Switzerland)
Beijing Normal University (China)
University of the South Pacific (Fiji)
Velammal Engineering College (India)
Nati Sci Lib, Chinese Acad of Sci (China)
Clinical & Biomedical Computing Ltd (UK)
Ministério da Ciência e Tecnologia e Inovação (Brazil)
Jonkoping University Engineering School (Sweden)
Universidad Nacional Autonoma de Mexico (Mexico)
Centre Interdisciplinaire de Nanoscience de Marseille (France)
Harvard
Univ Minnesota
Dartmouth
Univ Mass
Boston Univ
Tufts Univ
Boston VA
Rensselaer
Univ Connecticut
Univ Rochester
NYU Med Ctr
Mount Sinai Sch of Med
MedMeme
Thomas Jefferson
UPenn
Johns Hopkins
USUHS-CNRM
NIH
George Wash U
Penn State
Childrens Nat Med Ctr
Wake Forest
Leadership in Med
HSSC
Georgia Tech
Piedmont Healthcare
Emory University
University Spotlights
Harvard University
UCSF
University of Minnesota
http://connects.catalyst.harvard.edu/profiles
http://profiles.ucsf.edu
http://profiles.ahc.umn.edu
South Carolina
UConn Health Center
Penn State
http://profiles.healthsciencessc.org
http://profiles.uconn.edu
http://profiles.psu.edu
Wake Forest Medicne
RTRN (18 RCMI Institutions)
Boston University
http://profiles.tsi.wakehealth.edu
http://rtrnprofiles.rtrn.net/profilesweb
http://profiles.bumc.bu.edu
Upcoming Events
• 3rd Annual VIVO Conference, Miami, FL, Aug 22-24, 2012
• “OpenSocial Workshop.” Workshop. Eric Meeks, Leslie Yuan, Anirvan
Chatterjee
• “Building better teams: innovative approaches to the design and deployment
of researcher recommendation systems.” Panel. Christopher Kelleher, Griffin
Weber, Melissa Haendel, Jeff Horon and Noshir Contractor
• “Linking Disciplines: Expanding Harvard Catalyst Profiles to Discover
Connections across an Entire University.” Podium Presentation. Griffin
Weber and Amy Brand
• NSF Science of Science Policy (SciSIP) PI Meeting,
Washington DC, Sep 20-21, 2012
• “EAGER-Profiles: Using researcher profiles to demonstrate the impact of
investments in science.” Poster & Demonstration. Griffin Weber
• AMIA Annual Symposium, Chicago, IL, Nov 3-7, 2012
• “Harvard Catalyst Profiles: Finding collaborators outside biomedicine.”
Poster. Griffin Weber
EAGER-Profiles
• NSF #1238469. Science of Science and Innovation Policy (SciSIP)
• “EAGER-Profiles: Using researcher profiles to demonstrate the
impact of investments in science”
• Prototype of national research networking website (SciENCV)
• Profiles of computer scientists at Harvard (Profiles RNS), UCSF
(Profiles RNS), U Chicago (Profiles RNS), U Florida (VIVO), U
Cambridge UK (Symplectic)
• Illustrate connections between research inputs (e.g., grants &
contracts) and research outputs (e.g., publications & patents)
• Computer scientists are funded by many different agencies, their
research outputs take different forms (pubs, software, data, etc.),
and they collaborate across many disciplines
Profiles RNS 1.0.1
• The names of many web code files and database components were
changed to make them more consistent throughout the software.
• The documentation, particularly the Architecture Guide, was
significantly expanded. ReadMeFirst and ReleaseNotes documents
were created.
• Database performance enhancements were made, which result in
RDF data being returned faster, especially for profiles containing
large numbers of triples.
• Default editing modules for DataType and ObjectType properties
were added.
• A custom editing module was created for email address.
• The Search API and SPARQL API were converted to SVC files and
XSD files were created for each API.
Profiles, Networks, Connections
Website Framework
Website Framework
Applications
Name
Profile
Display
Search
About
SPARQL
Edit
Direct
Description
Returns the RDF document for a URI.
Renders a URI as HTML.
Search identifies all RDF nodes that have a property whose value matches a search
string. It displays a list of those nodes and links to their URIs. Faceting allows users to
narrow the search results by type (class group) or subtype (class). Any property can be
used to sort search results. Search incorporates stemming (to match different parts of
speech), removal of stop words (e.g., “the”, “of”), and term expansion through the use
of a thesaurus (e.g., “cancer” -> “neoplasm”).
Displays general information about the Profiles RNS website.
This is an interface to test the Profiles RNS SemWeb SPARQL engine. Users can enter
an arbitrary SPARQL query and view the results. By default, this front-end tool is only
available to administrators, though the ability to pass SPARQL queries to the SemWeb
web services can remain open to the public.
This application allows users to manage the content on their profiles.
Direct2Experts is a federated search tool that locates experts across multiple
institutions using Profiles RNS and other research networking products.
Core Objects
Ontology
Linked
Data
Nodes,
Triples
Co-Authors,
Extended Objects
Data Flow
Database Schematic
Social Network Analysis
Derived
Data
Faculty Publications
Disambiguated
Data
Medline, ISI Web of Knowledge,
DSpace, Administrative Databases,
Schema Complexity
External
Data
Database Schemas & Tables
[Profile.Cache].[SNA.Coauthor.Distance]
Schema
Table
Core Schemas
Schema
Description
[Framework.]
Handles global functions, such as resolving RESTful URLs and managing scheduled jobs.
[Ontology.]
[Ontology.Import]
[RDF.]
[RDF.Security]
Contains the semantic web ontology used by the website.
Contains tools to import and process OWL files.
Contains the "presentation" ontology, which describes how content should be displayed on
the website.
Contains the RDF nodes and triples specific to an instance of Profiles.
Contains information about who can access secure/private nodes and triples.
[RDF.SemWeb]
Used to format [RDF.] data so that it can be used by the SemWeb SPARQL engine.
[Ontology.Presentation]
[RDF.Stage]
[User.Account]
[User.Session]
[Utility.Application]
[Utility.Math]
[Utility.NLP]
Used by the bulk data loading process to store temporary data before it is loaded into the
[RDF.] tables.
Contains information about authorized users of the website.
Contains information about website sessions. A public user of the website will have a
session even if she has not logged in and linked the session to a specific user account.
Contains functions and procedures that are used in a variety of contexts.
Contains mathematical lookup tables and functions.
Contains lookup tables and functions related to support natural language processing for
search and other features.
Extended Schemas
Schema
Description
[Direct.*]
Supports Direct2Experts functionality--federated search across multiple institutions using
Profiles and other research networking products.
[Edit.*]
Allows users to edit profile content.
[Login.*]
Allows users to login to the website.
[Profile.Cache]
Contains the results of bibliometric and social network analyses.
[Profile.Data]
Stores copies of certain types of RDF data in relational tables to help with data loads or to
improve performance of particular kinds of queries.
[Profile.Framework]
Used by the Profile application to interact with the Framework.
[Profile.Import]
Used to place person and other types of data during an initial load of Profiles RNS and in
subsequent updates.
[Search.]
Provides basic search functionality for Profiles RNS.
[Search.Cache]
Improves the performance of the Profiles RNS search tool by pre-processing the RDF data
through scheduled jobs.
[Search.Framework]
Used by the Search application to interact with the Framework.
Security Groups
SecurityGroupID
Label
Description
-50
Admins
Limited to a restricted set of site administrators with
special access permissions to configure the website.
-40
Curators
Limited to a small number of users whose job is to
manage content on the website.
-30
Harvesters
-20
Users
Limited to people who have logged into website.
-10
No Search
Open to the general public, but blocked to certain
(but not all) search engines such as Google.
-1
Public
0
Undefined
Limited to authorized automated processes that
synch data between this website and other systems.
Open to the general public and may be indexed by
search engines.
Cannot be accessed by any users.
Node and Triple Tables
[RDF.].[Node]
[RDF.].[Triple]
Field
Type
Field
Type
NodeID
BIGINT
TripleID
BIGINT
ValueHash
BINARY(20)
Subject
BIGINT
Language
NVARCHAR(255)
Predicate
BIGINT
DataType
NVARCHAR(255)
Object
BIGINT
Value
NVARCHAR(MAX)
TripleHash
BINARY(20)
InternalNodeMapID
INT
Weight
FLOAT
ObjectType
BIT
Reitification
BIGINT
ViewSecurityGroup
BIGINT
ObjectType
BIT
EditSecurityGroup
BIGINT
SortOrder
INT
ViewSecurityGroup
BIGINT
Graph
BIGINT
Ontology Tables
Table
Description
[Ontology.].[ClassGroup]
Lists top-level Class Groups for search and browse.
[Ontology.].[ClassGroupClass]
Maps Class Groups to individual RDF Classes.
[Ontology.].[ClassProperty]
Defines which RDF properties should be returned
and expanded when data is requested.
[Ontology.].[ClassTreeDepth]
Contains the class hierarchy. Used by Search.
[Ontology.].[DataMap]
Maps extended schema data to the ontology.
[Ontology.].[Namespace]
Lists namespaces and their prefixes.
[Ontology.].[PropertyGroup]
Lists the broad groups of related properties.
[Ontology.].[PropertyGroupProperty]
Lists the properties within each group.
Data Flow
Loading person data from an external (e.g., HR) source
[Profile.Import]  [Profile.Data]  [Profile.Cache]  [RDF.]
Loading user account data from an external source
[Profile.Import]  [User.Account]  [RDF.]
Creating RDF data from an extended data table
[Profile.Data]  [RDF.]
Loading data as triples
[RDF.Stage]  [RDF.]
Adding new classes or properties to the ontology
[Ontology.Import]  [Ontology.]  [RDF.Stage]  [RDF.]
Presenting the RDF data in a format that can be used by SemWeb (SPARQL)
[RDF.]  [RDF.SemWeb]
Populating the search cache based on the RDF data
[RDF.]  [Search.Cache]
Extending Profiles RNS
1) Extend the ontology
a) Define a namespace
b) Define the new class in that namespace
c) Define the new properties in that namespace
2) Import the data feed to an extended schema table
a) Create a new extended schema table (i.e., [Profile.Data].*)
b) Load the feed into the new table
3) Create a mapping from the new table to the ontology
4) Run ProcessDataMap to generate RDF
Download