Digital Enterprise Research Institute www.deri.ie HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data IESD 2012 - EKAW 2012 Galway, Ireland Tuesday 9th October 2012 Owen Sacco owen.sacco@deri.org Digital Enterprise Research Institute www.deri.ie US Government’s principal agency for: Protecting Providing the Health of all Americans Enabling Networked Knowledge all essential Human Services Digital Enterprise Research Institute www.deri.ie HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Promote the advancement of the Health, Safety, and Well-Being of the American People HHS IT Asset Discovery Application HADA Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Currently, data about HHS IT Investments exists: In different systems In different data models With different levels of access Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie HADA aims to provide intelligent: Aggregation of this data to support information discovery Interoperability amongst the different systems Fine-grained Access Control Using Semantic Web principles Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Public Data WWW Enterprise Repositories Docs Data IT asset information are pre-aggregated from multiple data sources Semantic Database Access rules are checked to grant or restrict access to the IT Investment Cost EPLC and other docs Which are stored in a database Data Access Rules Who can see what? She searches for If she has access, ashe specific IT the can view Web Application Enabling Networked Knowledge Investment cost Investment cost HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Presentation and Navigation of Content Presentation Layer Enforcement of Privacy Policies Privacy Preference Manager Privacy Preferences Repositories Semantic Database Semantic Model Transformation Semantic Transformation and Synthesis Existing Ontologies (e.g. FEA) Extracted instance data in XML format Privacy Layer XML XML XML XML Semantic Layer XML Content Extraction Layer System Content Extraction Metadata Extraction and Manual Clarification Code Instance data CPIC Repositories EA Repositories Docs Etc. Enabling Networked Knowledge Code, Documentation, Etc. Repositories Data Layer HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Publishing Linked Data using the Linked Data API • A RESTful API over RDF graphs • Acts as a proxy over SPARQL endpoints • Easy-to-process representations of resources Indexing and searching RDF data using SIREn “A Lucene plugin to efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields” Storing RDF data using Sesame and ARC over MySQL Enabling Networked Knowledge HEALTH AND HUMAN SERVICES DOMAIN IT PROGRAM MANAGEMENT OFFICE Digital Enterprise Research Institute www.deri.ie Attribute based access and fine grained access Rules based on… Where the data comes from Context What the data is about Subject What the data is describing Predicate Subject Predicate Object Context HADA hasName “HHS IT Asset Discovery Application” HEAR HADA hasAcronym “HADA” HEAR HADA hasCost $12345 CPIC HADA hasIPAddress 107.20.137.21 0 HEAR HADA belongsTo HHS HEAR HADA hasLabel “Health and Human Services Asset Discovery Application” ITDashboard HADA hasAcronym “HADA” ITDashboard Properties of the data itself Object Any combination of the above More than one rule can be applied to each data element Enabling Networked Knowledge Privacy Preference Ontology Digital Enterprise Research Institute acl:Access rdfs:Resource www.deri.ie rdfs:Literal foaf:Agent acl:Access ppo:appliesToResource rdf:Statement ppo:appliesToStatement ppo:hasAccessAgent ppo:hasAccess ppo:hasNoAccess ppo:PrivacyPreference ppo:hasAccessQuery ppo:AccessSpace ppo:hasAccessSpace ppo:hasPriority ppo:appliesToNamedGraph ppo:hasConditionOperator trix:Graph wo:Weight ppo:hasCondition ppo:ConditionOperator ppo:conditionOperatorOf ppo:appliesToDataset ppo:hasChildConditionOperator ppo:Condition void:Dataset ppo:hasLogicalOperator ppo:Operator ppo:appliesToContext rdfs:Resource ppo:resourceAsSubject rdfs:Resource ppo:resourceAsObject ppo:classAsSubject rdfs:Resource rdfs:Class ppo:classAsObject rdfs:Class ppo:hasLiteral rdfs:Literal ppo:hasProperty rdfs:Propoerty Enabling Networked Knowledge Applies To Conditions Access Test Queries Access Control Privileges Namespace: http://vocab.deri.ie/ppo# Privacy Preference Ontology Digital Enterprise Research Institute www.deri.ie PREFIX ppo: <http://vocab.deri.ie/ppo#> . PREFIX hada: <http://hprod.dyndns.org/> . hada:pp1 a ppo:PrivacyPreference; ppo:appliesToResource <http://hprod.dyndns.org/hada/Investment/90000001>; ppo:hasAccess acl:Read; ppo:hasAccessSpace [ ppo:hasAccessQuery "ASK {?x foaf:topic_interest <http://hprod.dyndns.org/hada/vocab/Asset>}"]. Enabling Networked Knowledge Namespace: http://vocab.deri.ie/ppo# Privacy Preference Ontology Digital Enterprise Research Institute www.deri.ie Privacy Preference ppo:hasAccessQuery ppo:appliesToResource ppo:hasAccess 90000001 acl:Read Who is interested in Asset Enabling Networked Knowledge Namespace: http://vocab.deri.ie/ppo# Privacy Preference Manager Digital Enterprise Research Institute www.deri.ie Privacy Preference Manager provides: User • • Creating privacy preferences Enforcing privacy preferences Privacy Preference Manager Privacy Preferences Repositories SPARQL Endpoint RDF Documents Knowledge Enabling Networked Enforcing Privacy Policies Digital Enterprise Research Institute www.deri.ie SPARQL Endpoint John John’s Profile Logs In Request RDF Documents John’s RDF Profile Request RDF Data Retriever & Parser Query Request Filtered RDF Data RDF DATA Privacy Preference Manager Access Query Result RDF Data Privacy Preferences Enforcer Query Privacy Preference Privacy Preferences Privacy Preferences Creator Enabling Networked Knowledge Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge Towards Patient Controlled Privacy Digital Enterprise Research Institute HHS is exploring to use on healthdata.gov: • Linked Data API for publishing Linked Data • Privacy Preference Framework to provide the Patient to control third party access to his/her health data Interface Interface John Alex Privacy Preference Manager Privacy Preference Manager Privacy Preferences Privacy Preferences SPARQL Endpoint SPARQL Endpoint RDF Documents www.deri.ie Enabling Networked Knowledge RDF Documents Links Digital Enterprise Research Institute www.deri.ie HADA: http://hprod.dyndns.org/ Linked Data API: http://code.google.com/p/linked-data-api/ SIREn: http://siren.sindice.com/ Sesame: http://www.openrdf.org/ PPO Namespace URI: http://vocab.deri.ie/ppo# PPM Screencasts: Creating Privacy Preferences: http://bit.ly/p0N1Vi Viewing Filtered Triples: http://bit.ly/qiAdxT Email: owen.sacco@deri.org Enabling Networked Knowledge