Emerging Technologies Semantic Web and Data Integration This meeting will start at 5 min past the hour As a reminder, please place your phone on mute unless you are speaking 3 May 2013 Emerging Technologies Semantic Web and Data Integration 3 May 2013 Meeting Agenda • Update- Discussion with related initiatives – – – – CDISC Collaboration OpenCDISC validation checks in RDF NCI-EVS publication of Controlled Terminology in RDF Folding the CDISC2RDF work into FDA/PhUSE ST Project • Moving Forward – Formation of sub-teams – Focus of our next meeting (10 May 2013) • Presentation - Marc Andersen (StatGroup) – A use case and short technical examples Python, SAS, RDFa 2 Formation of sub-teams • Propose to focus on the development of use cases – CDASH version 1.1 – SDTM Version v1.3/IG v3.1.3, TA Supplements – Expand RDF representation of SDTM v1.3/IG v3.1.2 – ADaM • Need to identify leads • Consider which area that you would want to focus on – Respond to discussion thread on wiki by 16 May 2013 3 Questions • Use of the wiki for communication – any questions? • Are we ready to move forward? • Feedback on meetings to date? 4 A use case and short technical examples Python, SAS, RDFa Marc Andersen mja@statgroup.dk 03-may-2013 Reviewer creates table by copy-paste of output with RDFa markup. Hovering over a cell with, say, N=42 provides the definition for count as a popup. In the popup clicking on the patients link opens a window showing the data listing for Use Case the corresponding 42 patients. Reviewer activates ”get data”, and the data are shown in a grid for further processing I learned a lot from reading and trying the examples in: “Programming the Semantic Web” by Toby Segaran, Colin Evans, and Jamie Taylor. http://www.oreilly.com/catalog/9780596153816 RDFa and Python Approach: • Extend SAS html tagset to create RDFa using content and value properties in span tag • Use SAS PROC report to make the output Creating RDFa using SAS SAS Generated output with RDFa Google Chrome extension - RDFa Triples Lister https://chrome.google.com/webstore/detail/rdfa-tripleslister/lmojbfnaigeibgkhacnebnpbhddpnoam import rdflib from rdflib import plugin from rdflib.namespace import Namespace from rdflib.graph import Graph g = Graph() # change url to your server url= "http://s107:8000/rdfaclass.html" g.parse(location=url, format="rdfa" ) qres = g.query( """SELECT DISTINCT ?row ?nameVal ?sexVal ?ageVal WHERE { ?dpName ds:Row ?row . ?dpSex ds:Row ?row . ?dpAge ds:Row ?row . ?dpName ds:Column "name"@en . ?dpSex ds:Column "sex"@en . ?dpAge ds:Column "age"@en . ?dpName ds:Value ?nameVal . ?dpSex ds:Value ?sexVal . ?dpAge ds:Value ?ageVal . }""" , initNs=dict( ds=Namespace("datapoint-rdf.xml/") ) ) Roundtripping: Get the data using Result SPARQL using RDFlib in Python 1 Alfred M 14 for row in qres.result: print("%s %s %s %s" % row) 2 Alice F 13 SPARQL queries are performed over http. The query can be made using SAS PROC HTTP The results in xml format can be transformed into SAS data set using SAS XML libname. The program enclosed shows how it can be SPARQL endpoint accessed done – but is not ready for production. using SAS R: example http://linkedscience.org/tools/sparql-packagefor-r/linked-open-piracy-tutorial/ RDFa Content Editor http://rdface.aksw.org/test/tinymce/examples/rdf aDemo.html SKOS - Simple Knowledge Organization System RDF Schema http://www.w3.org/2004/02/skos/ http://www.w3.org/TR/2009/REC-skos-reference20090818/ The RDF Data Cube Vocabulary Ontologies http://www.w3.org/TR/2013/WD-vocab-data-cube20130312/ • Make/identify SAS tools? – And/or use other tools? • Select ontology to present results – BRIDG? • For the use case – browser based orforward dedicated application? Looking