Emerging Technologies
Semantic Web and Data Integration
This meeting will start at 5 min past the hour
As a reminder, please place your phone on mute unless you are speaking
3 May 2013
Emerging Technologies
Semantic Web and Data Integration
3 May 2013
Meeting Agenda
• Update- Discussion with related initiatives
–
–
–
–
CDISC Collaboration
OpenCDISC validation checks in RDF
NCI-EVS publication of Controlled Terminology in RDF
Folding the CDISC2RDF work into FDA/PhUSE ST Project
• Moving Forward – Formation of sub-teams
– Focus of our next meeting (10 May 2013)
• Presentation - Marc Andersen (StatGroup)
– A use case and short technical examples Python, SAS, RDFa
2
Formation of sub-teams
• Propose to focus on the development of use cases
– CDASH version 1.1
– SDTM Version v1.3/IG v3.1.3, TA Supplements – Expand
RDF representation of SDTM v1.3/IG v3.1.2
– ADaM
• Need to identify leads
• Consider which area that you would want to focus on
– Respond to discussion thread on wiki by 16 May 2013
3
Questions
• Use of the wiki for communication – any
questions?
• Are we ready to move forward?
• Feedback on meetings to date?
4
A use case and
short technical examples Python,
SAS, RDFa
Marc Andersen
mja@statgroup.dk
03-may-2013
Reviewer creates table by copy-paste of
output with RDFa markup.
Hovering over a cell with, say, N=42
provides the definition for count as a popup.
In the popup clicking on the patients link
opens a window showing the data listing for
Use Case
the corresponding 42 patients.
Reviewer activates ”get data”, and the data
are shown in a grid for further processing
I learned a lot from reading and trying the
examples in:
“Programming the Semantic Web” by Toby
Segaran, Colin Evans, and Jamie Taylor.
http://www.oreilly.com/catalog/9780596153816
RDFa and Python
Approach:
• Extend SAS html tagset to create RDFa
using content and value properties in span
tag
• Use SAS PROC report to make the output
Creating RDFa using SAS
SAS Generated output with
RDFa
Google Chrome extension - RDFa Triples Lister
https://chrome.google.com/webstore/detail/rdfa-tripleslister/lmojbfnaigeibgkhacnebnpbhddpnoam
import rdflib
from rdflib import plugin
from rdflib.namespace import
Namespace
from rdflib.graph import Graph
g = Graph()
# change url to your server
url= "http://s107:8000/rdfaclass.html"
g.parse(location=url, format="rdfa" )
qres = g.query(
"""SELECT DISTINCT ?row ?nameVal
?sexVal ?ageVal
WHERE {
?dpName ds:Row ?row .
?dpSex ds:Row ?row .
?dpAge ds:Row ?row .
?dpName ds:Column "name"@en .
?dpSex ds:Column "sex"@en .
?dpAge ds:Column "age"@en .
?dpName ds:Value ?nameVal .
?dpSex ds:Value ?sexVal .
?dpAge ds:Value ?ageVal .
}""" ,
initNs=dict(
ds=Namespace("datapoint-rdf.xml/")
)
)
Roundtripping: Get the data
using
Result
SPARQL using RDFlib in
Python
1 Alfred
M 14
for row in qres.result:
print("%s %s %s %s" % row)
2 Alice F 13
SPARQL queries are performed over http.
The query can be made using SAS PROC
HTTP
The results in xml format can be transformed
into SAS data set using SAS XML libname.
The program enclosed shows how it can be
SPARQL
endpoint
accessed
done – but
is not ready
for production.
using SAS
R: example
http://linkedscience.org/tools/sparql-packagefor-r/linked-open-piracy-tutorial/
RDFa Content Editor http://rdface.aksw.org/test/tinymce/examples/rdf
aDemo.html
SKOS - Simple Knowledge Organization
System RDF Schema
http://www.w3.org/2004/02/skos/
http://www.w3.org/TR/2009/REC-skos-reference20090818/
The RDF Data Cube Vocabulary
Ontologies
http://www.w3.org/TR/2013/WD-vocab-data-cube20130312/
• Make/identify SAS tools?
– And/or use other tools?
• Select ontology to present results
– BRIDG?
• For the use case
– browser
based orforward
dedicated application?
Looking