A report from the ‘Star Trek’ Crew
Chris Mavergames
Web Operations Manager/Information Architect
Cochrane Collaboration Web Team
Intro to linked data and what it means for Cochrane
"Star Trek" stream of work so far
What's possible now and in the future
* Acknowledgements to Lorne Becker and the entire Star Trek crew. Their input was invaluable in the preparation of this talk.
Structure of this talk
There are problems that limit their use by some people
◦ Difficult to wade through all of the text
◦ Difficult to understand the figures, terminology, and other bits of the Review
◦ Hard to compare interventions without reading multiple Reviews
◦ Can be difficult to find the Review you seek
Cochrane Reviews are fantastic
Search for “Prozac” – no reviews
Search for “fluoxetine” – 25 reviews
Searching The Cochrane Library
Beginning to do this now:
◦ Summaries.Cochrane.org for consumers
◦ Cochrane Clinical for clinicians
BUT
◦ Takes a lot of work to reformulate reviews & authors, CRGs, etc are busy
Wouldn’t it be nice if we could automate or partially automate this?
Ideally we’d restructure our content for different users
How did Bing read 3 different weather sites & bring me the data
I need?
If so, what might we be able to accomplish?
Could we do similar magic with our Cochrane reviews?
is made up of:
Which all together comprise
What is linked data?
Current web = Web of documents
Docs are linked not data in docs
Data on the web is meant for human consumption
Machines need the data to be structured
Once structured, information can be more easily shared within datasets and across web pages
Machines aren‘t good at reading web pages
XML
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<COCHRANE_REVIEW DESCRIPTION="For publication"
DOI="10.1002/14651858.CD008440" GROUP_ID="HIV"
ID="589309120202025823" MERGED_FROM=""
MODIFIED="2011-05-06 12:29:46 +0100"
MODIFIED_BY="Rachel Marshall" REVIEW_NO=""
REVMAN_SUB_VERSION="5.1.1"
REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R"
STATUS="A" TYPE="INTERVENTION"
VERSION_NO="2.0">........
Cochrane Reviews
Fortunately, Cochrane Reviews are structured – but we still need to teach the machines how to read them, where to find data within them and how the data is related.
Data point
Data point
Data point
Data point
Data point
Cochrane Reviews
Data point
Cochrane Register of Studies
Lack of unique study IDs a real problem
CRS solves this by providing a unique ID for all studies that can be referenced
Better linking of data about trials and possibilities with linking to external sources such as PubMed (example later)
Links to the CRS
OWL (Web Ontology Language)
RDF (Resource Description Framework)
SPARQL (RDF query language)
Model Cochrane Reviews in OWL
Transform them into RDF and add to triple store
Query them with SPARQL
OR, simply...
Linked data technologies
Use the gears!
Subject -> Property -> Object
<Gerd Antes> has-role <Director German Ctr>
<Director German Ctr> works-in <Freiburg, Germany>
<Gerd Antes> works-in <Freiburg, Germany>
Triple store = Way we think!
All
Reviews in Archie
Standard tools have been developed to facilitate this process
A Copy of the
Review
XML
A
Model of the
Data
Using “the gears”
A Machine
Readable
“Triple
Store”
A
Question
A Machine
Readable
“Triple
Store”
Using “the gears”
A Machine
Generated
Answer
Insert witty Star Trek reference here!
Cochrane Review ontology
Lots of work still needed from people with a deep understanding of
Cochrane content in order to get the data model and ontology right
Cochrane Review ontology
Cochrane Review ontology
Cochrane Review ontology
Cochrane Review ontology
Findings ontology from Lorne
A
Question
A Machine
Generated
Answer
A Machine
Readable
“Triple
Store”
What sorts of things could we do with this?
Gears!
Ask questions that use data from several different reviews
Enhance the experience of our users by including data from the triple stores of others
Improve search
Make it easier for people to find Cochrane
Reviews
We can…
Ask questions that use data from several different reviews
Enhancing the User Experience
I’ve done a search for trials on a particular intervention for dementia.
I want to know which of the trials have been included in a Cochrane Review.
A question using multiple reviews
Search for the relevant Reviews
Read the reference lists to find included trials
Compare with my trial search
Eliminate the new references that are additional publications from trials already included in a Review.
Finding the answer the old way
My list of trials
A
”studified” list from the CRS
The
Cochrane
Review
“Triple
Store”
The “Star Trek” Way
A Machine
Generated list of trials not yet included in a review
Links to the relevant Review for those trials that were included
What are the risks of bias for the entire set of trials assessing the effectiveness of a particular intervention?
Another question using multiple
Reviews
Search for the relevant reviews (there may be more than one)
Read the tables of included studies to find risk of bias assessments for each trial
Combine them
* (in some cases review authors may have done this for all of the trials in a single review)
Finding the answer the old way
The
Cochrane
Review
“Triple
Store”
The “Star Trek” Way
A Machine generated summary of the Risk of
Bias assessments for the relevant trials
RoB Summary for Cochrane Reviews on dementia
These figures summarize
Risks of Bias from the trials included in the reviews in your search
XML
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<COCHRANE_REVIEW DESCRIPTION="For publication"
DOI="10.1002/14651858.CD008440" GROUP_ID="HIV"
ID="589309120202025823" MERGED_FROM=""
MODIFIED="2011-05-06 12:29:46 +0100"
MODIFIED_BY="Rachel Marshall" REVIEW_NO=""
REVMAN_SUB_VERSION="5.1.1"
REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R"
STATUS="A" TYPE="INTERVENTION"
VERSION_NO="2.0">........
Cochrane Reviews
Make search work better
Enhancing the User Experience
Or, one could say any of these:
Abenol (CA), Acephen, Anadin Paracetamol
(UK), Apo-Acetaminophen (CA), Aspirin Free
Anacin, Atasol (CA), Calpol (UK), Cetaphen,
Children's Tylenol Soft Chews, Disprol (UK),
Exdol (CA), Feverall, Galpamol (UK),
Genapap, Genebs, Infant's Pain Reliever,
Mandanol (UK), Nortemp, Pain Eze, Panadol
(UK), Robigesic (CA), Silapap, Tycolene,
Tylenol 8 Hour, Tylenol, Tylenol Arthritis, Uni-
Ace, Valorin
You Say “Paracetamol”
I Say “Acetaminophen”
LinkedLifeData.com
LinkedLifeData.com
DrugBank
XML
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<COCHRANE_REVIEW DESCRIPTION="For publication"
DOI="10.1002/14651858.CD008440" GROUP_ID="HIV"
ID="589309120202025823" MERGED_FROM=""
MODIFIED="2011-05-06 12:29:46 +0100"
MODIFIED_BY="Rachel Marshall" REVIEW_NO=""
REVMAN_SUB_VERSION="5.1.1"
REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R"
STATUS="A" TYPE="INTERVENTION"
VERSION_NO="2.0">........
Cochrane Reviews
XML
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<COCHRANE_REVIEW DESCRIPTION="For publication"
DOI="10.1002/14651858.CD008440" GROUP_ID="HIV"
ID="589309120202025823" MERGED_FROM=""
MODIFIED="2011-05-06 12:29:46 +0100"
MODIFIED_BY="Rachel Marshall" REVIEW_NO=""
REVMAN_SUB_VERSION="5.1.1"
REVMAN_VERSION="5" SPLIT_FROM="" STAGE="R"
STATUS="A" TYPE="INTERVENTION"
VERSION_NO="2.0">........
Cochrane Reviews
Make it easier for people to find
Cochrane Reviews
Enhancing the User Experience
Enhancing news content
Cochrane Reviews marked up in semantic markup can be linked to news publishers
For example, BBC Health writers could be suggested related Cochrane evidence for a particular story they are writing
And, could include a link to primary source material such as a Cochrane
Review
Thus driving traffic to our Reviews
Enhancing news content
How applicable is this
Review in my part of the world?
Super Star Trek
A list of the drugs in comparisons of malaria in Reviews and the geographic extent of their effectiveness
Geographical relevance
Map of Artemisin Resistance
Structured and linked data can help make our content “nimble”
Nimble content can:
• Travel Freely
• Retain Context Meaning
• Create New Products
- R. Lovinger, Razorfish
Making our content nimble
"Structured data allows you to preserve your value proposition over a longer distance to a much wider audience."
- Martin Hepp, creator of the Good Relations ontology
Structured data
Implementing semantic and linked data technologies should be:
• Non-invasive
• Agile
• Low impact (on staff – hopefully, high impact on users!)
Incremental development
What would Cochrane data “look like” outside of it’s container, the
Review?
Looking to the future
For example: someone who is looking at a study in PubMed might be interested in seeing Cochrane’s Risk of Bias assessment of this study, regardless of whether they are interested in the overall Cochrane
Review that includes that study.
Risk of Bias in PubMed
RoB assessment in PubMed
Linked Data or Web 3.0 is here
How can we leverage these tools to further our mission
Requires that we think differently about the
“container“ of the Review
Our data needs to become “nimble“ to meet future user needs
We should proceed slowly, incrementally
What are the “quick wins“ – Links to CRS?
Across-Review queries? Links to external datasets
Summary
CDSR
CRS/
CENTRAL
DARE
CMR
EbHC Semantic Platform
HTAs
CDSR
CRS/
CENTRAL
UMLS
Drug
Bank
Diseasome
HTAs
DARE
Symptom
Ontology
CMR * BBC
Health
Ontology
EbHC Semantic Platform
* Not yet created
Cochrane and EbHC ontology?
Will Cochrane have a bubble here someday?