HCLS$$HCLSIG_DemoHomePage_HCLSIG_Demo

advertisement
Harnessing the Semantic Web to
Answer Scientific Questions:
A Health Care and Life Sciences Interest Group demo
Susie Stephens, Principal Research Scientist, Lilly
Agenda
• Health Care and Life Sciences Interest Group
• Scientific Use Case
• Technological Approach
• Demonstration
• Benefits of the Semantic web
Health Care & Life Sciences Interest Group
HCLSIG is chartered to develop and support the use of
Semantic Web technologies and practices to improve
collaboration, research and development, and innovation
adoption in the Health Care and Life Science domains
More details on HCLS are available at:
http://www.w3.org/2001/sw/hcls/
Benefits of Semantic Web Technologies
• Fusion of data across many scientific disciplines
• Easier recombination of data
• Querying of data at different levels of granularity
• Capture provenance of data through annotation
• Perform inference across data sets
• Machine processable approach
• Data can be assessed for inconsistencies
Scientific Use Case
• Use case focuses on Alzheimer’s Disease
• AD is a devastating illness that impacts 26.6 million
people worldwide
• Prevalence is predicted to quadruple to 106.8 million
by 2050
• Many different types of evidence need to be
integrated
• An active Web community exists for AD research
Scientific Data Sets
• Integration and analysis of heterogeneous data sets
•
Hypothesis, Genome, Pathways, Molecular Properties, Disease, etc.
PDSPki
Gene
Ontology
NeuronDB
Reactome
BAMS
Antibodies
NC
Annotations
Entrez
Gene
Allen Brain
Atlas
BrainPharm
MESH
Mammalian
Phenotype
SWAN
AlzGene
PubChem
Homologene
Publications
Scientific Hypothesis & Research Questions
Scientific Hypothesis:
- Amyloid beta peptide may impair memory by
inhibiting long-term potentiation (LTP)
Research Questions:
- By what mechanism does amyloid beta inhibit LTP?
- Can we identify a novel therapeutic target based on
this mechanism?
- How can we validate the therapeutic target?
Technological Approach
• Careful modeling that reflect biology to enable integration of
data sources
• All bio-entities were assigned URIs
• Most data translated to RDF and managed in a triple store
• Other data maintained in original store and mapped to RDF
• Using a reasoner to infer triples to increase expressiveness
of queries
• Query data with SPARQL and visualization tools
Conclusions
• Semantic Web provides ability to query across many
disparate data sources to discover new insights
• Potential to identify patterns and insights across many
data sources
• Data needs to be carefully modeled
• Flexible re-use of data, which is important in a
discipline where knowledge is frequently updated
Acknowledgements
HCLS Demo Contributors
HCLS Demo Contributors
• John Barkley (NIST)
• Susie Stephens (Eli Lilly)
• Olivier Bodenreider (NLM, NIH)
• Mike Travers (
• Bill Bug (Drexel University College of Medicine)
• Gwen Wong (SWAN)
• Huajun Chen (Zhejiang University)
• Elizabeth Wu (SWAN)
• Paolo Ciccarese (SWAN)
• Kei Cheung (SenseLab, Yale)
Data Providers
•Tim Clark (SWAN)
• Judith Blake (MGD.)
• Don Doherty (Brainstage Research Inc.)
• Mikail Bota (BAMS)
• Kerstin Forsberg (AstraZeneca)
• David Hill (MGD)
• Ray Hookaway (HP)
• Oliver Hoffman (CL)
• Vipul Kashyap (Partners Healthcare)
• Minna Lehvaslaiho (CL)
• June Kinoshita (AlzForum)
• Colin Knep (Alzforum)
• Joanne Luciano (Harvard Medical School)
• Maryanne Martone (CCDB)
• Scott Marshall (University of Amsterdam)
• Susan McClatchy (MGD)
• Chris Mungall (NCBO)
• Simon Twigger (RGD)
• Eric Neumann (Teranode)
• Allen Brain Institute
• Eric Prud’hommeaux (W3C)
• Jonathan Rees (Science Commons)
• Alan Ruttenberg (Science Commons)
• Matthias Samwald (Medical University of Vienna)
Vendor Support
• OpenLink - Kingsley Idehen, Ivan Mikhailov, Orri Erling, Mitko Iliev
• HP - Ray Hookaway, Jeannine Crockford
NeuronDB
PDSPki
GO
Reactome
Genes/proteins
Interactions
Cellular location
Processes (GO)
Molecular function
Cell components
Biological process
Annotation gene
PubMedID
Antibodies
Genes
Antibodies
NC
Annotations
Genes/Proteins
Processes
Cells (maybe)
PubMed ID
PubMedID
Hypothesis
Questions
Evidence
Genes
SWAN
Proteins
Chemicals
Neurotransmitters
Entrez
Gene
Genes
Protein
GO
pubmedID
Interaction (g/p)
Chromosome
C. location
Allen Brain
Atlas
Genes
Brain images
Gross anatomy ->
neuroanatomy
BAMS
Protein
Neuroanatomy
Cells
Metabolites
(channels)
PubmedID
MESH
Drugs
Anatomy
Phenotypes
Compounds
Chemicals
PubMedID
PubChem
Genes
Phenotypes
Disease
PubMedID
Genes
Mammalian
Species
Phenotype
Gene
Orthologies
Polymorphism Proofs
Population
Alz Diagnosis
Homologene
AlzGene
Protein (channels/receptors)
Neurotransmitters
Neuroanatomy
Cell
Compartments
Currents
BrainPharm
Drug
Drug effect
Pathological agent
Phenotype
Receptors
Channels
Cell types
pubMedID
Disease
Name
Structure
Properties
Mesh term
PubChem
Download