Where do workflows fit in? • Advanced queries incorporating other DBs

advertisement
Where do workflows fit in?
• Advanced queries incorporating other DBs
– Linking genes with diseases (OMIM)
– Genetic pathways (Kegg)
• Mouse-human interoperability
– Using anatomical terms
– Using direct 3D to 3D model mapping
– Using spatial-temporal ontologies
• Data mining and processes
– Hierarchical Clustering
– Association rules
Jano van Hemert
www.dgemap.org
Mouse-human interoperability
Jano van Hemert
www.dgemap.org
Hierarchical clustering
‘McMahon’ Data TS17
Jano van Hemert
www.dgemap.org
Hierarchical clustering
‘McMahon’ Data TS17
Myt1l
Dlx5
Jano van Hemert
www.dgemap.org
Let biologists cluster data
Jano van Hemert
www.dgemap.org
Clustering: viewing the output
Jano van Hemert
www.dgemap.org
What are association rules?
•
•
•
•
•
Based on a set of transactions
We want to derive rules of the form X => Y
Meaning, if X happens then Y happens
X and
X and Y are sets of items appearing in the
transactions
• The rules come with numbers to express their
quality with respect to the set of transactions
(most common: support and confidence)
Jano van Hemert
www.dgemap.org
Association Rules
• In the context of gene expression:
if Gene1 and Gene2 then Gene3
where a transaction equals a set of
genes expressing together at the
same time in the same anatomical
component
• Alternative: if Component1 then
Component2 and Component3
where a transaction equals a
number of components expressing
the same gene at the same time
Jano van Hemert
www.dgemap.org
Association Rules Results
Transaction: genes
expressing in the same
anatomical component in
the same Theiler stage
Wnt1, Bmp4 => Shh
Vcam1 => Kdr
Emx2 => Otx2
Otx1, Pax6 => Otx2
Techo-fact: extracted using
web services called from a
Perl script…
Jano van Hemert
Association rules with a
minimum confidence of
90%
0.053
0.057
0.054
0.051
0.91
0.93
0.95
0.92
Source: the EMAGE
database, using the editorial
spatial annotations
extracted on 2006/08/28
www.dgemap.org
Perl script
Jano van Hemert
www.dgemap.org
Main issues while using Taverna
•
•
•
•
•
•
Need for more data mangling functions
Need for more data formatting controls
Pipelining and memory concerns
Library of useful translations services
Interaction Plug-in Architecture…?
What about Axis version 2?
Jano van Hemert
www.dgemap.org
Thanks for your attention
Susan Lindsay
Demetrius Vouyiouklis
Marie-Laure Muiras
Xunxian Wang
Mark Scott
Alina Andras
Jano van Hemert
Malcolm Atkinson
Jano van Hemert
Yin Chen
Richard Baldock
Simon Woods
Ken Taylor
www.dgemap.org
Download