Knowledge Capture Exercise: Representing Relationships between Mapping Data

advertisement
Knowledge Capture Exercise:
Representing Relationships between Mapping Data
Trevor Paterson 31 May 2016
Representing relationships between mapping data
1. Vertical Data Representation:
• how we accurately and unambiguously represent
mapping data in individual datasets
• The framework allowing us to integrate the various
different types and sources of mapping data.
• Important but should be solvable. Less problematic and
contentious ? – A Technological problem of Architecture
and Data Integration.
2. Horizontal Data Integration and Comparison:
• how do we establish and represent links between data –
particularly across species boundaries.
• What are the types, meanings and reliability of the
relationships between data.
• Crucial for the scientific rationale of ComparaGRID. A
Scientific problem of data interpretation, discovery and
inference. But [2] relies on [1].
Trevor Paterson 31 May 2016
Vertical Issues
•
•
•
•
•
What types of Data are we interested in?
What is a map?
What are the types of maps?
How are the types of maps related?
What is the relationship between a sequence and
a map?
• What is a Marker?
• What types of things can be used/represented as
markers?
• How do we represent evidence?
Trevor Paterson 31 May 2016
Horizontal Issues
• What types of Relationships between data are we interested
in?
• Which types of data can be related in these ways? Which
types of Concepts ( Markers?)
• Which sources of data can be related? Which types of
Maps?
• How do we define these relationships?
• How do we establish these relationships?
• How do we store these relationships? Permanently? With
ownership and provenance?
• How ‘good’ are these relationships? How reliable? How
accurate? How reproducible? How stable?
• How do we navigate/join across these relationships?
• What inferences can we draw from relationships? How do
we represent the quality and reliability of these
relationships?
Trevor Paterson 31 May 2016
Exercise: What are the Relationships
SENARIO: Compare:
My species
vs
Your species
vs
Model species
Or
My species
Trevor Paterson 31 May 2016
What types of maps do we have data for?
My species
vs
Your species
Meiotic linkage
maps
Physical map of
BAC fingerprints
HAPPY Map
….
…
Partial Sequence
Map
Trevor Paterson 31 May 2016
What types of things are represented as markers on
these maps
Map
Marker
Meiotic linkage
maps
Microsatellites,
Phenotypes/QTLS,
Deletions, …
Physical map of
BAC fingerprints
Restriction sites,
SNPs..
….
…
Trevor Paterson 31 May 2016
What relationships can be drawn between (which
types of) markers?
- On the same map
- On different maps in the same species
- Across Species
Some examples
Homology
Homoeology
Orthology
Paralogy
Xenology
Similarity
Identity
Synonymy
BiologicalGroup
HomologyGroup
OrthologyGroup
ParalogyGroup
GeneFamily
ProteinFamily
RelatedPhenotypes
PhenotypicAssociation
NegativeAssociation
CausalRelationship
SequenceSimilarity
BestMatch
SyntenyWithConservedGeneOrder
SyntenyWithoutConservedGeneOrder
ReciprocalBestMatch
Synteny
Order
Linkage
Colinearity
Trevor Paterson 31 May 2016
Do some relationships imply others
or exclude others?
SequenceSimilarity
=>
Homology, Orthology, Identity?
RelatedFunction
=>
Orthology, Homology, Paralogy
ReciprocalBestMatch
=>
Orthology
PhenotypicAssociation =>
CausalRelationship
Synonymy
=>
Similarity
Linkage+Order
=>
SyntenyWithConservedGeneOrder
Which relationships provide better evidence?
Trevor Paterson 31 May 2016
Download