Knowledge Capture Exercise: Representing Relationships between Mapping Data Trevor Paterson 31 May 2016 Representing relationships between mapping data 1. Vertical Data Representation: • how we accurately and unambiguously represent mapping data in individual datasets • The framework allowing us to integrate the various different types and sources of mapping data. • Important but should be solvable. Less problematic and contentious ? – A Technological problem of Architecture and Data Integration. 2. Horizontal Data Integration and Comparison: • how do we establish and represent links between data – particularly across species boundaries. • What are the types, meanings and reliability of the relationships between data. • Crucial for the scientific rationale of ComparaGRID. A Scientific problem of data interpretation, discovery and inference. But [2] relies on [1]. Trevor Paterson 31 May 2016 Vertical Issues • • • • • What types of Data are we interested in? What is a map? What are the types of maps? How are the types of maps related? What is the relationship between a sequence and a map? • What is a Marker? • What types of things can be used/represented as markers? • How do we represent evidence? Trevor Paterson 31 May 2016 Horizontal Issues • What types of Relationships between data are we interested in? • Which types of data can be related in these ways? Which types of Concepts ( Markers?) • Which sources of data can be related? Which types of Maps? • How do we define these relationships? • How do we establish these relationships? • How do we store these relationships? Permanently? With ownership and provenance? • How ‘good’ are these relationships? How reliable? How accurate? How reproducible? How stable? • How do we navigate/join across these relationships? • What inferences can we draw from relationships? How do we represent the quality and reliability of these relationships? Trevor Paterson 31 May 2016 Exercise: What are the Relationships SENARIO: Compare: My species vs Your species vs Model species Or My species Trevor Paterson 31 May 2016 What types of maps do we have data for? My species vs Your species Meiotic linkage maps Physical map of BAC fingerprints HAPPY Map …. … Partial Sequence Map Trevor Paterson 31 May 2016 What types of things are represented as markers on these maps Map Marker Meiotic linkage maps Microsatellites, Phenotypes/QTLS, Deletions, … Physical map of BAC fingerprints Restriction sites, SNPs.. …. … Trevor Paterson 31 May 2016 What relationships can be drawn between (which types of) markers? - On the same map - On different maps in the same species - Across Species Some examples Homology Homoeology Orthology Paralogy Xenology Similarity Identity Synonymy BiologicalGroup HomologyGroup OrthologyGroup ParalogyGroup GeneFamily ProteinFamily RelatedPhenotypes PhenotypicAssociation NegativeAssociation CausalRelationship SequenceSimilarity BestMatch SyntenyWithConservedGeneOrder SyntenyWithoutConservedGeneOrder ReciprocalBestMatch Synteny Order Linkage Colinearity Trevor Paterson 31 May 2016 Do some relationships imply others or exclude others? SequenceSimilarity => Homology, Orthology, Identity? RelatedFunction => Orthology, Homology, Paralogy ReciprocalBestMatch => Orthology PhenotypicAssociation => CausalRelationship Synonymy => Similarity Linkage+Order => SyntenyWithConservedGeneOrder Which relationships provide better evidence? Trevor Paterson 31 May 2016