Andy Law Farm Animal Genomics

advertisement
Farm Animal Genomics
Andy Law
Biological background
z
An organism’s genetic blueprint is
contained within DNA
z
DNA is packaged into linear ‘strings’
or chromosomes
z
The number of chromosomes is a
characteristic of the species
Genomics and Bioinformatics
Biological background
z
Over time, (random) changes occur
–
Mutations in DNA sequence
(polymorphisms)
–
Duplications/Deletions of regions
–
Fusion/Fission/Rearrangement at
chromosome level.
Genomics and Bioinformatics
Biological background
z
Polymorphisms (mutations) within a
species can be used to ‘map’ the
genome of that species
z
Associations between locations on
those maps and phenotypes provide
clues to the presence of ‘important’
genes (QTL)
Genomics and Bioinformatics
Biological background
z
Evolution preserves functional
regions (coding regions/regulatory
elements)
z
We can identify pieces of DNA that
share a common ancestor through
the similarity of their sequence
Genomics and Bioinformatics
Biological background
z
Links from one species to another
can be used to infer missing data
Genomics and Bioinformatics
Species A Gene Map
Species A
A
B
C
D
E
Genomics and Bioinformatics
Species B Gene Map
Species B
A
D
B
E
C
Genomics and Bioinformatics
Comparative Gene Mapping
Species A
Species B
A
A
B
B
C
C
D
D
E
E
Genomics and Bioinformatics
Identification of QTL gene
Species A
QTL
is in here
somewhere
{
Species B
A
A
B
B
C
C
D
D
E
E
Genomics and Bioinformatics
Identification of QTL gene
Species A
QTL
is in here
somewhere
{
A
Species B
A
Gene 1
Gene 2
B
B
C
C
D
D
E
E
Gene 3
}
Gene 4
These are
potential
candidate
genes
Genomics and Bioinformatics
Integration
z
There are other data types
z
Links can be made indirectly across
multiple species
Genomics and Bioinformatics
Pig Fat QTL
Genomics and Bioinformatics
Linkage and RH maps
Fat
Trait
location
Linkage
Map
Radiation
Hybrid Map
Genomics and Bioinformatics
Human homology
Pig
Fat
Trait
location
Linkage
Map
Radiation
Hybrid Map
Cytogenetic
Map
Genomics and Bioinformatics
Physical clones
Pig
Human
BAC1
BAC2
Fat
BAC3
Trait
location
Linkage
Map
Radiation
Hybrid Map
Cytogenetic
Map
Physical
Mapping
Genomics and Bioinformatics
Chicken EST homologues
Pig
Chicken
Human
BAC1
EST1
BAC2
Fat
EST2
BAC3
Trait
location
Linkage
Map
Radiation
Hybrid Map
Cytogenetic
Map
Physical
Mapping
Genomics and Bioinformatics
Expression data
Pig
Chicken
Human
BAC1
EST1
BAC2
Fat
EST2
BAC3
Trait
location
Linkage
Map
Radiation
Hybrid Map
Cytogenetic
Map
Physical
Mapping
Expression
Analysis
Genomics and Bioinformatics
Supporting literature
Pig
Chicken
Human
BAC1
EST1
BAC2
Fat
EST2
BAC3
Trait
location
Linkage
Map
Radiation
Hybrid Map
Linked
References
Cytogenetic
Map
Physical
Mapping
Expression
Analysis
Genomics and Bioinformatics
Making the links
z
Different name, same thing…
–
TGF-B1, TGFB1, Tgfb1, Transforming
Growth Factor Beta 1, TGF β1
–
TGF-B1, TGF-B4, TGF-B5
Genomics and Bioinformatics
Making the links
z
Same name, different thing…
–
There are at least 6 different markers
recorded as ‘GH’ within ARKdb-pig
–
Some primer pairs amplify multiple loci
and the same anonymous symbol has
thus been assigned to multiple
chromosomal locations
Genomics and Bioinformatics
Making the links
z
Gene families
–
TGF-B1, TGF-B2, TGF-B3, TGF-B4, TGFB5
–
Chicken, human have 3, Xenopus has 2
Genomics and Bioinformatics
Making the links
z
Fat QTLs
–
z
Abdominal fat pad, shoulder, back,
interstitial (marbling)
Other phenotypes
–
–
Are chicken wings equivalent to arms
or limbs in general?
What about drosophila wings?
Genomics and Bioinformatics
Making the links
z
Ontologies
–
Graphs of controlled vocabularies
–
Not perfect
–
Current debate in MGED moving
towards references to ontologies and
collections of ontology-ontology
mappings
Genomics and Bioinformatics
Making the links
z
z
Ontologies provide a means to
define hierarchies of attributes and
functions
We need a way to define
relationships between instances of
physical ‘things’ rather than their
functions or attributes
Genomics and Bioinformatics
Making the links
z
Need a way to assert…
–
–
A ‘is an alias of’ B
C ‘is contained by’ D
• Ergo D ‘contains’ C
–
–
E ‘is homologous/orthologous to’ F
G ‘differs from’ G1
Genomics and Bioinformatics
Making the links
z
More importantly need to define
flexible external data references
–
–
A ‘has a sequence accession of’
AC012345
B ‘is defined at’ http://whatever.com
Genomics and Bioinformatics
Integration
z
Technical issues…
–
Systems developed stand-alone
• Fine for ‘point-and-click’
• Less good for automated/bulk analysis
Genomics and Bioinformatics
Farm Animal Genomics
z
Ultimate goal is to identify causative
genes
z
Comparative genomics/Data
integration will play a large part
z
Need to focus on infrastructure
Genomics and Bioinformatics
Integration
z
Re-engineer systems
–
–
z
Define Application Programming
Interfaces (APIs)
Define Structured Data Interchange
Formats
Use APIs to integrate data from
different systems
Genomics and Bioinformatics
User
resSpecies
ARKdb
Radiation
Hybrid
Database
Diversity
Databases
Genomics and Bioinformatics
Novel Analyses
User
resSpecies
ARKdb
Radiation
Hybrid
Database
Diversity
Databases
Genomics and Bioinformatics
User
resSpecies
Interface
ARKdb
Interface
Radiation
Hybrid
Database
Interface
Diversity
Databases
Interface
Application Programmable Interface
resSpecies
ARKdb
Radiation
Hybrid
Database
Diversity
Databases
Genomics and Bioinformatics
User
Novel Analyses
Application Programming Interface
resSpecies
ARKdb
Radiation
Hybrid
Database
Diversity
Databases
Genomics and Bioinformatics
User
Novel Analyses
Application Programming Interface
resSpecies
ARKdb
Radiation
Hybrid
Database
Array
Diversity
Expression
Databases
Data
Sequence
&
Homology
Genomics and Bioinformatics
?
The GRID!
Application Programming Interface
resSpecies
ARKdb
Radiation
Hybrid
Database
Array
Diversity
Expression
Databases
Data
Sequence
&
Homology
Genomics and Bioinformatics
?
Download