MRC Slide Format

advertisement
Questions
•
•
•
•
•
•
Are we ‘just’ E. coli, except more so?
Where do new genes come from?
Do all genes evolve at the same rate?
Do all tissues & organs evolve at the same rate?
Where do we fit in the tree of life?
What specifies the differences between us and
rodents, or us and chimps?
• What specifies the elevated complexity of us versus
other animals?
• Can we understand sequence variation among
humans?
• How can gene function contribute to behaviour?
Where do new genes come from?
‘New Domains’
23 of 94 InterPro families:
Defense and Immunity
e.g. IL-1, interferons, defensins
17 of 94 InterPro families:
Peripheral nervous system
e.g. Leptin, prion, ependymin
4 of 94 InterPro families:
Bone and cartilage
GLA, LINK, Calcitonin, osteopontin
3 of 94 InterPro families:
Lactation
Caseins (a, b, k), somatotropin
2 of 94 InterPro families:
Vascular homeostasis
Natriuretic peptide, endothelin
5 of 94 InterPro families:
Dietary homeostasis
Glucagon, bombesin, colipase, gastrin, IlGF-BP
18 of 94 InterPro families:
Other plasma factors
Uteroglobin, FN2, RNase A, GM-CSF etc.
Stepping through
structure and
sequence space:
the FGF / IL-1
beta-trefoil
story
Structure &
Sequence
Sequence
J Mol Biol. 2000 Oct 6;302(5):1041-7.
FGFs, interleukin-1s
beta-trefoils
EXTRACELLULAR (CELL-CELL SIGNALLING):
FGF
IL-1a
VERT., INVERT.
VERT.
INTRACELLULAR (ACTIN-BINDING PROTEINS):
Fascin
Hisactophilin
VERT., INVERT., FUNGI
Dictyostelium.
J.Mol.Biol. 302, 1041-1047
Gene Genesis
• Positive selection often leads to the
erosion of sequence similarity
• If this erosion is extensive, homology
cannot be inferred from database search
strategies.
• If, concomitantly, there is positive selection
for duplication of these genes, this gives
the appearance of a new gene/domain
family that lacks antecedents.
Copley, Goodstadt, Ponting
Current Opinion in Genetics & Development
Volume 13, December 2003, Pages 623-628
Conservation and Selection over Time
Conservation (% identity)
100.00%
% of orthologs
found in fugu
90.00%
50%
a
80.00%
b
70.00%
c
d
60.00%
50.00%
e
f
g
h
i
Mouse-rat
Human-mouse
Human-fugu
0
j
150
300
Time of Divergence (Myr)
450
100%
Percentage of sequences
Do all tissues & organs evolve at the same rate?
100%
80%
Cytoplasmic domains
Nuclear domains
Secreted domains
60%
40%
20%
0%
0.00
0.10
0.20
KA /KS
0.30
0.40
Need to investigate expression of
tissue-specific genes.
PNAS | April 2, 2002 | vol. 99 | no. 7 | 4465-4470
Genetics
Large-scale analysis of the human and
mouse transcriptomes
Andrew I. Su et al.
http://expression.gnf.org
• Tissue Specificity of a Gene: TS
• A gene's fractional expression in a tissue relative
to the sum of its expression in all tissues
• max TS : an indicator of Tissue Specificity.
• Divide data into 5 sets:
•
•
•
•
•
(1) maxTS ≤ 0.1;
(2) 0.1 < maxTS ≤0.2;
(3) 0.2 < maxTS ≤ 0.3;
(4) 0.3 < maxTS ≤ 0.4;
(5) maxTS > 0.4
All
Protein secretion
accounts for much of
the elevation in KA /KS
for Tissue-Specific
genes.
Non-secreted
Secreted
Non-disease
Eitan Winter
Disease
Thymus
Blood
Brain
Liver
Kidney
Slow
(KA/KS=0.04)
Evolutionary Rates
Fast
(KA/KS=0.13)
Trachaea
Blood
Brain
Liver
Testis
Kidney
Low
(12.2%)
Protein Secretion (%)
High
50%
All
Housekeeping genes
are under-represented
among disease genes
Non-secreted
Secreted
Non-disease
Eitan Winter
Disease
Trachaea
Blood
Brain
Liver
Testis
Kidney
Low
(5.0%)
Human Disease (%)
High
39%
B
Te rai
Pi A H sti n
tu m e s
ita yg p3
ry da b
g la
D lan
O d
H HH
U 2
V
THE C
Sp Th DRYin yr G
al oi
c d
U or
te d
r
A
dr P O v us
en r o ar
al sta y
Thgla te
ymnd
K
Fe id us
t n
W Pa al l ey
ho n iv
l cr er
Sa e b ea
lo s
l iv
ar L od
y ive
gl r
a
Pl Hend
a a
Tr cen rt
ac ta
h
Luea
Sp n
le g
en
Median Ks-value
Tissue-specific genes’ Ks
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Winter et al. Genome Research 14:54-61, 2004
Tissue/Organ Evolution
• Mammalian tissues & organs are evolving at different
rates, according to the genes that are specifically
expressed in them.
• Perhaps this is not too surprising since there are
mammalian-specific tissues & organs!
• Tissue-specific genes are ‘mutating’ at different rates,
possibly due to transcription-coupled repair in the
germline.
• Mendelian disease acts non-uniformly among genes and
tissues.
Human-Mouse Orthologues’
Expression Profile Correlations
18
16
14
12
10
%
Orthologue Pairs
Random Pairs
8
6
4
2
Eitan
Winter
0
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
Pearson Correlation
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Pan troglodytes genome
• 4X coverage
• average nucleotide divergence of just
1.2%
How do the 2 gene
complements differ?
• Gene duplications observed in the human
genome.
• Lack of N-glycolylneuraminic acid (Neu5Gc) in
humans due to mutation in CMP-sialic acid
hydroxylase (Chou et al. PNAS 95(20):11751-6.)
• Mutation in a Siglec (sialic acid receptor)
(Angata et al. JBC 276:40282-7)
How do the Great Apes differ from us?
•
•
•
•
•
•
Rare HIV progression to AIDS
Resistant to malarial infection
Menopause rare
Coronary atherosclerosis rare
Epithelial cancers rare
Alzheimer’s disease pathology incomplete
FOXP2
•
A point mutation in FOXP2 co-segregates with a disorder in a family in
which half of the members have impaired linguistic and grammatical abilities
•
Human FOXP2 contains missense mutations and a pattern of nucleotide
polymorphism, which strongly suggest that this gene has been the target of
selection during recent human evolution. Enard et al. Nature 418, 869 - 872
Figure 2 Silent and replacement nucleotide substitutions mapped on a
phylogeny of primates. Bars represent nucleotide changes.
P < 0.001
Grey bars indicate amino acid changes.
Loss of Olfactory Receptor Genes Coincides with the Acquisition of Full
Trichromatic Vision in Primates.
PLoS Biol. 2004 Jan;2(1):E5. Epub 2004 Jan 20 Gilad et al.
Figure 2. The Proportion of OR Pseudogenes in 20 Species
Table 1. Biological processes showing the strongest evidence for positive selection. The top
panel includes the categories showing the greatest acceleration in human lineage, and the
bottom panel includes categories with the greatest acceleration in the chimp lineage.
Clark et al.
Inferring Nonneutral Evolution
from Human-Chimp-Mouse
Orthologous Gene Trios
Science (2003) 302: 1960-1963
Biological process
Number of PMW (human/Model PMW (chimp/Model
genes*
2)*
2)*
Categories showing the greatest acceleration in human lineage
Olfaction
Sensory perception
Cell surface receptor—mediated
signal transduction
Chemosensory perception
Nuclear transport
G protein—mediated signaling
Signal transduction
Cell adhesion
Ion transport
Intracellular protein traffic
Transport
Metabolism of cyclic nucleotides
Amino acid metabolism
Cation transport
Developmental processes
Hearing
48
146 (98)
505 (464)
0
0 (0.026)
0 (0.0386)
0.9184
0.9691 (0.9079)
0.199 (0.0864)
54 (6)
26
252 (211)
1030 (989)
132
237
278
391
20
78
179
542
21
0 (0.1157)
0.0003
0.0003 (0.1205)
0.0004 (0.0255)
0.0136
0.0247
0.0257
0.0326
0.0408
0.0454
0.0458
0.0493
0.0494
0.9365 (0.7289)
0.2001
0.2526 (0.0773)
0.0276 (0.0092)
0.3718
0.8025
0.8099
0.7199
0.1324
0.0075
0.8486
0.2322
0.9634
Categories with the greatest acceleration in the chimp lineage
Signal transduction
Amino acid metabolism
Amino acid transport
Cell proliferation and differentiation
Cell structure
Oncogenesis
Cell structure and motility
Purine metabolism
Skeletal development
Mesoderm development
Other oncogenesis
DNA repair
*
1030 (989)
78
23
82
174
201
239
35
44
168
39
49
0.0004 (0.0255)
0.0454
0.1015
0.3116
0.2633
0.3132
0.2208
0.9127
0.2876
0.5813
0.2777
0.9363
0.0276 (0.0092)
0.0075
0.0102
0.0182
0.0233
0.0267
0.0299
0.0423
0.0438
0.0439
0.0469
0.0477
The number of genes and the PMW values excluding olfactory receptor genes are shown in
Table 2. Molecular functions showing the strongest evidence for positive selection. The table
includes only human-accelerated categories, because the only categories accelerated in the
chimp lineage are chaperones (P = 0.0124), cell adhesion molecules (P = 0.0220), and
extracellular matrix (P = 0.0333).
Molecular function
Number of
genes*
G protein coupled receptor
G protein modulator
Receptor
Ion channel
Extracellular matrix
Other G protein modulator
Extracellular matrix glycoprotein
199 (153)
62
448
134
97 (95)
32
44 (42)
0 (0.2533)
0.0008
0.0030
0.0043
0.0120 (0.0178)
0.0149
0.0178 (0.0269)
0.8689 (0.6776)
0.3776
0.9798
0.8993
0.1482 (0.1593)
0.4441
0.1579 (0.1765)
Voltage-gated ion channel
Other hydrolase
Oxygenase
Protein kinase receptor
Transporter
Ligand-gated ion channel
Microtubule binding motor
protein
Microtubule family cytoskeletal
protein
62
95
46
37
214
45
22
0.0219
0.0260
0.0303
0.0314
0.0338
0.0405
0.0421
0.6692
0.4823
0.4792
0.6911
0.1836
0.9503
0.6385
54
0.0467
0.2815
*
PMW (human/Model PMW (chimp/Model
2)*
2)*
The number of genes and the PMW values excluding olfactory receptor genes are shown in
parentheses.
• “Smell, Hearing Genes Differ between
Chimps and Humans”
Genome News Network January 9 2004
• “The 2.5Gb mouse genome sequence
reveals about 30,000 genes, with 99%
having direct counterparts in humans.”
Nature editorial 5 December 2002.
Questions
•
•
•
•
•
•
Are we ‘just’ E. coli, except more so? Not at all.
Where do new genes come from? Old genes!
Do all genes evolve at the same rate? No.
Do all tissues & organs evolve at the same rate? No.
Where do we fit in the tree of life? Primates!
What specifies the differences between us and
rodents, or us and chimps? Jury is out. Duplicates?
• What specifies the elevated complexity of us versus
other animals? Jury is out.
• Can we understand sequence variation among
humans? Not yet – Lon’s lecture?
• How can gene function contribute to behaviour?
Seen in rodents, but not yet in primates.
Near Future
Genome Sequencing Capacity
(NHGRI)
YEAR
7X
genome
(3 Gb)
1X
genome
(3 Gb)
2003
2.5
genomes
4.9
genomes
6.2
genomes
8.4
genomes
18
genomes
34
genomes
43
genomes
59
genomes
2004
2005
2006
Sampling the
placental
mammal
phylogeny
*
*
(Murphy et al.
Science 2001 294: 2348-51 )
MRC Functional Genetics Unit, Oxford
Leo Goodstadt
Richard Emes
Eitan Winter
Steve Rice
Scott Beatson
Nick Dickens
Caleb Webber
Michael Elkaim
Jose Duarte
Zoe Birtle
Tania Oh
Ensembl (Ewan Briney, Michele Clamp, Abel Ureta-Vidal);
Richard Copley (WTCHG, Oxford); Ziheng Yang (UCL);
The Human, Mouse and Rat Genome Sequencing Consortia; UCSC
Bibliography
Human Genome Papers:
Lander et al. Nature (2001) 409, 860-921
Venter et al. Science (2001) 291, 1304-1351.
Mouse Genome Paper:
Waterston et al. Nature (2002) 420, 520-62.
Rat Genome Paper: submitted.
Comparative genomics & evolutionary rates:
Hardison et al. Genome Res. (2003) 13, 13-26.
Adaptive evolution of genomes:
Emes et al. Hum Mol Genet. (2003) 12, 701-9
Wolfe & Li Nat Genet. (2003) 33 Suppl: 255-65
Download