Human Evolutionary Genomics: Lessons from DUF1220 Protein Domains, Cognitive

advertisement
Human Evolutionary Genomics:
Lessons from DUF1220 Protein Domains, Cognitive
Disease and Human Brain Evolution
James M. Sikela, Ph.D.
Department of Biochemistry & Molecular Genetics
Human Medical Genetics and Neuroscience Programs
University of Colorado School of Medicine
Advanced Genome Analysis Course
University of Colorado School of
Medicine
March 5, 2015
Primate Evolution
2 MYA
5 MYA
8 MYA
13 MYA
20 MYA
25 MYA
Human
Gorilla
B/C = ~ 2
C/H = ~ 5
HC/G = ~ 8
HCG/O = ~ 13
HCG/O/Gib = ~20
Hom/OWM = ~ 25
HomOWM/NW = ~ 40
Orangutan
Gibbons
Old World Monkeys (e.g. baboon, rhesus, etc.)
New World Monkeys (e.g. squirrel monkey,spider monkey)
Chimpanzee
Gorilla
Bonobo
Orangutan
More Primates!
---- some things have changed!
Human Characteristics
• Body shape and thorax
• Cranial properties (brain
case and face)
• Small canine teeth
• Skull balanced upright on
vertebral column
• Reduced hair cover
• Enhanced sweating
• Dimensions of the pelvis
• Elongated thumb and
shortened fingers
• Relative limb length
• Neocortex expansion
• Enhanced language &
cognition
• Advanced tool making
modified from S. Carroll, Nature, 2005
Reports of “human-specific” genes
• FOXP2
– Mutated in family with language disability
• ASPM/MCPH
– Mutated in individuals with microcephaly
• HAR1F
– Gene sequence highly changed in humans
• SRGAP2 (neuronal migration?)
– Partial human-specific gene duplication
• DUF1220 protein domains
– Highly increased in copy number in humans;
expressed in important brain regions
HAR1F Gene
Marques-Bonet, et al Ann Rev Genomics 2009
Molecular mechanisms driving
genome evolution
• Single nucleotide substitutions
- change gene expression
- change gene structure
• Genome rearrangement
• Gene/segmental duplication
- copy number change
- value of redundancy
Gene Duplication & Evolutionary Change
•“There is now ample evidence that gene
duplication is the most important mechanism for
generating new genes and new biochemical
processes that have facilitated the evolution of
complex organisms from primitive ones.”
- W. H. Li in Molecular
Evolution, 1997
•“Exceptional duplicated regions underlie
exceptional biology”
- Evan Eichler, Genome
Research, 2001
Interhominoid cDNA Array-Based Comparative
Genomic Hybridization (arrayCGH)
Fig 1. Measuring genomic DNA
copy number alteration using
cDNA microarrays (array CGH).
Fluorescence ratios are
depicted in a pseudocolor scale,
such that red indicates
increased, and green
decreased, gene copy number
in the test (right) compared to
reference sample (left).
Human & Great Ape Genes Showing Lineage-Specific Copy Number Gain/Loss
Fortna, et al, PLoS Biol. 2004
BAC-FISH with clone containing SLC35F5 gene
Human
Bonobo
Gorilla
Orang
Chimp
H
B
C
G
O
IMAGE:814107
IMAGE:261219
IMAGE:665496
PLA2G4B/SPTBN5 gene copy number increases in
African great apes
1
2B
1
MST
2
AMY
R1A T4
FCG NUD
3
9
B10
SRP ABC
2
NEK
02
3867 GR1A
C
|nt141
1q12 A453258|F
|A
3
1
O
G
Mb
1p36
1p34
20
0
6
E2F
1p31
50
1p22
PC1 5
ANA
4
1p13
100
6
140
1q21
1q23
1q32
170
C
1q41
210
250
B
3
I2
KHA
DNC PLE
7
H
2
Mb
2p24
30
2p16
50
2p11
90
2q14
110
2q21
130
2q31
2q33
170
200
2q37
2691
1239
04
.1|nt1 FLJ220
2q14
233|
|H98
6
240
O
G
0
1
MST
C
3
B
H
3p25
3p21
20
3p12
50
3q13
80
ALB
3q21
130
160
3q25
3q26
180
3q28
200
0
4
Mb
4p16
4p12
10
4q12
50
4q24
80
4q31
100
4q34
140
190
0
9
1
5
N
2H2 C1
PAIP OCL GTF BIR SMA
C1
8
BIR
5
9
Mb
5p15
20
5q11
50
0
X
MKP
5q13
70
5q15
93
233
100
130
5q34
150
O
G
190
C
116
GPR
FLJ
10
5q23
5q
1
5q 3.1
13 |nt
.1 70
|n 41
t7 7
04 5
**
26 55
IM
48 |AI
A
|H G
4| 29
s. E
W 11
79 : 7
72 8
5
01 5
43 4|O
9| 09
7| C
B 3
G LN
IR |5
T
C |7
F
1 0
2H
46
2
4
**
55
IM
1;
A
5:
5: G
3
34 E:
51
89 95
38
18 04
35
74 59
95
|H |5
s. |7
43 07
24 10
75 19
|S 5;
M 5:2
A 1
5 6
80
52
4
Mb
B
H
1
MST
6
Mb
6p25
6p22
10
6p21
30
40
6q12
50
6q14
90
2IP1
GTF
6q22
130
FAM
GEF
ARH
11
Mb
16p13
10
16p12
20
30
50
16q12
16q22
0
0
7q11
60
2
PMP
7q21
90
7q22
100
7q31
130
140
160
0
Mb
0
7
40
8q12
60
7
FGF
AOP
8q21
80
8q22
100
8q24
120
150
13
13
10
17q11
20
17q12
30
17q21
50
17q23
70
17q25
K1
ROC
17
Mb
18p11
O
G
9
7
18
**
8p12
20
17p13
FGF
10
0
8p21
12
IM
18 AG
:1 E
|H 49 : 3
s. 42 65
37 4 51
49 32 5|
88 ;21 9|4
|F :1 13
G 13 4
F 6 96
7 4 9
22 3
6 ;1
5:
8
Mb
90
FLJ
17
7q35
42
76
14
35
56
306
FLJ
0
7p14
30
16q24
70
56
306
1
PAIP
7p21
10
170
7
Mb
USP
16
6q25
5
3C
SR2
CEL
16
18q12
20
18q21
50
80
18
C
Mb
9p23
9p13
30
40
60
9q21
9q22
80
9q33
100
0
FLJ
A
23
136MPR1
B
19
B
9q34
120
150
H
Mb
19p13
10
0
10
20
19p11
19q11
40
19q12
50
19
Mb
10p15
10p11
20
40
50
10q21
80
10q24
100
10q25
120
20
10q26
140
GF-B
R1A
BMP
0
SCD
Mb
20p13
10 20
0
11
30
20q11
20 FGF7
Mb
11p15
10
3
0
6A1
SLC
DDX
11p14
20
11
50
28
11q12
11q13
70
80
827
253
220
FLJ
LOC
90
4
T
NUD
11q14
11q22
120
140
30
0
12p12
10
30
50
12q13
12q14
70
90
12q21
110
60
Mb
21
12p13
50
21
11q24
TDG
12
Mb
20q13
ALB
12q24
21q22
40
50
6
E2F
22
22
130
0
Mb
13
14
30
13q14
50
13q21
0
14q11
0
110
Human ( Homo Sapiens )
Bonobo ( Pan Paniscus )
Chimpanzee ( Pan Troglodytes )
Gorilla ( Gorilla Gorilla )
Orangutan ( Pongo Pygmaeus )
14q13
7
694
283 FAM
LOC CHR
A
50
14q22
14q31
14q32
70
90
Xp22
20
Xp11
23
Test/Reference ratio:
7
FGF
Y
Mb
<
_ 0.5
20
15q13
40
15q21
40
50
X
100
15
Mb
22q13
Mb
50
15q22
70
15q24
15q26
100
1
>2
_
Yp11
0
15
30
30
3C
13q33
90
14
Mb
FAM
0
13q12
22q11
20
0
Mb
20 50
50
70
Xq21
100
Xq26
130
Xq28
150
19q13
60
90
0
Human Chromosome
9
Human lineage-specific
amplification of AQP7
9p22
Human
Bonobo
Chimpanzee
Gorilla
Oranutan
Gibbon
Macaque
Baboon
Marmoset
Lemur
Test/Reference Ratio:
< 0.4
1
> 2.5
AQP7
AQP7
-0.6
Lemur
Baboon
Marmoset
Gibbon
Macaque
Gorilla
Orangutan
Chimp
Human
-0.4
1.4
Bonobo
aCGH log2 Fluorescent Ratio
0
-0.2
1.2
1
0.8
-0.8
-1
-1.2
0.6
0.4
-1.4
0.2
-1.6
0
Quantitative Real Time PCR
Copy Number
9q22
aCGH
Q-PCR
r2=0.9532
SMA
Chr5q13
Williams Beuren
Chr7q11.2
Prader-Willi
Chr15q11.1
DiGeorge
Chr22q11
50
321470
470930
781385
594438
843276
1212231
296679
383823
119768
126229
135010
234376
279874
50904
297084
298685
298862
323796
451080
470261
488945
626842
704320
730398
741841
767345
811138
823588
969906
1030854
1031047
1467026
1468074
1474402
1557341
1638749
1641894
1641988
1683035
1699118
1759573
1856246
1874052
1946251
Number of BLAT Hits
BLAT-Predicted Intronless vs. Intron-Containing HLS Gene Copies in Human,
Chimp, and Macaque Genomes
*
50
45
40
35
Human intron-containing
Chimp intron-containing
Macaque intron-containing
Intronless
IMAGE Clone
45
40
35
30
30
25
25
20
20
15
15
10
10
5
5
0
0
DUF1220
Repeat Unit
Popesco, et al, Science 2006
Synonymous and Nonsynonymous
Differences Between Aligned Sequences
T h r
A CT
Phe
T TT
A CC
T h r
GTT
Val
Ks = Average number of synonymous changes
Ka = Average number of nonsynonymous changes
Nonsynonymous and Synonymous
Sites in Codons
Th r
ACT
N
Phe
T TT
S
N
N
N
1/3 S
2/3 N
What will be the Ka/Ks
values for most proteins?
Ka/Ks Distribution
Ka/Ks Distribution
1600
1400
Intra-primate comparison
mean:0.91
Rodent-primate comparison
mean: 0.61
1000
800
600
400
200
Ka/Ks value
Ka/Ks Value
2.00
1.92
1.84
1.76
1.68
1.60
1.52
1.44
1.36
1.28
1.20
1.12
1.04
0.96
0.88
0.80
0.72
0.64
0.56
0.48
0.40
0.32
0.24
0.16
0.08
0
0.00
Number of genes per bin
Number of genes per bin
1200
Genome
Human
Chimp
Gorilla
Orangutan
Gibbon
Macaque
Marmoset
Mouse Lemur
Bushbaby
Tarsier
Rabbit
Pika
Mouse
Rat
Guinea Pig
Squirrel
Tree Shrew
Cow
Dolphin
Pig
Horse
Dog
Panda
Cat
Megabat
Microbat
Hedgehog
Shrew
PDE4DIP
2
3
3
4
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Total
DUF1220
272
125
99
92
53
35
31
2
3
1
8
1
1
1
1
1
4
7
4
3
8
3
2
3
1
1
1
1
NBPF
Genes
23
19
15
11
10
10
11
1
2
0
3
0
0
0
1
1
3
3
1
1
3
1
1
2
0
0
0
0
• DUF1220 shows greatest human
specific copy number expansion of
any protein coding sequence in
the human genome
• Show signs of positive selection
• Human increase primarily due to
domain amplification (rather than
gene duplication)
O’Bleness et al. Evolutionary History and Genome Organization
of DUF1220 Protein Domains. G3 (Bethesda). Sept (2012).
A Chronology of DUF1220 Domain Evolution
* Branch points in millions of years.
O’Bleness, et al, G3: Genes,
Genomes, Genetics, 2012
Consensus Tree of Evolutionary Relationships of 429
Primate DUF1220 Sequences
DUF1220 Duplication and Protein Domain Classifications
Ancestral DUF1220 found in human
PDE4DIP
NBPF-type DUF1220 Domains
CON1
CON2
HLS1
HLS2
HLS3
CON3
Clades CON1-3 are conserved DU1220 sequences among
primates
Clades HLS1-3 refers to a three-DUF1220 domain unit that
has expanded only in the human lineage
DUF1220 triplet DUF1220 triplet
NBPF12
CON1
CON2 HLS1 HLS2 HLS3 HLS1 HLS2 HLS3 CON3
DUF1220/NBPF Genome Organization in Chimp & Human
Chimpanzee
Human
O’Bleness, et al, G3: Genes,
Genomes, Genetics, 2012
50
37.5
25
A
36kDa
B
GAPDH
Western analysis of Normal Adult Human Brain regions with DUF1220 antibody: Total protein lysates (50ug)
from normal adult human brain regions (male and female; ages ranging from 22-82yrs) were electrophoresed on 4-20%
denaturing SDS-PAGE gels and blotted with: A) DUF1220 affinity purified antibody B) GAPDH.
Popesco, et al Science 2006
DUF1220 Protein Expression in Adult Human Brain
A
B
C
E
F
ml
P
den
igl
D
DUF1220 antibody staining in the human cerebellum (77yr old white female). A) DUF1220 affinity
purified antibody; B) Double labeling with DUF1220 affinity purified antibody and Neurofilament
160kDa; C) same as B-higher magnification; D) Double labeling with DUF1220 affinity purified antibody
and GFAP; E) DUF1220 preimmune and GFAP; F) DUF1220 Adsorption control. Blue labeling represents
DAPI for nuclear staining.
Popesco et al Science 2006
(30yr old
female)
HippocampusCA regionsDUF1220
Affinity purified
+ GFAP + DAPI
GFAP
DUF1220
Affinity
Purified
Antibody
DAPI
(30yr old
female)
Cortical
regionsHippocampusDUF1220
Affinity
purified
+ GFAP + DAPI
GFAP
DUF1220
Affinity
Purified
Antibody
DAPI
Noteworthy DUF1220 Copy Number Totals
DUF1220 Copies
Total in Human Genome
Total in Chimp Genome (CLS)
272
125 (23)
Total in Last Common Ancestor of Homo/Pan
102
Total of Newly Added Copies in Human Lineage
167
Total Human-Specific Copies Added via Domain Amplification
146
Total Human-Specific Copies Added via Gene Duplication
21
Avg. Number Added to Human Lineage Every Million Years
28
O’Bleness, et al, G3: Genes,
Genomes, Genetics, 2012
Sequences Encoding DUF1220 Domains
• Show the largest human lineage-specific increase in
copy number of any protein coding region in the
genome (160 HLS; >270 total in haploid genome)
• Show signs of positive selection especially in primates
• In brain, are expressed only in neurons
• Are highly amplified in human, reduced in great apes,
further reduced in monkeys, single-or-low copy in
prosimians and non-primate mammals, and absent in
non-mammals
• Have increased in human primarily by domain hyperamplification involving DUF1220 triplet
Key Human-Specific Evolutionary Features of 1q21.1 Region
‡*
O’Bleness, et al, Nat Rev
Genet, 2012
1q21.1 Deletions linked to Microcephaly*
1q21.1 Duplications linked to Macrocephaly*
• Recurrent Reciprocal 1q21.1 Deletions and Duplications Associated
with Microcephaly or Macrocephaly and Developmental and
Behavioral Disorders
Brunetti-Pierri, et al, Nature Genetics 2008
• Recurrent Rearrangements of Chromosome 1q21.1 and Variable
Pediatric Phenotypes
Mefford, et al, N. Engl. J. Med. 2008
• *Implies the copy number (dosage) of one or more genes in this
region is influencing brain size in a dose-dependent manner
• These CNVs encompass or are immediately flanked by DUF1220
sequences (Dumas & Sikela, Cold Spring Harbor Symposium Quant. Biol.,
2009)
DUF1220/NBPF Sequences & Recurrent Disease-associated 1q21.1 CNVs
Human Evolutionary Genomics:
Relevant Reviews
Sikela, J.M. (2006). The Jewels of Our Genome:
The Search for the Genomic Changes Underlying
the Evolutionarily Unique Capacities of the
Human Brain. PLoS Genet. 2, e80.
O’Bleness, M.S., Searles, V., Varki, A., Gagneux, P.,
and Sikela, J.M. (2012). Evolution of genetic and
genomic features unique to the human lineage.
Nat. Rev. Genet., 13, 853-866.
Download