Known sequence or Node – Pop1

advertisement
Known sequence or Node
– Pop1
Human
Mouse
D. melanogaster
C. elegans
C. briggisae
A. nidulans
S. pombe
S. cerevisiae
C. albicans
C. parvum (partial)
P. falciparum (partial)
Node A
Node B
Node C
Node D
Node E
Node F
Node G
S. pombe (PAML) S. pombe (FastML)
Score
E-value
Score
E-value
84
2e-16
59
4e-10
72
9e-13
70
3e-13
81
1e-15
60
3e-10
75
8e-14
46
3e-06
64
3e-10
46
3e-06
175
2e-41
89
4e-19
1362
0.0
171
7e-44
162
6e-40
81
1e-16
168
5e-42
75
7e-15
62
3e-10
53
2e-08
54
1e-07
94
2e-19
70
3e-13
117
3e-26
74
1e-14
132
7e-31
47
2e-06
140
4e-33
95
9e-21
177
2e-44
118
6e-28
206
4e-53
120
2e-28
123
4e-28
91
1e-19
Target Genomes
Microsporidia
Score
E-value
61
3e-10
58
3e-09
50
9e-07
36
0.010
38
0.003
33
0.099
34
0.051
34
0.022
46
5e-06
38
0.006
39
0.002
41
7e-04
46
1e-05
41
7e-04
41
7e-04
37
0.010
Entamoeba
Score
E-value
40
5e-04
-
Giardia
Score
E-value
35
0.13
37
0.045
39
0.002
38
0.008
43
8e-04
45
3e-04
44
4e-04
44
3e-04
41
0.008
Supplementary Table 1: Results from BLAST searches with known Pop1 sequences and ancestral sequences of the genome databases from
S. pombe, Entamoeba, Microsporidia and Giardia. Node names are as indicated in Figure 1.
‘-‘ Indicates no hits (E-values above 1.0 are reported as no hits).
Known sequence or
Node – Pop4
Human
Mouse
D. melanogaster
C. elegans
A. thaliana
S. pombe
S. cerevisiae
Node A
Node B
Node C
Node H
Node F
S. pombe
Score E-value
81
3e-16
74
3e-14
32
0.25
371
e-104
49
1e-06
72
3e-13
87
6e-18
75
4e-14
94
9e-20
172
2e-43
Target Genomes
Microsporidia
Entamoeba
Score E-value
Score
E-value
27
0.88
36
0.002
31
0.058
Giardia
Score
E-value
48
6e-06
47
8e-06
41
8e-04
46
2e-05
51
5e-07
57
9e-09
Supplementary Table 2: Results from BLAST searches with Pop4 known and ancestral sequences of the
target databases from S. pombe, Giardia, Entamoeba and Microsporidia. Node names are as indicated on
Figure 1. E-values above 1.0 are reported as: “–“ (ie. no hits). There were no sequences recovered with the
archaeal Pop4 protein sequences or any ancestral sequence derived from them.
Known sequence
or Node – Rpp21
Human
Mouse
S. pombe
S. cerevisiae
Node A
Node D*
S. pombe
Score
E-value
54
3e-08
67
2e-12
163
6e-42
46
5e-06
51
2e-07
103
3e-23
Target Genomes
Microsporidia
Entamoeba
Score
E-value
Score
E-value
37
8e-04
39
1e-04
25
0.97
47
2e-07
22
5.4
28
0.32
27
0.31
39
3e-04
48
7e-07
-
Giardia
Score
E-value
36
0.006
46
4e-06
50
1e-07
32
0.069
40
6e-04
45
2e-05
Supplementary Table 3: Results from BLAST searches with Rpp21 known and ancestral sequences of the target
databases from S. pombe, Giardia, Entamoeba and Microsporidia. Node names are as indicated on Figure 1.
E-values above 1.0 are reported as: “–“ (ie. no hits).
* This node is the “root” node for the tree of the above sequences.
HMM model
Pop1 PAML sequences
Pop1 PAML and ancestral sequences
Pop1 PAML ancestral sequences only
Pop1 FastML sequences
Pop1 FastML and ancestral sequences
Pop1 FastML ancestral sequences only
Pop4 Eukaryotic sequences
Pop4 Eukaryotic and ancestral sequences
Pop4 Eukaryotic ancestral sequences only
Pop4 Eukaryotic and Archaeal sequences
Pop4 Eukaryotic, Archaeal and ancestral sequences
Pop4 Eukaryotic and Archaeal ancestral sequences only
Pop5 Eukaryotic sequences
Pop5 Eukaryotic and ancestral sequences
Pop5 Eukaryotic ancestral sequences only
Pop5 Eukaryotic and Archaeal sequences
Pop5 Eukaryotic, Archaeal and ancestral sequences
Pop5 Eukaryotic and Archaeal ancestral sequences only
S. pombe
Score
E-value
162.82
4.3e-41
199.93
1.2e-53
163.88
5.5e-44
167.25
1.5e-35
156.31
1e-34
130.59
1e-30
97.16
5.2e-22
108.44
3.9e-30
70.53
1.1e-14
70.56
1.5e-17
85.77
3.6e-19
50.63
4.7e-13
111.88
5.2e-24
151.60
5.1e-37
145.07
8.3e-42
110.03
6.8e-28
142.84
4.1e-37
131.35
6.0e-35
Score
27.84
27.84
25.65
18.73
21.03
17.07
50.53
49.40
34.98
38.77
49.78
36.13
24.53
10.60
-
Giardia
E-value
2.9e-03
3.3e-03
0.16
0.014
0.039
0.11
1.7e-07
1.7e-08
1.3e-04
2e-05
9e-10
2e-06
0.012
20
-
Supplementary Table 4: HMM model results for Pop1 and Pop4 protein sequence alignments used in searches of
the S. pombe and Giardia genome databases. ‘ – ‘ indicates that the candidate sequence was not found.
Download