1
2 XXX
1
, Zhang Zhang
1,*
3
4 1 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of
5 Genomics, Chinese Academy of Sciences, Beijing 100029, China
6
2
School of Computer Science and Technology, XXX
7
8
9
10
11
12
13
14
* Corresponding authors :
Zhang Zhang (zhangzhang@big.ac.cn): CAS Key Laboratory of Genome Sciences and
Information, Beijing Institute of Genomics, Chinese Academy of Sciences
,
No.7
Beitucheng West Road, Building G, Chaoyang District, Beijing 100029, China, Tel &
Fax: +86-10-82995427
Running head: Parallel construction of multiple protein-coding DNA alignments
15
1 Abstract
2 Constructing multiple homologous alignments for protein-coding DNA sequences is
3 crucial for a variety of bioinformatic analyses but remains computationally challenging.
4 XXX
5
6 Key words: parallel, alignment, back-translation, homolog, protein-coding DNA
7 alignment
8
9
10
1 Introduction
2 Alignments of homologous sequences within and among species are of utmost
3 importance for comparative genomics, molecular evolution and phylogenetic
4
5 XXX.
6 Here we present XXX.
7 Material and methods
8 Algorithm
9 XXX
10 AA
x
1 y
2 z
3
w abc a
1 b
2 c
3 abc
,
11 where w abc
1 , if abc is a
0 , s ens e codon
otherwis e
and a , b , c
{A, T, G, C} .
12
13 Estimating XXX
14 XXX
15 A i
( 1
S i
) R i
, T i
( 1
S i
)( 1
R i
) , G i
S i
R i
, C i
S i
( 1
R i
)
16 XXX
(1)
(2)
1 Data collection
2 XXXX
3 Section XXX
4 XXXX
5 Results
6 Section 1
7 XXX
8 Section 2
9 XXX.
10 Section 3
11 XXX.
12 Discussion
13 Section 1
14 XXX
15 Section 2
16 XXX.
17 Section 3
18 XXX.
1 Acknowledgments
2 We thank XXX for YYY. This work was supported by the “100-Talent Program” of
3 Chinese Academy of Sciences (Y1SLXb1365; ZZ).
4 References
9
10
11
12
5
6
7
8
13
14
[1] W.-H. Li, Molecular Evolution, Sinauer Associates, Sunderland, Massachusetts, 1997.
[2] B. Rannala, Z. Yang, Phylogenetic inference using whole genomes, Annu Rev Genomics
Hum Genet 9 (2008) 217-231.
[3] Z. Yang, Inference of selection from multiple species alignments, Curr Opin Genet Dev 12
(2002) 688-694.
[4] J.P. Townsend, F. Lopez-Giraldez, R. Friedman, The phylogenetic informativeness of nucleotide and amino acid sequences for reconstructing the vertebrate tree, J Mol Evol 67
(2008) 437-447.
15
1 Tables
2 Table 1
3
4
5 Table 2
6
7
8 Figure legends
9 Fig. 1 Parallelization scheme of ParaAT.
10 Fig. 2 Speedup (dotted lines) and running time (solid lines) for constructing protein-
11 coding DNA alignments using 1–16 CPUs.
12 Figures
13 (Font: Arial Narrow; Size: >=12pt; Format: EPS;)
14 Fig. 1
1
2 Fig. 2
3
4
5