CBB-Manuscript

advertisement

1

Title is here

2 XXX

1

, Zhang Zhang

1,*

3

4 1 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of

5 Genomics, Chinese Academy of Sciences, Beijing 100029, China

6

2

School of Computer Science and Technology, XXX

7

8

9

10

11

12

13

14

* Corresponding authors :

Zhang Zhang (zhangzhang@big.ac.cn): CAS Key Laboratory of Genome Sciences and

Information, Beijing Institute of Genomics, Chinese Academy of Sciences

,

No.7

Beitucheng West Road, Building G, Chaoyang District, Beijing 100029, China, Tel &

Fax: +86-10-82995427

Running head: Parallel construction of multiple protein-coding DNA alignments

15

1 Abstract

2 Constructing multiple homologous alignments for protein-coding DNA sequences is

3 crucial for a variety of bioinformatic analyses but remains computationally challenging.

4 XXX

5

6 Key words: parallel, alignment, back-translation, homolog, protein-coding DNA

7 alignment

8

9

10

1 Introduction

2 Alignments of homologous sequences within and among species are of utmost

3 importance for comparative genomics, molecular evolution and phylogenetic

4

reconstruction [1,2,3,4].

5 XXX.

6 Here we present XXX.

7 Material and methods

8 Algorithm

9 XXX

10 AA

 x

1 y

2 z

3

 w abc a

1 b

2 c

3 abc

,

11 where w abc

1 , if abc is a

 0 , s ens e codon

 otherwis e

and a , b , c

{A, T, G, C} .

12

13 Estimating XXX

14 XXX

15 A i

( 1

S i

) R i

, T i

( 1

S i

)( 1

R i

) , G i

S i

R i

, C i

S i

( 1

R i

)

16 XXX

(1)

(2)

1 Data collection

2 XXXX

3 Section XXX

4 XXXX

5 Results

6 Section 1

7 XXX

8 Section 2

9 XXX.

10 Section 3

11 XXX.

12 Discussion

13 Section 1

14 XXX

15 Section 2

16 XXX.

17 Section 3

18 XXX.

1 Acknowledgments

2 We thank XXX for YYY. This work was supported by the “100-Talent Program” of

3 Chinese Academy of Sciences (Y1SLXb1365; ZZ).

4 References

9

10

11

12

5

6

7

8

13

14

[1] W.-H. Li, Molecular Evolution, Sinauer Associates, Sunderland, Massachusetts, 1997.

[2] B. Rannala, Z. Yang, Phylogenetic inference using whole genomes, Annu Rev Genomics

Hum Genet 9 (2008) 217-231.

[3] Z. Yang, Inference of selection from multiple species alignments, Curr Opin Genet Dev 12

(2002) 688-694.

[4] J.P. Townsend, F. Lopez-Giraldez, R. Friedman, The phylogenetic informativeness of nucleotide and amino acid sequences for reconstructing the vertebrate tree, J Mol Evol 67

(2008) 437-447.

15

1 Tables

2 Table 1

3

4

5 Table 2

6

7

8 Figure legends

9 Fig. 1 Parallelization scheme of ParaAT.

10 Fig. 2 Speedup (dotted lines) and running time (solid lines) for constructing protein-

11 coding DNA alignments using 1–16 CPUs.

12 Figures

13 (Font: Arial Narrow; Size: >=12pt; Format: EPS;)

14 Fig. 1

1

2 Fig. 2

3

4

5

Download