1 Genome Sequence of Bacillus sp. Strain FJAT-14515 2 Liu guohonga, Liu Boa, Tang Weiqib, Che Jianmeia, Lin Yingzhia, Zhu Yujinga, Su 3 Mingxinga,Tang Jianyanga 4 5 Agricultural Bio-resource Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian 350003, China a. 6 Fujian agicultural and foresty university, Fuzhou, Fujian 350003, Chinab. 7 8 E-mail: Liu Guohong liuguohong624@163.com, Liu Bo fzliubo@163.com, Tang Weiqi 9 tweiqi@163.com, Che Jianmei chejm2002@163.com, Lin Yingzhi lynnet@163.com, Zhu Yujing 10 zyjingfz@163.com, Su Mingxing starsu717@163.com, Tang Jianyang tjy836@163.com 11 12 ABSTRACT 13 We report here the draft genome sequence of Bacillus sp. strain FJAT-14515. The 14 genome is 5.44 M in length, and it covers 5,263 genes with an average length of 791 15 bp and a G+C value of 37%.06, 67 tRNAs, 31sRNAs and 5 rRNA loci. 16 17 GENOME ANNOUNCEMENT 18 Bacillus sp. strain FJAT-14515 (16S rRNA GenBank accession number 19 JX262264), a mesophiles and endospore-forming bacterium, was isolated from the 20 soil sample collected from Taiwan. It grows optimally at a 0% NaCl (range, 0 to 5%) 21 concentration in Nutrient Agar (NA) at 30 °C (10-40 °C) and pH 7.0 (6-9). The 16S 22 rRNA similarities between FJAT-14515 and closest strains B. muralis DSM 16288T 23 and B. simplex DSM 30646T were less than 98% through EzTaxon-e database 24 (http://eztaxon-e.ezbiocloud.net/)(1). Devereux et al. (2) and Fry et al. (3) have proposed 25 that a similarity of less than 98% in a 16SrRNA sequence should be considered 26 evidence for separate species. So, the genome of Bacillus sp. FJAT-14515 was 27 sequenced with a view to determining whether it was a novel species of the genus 28 Bacillus. 29 The complete genome sequence was determined by Illumina Solexa technology 30 at the Beijing Genomics Institute (BGI) (Shenzhen, China). Assembly was performed 31 using SOAP denovo v2.04 (4). 32 The genome assembly of Bacillus sp. FJAT-14515 (G+C content of 37.06%) has 33 approximately 98-fold coverage. It contains 28 scaffolds totaling 5,443,019 bp (largest, 34 2,476,485 bp, and smallest, 598 bp). The scaffolds consist of 43 contigs totaling 35 5,436,942 bp(largest, 793,370 bp, and smallest, 204 bp). N50 scaffold lengths of 36 793,370 bp and N50 contig lengths of 435,981 bp were obtained. All assembly data 37 were deposited in the DDBJ/EMBL/GenBank nucleotide sequence database. 38 Coding sequences (CDSs) were predicted using Glimmer 3.02 (5) and further 39 annotated using Uniprot, NCBInr, COG and KEGG through BLASTP. tRNAs, rRNAs 40 and sRNA were identified using tRNAscan-SE (6), RNAmmer (7) and Rfam (8), 41 respectively. 42 The genome contains 5,263 CDSs with the average length of 789 bp that 43 represent 76.5% of the whole genome. The result of annotation shows that only 837 44 (15.9%) genes did not match any known protein in the current public protein 45 databases. Of the 5,263 genes, 2,735 and 2,812 proteins were assigned into COGs 46 functional categories and KEGGs, respectively. Additionally, 67 tRNAs, 31 sRNAs, 5 47 rRNAs.7 CRISPRs were identitied in our genome. 48 Nucleotide sequence accession number. This Whole Genome Shotgun 49 project of Bacillus sp. (strain FJAT-14515) has been deposited at 50 DDBJ/EMBL/GenBank under the accession AYSD00000000. The version described 51 in this paperis version AYSD00000000. 52 53 ACKNOWLEDGMENT 54 This work was supported by agricultural bioresources institute, Fujian Academy of Agricultural 55 Sciences, PR China. The work was financed by the 948 project (2011-G25) from Chinese Ministry 56 of Agriculture as well as by the 973 program earlier research project (2011CB111607), the project 57 of agriculture science and technology achievement transformation (2010GB2C400220), the 58 international cooperation project (2012DFA31120) from Chinese Ministry of Science and 59 Technology,Natural Science Foundation of China (NSFC)(31370059), respectively. 60 61 REFERENCES 62 1. Kim OS, Cho YJ, Lee K, Yoon SH, Kim M, Na H, Park SC, Jeon YS, Lee JH, Yi H, Won S, Chun 63 J. (2012). Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with 64 phylotypes that represent uncultured species.International Journal of Systematic and Evolutionary 65 Microbiology. 62, 716-721. 66 2. Devereux R, He SH, Doyle CL, Orkland S, Stahl DA, LeGall J, Whitman W B. (1990). Diversity 67 and origin of Desulfovibrio species: phylogenetic definition of a family. Journal of Bacteriology. 172, 68 3609-3619. 69 3. Fry NK, Warwick S, Saunders NA, Embley TM. (1991). The use of 16S ribosomal RNA analyses 70 to investigate the phylogeny of the family Legionellaceae. Journal of General Microbiology. 71 137,1215-1222. 72 4. Li RQ, Zhu HM, Ruan J, Qian WB, Fang XD, Shi Z B, Li YR, Li ST, Shan G, Kristiansen K, Li SG, 73 Yang HM, Wang J. (2010). De novo assembly of human genomes with massively parallel short 74 read sequencing. Genome Research. 20(2): 265-272. 75 5. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. (1999) Improved microbial gene 76 identification with GLIMMER, Nucleic Acids Research. 27, 23 4636-4641. 77 6. Lowe TM, Eddy SR. (1997). tRNAscan-SE: a program for improved detection of transfer RNA 78 genes in genomic sequence. Nucleic Acids Research. 25, 955–964. 79 7. Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW. (2007). RNAmmer: 80 consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Research. 35, 81 3100–3108. 82 8. Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, 83 Griffiths-Jones S, Eddy SR, Bateman A. (2009). Rfam: updates to the RNA families database. 84 Nucleic Acids Research. 37 (suppl 1): D136-D140.