Comparative Sequence Analysis of Genes from Rice ssp. indica and japonica ---- A Case Study Longjiang Fan1,2(樊龙江) Jianbin Wang1(王建斌) Yang Zhang1 (张扬) (Bioinplant Lab, 1. Institute of Bioinformatics / 2. Institute of Seed Science, Zhejiang University, Hangzhou 310029; E-mail fanlj@zju.edu.cn / bioinplant@zju.edu.cn) Comparative sequence analysis of genes from oryza sativa L. ssp. indica and japonica is an important part in rice genomics research. The releases of draft genome sequences of indica rice cultivar 9311 and about half genome sequences of japonica rice cultivar Nipponbare make the analysis available. As a case study, here we compared the sequences of superoxide dismutase (SOD, EC1.15.1.1) gene from the two subspecies respectively and offer some clues for large scale comparative research of them. Rice SOD gene sequences in public databases SOD catalyzes the first step in the scavenging system of active oxygen species by the disproportionation of superoxide anion radical to hydrogen peroxide and molecular oxygen. SOD has multiple isoforms, which are classified by their metal co-factor as copper/zinc (Cu/Zn), manganese (Mn) and iron (Fe) forms. These isoforms are distributed in different subcellular locations. In higher plants, Cu/Zn-SOD is localized mainly in plastids and cytosol. Mn-SOD is localized predominantly in the mitochondrial matrix, and Fe-SOD is observed in chloroplasts. In rice, the cDNAs and genes corresponding to most SOD isoforms have been isolated and characterized (Kaminaka et al., 1999). In GenBank, There are total 13 rice SOD entries which are mostly from japonica rice (Table1). Genes from Nipponbare were searched against the Rice GD (http://btn.genomics.org.cn/rice), using BLAST (2002-3-20). We were fortunate that all of them were found in contigs of the database (all E-values equal to 0). Table 1 Entries of rice SOD mRNA/gene sequences in GenBank and the Rice GD mRNA SOD isoforms Cu/Zn-SOD Plastid CytosolA CytosolB Mn-SOD Mn-SODA Mn-SODB Fe-SOD japonica (Nipponbare) D85239 D00999 D01000 L19436 AB014056 Genomic DNA indica (IR36) japonica (Nipponbare) indica (9311) L36320 AB026724 L19435 L19434 Contig20429 Contig7075 Contig11946 AB026725 Contig736 AP000399 Contig5035 and Contig13401 L34038 L34039 UniGene Clusters UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene. All sequences from GenBank in Table1 fall into five UniGene clusters (Table2). Same SOD isoform gene sequences from japonica and indica rice all fall into same cluster (see Os.186 and Os.4610) Table 2 UniGene clusters of rice SOD gene sequences UniGene Os.186 Os.4169 Os.5522 Os.4610 Os.3583 mRNA/GENE SEQUENCES Description Oryza sativa mRNA for copper/zinc-superoxide dismutase, complete cds, clone:RSODA Oryza sativa mRNA for copper/zinc-superoxide dismutase, complete cds, clone:RSODB Oryza sativa mRNA for plastidic copper/zinc-superoxide dismutase, complete cds Rice mitochondrial manganese-superoxide dismutase (sodAOs1) mRNA, complete cds Oryza sativa mRNA for iron-superoxide dismutase, complete cds D00999 /L19435/L36320 D01000/L19434 D85239/AB026724 L19436/ AB026725 /L34038/L34039 AB014056/AP000399 Differences in genomic DNA level The Rice GD BLAST results of all SOD genes from Nipponbare were summarized (Table3). In general, the differences between SOD gene sequences from japonica and indica rice are small in genomic level (gene/promoter) with a view to SNP density and sequencing errors, etc. Mn-SOD gene may be an exception and there is a big gap (856bp) between the genes from japonica and indica rice (see figure). The gap doesn’t belong to exon region of Mn-SOD gene by and large based on its mRNA sequence. There is a 102bp-overlapped region between two matched regions in a contig for CytosolA (Cu/Zn-SOD) genes. This may be a wrong assembling in the Rice GD. M1-337bp unM-23bp M2-154bp unM2-856bp M3-764bp unM3-43bp M4-2280bp Mn-SOD Genes from indica (upper) and japonica (down) rice 0 500 1000 1500 2000 2500 3000 Gene size (bp) 3500 4000 4500 Table 3 Differences between SOD gene sequences from japonica and indica rice in genomic level* Gene (bp) Promoter (bp) Gene size Identity(%) of matched region Unmatched region/gap Before ATG Identity(%) of matched region Cu/Zn-SOD CytosolA CytosolB Plastid 1548 1172 2626 100/100 100 99.7/98.7/100 102,overlapped 20,14/ -605 -600 -610 99.7 100 99.1/100§ Mn-SOD Mn-SODA 3352 23/856,43 -602 98.3 Fe-SOD 1472 99.7/99.4/99.9/9 9.0 100/100 /38, two contigs -600 100 SOD isoforms *based on the Rice GD BLAST research with default value; §there is a 30bp unmatched region Differences in mRNA and protein level SOD isoforms Cu/Zn-SOD CytosolA Mn-SOD Mn-SODA mRNA sequence(bp) Protein sequence(aa) G+C content CDS size Different number Size Different number japonica/indica (%) 459 1 152 0 52.1/52.1 696 2 231 1 58.0/58.0 Conclusions Difference between genes from japonica and indica rice is small in this case (SOD gene). Particular, this difference is few at mRNA and protein sequence level. But the things make sense are what make for the difference between the two subspecies. Do regulatory sequences or transcription factors play important roles in it? SOD genes are single-copy genes in the indica rice genome. Only one significant sequence (contig) was returned for all SOD gene sequences in BLAST search from the Rice GD. It was reported that Mn-SOD gene is a single-copy gene in rice ((Kaminaka et al., 1999). The Rice GD contributes a comprehensive functional coverage of rice genome based on this research. 第一作者简介 樊龙江,37 岁,博士,浙江大学生物信息学研究所/种子科学与工程研究所副教 授,目前主要从事水稻基因组生物信息学、数量遗传等方面研究。已发表 SCI 论文 4 篇,国内一级学报论文 10 篇,获国家教委科技进步二等奖(1991)和农业 部(1999)、浙江省(2001)科技进步三等奖各一次。 实验室(Bioinplant Lab)主页: http://www.cab.zju.edu.cn/depart/nx/Bioinplant/bioinplant_page.htm E-mail: fanlj@zju.edu.cn 或 bioinplant@zju.edu.cn