Materials and Methods

advertisement
Materials and Methods
Human constitutively and alternatively spliced exons were available from online
database ASAP (the Alternative Splicing Annotation Project(12)) at
http://www.bioinformatics.ucla.edu/ASAP/. By mapping the ASAP-provided
homolog table to the ASAP genomic data set or the corresponding mouse UniGene
EST sequences (ftp://ftp.ncbi.nih.gov/repository/UniGene/, March 2005), 4,670
human constitutively spliced exons and their orthologous mouse exonic sequences
were extracted. As alternatively spliced exons conserved between human and mouse
were not available from ASAP, two strategies were applied to identify the orthologous
mouse counterparts of human ASEs: (1) Blastn-align human alternative exon plus two
flanking exons to the whole mouse UniGene EST database or (2) Blastn-align the
same sequences to the mouse genome (NCBI Build 33, May 2004). Mouse exons that
had ≥70% identity rates to the full lengths of human exon queries were extracted.
Therefore, a total number of 800 human ASEs, including 523 major-form exons, 110
minor-form exons, and 264 undetermined-form exons were paired with their mouse
orthologs. The classification of major-form (included in at least two thirds of the EST
counts), minor-form (skipped in at least two thirds of the EST counts), and
undetermined-form (in the intermediate case, or 5 ESTs in total) exons was retrieved
from ASAP (also defined in Ref. (1)). On the other hand, orthologous human-rat exon
pairs, including constitutive and ASEs, were identified based on Blastn alignments
between human exons and the rat UniGene database.
The human constitutive introns, i.e., introns that do not include any ASEs or EST
matches, were extracted from the UCSC annotation and EST-to-genome alignment.
The human EST database (HGI Release 15) was provided by TIGR (The Institute of
Genome Research) at http://www.tigr.org/tdb/tgi, and the EST-to-genome alignment
was performed by CRASA (2) at http://big.pcf.sinica.edu.tw/service/tools.php. Only
that the constitutive intron and its two flanking exons are all conserved in the mouse
genome was regarded as human-mouse ortholog. The human-mouse genomic
sequence alignments were extracted from the UCSC multiple alignments at
http://hgdownload.cse.ucsc.edu/goldenPath/hg17/multiz8way/.
For the KA/KS ratio analysis of orthologous exon pairs, the following procedures
were performed: (i) detecting the reading frames of human protein-coding exons by
Blastx-aligning these exons against the corresponding RefSeq protein sequences
(ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/protein/); (ii) calculating the numbers of
synonymous and non-synonymous sites, KA, KS, and KA/KS values, using the PAML
package (3, 4); (iii) creating two-way contingency tables with rows comprising
numbers of synonymous and non-synonymous sites and columns comprising numbers
of changed and unchanged sites; and (iv) testing the independence between the
numbers of changed synonymous and non-synonymous sites using Fisher’s exact test.
The substitution rates of human-mouse and mouse-rat orthologous constitutive introns
were measured by maximum likelihood using the HKY (Hasegawa, Kishino, Yano) model of
evolution and also calculated by PAML package (3, 4).
References
1.
2.
3.
4.
B. Modrek, C. J. Lee, Nat Genet 34, 177 (2003).
T.J. Chuang et al. Genome Res 13: 313 (2003).
Z. Yang, R. Nielsen, Mol Biol Evol 17, 32 (2000).
Z. Yang, Comput Appl Biosci 13, 555 (1997).
Download