ADDITIONAL MATERIALS Additonal Materials 2: Arabidopsis 3’ UTR, 5’ UTR and Upstream ORF datasets (HTML). We used the annotation of the NCBI 2005 assembly of Arabidopsis thaliana. From the annotation we extracted the position informations of CDS and mRNA exons of each annotated genes. Comparing these two types of data (CDS and mRNA) we defined the length of 3' and 5' UTRs and the positions of introns. In the html tables the UTR lengths are calculated for spliced UTRs. Position of an intron means the first base of the spliced UTR downstream of the intron. 3’ UTR dataset lists Arabidopsis genes according to the length of their 3’ UTR. Number of annotated Arabidopsis genes: 26519 Number of annotated Arabidopsis genes having either 5’ UTR, 3’ UTR or both 5’ and 3’ UTRs:18494 Number of annotated genes that contain intron in the 3’ UTR: 693, 81 contain more than 1 intron Number of annotated genes that have intron > 54 nt downstream of the stop codon: 257 Arabidopsis thaliana upstream ORF (uORF) dataset. uORF is defined as an at least 10 amino acids long ORF located in the 5’ UTR. uORF positions are defined on spliced (intron subtracted ) 5’ UTR sequences. Additonal Materials 3: Distribution of introns in different eukaryots. Note that except the very 5’ and 3’ regions of coding sequences, the plant introns are distributed equally within the coding region.