Proposed list of taxa for Comparative Analysis of Eukaryotic Promoter Architecture R. Taylor Raborn** and Krishnakumar Sridharan** May 9th, 2011 List of proposed taxa (common name) Animalia H. sapiens (human) Reference Available TSS and transcript Data (Number of clones) -Yamashita et al., 2011 [~140 million tags in 12 human tissues] -Mammalian Gene Collection-MGC (29,818 fulllength cDNA) and dbEST (8,314,509 EST sequences) M. musculus (house mouse) -Yamashita et al., 2010 20,246,303 (tags from cultured cells) -MGC (27,285 full-length cDNA) and dbEST (4,853,542 ESTs) D. rerio (Zebrafish) -Yamashita et al., 2006 32,263 clones/15,198 TSSs -MGC (16,739 full-length cDNA) and dbEST (1,488,275 ESTs) D. melanogaster (fruit fly) -Hoskins et al., 2011 CAGE 21 million tags - BDGP (full-length cDNAs) and dbEST (821,005 ESTs) C. elegans (nematode worm) -WormBase- SAGE and other data -dbEST (396,687 ESTs) P. falciparum (Malarial protozoan parasite) -Yamashita et al., 2006 10,236 clones/ 6908 TSS -Comparasite ftp -P.Falciparum 3D7 dbEST (53,293) P.Falciparum dbEST (40,620) T. gondii (Protozoan parasite Cause of Toxoplasmosis) -Yamagishi et al., 2010 TSS-seq 1.24*105 TSSs, clustered into 103 TSRs -Comparasite ftp TSS tags (6,801,945 ) -dbEST (136,229) Fungi S. pombe (fission yeast) C. cinerea (gray shag mushroom) -Wilhelm et al., 2008 High density tiling array giving TSS assignments -dbEST (109,202) -5’SAGE project completed by Tommy Chu (personal communication) Publication status unknown L. edodes (Shittake mushroom)** Plantae C. merolae (red algae) A. thaliana (thale cress) -dbEST (15,777) -WWY Chum et al., 2011 454 GS 20 cDNAs pooled from sporeless (FB) and sporebearing (FBS) fruiting bodies -dbEST (26,541) -Yamagishi et al., 2010 22,923 clones/14,029 TSSs -U Tokyo link (see below) -Yamamoto et al., 2009 CT-MPSS 158,237tags/38,311 TSS -RIKEN Arabidosis full-length (RAFL cdnas) -> 155,144 RAFL cDNAs clustered into 14,668 nonredundant cDNA groups (about 60% of predicted genes) -dbEST/plantGDB/Brendel grp (1,529,700 ESTs) Z. mays (maize) O. Sativa(Rice) -Li et al., 2010 120 million reads (RNA-seq) - dbEST/plantGDB/Brendel grp (2,019,105 ESTs) -Maize full-length cDNA project (Varying number of FLcDNA based on strictness) - 28,469 full-length cDNA clones for japonica strain -dbEST/plantGDB/Brendel grp (1,252,989 ESTs) RESOURCES AND LINKS: 1. Mammalian Gene Collection (ftp://ftp.ncbi.nih.gov/repository/MGC/MGC_project/MGC_project_data/) 2. Full-length cDNA links – (http://www.ncbi.nlm.nih.gov/genome/flcdna/) 3. Berkeley Drosophila Genome Project – (http://www.fruitfly.org/sequence/dlcDNA.shtml) 4. WormBase for c.elegans – (ftp://ftp.wormbase.org/pub/wormbase/genomes/c_elegans/) 5. Full-length cDNA database for parasites-- http://comparasite.hgc.jp/ (see lower left of webpage) 6. C.Merolae Genome resource -- http://merolae.biol.s.u-tokyo.ac.jp/download/ 7. RARGE = RIKEN Arabidopsis Genome Encyclopedia -- http://rarge.psc.riken.jp/cdna/cdna.pl and http://rarge.psc.riken.jp/archives/rafl/ 8. Maize full-length cDNA resource -- http://www.maizecdna.org/download/ 9. Rice Oryza Sativa japonica strain FL cDNA resource -ftp://cdna01.dna.affrc.go.jp/pub/data/ 10. Plant EST and genome information and sequences -http://grinch2.gdcb.iastate.edu/PlantGenome/plantGenomeTable.php * Genome sequence currently unavailable- genome project underway at Chinese University of Hong Kong