The development of a RNA-sequencing pipeline based on tuxedo tools Martijn Derks Masoed Ramuz Nick Alberts Rico Hagelaar Index • • • • • • Dataset Pipeline 1 (Tophat_cuff) Pipeline 2 (Cuff_diff) Pipeline 3 (Summary) Conclusions Future prospects Dataset • Arabidopsis thaliana (advanced) • Six conditions: • • • • • • Cold stress Drought stress Heat stress Highlight stress Salt stress Control Gan et al. 2011. Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature. 477, P 419–423. Tophat_cuff Input data (FastQ) Configuration file Tophat (6x ) Bamfile Cufflinks Transcripts.gtf Analysis Total intron length Transcript length Basic for plot R Tophat_cuff results Cold Drought Heat highlight Salt WT mapped 11.01M 10.63M 11.24M 10.96M 7.41M 20.11M unmapped 23.90M 25.18M 21.82M 19.97M 24.30M 33.64M percentage 31.5 29.7 34.0 35.4 23.4 37.4 Tophat_cuff results Condition # genes FPKM > 1 Cold_stress 34029 20348 Drought_stress 35060 21044 Heat_stress 33615 19079 Highlight_stress 38480 22557 Salt_stress 33778 20111 Cuff_diff (1) (5x ) Control vs condition transcript.gtf Cuffmerge Merged.gtf Bamfile Cuffdiff DE-genes Functions + enrichment Cuff_diff (2) uniprot Get Functions David Enrichment (5x ) Cuff_diff results (Uniprot) • XLOC_005119 XLOC_005119 Hsp70b 1:5502205-5504535 WT_control heat_stress OK 1.88554 4668.1 11.2736 -4.26394 2.00852e-05 0.00596 • 568 yes Q9S9N1 Heat shock 70 kDa protein 5 (Heat shock protein 70-5) (AtHsp70-5) (Heat shock protein 70b) FUNCTION: In cooperation with other chaperones, • Hsp70s stabilize preexistent proteins against aggregation and mediate the folding of newly translated polypeptides in the cytosol as well as within organelles. These • chaperones participate in all these processes through their ability to recognize nonnative conformations of other proteins. They bind extended peptide segments with a • net hydrophobic character exposed by polypeptides during translation and membrane translocation, or following stress-induced damage (By similarity). Cytopla • sm. ATP binding; cell wall; chloroplast; plasma membrane; response to heat; response to virus GO:0005524; GO:0005618; GO:0009507; GO:0005886; GO:0009408; GO: • 0009615 Cuff_diff results DE genes/overlap Cuff_diff results (David) Drought Cold Heat Salt Highlight Summary Summary ID Cold Drought Heat Highlight Salt WT AT1G01010 10.5501 12.0209 6.80685 6.44518 Tophat count AT_codes Csv maker 0 10.7992 GC genes vs FPKM AT1G01030 2.51058 2.60705 0.582286 3.71439 1.37225 2.46655 AT1G01046 0 0 0 6.40264 4.73081 0 AT1G01050 52.7297 75.5912 0 CV46.9862 0 46.5351 Expr. intron AT1G01070 13.6023 15.9691 0 7.52686 19.3891 23.0487 AT1G01073 0 0 0 0 0 0 AT1G01090 80.2276 70.2176 58.4497 67.0227 102.39 Conservation Overlap 80.5032 matrix Clustering R AT1G01110 0.966456 1.307 0.564864 1.26781 1.88932 2.65862 CV= STDEV/ Average HC sample Clustering HC gene Clustering 0.15 Heatmap Clustering HC clusters (9) PAM clusters (10) Transcription factors Abscisic acid biosynthesis (stress conditions) 1 2 1. Cold, salinity and drought stresses: An overview Shilpi Mahajan Narendra Tuteja 2. Cold stress regulation of gene expression in plants Viswanathan Chinnusamy et al. Conserved genes in Arabidopsis • Abiotic stress genes which also occur in Arabidopsis were retrieved from Oryza sativa (Rabbani et al). • These genes were compared with the DE stress genes found in the results. • Three genes were found in the salt, cold and drought conditions. • Rabbani, M.A. Maruyama, K. Abe, H. Khan, M. A. Katsura, K. Ito, Yoshiwara, K. Seki, M. Shinozaki, K. Yamaguchi-Shinozaki, K. 2003. Monitoring Expression Profiles of Rice Genes under Cold, Drought, and High-Salinity Stresses and Abscisic Acid Application Using cDNA Microarray and RNA Gel-Blot Analyses. Plant Physiology vol. 133. No 4. Pp 1755-1767 Literature overlap • Results of the GO enrichment are backed up by the literature, with the exception of high light stress • The crosstalk between drought, cold and salt stress was confirmed by the literature with a greater emphasis on drought and salt stress. Seki, M. Narusaka, M. Ishida, J. Nanjo, T. Fujita, M. Oono, Y. Kamiya, A. Nakajima, M. Enju, A. Sakurai, T. Satou, M. Akiyama, K. Taji, T. Yamaguchi-Shinozaki, K. Carninci, P. Kawai, J. Hayashizaki, Y. Shinozaki, K. 2002. Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. V 31. I 3. pp 279-292. Baniwal, K. S. Bharti, K. Yu Chan, K. Fauth, M. Ganguli, A. Kotak, S. Mishra, S. K. Nover, L. Port, M. Scharf, K. Tripp, J. Weber, C. Zielinski, D. Koskull-Doring, P. 2004. Heat stress response in plants: a complex game with chaperones and more than twenty heat stress transcription factors. J Biosci. V 29. I 4. pp 471-487. Bartels, D. Nelson, D. 1994. Approaches to improve stress tolerance using molecular genetics. Plant, Cell and Environment. V 17. pp 659-667. Wang, W. Vinocur, B. Shoseyov, O. Altman, A. 2004. Role of plant heat-shock proteins and molecular chaperones in the abiotic stress response. V 9. I 5. pp. 244-252. Conclusions • Working pipeline for (Paired + Unpaired) RNAseq analysis • DE genes + Gene Enrichment detection • Cluster analysis CV genes • Differential expressed genes identified (stress conditions vs. WT) • Correlation Transcript length with FPKM • Not found in Intron/GC percentage • Clusters of Co-expressed genes • Assumption of co-regulated genes Future perspectives • Use different IDs (TAIR IDs are not suitable) • Transcription factors to cluster genes (similar regulatory elements? ) • Conservation other plant species (synteny) • Validation different dataset (organisms, paired end) Questions