Group (A)rabidopsis:
David Nieuwenhuijse
Matthew Price
Qianqian Zhang
Thijs Slijkhuis
Species:
C. Elegans
Project:
Advanced (+Basic)
Dataset Preparation
Transcriptome
Construction Pipeline
Differentially
Expressed Genes
Co-expressed
Genes Modules
Gene Expression
(Basic Project)
Gene
Function
Biological
Explanation
Functional
Description &
Explanation
Module
Conservation b/w species
Relationship to Transcript
Properties
Visualisation of Interaction
Network
David Nieuwenhuijse
◦ GeneID and GO term extraction tool
◦ Cytoscape GO enrichment analysis
◦ Finding automatic GO enrichment tool for pipeline
Qianqian Zhang
◦ Create shell script for running Cuffdiff, Gffread and Samtools program
◦ Get the gene lists of most differentially expressed genes and highest expressed genes
◦ Visualization of differentially expressed genes by cummeRbund package: Density plot, Scatter plot, Volcano plot, P value distribution plot, MA plot etc.
◦ Basic statistics of differentially expressed genes
Matthew Price
◦ Script for listing the top 100 expressed genes
◦ Script for determining GC-content, transcript & intron length
◦ Script for getting correlation between each transcript property and the expression level
Thijs Slijkhuis
◦ Created a shell script that:
Downloads the source files
Converts SRA into FASTQ files
Performs bowtie2-build
Performs tophat
Performs cufflinks
◦ Programmed a script that sorts cuffdiff output on p-value
(significance in differential expression), extracts gene names from it
Co-expressed Genes Modules
◦ WGCNA package not usable in our case
◦ Use cummeRbund package to get Heatmaps
GO enrichment analysis
◦ Not many genes are annotated in the GO database.
◦ Gene id of the differentially expressed genes are not compatible with the NCBI database.
Transcript sequences
◦ Not all expressed transcripts in the .gtf file can be matched to their corresponding sequence in the fasta file.