Gene expression data analysis using R: How to make sense out of your RNA-Seq/microarray data June 22-26, 2015 Organised by MolMed 15th edition, Erasmus MC vs 150615 Course organizers and website Program: Dr. Judith Boer Pediatric Oncology, Erasmus MC-Sophia Children’s Hospital, j.m.boer@erasmusmc.nl Coordination: Dr. Frank van Vliet MolMed, 010-70 43518/ 06-5474 6408, f.vanvliet@erasmusmc.nl Website: www.molmed.nl Course website: http://molmed-expression.erasmusmc.nl/ Speakers and moderators Judith Boer and Alex Hoogkamer, Department of Pediatric Oncology, Erasmus MC-Sophia Children's Hospital, Rotterdam Henk Buermans, Leiden Genome Technology Center, LUMC Marcel Reinders and Erdogan Taskesen, Information and Communication Theory Group, TU Delft Renée de Menezes, Department of Epidemiology and Biostatistics, VUmc, Amsterdam Lodewyk Wessels and Jelle ten Hoeve, Bioinformatics and Statistics Group, Netherlands Cancer Institute, Amsterdam Jelle Goeman, Department of Biostatistics, Radboud University Nijmegen Maarten van Iterson, Department of Molecular Epidemiology, LUMC Leiden Guido Jenster, Department of Urology, Erasmus MC Job van Riet, Cancer Computional Biology Center and Department of Urology, Erasmus MC Andrew Stubbs, Department of Bioinformatics, Erasmus MC, Rotterdam Kristina Hettne and Eelke van der Horst, Biosemantics Group, LUMC Leiden Peter van Baarlen, Nutrition, Metabolism & Genomics Group, Wageningen University Course website: Sylvia de Does, Department of Bioinformatics, Erasmus MC Location: Erasmus MC, Computerroom 22, Onderwijscentrum. Target group The course is tailored for biological and clinical researchers whose research involves experiments that generate gene expression data by using RNA sequencing or microarrays. The course focuses mostly on the analysis of expression data, and explains general concepts such as experimental design, normalization, testing and interpretation. We do not explain the technologies themselves and we do not cover the mapping of sequence reads. Dedicated courses for next-generation sequencing and RNA-seq covering these topics are available (see http://biosb.nl). Some concepts may be applicable to other types of genomics data. Most of the speakers (and therefore examples) have a biomedical background. 1 Pre-requisites for participants Participants need to know what a microarray or RNA sequencing experiment is, and have their own expression profiling data. They have preferably followed an introduction to R course; alternatively they have practiced the "Getting started in R" practical prior to the course. Basic statistical concepts including mean, variance, standard deviation, probability distributions, t-test, p-value, correlation, and linear regression are assumed known. These are typically seen during basic statistics courses. Please fill in the online registration form (in the free text box at the bottom of the form): do you have basis R knowledge (yes/no); if yes, please indicate how you acquired this knowledge: basic R course/ other…; do you have gene expression data to analyse yes/no, if yes: which platform? Microarrays: Affymetrix / Illumina / Agilent / other: ..... RNA sequencing: tag / transcriptome / other: ..... Format The course is intensive, and covers the basic concepts and methods required for expression analysis. Presentations are followed by hands-on computer sessions to directly apply and get more insight in the analysis methods. One afternoon is dedicated to the analysis of a new data set, allowing the students to refresh and extend their analysis skill. After the course, the presentations, practicals and test data will remain available for future reference. Software packages used are freeware, including the statistical software R, Bioconductor, Cytoscape and web tools. Learning objectives 1. The participant has insight in the issues involved in good experimental design of microarray and next-generation sequencing experiments. 2. The participant knows and can perform analysis steps in expression data analysis, visually present and judge the results for: quality control and preprocessing, finding differentially expressed genes, cluster analysis, classification analysis, pathway testing. 3. The participant has insight in the different algorithms and options available to perform an analysis, and can make an informed choice. 4. The participant knows the pitfalls of existing analyses and is able to critically judge the statistical analysis of expression data performed by others. Registration, deadline, admittance, sponsored places & related courses The total number of participants is limited to 40. Deadline for registration is four weeks in advance, on Monday 25 May, 9 a.m. When more than 40 students register before this deadline, the organisers will make a selection and admit the students with own data and experience in R. Please note that to this aim you must fill in the online registration form: do you have basis R knowledge (yes/no); if yes, please indicate how you acquired this knowledge: basic R course/ other…; do you have gene expression data to analyse yes/no, if yes: which platform? Microarrays: Affymetrix / Illumina / Agilent / other: ..... RNA seq: tag / transcriptome / other: ..... LUMC organizes a basic course in R from 9-10 June 2015; www.boerhaavenet.nl. MolMed (Erasmus MC) organizes a basic course on R from 26-29 May 2015; www.molmed.nl. 2 Programme Day 1 Room Monday June 22: Design and Preprocessing Rooms: computer room 22 (Onderwijscentrum) Moderator : Judith Boer 9:15 9:45 10:00 10:45 11:00 12:00 13:00 13:45 14:00 14:15 17:00 Welcome coffee and registration Short introduction to data sets and tools Introduction to microarray and RNA-seq technology Coffee Experimental design: Think before you start Lunch (in room Ae-406) Normalization Introduction to R and Bioconductor Coffee Practical: Normalization and quality control in R: platform comparison data Affymetrix, Agilent, Illumina arrays, Solexa (RNA-Seq) End day 1 Judith Boer Henk Buermans Judith Boer Judith Boer Judith Boer Judith Boer, Alex Hoogkamer, Andrew Stubbs Day 2 Room Tuesday June 23: Gene testing and Clustering Room: Computer room 22 (OWR) Moderator: Renée de Menezes 8:45 9:00 10:00 10:15 11:15 12:30 13:30 14:45 15:00 17:00 Welcome coffee Hierarchical and K-means clustering Coffee Cluster validation and principal component analysis Practical: Clustering using R Lunch (in room Ae-406) Finding differentially expressed genes Coffee Practical: Finding differentially expressed genes in R using limma / edgeR Marcel Reinders Marcel Reinders Marcel Reinders with assistance Renée de Menezes Renée de Menezes, Judith Boer, Alex Hoogkamer End day 2 3 Day 3 Room Wednesday June 24: Classification and Gene set testing Room: Computer room 22 (OWR) Moderator: Lodewyk Wessels 8:45 9:00 10:30 10:45 12:30 13:30 14:30 14:45 17:00 Day 4 Welcome coffee Classification and PAM Coffee Practical: Classification using PAM Lunch (in room Ae-406) Testing groups of genes Coffee Practical: Testing groups of genes End day 3 Lodewyk Wessels Lodewyk Wessels, Jelle ten Hoeve Jelle Goeman Jelle Goeman with assistance Room Thursday June 25: Practical Issues and Practice Room: Computer room 22 (OWR) Moderator: Judith Boer 8:45 9:00 9:45 10:30 10:45 11:15 Welcome coffee Gene annotation Practical: Gene annotation Coffee Batch effects Practical: Batch effects 12:00 13:00 Lunch (in room Ae-406) Gene expression profiling: the cancer transcriptome Coffee Assignment: Data analysis of ALL samples 14:00 14:15 15:45 16:00 Coffee Assignment: Data analysis of ALL samples, continued End day 4 17:00 Day 5 Room Maarten van Iterson Maarten van Iterson, Judith Boer Judith Boer, Maarten van Iterson Judith Boer, Maarten van Iterson, Alex Hoogkamer Guido Jenster Judith Boer, Andrew Stubbs, Alex Hoogkamer, Job van Riet Judith Boer, Andrew Stubbs, Alex Hoogkamer, Job van Riet Friday June 26: Databases and Pathways Room: Computer room 22 (OWR) Moderator: Andrew Stubbs 8:45 9:00 9:45 10:30 10:45 12:15 13:15 15:30 15:45 17:00 Welcome coffee Databases and pathway analysis Interpretation of gene lists Coffee (in room Ae-406) Practical: Practical: Interpretation of gene lists with the Anni Web Service and DAVID; Databases and pathway analysis Lunch Presentation Cytoscape Coffee Practical Cytoscape End day 5: hand in evaluation form & badge! Andrew Stubbs Kristina Hettne, Eelke van der Horst Kristina Hettne, Eelke van der Horst Peter van Baarlen Peter van Baarlen 4 Attendance fees Course tuition for non-commercial participants is € 700. Discounts are handled as followed: Participants from the postgraduate school MolMed get a discount of 100% (tuition = €0). PhD students and Master’s students, regardless of institution, get a discount of 50% (tuition = €350). The course is considered an entirety, and participants are encouraged to attend all parts of the course. No discounts are given for participants who chose not to participate in a portion of the course. If these financial requirements pose a problem, please contact Frank van Vliet, managing director of the Erasmus Postgraduate School Mol Med, at: f.vanvliet@erasmusmc.nl. Invoices Fees should only be paid after receipt of an INVOICE. Shortly after your registration you will receive the INVOICE by mail. Payment should be transferred to account: 43.47.01.408 / Erasmus MC, (IBAN code bank: NL86ANBA0434701408; SWIFT code bank: ABNANL2A), with the invoice number noted. Late registrations may also pay in cash upon arrival. Cancellations Cancellation is possible up to one week before the start of the Course. Later cancellation will not be accepted, but you are allowed to send a substitute. Commercial participants & sponsors Companies are invited to inquire about commercial participant tuition fees and about sponsoring. 5