Genetic Association Studies Monday, January 28 – Friday, February 1, 2013, 9:00AM – 12PM Sackler Room 514 Instructors: Peter Castaldi, MD, MSc and Jessica Paulus, ScD This course will introduce learners to the background concepts necessary to conduct and interpret genetic association studies and genome wide association studies (GWAS). Specific topics include quality control (QC) for genome-wide genotype data, linkage disequilibrium and SNP-tagging, assessment and adjustment for genetic ancestry, and an introduction to relevant statistical concepts and PLINK. Emphasis will be on population-based genetic association studies. Learning Objectives: 1.) Students will understand how to use publically available databases to access relevant functional information about specific genetic variants. 2.) Students will understand how to conduct genetic association analysis, including genome-wide association analysis, using PLINK. 3.) Students will understand the most commonly used study designs and statistical analytic methods for testing genetic association. 4.) Students will understand basic issues related to cleaning and data quality checks for genome-wide genotype data. 5.) Students will understand linkage disequilibrium, how it is quantified, and how it relates to SNP tagging and genetic association studies. 6.) Students will understand how population stratification can lead to spurious association results, how to assess for it, and how to adjust for it. *This instructor will lead the class with another instructor walking around the class available for troubleshooting and assistance Day Monday Tuesday Concepts Introductions and course overview Case-control study designs Basic statistics for genetic association and multiple testing Overview of Computational Tools (Unix, R) Introduction to WebBased Resources (UCSC Genome Browser, LocusZoom) PEDfile format Linkage Disequilibrium Overview of genotyping chip design MAPfile format Wednesday Thursday Friday Introduction to HapMap GWAS with quantitative traits Linear and logistic regression Confounding and Adjustment in Genetic Association Studies Quantifying Genomic Inflation Factor QQ plots Data cleaning for genomewide genotype data Introduction to Sequencing Hardy-Weinberg equilibrium Population Stratification Skills Conduct simple association testing in PLINK Plot local association results using LocusZoom Make a simple Manhattan plot in R Instructors Jessica Paulus Pete Castaldi Use the Haploview program to construct an LD plot for the CFH gene Use SNAP to identify proxy SNPs for rs380390 in HapMap genotype data Locate the CFH gene in the UCSC Genome Browser Jessica Paulus Pete Castaldi Perform an eQTL analysis with SNPs on Chromosome 17 and expression of the ORMDL3 gene Make a QQ plot in R Jessica Paulus Pete Castaldi Visualize results in WGAViewer Perform an eQTL analysis adjusted for gender Perform quality control on genomewide genotype data Michael Cho (Channing Lab) Jessica Paulus Pete Castaldi Calculate principal Jessica Paulus components of population Pete Castaldi stratification with Eigenstrat *This instructor will lead the class with another instructor walking around the class available for troubleshooting and assistance Day Monday Time 9:00-9:10 Event Introductions 9:10-9:30 9:30-9:50 Introduction to GWAS, Part 1 Introductory Exercise – Unix, R, WinScp Break Group 1 Introductions Basic GWAS Statistics and Multiple Comparisons Learning Exercise 1 – Basic Association Testing and Results Visualization Break Group 2 Introductions Introduction to GWAS, Part 2 Learning Exercise 1 cont. Review and Group 3 Introductions Linkage Disequilibrium, HapMap and Chip Design Break Group 4 Introductions Learning Exercise 2 – Linkage Disequilibrium and Proxy SNPs Quantifying Linkage Disequilibrium Break Learning Exercise 2 cont. Recap of Days 1 & 2 Review and Group 3 Introductions Learning Exercise 3 – GWAS with a Quantitative Trait, Adjustment for Confounding in GWAS Break Study Designs for GWAS Break Introduction to Population Stratification Learning Exercise 3 cont. Review and Questions 9:50-10:00 10:00-10:10 10:10-10:30 10:30-10:50 Tuesday 10:50-11:00 11:00-11:10 11:10-11:30 11:30-12:00 9:00-9:20 9:20-9:50 9:50-10:00 10:00-10:10 10:10-10:30 10:30-10:50 Wednesday 10:50-11:00 11:00-11:40 11:40-12:00 9:00-9:10 9:10-9:50 9:50-10:00 10:00-10:50 10:50-11:00 11:00-11:20 Thursday 11:20-11:50 9:00-9:10 9:10-9:30 Instructors Jessica Paulus Pete Castaldi Pete Castaldi Jessica Paulus Pete Castaldi Jessica Paulus Pete Castaldi Pete Castaldi Jessica Paulus Jessica Paulus Pete Castaldi Jessica Paulus Jessica Paulus Michael Cho (Channing Lab) Jessica Paulus Pete Castaldi Data Cleaning and QC of GWAS Data *This instructor will lead the class with another instructor walking around the class available for troubleshooting and assistance Friday 9:35-9:50 Learning Exercise 4 – GWAS Data Cleaning and QC 9:50-10:00 10:00-10:20 10:20-10:50 Break Learning Exercise 4 cont. Introduction to Sequencing 10:50-11:00 11:00-11:20 11:20-12:00 9:00-9:10 Break Hardy-Weinberg equilibrium Open Review and Questions 9:10-9:30 Potpourri- Hardy-Weinberg, Imputation, Meta-Analysis, Population Stratification Article Review Break Learning Exercise 6 – Population Stratification Functional Follow-up of GWAS Hits Break Learning Exercise 6 – Population Stratification Closing, Evaluations 9:30-9:50 9:50-10:00 10:00-10:30 10:30-10:50 10:50-11:00 11:00-11:30 11:30-12:00 Michael Cho (Channing Lab) Michael Cho (Channing Lab) Jessica Paulus Jessica Paulus Pete Castaldi Pete Castaldi Pete Castaldi *This instructor will lead the class with another instructor walking around the class available for troubleshooting and assistance