Genome-wide approaches to identify pharmacogenetic contributions to adverse drug reactions Matthew R. Nelson, Silviu-Alin Bacanu, Li Li, Clive E. Bowman, Michael Mosteller, Allen D. Roses, Eric H. Lai, Margaret G. Ehm Supplementary figure legends Supplementary Figure 1: Simulation-based power estimates for association on a genome-wide scale for a range of ADR pharmacogenetic effects, given a fixed set of 200 clinical controls. The relationship between the number of ADR cases (sample size), risk allele frequency (pd) and statistical power (contours) is displayed for additive, dominant, and recessive genetic models (rows), three values of marker LD with the functional risk allele (r2 = 1.0, 0.7, and 0.4; columns), and two test-wise significance levels (2 x 10-5 and 10-7). Power was simulated as described in Methods over the range of conditions outlined in Table 2. Contours corresponding to power of 0.2, 0.4, 0.6, 0.8, and 0.99 are displayed. Supplementary Figure 2: Simulation-based power estimates for association on a genome-wide scale for a range of ADR pharmacogenetic effects, given a fixed set of 200 population controls. The relationship between the number of ADR cases (sample size), risk allele frequency (pd) and statistical power (contours) is displayed for additive, dominant, and recessive genetic models (rows), three values of marker LD with the functional risk allele (r2 = 1.0, 0.7, and 0.4; columns), and two test-wise significance levels (2 x 10-5 and 10-7). Power was simulated as described in Methods over the range of conditions outlined in Table 2. Contours corresponding to power of 0.2, 0.4, 0.6, 0.8, and 0.99 are displayed. Supplementary Figure 3: Allelic intensities for 39 markers with minimum p-values less than 2 x 10-5. Mean adjusted allelic intensities for all cases and controls are shown on one plot for each marker. The figure legend on each plot indicates the colors corresponding to each genotype call and the symbol type indicates the case-control status. The quality score index and minimum p-value are displayed below each plot. Supplementary Figure 4: Test-wise critical values for a sequential genome-wide analysis for ADR association. Critical values (αP(τ)) were derived using the Lan and DeMets alpha spending function with a Pocock boundary as described in Methods for test-wise significance levels of 2 x 10-5 and 10-7. Estimates presented for 1, 2, 3, 4, 5, 10, and 20 equally-spaced analyses as a fraction of the total/maximum number of cases to be collected. Supplementary Figure 5: Simulation-based power estimates for a sequential genomewide analysis for ADR association for a common (K = 0.05) and large pharmacogenetic effect (GRR2 = 30), given a fixed set of 200 clinical or population controls. The statistical power is displayed for a dominant genetic model, a functional risk allele with 0.05 frequency when cases are analyzed sequentially for every five new cases. Power is shown for critical values derived from test-wise significance levels of 2 x 10-5 (red) and 10-7 (blue) and for 200 clinical (solid) and 200 population (dashed) controls assuming a maximum number of 50 cases available for analysis. Supplementary Figure 6: Genome-wide association results analyzed sequentially comparing allele and genotype frequencies between 5, 10, 15, 20, and 22 ABC HSR cases and 203 population controls. Allelic and genotypic exact test p-values for each SNP tested where P < 0.01. P-values are shown on the y axis on –log10 scale versus genomic position by chromosome on x axis. Pink horizontal line at corresponds to the 10-7 test-wise significance level derived from the Lan and DeMets alpha spending function with a Pocock boundary. Red arrow on the small arm region of chromosome 6 corresponds to the position of the HLA-B gene. Numbers above each graph correspond to the number of valid exact tests carried out and the number (percent) of results displayed above –log10(P) = 2.