Data source WTCCC Diseases Bipolar disorder Coronary artery disease Crohn’s disease Hypertension Rheumatoid Type 1 diabetes Ancestry Europe Europe Europe Europe Europe Europe (Cases, controls) (1817, 2928) (1878, 2928) (1729, 2928) (1934, 2928) (1894, 2928) (1939, 2928) Cross-validation 5-fold 5-fold 5-fold 5-fold 5-fold 5-fold Functional SNP sets for 2D PRS Europe (5937, 10862) 10-fold Asian (5510, 4544) 10-fold Europe (5066, 8807) 10-fold blood eSNPs1,2, CR-SNPs3, active histone marks H3K4me3 and H3K9-14Ac in HAEC4, active histone marks in bladder cell lines downloaded from the ROADMAP project, lung related functional SNPs (eSNPs5 and meSNPs6 in lung tissues, H3K4me and H3K-14Ac in HAEC4) blood eSNPs1,2, CR-SNPs3, eSNPs5 and meSNPs6 in lung tissues, active histone marks H3K4me3 and H3K9-14Ac in HAEC4, pleiotropic SNPs with p<0.01 (denoted as PT-0.01) or p<0.001 (denoted as PT-0.001) in at least one other trait. CR-SNPs3, eSNPs/meSNPs in adipose7,8, combined active histone mark (H3K4me3, H3K9-14Ac, H3K36me3, H3K4me1, H3K9ac and H3K9me3) SNPs in pancreatic islet cells and primary pancreatic cells downloaded from the ROADMAP project, PT-0.01 and PT-0.001 SNPs. Bladder cancer Three cancer GWAS with individual genotype data Lung cancer, Asian non-smoking females Pancreatic cancer blood eSNPs1,2, CR-SNPs3 Table S1A: Disease GWAS with individual genotype data used for evaluating risk prediction performance. Type 2 diabetes Validation sample Data sources (Cases, controls) GERA (1500,1500) Europe Discovery sample Data sources (Cases, controls) DIAGRAM (17,802, 105,109) GERA TRICL (11,300, 15,952) PLCO (1237,1330) Europe Europe PGC2 GECCO (31,560,42,951) (9,719, 10,937) MGS PLCO (2681,2653) (1000,2302) Europe African Japanese Latino PRACTICAL ELLIPSE (38,703, 40,796) Pegsus (4600,2941) Ancestry Europe Lung cancer Schizophrenia Colorectal cancer Prostate cancer Table S1B: Disease GWAS with independent validation samples for evaluating prediction performance. Functional SNP sets for 2D PRS CR-SNPs3, eSNPs/meSNPs in adipose7,8, histone mark SNPs in pancreatic islet cells. blood eSNPs1,2, eSNPs5 and meSNPs6 in lung tissues, CR-SNPs3, H3K4me3 in SAEC9, PT-0.01 and PT-0.001 SNPs. blood eSNPs1,2, CR-SNPs3, PT-0.01 and PT-0.001 SNPs blood eSNPs1,2, CR-SNPs3, PT-0.01 and PT-0.001 SNPs, histone mark SNPs in colon/rectal cells in the ROADMAP project . blood eSNPs1,2, CR-SNPs3, PT-0.01 and PT-0.001 SNPs, TCF7L2/H3K27Ac (-DHT)/H3K27Ac(+DHT) in LNCaP cells10. Disease PRS High priority SNPs for 2D PRS 1D Bipolar disorder 2D Blood eSNPs CR SNPs 1D Coronary artery disease 2D Blood eSNPs CR SNPs 1D Crohn’s disease 2D Blood eSNPs CR SNPs 1D Hypertension 2D Blood eSNPs CR SNPs 1D Rheumatoid Type 1 diabetes 2D Blood eSNPs CR SNPs 1D 2D Blood eSNPs CR SNPs Prediction R2 Nagelkerke R2 AUC Winner’s curse correction Winner’s curse correction Winner’s curse correction NO LASSO MLE NO LASSO MLE NO LASSO MLE 5.59% 5.64% 5.62% 7.59% 7.65% 7.62% 0.635 0.636 0.635 5.75% 5.74% 5.72% 7.80% 7.79% 7.76% 0.637 0.636 0.636 5.72% 5.75% 5.76% 7.76% 7.80% 7.81% 0.637 0.637 0.637 1.63% 1.58% 1.61% 2.22% 2.14% 2.18% 0.572 0.571 0.572 1.73% 1.67% 1.72% 2.34% 2.26% 2.34% 0.575 0.574 0.575 1.79% 1.72% 1.70% 2.43% 2.33% 2.31% 0.578 0.574 0.572 6.65% 8.22% 7.60% 9.25% 11.32% 10.43% 0.646 0.660 0.656 7.71% 8.75% 8.40% 10.59% 12.10% 11.55% 0.658 0.667 0.663 6.85% 8.25% 7.75% 9.34% 11.40% 10.61% 0.651 0.660 0.656 3.04% 3.02% 3.07% 4.12% 4.09% 4.16% 0.597 0.597 0.598 3.33% 3.28% 3.27% 4.52% 4.45% 4.44% 0.601 0.600 0.600 3.23% 3.15% 3.2 4.38% 4.27% 4.34% 0.600 0.600 0.600 7.24% 8.60% 7.60% 9.77% 11.59% 10.30% 0.653 0.669 0.659 7.12% 8.69% 7.74% 9.63% 11.71% 10.44% 0.650 0.671 0.658 7.50% 8.68% 7.84% 10.11% 11.69% 10.56% 0.657 0.670 0.661 18.20% 18.50% 18.20% 26.09% 26.99% 26.03% 0.754 0.758 0.754 18.30% 18.70% 18.40% 26.31% 26.87% 26.54% 0.755 0.758 0.756 18.50% 18.70% 18.50% 26.70% 27.20% 26.64% 0.757 0.758 0.756 Table S2: Prediction R2 (=cor(y,PRS)2), Nagelkerke R2 and AUC in the WTCCC data, based on five-fold cross-validation. Disease PRS High priority SNPs for 2D PRS NO LASSO MLE 0.4 0.5 0.3 Blood eSNPs (0.3,0.4) (0.6,0.5) (0.3,0.4) CR SNPs (0.3,0.1) (0.4,0.2) (0.9,0.2) Blood eSNPs CR SNPs 0.6 (0.5,0.4) (0.5, 0.01) 0.7 (0.8,0.4) (0.8,0.2) 0.6 (0.8,0.5) (0.7,0.3) Blood eSNPs CR SNPs 0.0001 (0.001,0.0001) (0.0001,0.00005) 0.005 (0.005,0.005) (0.005,0.005) 0.005 (0.005, 0.0005) (0.001, 0.0005) 0.3 0.4 0.4 Blood eSNPs (0.4,28) (0.9,0.3) (0.6,0.3) CR SNPs (0.2,0.3) (0.4,0.5) (0.4,0.3) 1D Bipolar disorder Coronary artery disease 2D 1D 2D 1D Crohn’s disease 2D 1D Hypertension 2D 1D Rheumatoid 2D 0.000001 0.001 0.00005 Blood eSNPs (0.00005,0.00005) (0.005,0.001) (0.0001,0.00005) CR SNPs (0.00001, 0.00001) (0.001,0.005) (0.000000005,0.00005) Blood eSNPs CR SNPs 0.00001 (0.00001,0.000001) (0.000001,0.00005) 0.005 (0.01,0.005) (0.001,0.005) 0.0001 (0.0005,0.0001) (0.00001,0.00005) 0.1 0.2 0.2 Blood eSNPs (0.05,0.1) (0.3,0.2) (0.4,0.02) CR SNPs (0.7,0.1) (0.8,0.2) (0.6,0.1) 1D Type 1 diabetes 2D 1D Type 2 diabetes 2D Winner’s curse correction Table S3: Optimal P-value thresholds for including SNPs for 1D and 2D PRS for WTCCC data. This table corresponds to the results reported in Figure 3 and Supplemental Table S2. For each disease, we have performed five-fold cross-validation. For each cross-validation, we determined the optimal threshold for 1D PRS and a pair of thresholds for 2D PRS. The reported data were the median of the five cross-validation results. Prediction R2 Nagelkerke R2 AUC Winner’s curse correction Winner’s curse correction Winner’s curse correction NO LASSO MLE NO MLE NO 1D 2.20% 2.54% 2.41% 3.26% 3.68% 3.47% 0.587 0.594 0.590 2D, CR-SNPs 2.56% 2.88% 2.74% 3.81% 4.24% 4.07% 0.596 0.601 0.598 2D, histone SNPs, pancreatic islet 2.57% 2.86% 2.72% 3.79% 4.18% 3.99% 0.597 0.601 0.598 2D, histone SNPs, pancreatic 2.44% 2.86% 2.66% 3.57% 4.18% 3.88% 0.592 0.600 0.597 2D, PT-0.001 SNPs 2.50% 2.64% 2.60% 3.63% 3.83% 3.75% 0.594 0.597 0.595 2D, PT-0.01 SNPs 2.48% 2.68% 2.58% 3.61% 3.89% 3.77% 0.593 0.596 0.595 2D, eSNPs/meSNPs in adipose 2.59% 2.81% 2.73% PRS and high-priority SNPs for 2D PRS Disease Pancreatic cancer Asian lung LASSO MLE 3.75% 4.06% 3.95% 0.593 0.599 0.598 1D 2.35% 2.51% 2.50% 3.15% 3.36% 3.35% 0.586 0.591 0.590 2D, blood SNPs 2.58% 2.63% 2.62% 3.46% 3.51% 3.50% 0.592 0.593 0.592 2D, CR-SNPs 2.42% 2.58% 2.57% 3.24% 3.46% 3.43% 0.588 0.591 0.591 2D, PT-0.01 2.70% 2.72% 2.82% 3.61% 3.64% 3.77% 0.593 0.594 0.595 2D, PT-0.001 2.70% 2.69% 2.76% 3.61% 3.60% 3.69% 0.594 0.594 0.595 2D, H3kme3, HAEC 2.76% 2.74% 2.84% 4.09% 4.07% 4.20% 0.595 0.595 0.596 2D, H3K9-14Ac, HAEC 2.65% 2.69% 2.76% 4.02% 4.07% 4.15% 0.593 0.594 0.596 2.63% 1.29% 2.62% 1.22% 4.01% 3.98% 4.08% 0.591 0.592 0.592 1D 2.55% 1.12% 1.53% 1.78% 1.68% 0.561 0.565 0.563 2D, CR-SNPs 1.34% 1.33% 1.34% 1.84% 1.83% 1.84% 0.568 0.566 0.566 2D, blood eSNPs 1.30% 1.46% 1.36% 1.79% 2.00% 1.87% 0.566 0.569 0.568 2D, H3K4me3, HAEC 1.47% 1.61% 1.57% 2.03% 2.21% 2.17% 0.570 0.574 0.573 2D, H3K9-14Ac, HAEC 1.45% 1.55% 1.47% 1.99% 2.13% 2.02% 0.570 0.573 0.571 2D, histone SNPs, OADMAP bladder 1.46% 1.57% 1.53% 2D, functional SNPs in lung tissues 1.54% 1.64% 1.62% 2.01% 2.13% 2.15% 2.25% 2.11% 2.23% 0.571 0.572 0.572 0.575 0.571 0.573 2D, eSNPs and meSNPs in lung Bladder LASSO Table S4: Prediction R2 (=cor(y,PRS)2), Nagelkerke R2 and AUC in the three cancer GWAS data sets, based on 10-fold cross-validation. PRS and high-prority SNPs for 2D PRS Disease Prediction R2 Winner’s curse correction NO 1D Pancreatic cancer Bladder cancer MLE 0.176 0.104 0.176 0.107 0.145 0.109 0.059 0.071 2D, histone SNPs, pancreatic 0.149 0.063 0.055 2D, PT-0.001 SNPs 0.076 0.125 0.023 2D, PT-0.01 SNPs 0.044 0.131 0.023 2D, eSNPs/meSNPs in adipose 0.132 0.144 0.135 0.220 0.124 2D, CR-SNPs 2D, histone SNPs, pancreatic islet 1D Asian lung LASSO 2D, blood SNPs 0.113 0.111 0.058 2D, CR-SNPs 0.250 0.140 0.052 2D, PT-0.01 0.112 0.092 0.028 2D, PT-0.001 0.078 0.082 0.025 2D, H3kme3, HAEC 0.155 0.114 0.097 2D, H3K9-14Ac, HAEC 0.200 0.221 0.137 2D, eSNPs and meSNPs in lung 0.220 0.121 0.072 1D 0.50 0.15 0.13 2D, CR-SNPs 0.06 0.06 0.02 2D, blood eSNPs 0.15 0.12 0.13 2D, H3K4me3, HAEC 0.08 0.02 0.05 2D, H3K9-14Ac, HAEC 0.09 0.07 0.10 2D, histone SNPs, OADMAP bladder 0.15 0.12 0.15 0.13 0.07 0.11 2D, functional SNPs in lung tissues Table S5: P-values for testing whether a PRS statistically significantly improved the risk prediction for three cancer GWAS. P-values were calculated based on a t-statistic with standard deviation estimated by 10-fold cross-validation. PRS and high-priority SNPs for 2D PRS Disease NO 5×10-6 LASSO 10-3 MLE 10-4 2D, CR-SNPs (10-6,5×10-6) (0.001, 5×10-4) (5×10-4, 10-5) 2D, histone SNPs, pancreatic islet (10-5,5×10-6) (0.002, 10-4) (5×10-4,5×10-6) 2D, histone SNPs, pancreatic (5×10-5, 10-5) (0.005, 5×10-5) (10-3, 5×10-5) 2D, PT-0.001 SNPs (10-5, 10-6) (10-3, 10-4) (5×10-4, 10-5) 2D, PT-0.01 SNPs (5×10-6,5×10-6) (10-3, 5×10-4) (5×10-5, 10-5) (10-5, 5×10-7) (0.002, 10-4) (10-4,5×10-6) 5×10-6 10-4 10-5 (5×10-6, 5×10-7) (5×10-4, 10-4) (10-5, 10-5) 2D, CR-SNPs (7,5×10-6) (5×10-5, 10-4) (10-5, 10-5) 2D, PT-0.01 (10-5, 5×10-7) (5×10-4, 5×10-5) (10-4, 10-5) 1D Pancreatic cancer 2D, eSNPs/meSNPs in adipose 1D 2D, blood SNPs Asian lung 2D, PT-0.001 (10 , 10 ) (10 , 5×10 ) (10-4, 10-5) 2D, H3kme3, HAEC (10-6, 10-6) (5×10-4, 5×10-5) (5×10-6, 10-5) 2D, H3K9-14Ac, HAEC (1,5×10-6) (10-5, 10-4) (5×10-5, 5×10-5) 1D (10-6, 5×10-7) 5×10-6 (5×10-4, 10-4) 10-4 (5×10-6, 10-5) 5×10-5 2D, CR-SNPs (5×10-6, 10-7) (5×10-4, 10-4) (5×10-5, 10-5) (5×10-6,5×10-6) (10-3, 10-4) (10-4, 0.00005) 2D, H3K4me3, HAEC (10-4,5×10-6) (0.005, 10-4) (0.002, 10-5) 2D, H3K9-14Ac, HAEC (10-5, 5×10-7) (0.005, 10-4) (10-3,5×10-6) (10-3, 10-5) (0.005, 5×10-4) (5×10-4, 10-4) 2D, eSNPs and meSNPs in lung 2D, blood eSNPs Bladder Winner’s curse correction 2D, histone SNPs, ROADMAP bladder -5 -7 -3 -5 (10-4, 10-6) (0.002, 10-4) (5×10-4, 10-5) 2D, functional SNPs in lung tissues Table S6: Optimal P-value thresholds for including SNPs for 1D and 2D PRS for three cancers GWAS. This table corresponds to the results reported in Figure 3 and Supplemental Table S4. For each disease, we have performed 10-fold cross-validation. For each cross-validation, we determined the optimal threshold for 1D PRS and a pair of thresholds for 2D PRS. The reported data were the median of the ten cross-validation results. PRS and high-priority SNPs for 2D PRS Disease T2D EUR lung Prostate Prediction R2 Nagelkerke R2 AUC Winner’s curse correction Winner’s curse correction Winner’s curse correction NO LASSO MLE NO LASSO MLE NO LASSO MLE 1D 2.29% 3.10% 2.67% 3.05% 4.13% 3.56% 0.582 0.597 0.590 2D, CR-SNPs 2.73% 3.32% 3.11% 3.64% 4.43% 4.15% 0.594 0.600 0.600 2D, histone SNPs, pancreatic islet 2.58% 3.23% 2.81% 3.44% 4.32% 3.75% 0.590 0.600 0.594 2D, eSNPs/meSNPs 2.58% 3.28% 2.83% 3.44% 4.38% 3.78% 0.587 0.600 0.593 2D, eSNPs/meSNPs and H3K4me3 in islet 2.90% 3.53% 3.13% 3.87% 4.71% 4.17% 0.598 0.605 0.598 2D, eSNPs/meSNPs, CR-NPs 1D 2.92% 1.13% 3.48% 1.12% 3.30% 1.12% 3.89% 4.65% 4.41% 0.594 0.602 0.601 1.52% 1.48% 1.50% 0.564 0.563 0.563 2D, CR-SNPs 1.17% 1.23% 1.16% 1.55% 1.64% 1.55% 0.564 0.564 0.564 2D, eSNPs and meSNPs in lung 1.14% 1.22% 1.13% 1.52% 1.63% 1.51% 0.563 0.566 0.563 2D, eSNPs and meSNPs 1.14% 1.31% 1.12% 1.52% 1.75% 1.49% 0.564 0.571 0.563 2D, PT-0.01 SNPs 1.14% 1.12% 1.14% 1.52% 1.49% 1.52% 0.564 0.563 0.563 2D, PT-0.001 SNPs 1.15% 1.21% 1.14% 1.54% 1.61% 1.52% 0.567 0.567 0.563 2D, H3K4me3, SAEC 1.13% 1.35% 1.21% 1.51% 1.80% 1.61% 0.560 0.569 0.565 2D, eSNPs, meSNPs and H3K4me3 in SAEC 1.14% 1.65% 1.25% 1.52% 1.98% 1.67% 0.566 0.574 0.567 1D 6.94% 6.87% 6.98% 9.43% 9.35% 9.48% 0.654 0.652 0.654 2D, blood eSNPs 6.95% 6.93% 7.15% 9.44% 9.43% 9.72% 0.654 0.653 0.656 2D, CR-SNPs 6.95% 7.05% 6.98% 9.44% 9.58% 9.49% 0.654 0.653 0.654 2D, PT-0.001 6.94% 7.10% 6.98% 9.43% 9.67% 9.49% 0.654 0.655 0.654 2D, PT-0.01 6.94% 7.04% 7.02% 9.43% 9.58% 9.55% 0.654 0.654 0.654 2D, H3K27Ac, -DHT 7.02% 7.10% 7.10% 9.54% 9.65% 9.65% 0.655 0.655 0.655 2D, H3K27Ac, +DHT 6.95% 7.06% 6.98% 9.45% 9.60% 9.48% 0.654 0.654 0.653 2D, TCF7L2 6.96% 6.90% 7.00% 9.45% 9.38% 9.51% 0.654 0.652 0.654 Table S7: Prediction R2 (=cor(y,PRS)2), Nagelkerke R2 and AUC for five large scale GWAS summary statistics with independent validation data. Disease CRC PRS and high-priority SNPs for 2D PRS Nagelkerke R2 AUC Winner’s curse correction Winner’s curse correction Winner’s curse correction NO MLE NO LASSO LASSO MLE NO LASSO MLE 1D 1.37% 1.33% 1.26% 1.93% 1.87% 1.78% 0.571 0.570 0.568 2D, blood eSNPs 1.40% 1.40% 1.41% 1.97% 1.96% 1.98% 0.570 0.571 0.572 2D, CR-SNPs 1.34% 1.33% 1.28% 1.92% 1.86% 1.78% 0.570 0.570 0.568 2D, PT-0.001 1.41% 1.39% 1.32% 1.93% 1.92% 1.81% 0.570 0.571 0.569 2D, PT-0.01 1.38% 1.35% 1.28% 1.97% 1.93% 1.84% 0.571 0.571 0.570 2D, H3K27ac 1.44% 1.47% 1.51% 2.04% 2.07% 2.11% 0.571 0.570 0.571 2D, H3K36me3 1.36% 1.32% 1.31% 1.93% 1.86% 1.84% 0.571 0.570 0.569 2D, H3K4me1 1.40% 1.38% 1.42% 1.98% 1.95% 2.00% 0.571 0.571 0.570 2D, H3K4me3 1.39% 1.33% 1.27% 1.96% 1.88% 1.80% 0.572 0.570 0.569 2D, H3K9ac SCZ Prediction R2 1.38% 1.37% 1.29% 1.96% 1.92% 1.82% 0.571 0.571 0.569 1D 14.01% 14.94% 14.89% 18.75% 19.99% 19.91% 0.717 0.724 0.724 2D, blood eSNPs 14.10% 14.94% 14.91% 18.88% 19.99% 19.94% 0.718 0.724 0.723 2D, CR-SNPs 14.25% 15.37% 15.15% 19.03% 20.56% 20.24% 0.718 0.727 0.725 2D, PT-0.001 SNPs 14.09% 15.00% 14.95% 18.83% 20.02% 20.00% 0.717 0.724 0.724 2D, PT-0.01 SNPs 14.07% 14.97% 14.95% 18.85% 19.99% 19.95% 0.718 0.724 0.724 Table S7–Continued: Prediction R2 (=cor(y,PRS)2), Nagelkerke R2 and AUC for five large scale GWAS summary statistics with independent validation data. PRS and high-priority SNPs for 2D PRS Disease 1D CRC Winner’s curse correction NO LASSO MLE 0.5879 0.7412 2D, blood eSNPs 0.4152 0.4404 0.4245 2D, CR-SNPs 0.6179 0.5755 0.6736 2D, PT-0.001 0.3040 0.4621 0.5941 2D, PT-0.01 0.4638 0.5442 0.6736 2D, H3K27ac 0.3632 0.3556 0.2951 2D, H3K36me3 0.5362 0.6038 0.6239 2D, H3K4me1 0.4207 0.4790 0.3962 2D, H3K4me3 0.4207 0.5793 0.7007 2D, H3K9ac 0.4715 0.5000 0.6631 1.5E-11 2.2E-09 1D SCZ 2D, blood eSNPs 1.7E-01 1.5E-11 6.0E-08 2D, CR-SNPs 2.0E-01 3.2E-10 5.8E-06 2D, PT-0.001 SNPs 3.5E-02 3.1E-10 1.6E-08 2D, PT-0.01 SNPs 2.7E-01 9.9E-10 8.8E-08 0.00173 0.02748 1D T2D EUR lung 2D, CR-SNPs 0.04529 0.00030 0.00313 2D, histone SNPs, pancreatic islet 0.07353 0.00059 0.01513 2D, eSNPs/meSNPs 0.13234 0.00048 0.01890 2D, eSNPs/meSNPs and H3K4me3 in islet 0.01468 0.00002 0.00256 2D, eSNPs/meSNPs, CR-NPs 1D 0.01222 0.00004 0.00038 0.5285 0.5662 2D, CR-SNPs 0.4166 0.3446 0.4300 2D, eSNPs and meSNPs in lung 0.4778 0.3478 0.5000 2D, eSNPs and meSNPs 0.4778 0.2169 0.5199 2D, PT-0.01 SNPs 0.4693 0.5222 0.4668 2D, PL-0.001 SNPs 0.4532 0.3085 0.4715 2D, H3K4me3, SAEC 0.5000 0.1694 0.3581 2D, eSNPs, meSNPs and H3K4me3 in SAEC 0.4878 0.1399 0.3413 0.6306 0.2866 1D Prostate 2D, blood eSNPs 0.4602 0.5173 0.0401 2D, CR-SNPs 0.4327 0.3162 0.3581 2D, PT-0.001 0.5000 0.2767 0.3792 2D, PT-0.01 0.5000 0.3446 0.2692 2D, H3K27Ac, -DHT 0.2119 0.2611 0.1587 2D, H3K27Ac, +DHT 0.4594 0.3222 0.4070 2D, TCF7L2 0.3170 0.5721 0.2209 Table S8: P-values for testing whether a PRS statistically significantly improved the risk prediction for five largescale GWAS summary statistics based on bootstrap. PRS and high-priority SNPs for 2D PRS Disease MLE 0.008 0.01 (0.002, 5×10 ) (0.02,0.005) (0.01, 0.00005) 2D, histone SNPs, pancreatic islet (0.1,0.002) (0.03,0.008) (0.02,0.005) 2D, eSNPs/meSNPs (0.02,0.002) (0.03,0.008) (0.02,0.002) -5 2D, CR-SNPs EUR lung -8 (0.03,0.005) (0.02,0.005) 10-4 (0.02, 0.002) (0.01,0.0001) 10-7 2D, CR-SNPs (10-10, 5×10-9) (0.02, 5×10-4) (10-10, 10-6) 2D, eSNPs and meSNPs in lung (5×10-9, 10-10) (10-4, 5×10-10) (10-7, 10-10) 2D, eSNPs and meSNPs (5×10-9, 10-10) (0.01, 5×10-6) (10-7, 10-10) 2D, PT-0.01 SNPs (10-10, 5×10-9) (5×10-5,5×10-6) (5×10-8, 10-6) 2D, PL-0.001 SNPs (0.001, 5×10-9) (0.002,5×10-6) (5×10-8, 10-6) 1D 2D, blood eSNPs -9 (0.002, 5×10 ) (0.008, 10 ) (0.005, 10-7) (0.001, 10-6) (0.008,5×10-6) (0.005,5×10-6) 5×10-6 0.002 5×10-5 (5×10-6,5×10-6) (0.03,0.005) (5×10-5,0.001) -6 (5×10 , 10 ) (0.02,0.001) (5×10-5, 10-5) 2D, PT-0.001 (5×10-6,5×10-6) (0.06,0.002) (5×10-5, 10-5) 2D, PT-0.01 (5×10-6,5×10-6) (0.02,0.002) (5×10-5, 10-5) 2D, H3K27Ac, -DHT (5×10-4,5×10-6) (0.07,0.002) (0.04, 5×10-5) (10 ,5×10 ) (0.08,0.005) (0.04, 5×10-5) (0.005,5×10-6) (0.002,0.02) (0.005, 5×10-5) 2D, TCF7L2 1D -6 -5 -5 2D, CR-SNPs 2D, H3K27Ac, +DHT -6 0.005 0.008 0.008 2D, blood eSNPs (0.008,0.005) (0.008,0.02) (0.008,0.03) 2D, CR-SNPs (0.008,0.005) (0.01,0.008) (0.008,0.005) 2D, PT-0.001 (0.008,0.005) (0.02,0.008) (0.03,0.008) 2D, PT-0.01 (0.005,0.005) (0.01,0.008) (0.03,0.008) 2D, H3K27ac (0.03,0.005) (0.04,0.008) (0.03,0.005) 2D, H3K36me3 (0.008,0.005) (0.01,0.008) (0.03,0.008) 2D, H3K4me1 (0.005,0.005) (0.01,0.008) (0.03,0.008) 2D, H3K4me3 (0.002,0.005) (0.008,0.008) (0.008,0.005) 2D, H3K9ac (0.005,0.005) (0.01,0.008) (0.01,0.008) 0.2 0.3 0.2 2D, blood eSNPs (0.04,0.2) (0.3,0.3) (0.09,0.2) 2D, CR-SNPs (0.5,0.05) (0.8,0.3) (0.5,0.1) 2D, PT-0.001 SNPs (0.4,0.2) (0.3,0.7) (0.9,0.3) 1D SCZ 0.002 (0.01, 5×10 ) (0.01, 5×10-5) 5×10-9 2D, eSNPs, meSNPs and H3K4me3 in SAEC CRC NO 2D, eSNPs/meSNPs and H3K4me3 in islet 2D, eSNPs/meSNPs, CR-NPs 1D 2D, H3K4me3, SAEC Prostate Winner’s curse correction LASSO 1D T2D P-value thresholds 2D, PT-0.01 SNPs (0.01,0.2) (0.3,0.3) (0.07,0.2) Supplemental Table S9: Optimal P-value thresholds for including SNPs for 1D and 2D PRS for five diseases with large-scale discovery data and independent validation samples. This table corresponds to the results reported in Figure 3 and Table S7. 1D 1D-LASSO 1D- MLE 2D-random 2D-random-LASSO 2D-random-MLE 2D-CR 2D-CR-LASSO 2D-CR-MLE Δ=2 5 × 10−5 5 × 10−4 5 × 10−4 (0.01, 10−5 ) (0.03, 1 × 10−4 ) (0.04, 5 × 10−5 ) (5 × 10−5 ,5 × 10−5 ) (5 × 10−3 , 10−4 ) (5 × 10−3 , 10−4 ) Δ=3 5 × 10−5 5 × 10−4 5 × 10−4 (0.04, 5 × 10−6 ) (0.08, 5 × 10−5 ) (0.08, 10−5 ) (5 × 10−4 , 5 × 10−5 ) (0.002, 10−4 ) (0.002, 10−4 ) Δ=4 5 × 10−5 5 × 10−4 5 × 10−4 (0.08, 5 × 10−6 ) (0.2, 10−4 ) (0.2, 5 × 10−5 ) (10−4 , 10−5 ) (0.001, 10−4 ) (0.001, 5 × 10−5 ) Table S10: Optimal P-value thresholds for including SNPs for 1D and 2D PRS in simulation studies. For each parameter setting, 50 simulations were performed and the P-value thresholds reported in the tables are the median of the 50 simulations. This table corresponds to the results reported in Figure 4. For 2D PRS, the two Pvalue thresholds correspond to the high-priority SNP set and the low priority SNP set. “1D” denotes 1D PRS without winner’s curse correction; “1D-LASSO(MLE)” denotes 1D PRS with LASSO-type (MLE) correction; “2D-random” indicates 2D PRS with functional SNP sets randomly selected from the LD-pruned SNPs in the genome; “2D-CR” indicates 2D PRS using SNPs in conserved regions as functional SNPs. Δ is the enrichment fold change for the high-priority SNPs. A: WTCCC data. Reported values are based on the average of five-fold cross validation. Winner’s curse correction BD CAD CD HT RA T1D NO 0.043 0.021 0.364 0.036 0.710 0.649 LASSO 0.071 0.026 1.111 0.052 1.204 0.867 MLE 0.051 0.024 0.495 0.043 0.905 T2D 0.303 1.255 0.637 0.355 B: For pancreatic cancer, Asian nonsmoking female lung cancer and bladder cancer, the reported values are based on the average of 10 fold cross validation. For other five diseases, the values are based on independent validation samples. Winner’s curse Pancreatic Bladder Lung cancer, Lung cancer, Prostate Colorectal correction cancer cancer Asian EUR T2D Schizophrenia cancer cancer NO 0.67 0.75 0.91 0.66 0.22 0.17 0.60 0.17 LASSO 1.83 1.82 1.88 0.96 0.74 0.31 0.86 0.86 MLE 0.69 0.66 0.99 0.67 0.26 0.23 0.61 0.28 Table S11: Calibration comparison for 1D PRS modeling with or without winner’s curse correction. Reported values are the coefficient of the PRS in the logistic regression. A value close to one represents a well-calibrated prediction model. The calibration results for 2D PRS are similar to 1D PRS and are not reported here. LASSO-type winner’s correction has the smallest bias overall. k 2 3 4 5 k 2 3 4 5 (A) Type-2 diabetes Number of samples (out of 1500 validation samples) with k-fold of populationaverage risk Standard 1D PRS Best PRS Theoretical Empirical Theoretical Empirical calculation calculation calculation calculation 10.4 10 30.4 31 0.1 0 1.3 2 0.0014 0 0.069 0 0.000026 0 0.005 0 (B) Lung cancer in European population Number of samples (out of 1333 validation samples) with k-fold of population-average risk Stanford PRS Best PRS Theoretical Empirical Theoretical Empirical calculation calculation calculation calculation 0.62 0 2.17 4 1.7E-4 0 0.0029 0 6.3E-8 0 0.0000055 0 4.3E-11 0 0.000000017 0 Ratio (bestPRS/standard-PRS) theoretical calculation 2.93 12.52 51.26 190.61 Ratio (bestPRS/standard-PRS) theoretical calculation 3.29 17.37 88.55 405.3 Table S12: Implication of identifying high-risk subjects based on PRS. Here, we calculate the proportion of samples in the general population is identified as high-risk based on a given PRS distribution. For a given PRS, we assume that the PRS risk scores follows a centered normal distribution, i.e. 𝑠~𝑁(0, 𝜎 2 ), with parameters estimated based on validation sample. We first perform calibration by fitting a logistic regression 𝑙𝑜𝑔𝑖𝑡(𝑦|𝑠) = 𝛼 + 𝛽𝑠 to derive 𝛽̂ . The calibrated risk score is then exp(𝛽̂ 𝑠). The average risk in the population is then 𝐴 = ∫ exp(𝛽̂ 𝑠) 𝜙(𝑠; 0, 𝜎)ds. To identify samples with projected risk greater than k-fold of the population average risk, we need to find a cut off 𝑠0 s.t. exp(𝛽̂ 𝑠0 ) = 𝑘𝐴, i.e., 𝑠0 = log(𝑘𝐴) /𝛽̂ . The theoretical proportion of samples is then calculated by 𝑃(𝑠 ≥ 𝑠0 ) assuming a normal distribution 𝑁(0, 𝜎 2 ). Here, we use two datasets, T2D and lung cancer GWAS data, to illustrate the calculation. The parameter 𝜎 was estimated based on the control samples in the validation sample. We calculated the number of samples with k-fold greater risk out of the validation samples using the above theoretical calculations. We also empirically calculate this number based on the PRS in the validation sample. For each disease, we compared our best PRS with the standard 1D PRS without winner’s curse correction or integrating functional data. For T2D, the best PRS is the 2D PRS with eSNPs/meSNPs and H3K4me3 SNPs in pancreatic islet cell line. For lung cancer, the best PRS is the 2D PRS with eSNPs, meSNPs and H3K4me3 SNPs in SAEC. Figure S1: Randomly selected SNPs and SNPs related with conserved genomic regions (CR-SNPs) have different local linkage disequilibrium (LD) pattern. For each given SNP (either randomly selected from LD-pruned SNPs or CR-SNPs after pruning), we counted the number of SNPs located less than 1Mb from the SNP and had 𝑟 2 ≥ 0.8. Shown are the histograms of the LD SNPs for two SNP sets. Mean=6.4 and median=2 for randomly selected SNPs; while mean=22.4 and median=12 for CR-SNPs. Thus, CR-SNPs have a much stronger local LD pattern than randomly selected SNPs. Figure S2: The prediction R2 for four diseases with large-scale discovery samples. For each disease, the left panel reports the 1D PRS R2 with varying p-value threshold; the right panel reports the 2D PRS R2 for the HP SNP set achieving the highest prediction. Additional acknowledgements Funding for GECOO (Genetics and Epidemiology of Colorectal Cancer) Consortium GECCO: National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services (U01 CA137088; R01 CA059045). ASTERISK: a Hospital Clinical Research Program (PHRC) and supported by the Regional Council of Pays de la Loire, the Groupement des Entreprises Françaises dans la Lutte contre le Cancer (GEFLUC), the Association Anne de Bretagne Génétique and the Ligue Régionale Contre le Cancer (LRCC). COLO2&3: National Institutes of Health (R01 CA60987). DACHS: German Research Council (Deutsche Forschungsgemeinschaft, BR 1704/6-1, BR 1704/6-3, BR 1704/6-4 and CH 117/1-1), and the German Federal Ministry of Education and Research (01KH0404 and 01ER0814). DALS: National Institutes of Health (R01 CA48998 to M. L. Slattery). HPFS is supported by the National Institutes of Health (P01 CA 055075, UM1 CA167552, R01 137178, R01 CA151993 and P50 CA127003), NHS by the National Institutes of Health (UM1 CA186107, R01 CA137178, P01 CA87969, R01 CA151993 and P50 CA127003) and PHS by the National Institutes of Health (R01 CA042182). MEC: National Institutes of Health (R37 CA54281, P01 CA033619, and R01 CA63464). OFCCR: National Institutes of Health, through funding allocated to the Ontario Registry for Studies of Familial Colorectal Cancer (U01 CA074783); see CCFR section above. Additional funding toward genetic analyses of OFCCR includes the Ontario Research Fund, the Canadian Institutes of Health Research, and the Ontario Institute for Cancer Research, through generous support from the Ontario Ministry of Research and Innovation. PMH: National Institutes of Health (R01 CA076366 to P.A. Newcomb). VITAL: National Institutes of Health (K05 CA154337). WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C. GECCO: The authors would like to thank all those at the GECCO Coordinating Center for helping bring together the data and people that made this project possible. The authors acknowledge Dave Duggan and team members at TGEN (Translational Genomics Research Institute), the Broad Institute, and the Génome Québec Innovation Center for genotyping DNA samples of cases and controls, and for scientific input for GECCO. ASTERISK: We are very grateful to Dr. Bruno Buecher without whom this project would not have existed. We also thank all those who agreed to participate in this study, including the patients and the healthy control persons, as well as all the physicians, technicians and students. DACHS: We thank all participants and cooperating clinicians, and Ute Handte-Daub, Utz Benscheid, Muhabbet Celik and Ursula Eilber for excellent technical assistance. HPFS, NHS and PHS: We would like to acknowledge Patrice Soule and Hardeep Ranu of the Dana Farber Harvard Cancer Center High-Throughput Polymorphism Core who assisted in the genotyping for NHS, HPFS, and PHS under the supervision of Dr. Immaculata Devivo and Dr. David Hunter, Qin (Carolyn) Guo and Lixue Zhu who assisted in programming for NHS and HPFS, and Haiyan Zhang who assisted in programming for the PHS. We would like to thank the participants and staff of the Nurses' Health Study and the Health Professionals Follow-Up Study, for their valuable contributions as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. The authors assume full responsibility for analyses and interpretation of these data. PMH: The authors would like to thank the study participants and staff of the Hormones and Colon Cancer study. WHI: The authors thank the WHI investigators and staff for their dedication, and the study participants for making the program possible. A full listing of WHI investigators can be found at: http://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Shor t%20List.pdf PanScan I, II and III authors: Brian M. Wolpin1, 2, Cosmeri Rizzato3, Peter Kraft4, 5, Charles Kooperberg6, Gloria M. Petersen7, Zhaoming Wang8, 9, Alan A. Arslan10, 11, 12, Laura Beane-Freeman8, Paige M. Bracci13, Julie Buring14,15, Federico Canzian3, Eric J. Duell16, Steven Gallinger17, Graham G. Giles18, 19, 20, Gary E. Goodman6, Phyllis J. Goodman21, Eric J. Jacobs22, Aruna Kamineni23, Alison P. Klein24, 25, Laurence N. Kolonel26, Matthew H. Kulke1, Donghui Li27, Núria Malats28, Sara H. Olson29, Harvey A. Risch30, Howard D. Sesso4, 14, 15, Kala Visvanathan31, Emily White32, 33, Wei Zheng34, 35, Christian C. Abnet8, Demetrius Albanes8, Gabriella Andreotti8, Melissa A. Austin33, Richard Barfield5, Daniela Basso36, Sonja I. Berndt8, Marie-Christine Boutron-Ruault37, 38, 39, Michelle Brotzman40, Markus W. Büchler41, H. Bas Bueno-de-Mesquita42, 43, 44, Peter Bugert45, Laurie Burdette8, 9, Daniele Campa46, Neil E. Caporaso8, Gabriele Capurso47, Charles Chung8, 9, Michelle Cotterchio48, 49, Eithne Costello50, Joanne Elena51, Niccola Funel52, J. Michael Gaziano14, 15, 53, Nathalia A. Giese41, Edward L. Giovannucci4, 54, 55 , Michael Goggins56, 57, 58, Megan J. Gorman1, Myron Gross59, Christopher A. Haiman60, Manal Hassan27, Kathy J. Helzlsouer61, Brian E. Henderson62, Elizabeth A. Holly13, Nan Hu8, David J. Hunter2, 63, 64 , Federico Innocenti65, Mazda Jenab66, Rudolf Kaaks46, Timothy J. Key67, Kay-Tee Khaw68, Eric A. Klein69, Manolis Kogevinas70, 71, 72, Vittorio Krogh73, Juozas Kupcinskas74, Robert C. Kurtz75, Andrea LaCroix6, Maria T. Landi8, Stefano Landi76, Loic Le Marchand77, Andrea Mambrini78, Satu Mannisto79, Roger L. Milne18, 19, Yusuke Nakamura80, Ann L. Oberg81, Kouros Owzar82, Alpa V. Patel22, Petra H. M. Peeters83, 84, Ulrike Peters85, Raffaele Pezzilli86, Ada Piepoli87, Miquel Porta71, 88, 89, Francisco X. Real90, 91, Elio Riboli44, Nathaniel Rothman8, Aldo Scarpa92, Xiao-Ou Shu34, 35, Debra T. Silverman8, Pavel Soucek93, Malin Sund94, Renata Talar-Wojnarowska95, Philip R. Taylor8, George E. Theodoropoulos96, Mark Thornquist6, Anne Tjønneland97, Geoffrey S. Tobias8, Dimitrios Trichopoulos4, 98, 99, Pavel Vodicka100, Jean Wactawski-Wende101, Nicolas Wentzensen8, Chen Wu4, Herbert Yu77, Kai Yu8, Anne ZeleniuchJacquotte11, 12, Robert Hoover8, Patricia Hartge8, Charles Fuchs1, 54, Stephen J. Chanock8, 9, Rachael S. Stolzenberg-Solomon8, Laufey T. Amundadottir8 1 Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA 3 Genomic Epidemiology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany 4 Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA 5 Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA 6 Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA 7 Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA 8 Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA 9 Cancer Genomics Research Laboratory, National Cancer Institute, Division of Cancer Epidemiology and Genetics, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA 10 Department of Obstetrics and Gynecology, New York University School of Medicine, New York, New York, USA 11 Department of Environmental Medicine, New York University School of Medicine, New York, New York, USA 12 New York University Cancer Institute, New York, New York, USA 13 Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, USA 14 Division of Preventive Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA 2 Division of Aging, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA 16 Unit of Nutrition, Environment and Cancer, Cancer Epidemiology Research Program, Bellvitge Biomedical Research Institute (IDIBELL), Catalan Institute of Oncology (ICO), Barcelona, Spain 17 Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada 18 Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Victoria, Australia 19 Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Victoria, Australia 20 Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia 21 Southwest Oncology Group Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA 22 Epidemiology Research Program, American Cancer Society, Atlanta, Georgia, USA 23 Group Health Research Institute, Seattle, Washington, USA 24 Department of Oncology, the Johns Hopkins University School of Medicine, Baltimore, Maryland, USA 25 Department of Epidemiology, the Bloomberg School of Public Health, Baltimore, Maryland, USA 26 The Cancer Research Center of Hawaii (retired), Honolulu, Hawaii, USA 27 Department of Gastrointestinal Medical Oncology, University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA 28 Genetic and Molecular Epidemiology Group, CNIO-Spanish National Cancer Research Centre, Madrid, Spain 29 Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York, USA 30 Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut, USA 31 Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA 32 Fred Hutchinson Cancer Research Center, Seattle, Washington, USA 33 Department of Epidemiology, University of Washington, Seattle, Washington, USA 34 Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA 35 Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee, USA 36 Department of Laboratory Medicine, University Hospital of Padova, Padua, Italy 37 Inserm, Centre for Research in Epidemiology and Population Health (CESP), U1018, Nutrition, Hormones and Women’s Health Team, F-94805, Villejuif, France 38 University Paris Sud, UMRS 1018, F-94805, Villejuif, France 39 IGR, F-94805, Villejuif, France 40 Westat, Rockville, Maryland, USA 41 Department of General Surgery, University Hospital Heidelberg, Heidelberg, Germany 42 National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands 43 Department of Gastroenterology and Hepatology, University Medical Centre Utrecht, Utrecht, The Netherlands 44 Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, United Kingdom 45 Institute of Transfusion Medicine and Immunology, Heidelberg University, Medical Faculty Mannheim, German Red Cross Blood Service Baden-Württemberg-Hessen, Mannheim, Germany 46 Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany 47 Digestive and Liver Disease Unit, ‘Sapienza’ University of Rome, Rome, Italy 48 Cancer Care Ontario, University of Toronto, Toronto, Ontario, Canada 49 Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada 50 National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, University of Liverpool, Liverpool, United Kingdom 15 51 Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA 52 Department of Surgery, Unit of Experimental Surgical Pathology, University Hospital of Pisa, Pisa, Italy 53 Massachusetts Veteran’s Epidemiology, Research, and Information Center, Geriatric Research Education and Clinical Center, Veterans Affairs Boston Healthcare System, Boston, Massachusetts, USA 54 Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, Massachusetts, USA 55 Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts, USA 56 Department of Pathology, Sidney Kimmel Cancer Center and Johns Hopkins University, Baltimore, Maryland, USA 57 Department of Medicine, Sidney Kimmel Cancer Center and Johns Hopkins University, Baltimore, Maryland, USA 58 Department of Oncology, Sidney Kimmel Cancer Center and Johns Hopkins University, Baltimore, Maryland, USA 59 Laboratory of Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota, USA 60 Preventive Medicine, University of Southern California, Los Angeles, California, USA 61 Prevention and Research Center, Mercy Medical Center, Baltimore, Maryland, USA 62 Cancer Prevention, University of Southern California, Los Angeles, California, USA 63 Harvard School of Public Health, Boston, Massachusetts, USA 64 Harvard Medical School, Boston, Massachusetts, USA 65 The University of North Carolina Eshelman School of Pharmacy, Center for Pharmacogenomics and Individualized Therapy, Lineberger Comprehensive Cancer Center, School of Medicine, Chapel Hill, North Carolina, USA 66 International Agency for Research on Cancer, Lyon, France 67 Cancer Epidemiology Unit, University of Oxford, Oxford, United Kingdom 68 School of Clinical Medicine, University of Cambridge, United Kingdom 69 Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, OH, USA 70 Centre de Recerca en Epidemiologia Ambiental (CREAL), CIBER Epidemiología y Salud Pública (CIBERESP), Spain 71 Hospital del Mar Institute of Medical Research (IMIM), Barcelona, Spain 72 National School of Public Health, Athens, Greece 73 Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy 74 Department of Gastroenterology, Lithuanian University of Health Sciences, Kaunas, Lithuania 75 Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, New York, USA 76 Department of Biology, University of Pisa, Pisa, Italy 77 Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA 78 Oncology Department, ASL1 Massa Carrara, Massa Carrara, Italy 79 National Institute for Health and Welfare, Department of Chronic Disease Prevention, Helsinki, Finland 80 Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan 81 Alliance Statistics and Data Center, Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA 82 Alliance Statistics and Data Center, Department of Biostatistics and Bioinformatics, Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA 83 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands 84 Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, United Kingdom 85 Epidemiology, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA 86 Pancreas Unit, Department of Digestive Diseases and Internal Medicine, Sant’Orsola-Malpighi Hospital, Bologna, Italy Department of Gastroenterology, Scientific Institute and Regional General Hospital “Casa Sollievo della Sofferenza”, Opera di Padre Pio da Pietrelcina, San Giovanni Rotondo, Italy 88 School of Medicine, Universitat Autònoma de Barcelona, Spain 89 CIBER de Epidemiología y Salud Pública (CIBERESP), Spain 90 Epithelial Carcinogenesis Group, CNIO-Spanish National Cancer Research Centre, Madrid, Spain 91 Departament de Ciències i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain 92 ARC-NET: Centre for Applied Research on Cancer, University and Hospital Trust of Verona, Verona, Italy 93 Toxicogenomics Unit, Center for Toxicology and Safety, National Institute of Public Health, Prague, Czech Republic 94 Department of Surgical and Peroperative Sciences, Umeå University, Umeå, Sweden 95 Department of Digestive Tract Diseases, Medical University of Łodz, Łodz, Poland 96 1st Propaideutic Surgical Department, Hippocration University Hospital, Athens, Greece 97 Institute of Cancer Epidemiology, Danish Cancer Society, Copenhagen, Denmark 98 Bureau of Epidemiologic Research, Academy of Athens, Athens, Greece 99 Hellenic Health Foundation, Athens, Greece 100 Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, Prague, Czech Republic 101 Department of Social and Preventive Medicine, University at Buffalo, Buffalo, New York, USA 87 MGS Consortium The Molecular Genetics of Schizophrenia Consortium includes P.V. Gejman, A.R. Sanders, J. Duan (North Shore University Health System and University of Chicago), C.R. Cloninger, D.M. Svrakic (Washington University, St. Louis), N.G. Buccola (Louisiana State University Health Sciences Center, New Orleans), D.F. Levinson, J. Shi (Stanford University, Stanford, Calif.; Dr. Shi is now at the National Cancer Institute), B.J. Mowry (Queensland Centre for Mental Health Research, Brisbane, and Queensland Brain Institute, University of Queensland, Brisbane), R. Freedman, A. Olincy (University of Colorado Denver), F. Amin (Atlanta Veterans Affairs Medical Center and Emory University, Atlanta), D.W. Black (University of Iowa Carver College of Medicine, Iowa City), J.M. Silverman (Mount Sinai School of Medicine, New York), and W.F. Byerley (University of California, San Francisco). 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNAsequencing of 922 individuals. Genome Research 24, 14-24 (2014). Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nature Genetics 45, 1238-U195 (2013). Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476-482 (2011). Marconett, C., Zhou, B., Rieger, M., Selamat, S. & Mickael Dubourd, X.F., Sean K. Lynch, Kimberly D. Siegmund, Benjamin P. Berman, Zea Borok, Ite A. Laird-Offringa. Integrated transcriptomic and epigenomic analysis reveals novel pathways regulating distal lung epithelial cell differentiation. PlosGenet (2013). Hao, K. et al. Lung eQTLs to Help Reveal the Molecular Underpinnings of Asthma. Plos Genetics 8(2012). Shi, J. et al. Characterizing the genetic basis of methylome diversity in histologically normal human lung tissue. Nat Commun 5, 3365 (2014). Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44, 1084-9 (2012). Grundberg, E. et al. Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements. Am J Hum Genet 93, 876-90 (2013). Consortium, T.E.P. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9(2011). Hazelett, D.J. et al. Comprehensive Functional Annotation of 77 Prostate Cancer Risk Loci. Plos Genetics 10(2014).