Statistics in Biosciences: Statistical Methods for Big Data from Health Science Organizer and Chair: Grace Yi (University of Waterloo) XIHONG LIN, Harvard University The Generalized Higher Criticism for Testing SNP-set Eects in Genetic Association Studies We propose the Generalized Higher Criticism (GHC) to test for the association between a SNP set, e.g., a gene or a network, and a disease outcome in the presence of sparse alternative. The proposed GHC overcomes the limitations of the HC by allowing for arbitrary correlation structures among the SNPs in a SNP-set, while performing accurate analytic p-value calculations for any nite number of SNPs in the SNP-set. We obtain the detection boundary of the GHC test. We compared empirically using simulations the power of the GHC method with existing SNP-set tests over a range of genetic regions with varied correlation structures and signal sparsity. We apply the proposed methods to analyze the CGEM breast cancer genome-wide association study. HONGZHE LI, University of Pennsylvania Sparse Simultaneous Signal Detection and Its Applications in Genomics The increasing availability of large-scale genomic data has made possible an integrative approach to studying disease. Such research seeks to uncover disease mechanisms by combining multiple types of genomic information, which may be collected on multiple sets of patients. I focus on a study that integrates GWAS and eQTL data collected from two dierent sets of subjects to nd transcripts potentially functionally relevant to human heart failure. I formalize a model that denes important transcripts as those whose expression levels are associated with SNPs that are simultaneously associated with disease and propose a new procedure to test for detecting simultaneous signals. I show that the test statistic is asymptotically optimal under certain conditions. I present several applications and extensions. CHARMAINE DEAN & MARK WOLTERS, Western University & Fudan University Parameter Estimation in Autologistic Regression Models for Detection of Smoke in Satellite Images Smoke from forest res is a health hazard that is dicult to study through direct measurement. Images from earth-orbiting satellites provide a potentially valuable data source to catalogue smoke events over space and time. We are developing classiers to segment satellite images into smoke and nonsmoke regions using the autologistic regression model, a Markov random eld model with logistic regression as a special case. The large size of the images (both in terms of pixel count and number of image planes) introduces a variety of computational challenges when using this model. The talk will focus on parameter estimation, comparing alternative approaches and discussing how the goal of the studypredictive accuracy or parameter interpretationmight inuence the choice of estimation method. 1