UP-STAT 2016 PROGRAM OUTLINE Friday, April 22 8:00-9:00 Registration 9:00-11:00 Tutorials A Gentle Introduction to Statistical Learning Theory for Data Science Ernest Fokoué, Rochester Institute of Technology An Excursion into Modern Big Data Concepts and Computational Models for Topic Discovery Xingchen Yu, University of California Santa Cruz 1:00-3:00 Tutorial Kaggle Predictive Analytics with Random Forests and Boosted Trees Padraic Neville, SAS Institute 1:00-2:00 Tutorial Analyzing Gravitational Wave Data from the LIGO Open Science Center John Whelan, Rochester Institute of Technology 2:00-3:00 Tutorial Practical Natural Language Processing Emily Prud’hommeaux, Rochester Institute of Technology 3:15-4:15 Fr. Haus Memorial Mathematics Lecture Modeling the Effect of Age in Human Performance Richard De Veaux, C. Carlisle and Margaret Tippit Professor of Statistics Department of Mathematics and Statistics, Williams College 4:30-5:55 Panel Discussion The Multiple Facets of Data Science Ernest Fokoué, Rochester Institute of Technology (Moderator) Richard De Veaux, Williams College Reneta Barneva, SUNY Fredonia John Coles, CUBRC Beth Sardina, HealthNow H. David Sheets, Canisius College Brian Sullivan, M&T Bank 6:00-7:00 Poster Session 7:00-9:00 Dinner Saturday, April 23 8:00-9:00 Breakfast/registration/orientation 9:00-9:20 Welcome 9:30-10:15 Parallel sessions Session 1A: Epidemiology / Experimental Design Session 1B: Advances in Clustering Methodology Session 1C: Undergraduate/Graduate Statistics Education Session 1D: Biostatistical Methods Session 1E: Machine Learning Session 1F: Geological Applications / Image Generation 10:00-12:00 Tutorial Tidy Data Analysis in R with dplyr, ggplot2, and broom David Robinson, Stack Overflow 10:25-11:10 Parallel sessions Session 2A: Multiple Outcomes in Biostatistics Session 2B: High-Dimensional Data Analysis with Dimension Reduction Session 2C: Infectious Disease Modeling Session 2D: Analytics in Sports Session 2E: Statistics in the Evaluation of Education Session 2F: Data Science / Computing 11:20-12:05 Parallel sessions Session 3A: Undergraduate Statistics Education Session 3B: Health Policy Statistics Session 3C: Statistical Applications in Physics Session 3D: Biostatistical Modeling Session 3E: Statistics in Music Session 3F: Statistical Applications in Education 12:15-1:15 Lunch / Poster Session 1:25-2:35 Session 4: Provost’s welcome and Keynote Lecture 2:40-3:20 Data competition session (three 10-minute presentations) 3:30-4:15 Parallel sessions Session 5A: Innovative Methods for Missing Data Session 5B: Use of Computing in Statistics Education Session 5C: Bioinformatics Session 5D: Methods to Enrich Statistics Education Session 5E: Extreme Values / Sparse Signal Recovery 4:25-5:00 Awards and wrap-up UP-STAT 2016 PROGRAM FOR SATURDAY, APRIL 23 SESSION 1A Room XXX Epidemiology / Experimental Design Session Chair: 9:30-9:50 ??? The statistics of suicides Jennifer Bready Division of Math and Information Technology, Mount Saint Mary College 9:55-10:15 Order-of-addition experiments Joseph Voelkel School of Mathematical Sciences, Rochester Institute of Technology SESSION 1B Room XXX Advances in Clustering Methodology Motivated by Mixed Data Type Applications Session Organizer/Chair: 9:30-9:45 Rebecca Nugent, Department of Statistics, Carnegie Mellon University Hitting the wall: mixture models of long distance running strategies § Joseph Pane Department of Statistics, Carnegie Mellon University 9:45-10:00 Prediction via clusters of CPT codes for improving surgical outcomes § Elizabeth Lorenzi Department of Statistical Science, Duke University 10:00-10:15 Clustering text data for record linkage Samuel Ventura Department of Statistics, Carnegie Mellon University SESSION 1C Room XXX Undergraduate/Graduate Statistics Education Session Chair: 9:30-9:50 ??? Psychological statistics and the Stockholm syndrome Susan Mason, Sarah Battaglia, and Sarah Ribble Department of Psychology, Niagara University 9:55-10:15 Interdisciplinary professional science master’s program on data analytics between SUNY Buffalo State and SUNY Fredonia Joaquin Carbonara and Valentin Brimkov Department of Mathematics, SUNY Buffalo State Renata Barneva Department of Applied Professional Studies, SUNY Fredonia SESSION 1D Room XXX Biostatistical Methods Session Chair: 9:30-9:50 ??? Detecting changes in matched case count data three ways: analysis of the impact of longacting injectable antipsychotics on medical services in a Texas Medicaid population John C. Handley PARC, Inc. Douglas Brink, Janelle Sheen, Anson Williams, and Lawrence Dent Xerox Corporation Robert Berringer AllyAlign Health, Inc. 9:55-10:15 Bayesian approaches to missing data with known bounds: dental amalgams and the Seychelles Child Development Study § Chang Liu and Sally Thurston Department of Biostatistics and Computational Biology, University of Rochester Medical Center SESSION 1E Room XXX Machine Learning Session Chair: 9:30-9:50 ??? An introduction to ensemble methods for machine learning § Kenneth Tyler Wilcox School of Mathematical Sciences, Rochester Institute of Technology 9:55-10:15 Generalization of training error bounds to test error bounds of the boosting algorithm § Paige Houston and Ernest Fokoué School of Mathematical Sciences, Rochester Institute of Technology SESSION 1F Room XXX Geological Applications / Image Generation Session Chair: ??? 9:30-9:50 A new approach to quantifying stratigraphic resolution: measurements of accuracy in geological sequences H. David Sheets Department of Physics, Canisius College 9:55-10:15 Image generation in the era of deep architectures Ifeoma Nwogu Department of Computer Science and Engineering, SUNY at Buffalo SESSION 2A Room XXX Strategies for Analyzing Multiple Outcomes in Biostatistics Session Organizer/Chair: 10:25-10:45 Amy LaLonde, Department of Biostatistics and Computational Biology, University of Rochester Medical Center Clustering multiple outcomes via the Dirichlet process prior § Amy LaLonde and Tanzy Love Department of Biostatistics and Computational Biology, University of Rochester Medical Center 10:50-11:10 Global tests for multiple outcomes in randomized trials § Donald Hebert Department of Biostatistics and Computational Biology, University of Rochester Medical Center SESSION 2B Room XXX New Progress in High-Dimensional Data Analysis with Dimension Reduction Session Organizer/Chair: 10:25-10:40 Wei Qian, School of Mathematical Sciences, Rochester Institute of Technology Consistency and convergence rate for the nearest subspace classifier Yi Wang Department of Mathematics, Syracuse University 10:40-10:55 A new approach to sparse sufficient dimension reduction with applications to highdimensional data analysis Wei Qian School of Mathematical Sciences, Rochester Institute of Technology 10:55-11:10 Tensor sliced inverse regression with application to neuroimaging data analysis Shanshan Ding Department of Applied Economics and Statistics, University of Delaware SESSION 2C Room XXX Statistical Methods and Computational Tools for Modeling the Spread of Infectious Diseases Session Organizer/Chair: 10:25-10:45 Samuel Ventura, Carnegie Mellon University Statistical and computational tools for generating synthetic ecosystems § Lee Richardson Department of Statistics, Carnegie Mellon University 10:50-11:10 An overview of statistical modeling of infectious diseases § Shannon Gallagher Department of Statistics, Carnegie Mellon University SESSION 2D Room XXX Analytics in Sports Session Chair: 10:25-10:45 ??? Statistics and data analytics in sports management Renata Barneva Department of Applied Professional Studies, SUNY Fredonia Valentin Brimkov Department of Mathematics, SUNY Buffalo State Patrick Hung and Kamen Kanev Faculty of Business and Information Technology, University of Ontario Institute of Technology 10:50-11:10 Predictive analytics tools for identifying the keys to victory in professional tennis § Shruti Jauhari and Aniket Morankar Department of Computer Science, Rochester Institute of Technology Ernest Fokoué School of Mathematical Sciences, Rochester Institute of Technology SESSION 2E Room XXX Statistics in the Evaluation of Education Session Chair: 10:25-10:45 ??? An analysis of covariance challenge to the Educational Testing Service Richard Escobales Department of Mathematics and Statistics, Canisius College Ronald Rothenberg Department of Mathematics, CUNY, Queens College 10:50-11:10 Model, model, my dear model! Tell me who the most effective is Yusuf K. Bilgic Department of Mathematics, SUNY Geneseo SESSION 2F Room XXX Data Science / Computing Session Chair: 10:25-10:45 ??? An example of four data science curveballs § Tamal Biswas and Kenneth Regan Department of Computer Science and Engineering, SUNY at Buffalo 10:50-11:10 Initializing R objects using data frames: tips, tricks, and some helpful code to get you started Donald Harrington Department of Biostatistics and Computational Biology, University of Rochester Medical Center SESSION 3A Room XXX Undergraduate Statistics Education Session Organizer/Chair: 11:20-11:45 Rebecca Nugent, Department of Statistics, Carnegie Mellon University Undergraduate statistics: How do students get there? What happens when they leave? (and everything in between . . .) Rebecca Nugent and Paige Houser Department of Statistics, Carnegie Mellon University 11:45-12:05 Panel discussion SESSION 3B Room XXX Health Policy Statistics Session Organizer/Chair: 11:20-11:35 Xueya Cai, Department of Biostatistics and Computational Biology, University of Rochester Medical Center Application of two stage residual inclusion (2SRI) model in testing the impact of length of stay on short-term readmission rate Xueya Cai Department of Biostatistics and Computational Biology, University of Rochester Medical Center 11:35-11:50 Health Services and Outcomes Research Methodology (HSORM): an international peerreviewed journal for statistics in health policy Yue Li Department of Public Health Sciences, University of Rochester Medical Center 11:50-12:05 Jackknife empirical likelihood inference for inequality and poverty measures Dongliang Wang Department of Public Health and Preventive Medicine, Upstate Medical University SESSION 3C Room XXX Statistical Applications in Physics Session Chair: 11:20-11:40 ??? Investigation of statistical and non-statistical fluctuations observed in relativistic heavy-ion collisions at AGS and CERN energies Gurmukh Singh Department of Computer and Information Sciences, SUNY Fredonia Provash Mali and Amitabha Mukhopadhyay Department of Physics, North Bengal University, India 11:45-12:05 Analysis of big data at the Jefferson Lab Michael Wood Department of Physics, Canisius College SESSION 3D Room XXX Biostatistical Modeling Session Chair: 11:20-11:40 ??? Techniques for estimating time-varying parameters in a cardiovascular disease dynamics model § Jacob Goldberg Department of Mathematics, SUNY Geneseo 11:45-12:05 Prediction of event times with a parametric modeling approach in clinical trials § Chongshu Chen Department of Biostatistics and Computational Biology, University of Rochester Medical Center SESSION 3E Room XXX Statistics in Music Session Chair: 11:20-11:40 ??? Can statistics make me a millionaire music producer? § Jessica Young School of Mathematical Sciences, Rochester Institute of Technology 11:45-12:05 Automatic singer identification of popular music via singing voice separation § Shiteng Yang School of Mathematical Sciences, Rochester Institute of Technology SESSION 3F Room XXX Statistical Applications in Education Session Chair: 11:20-11:40 ??? Monte Carlo-based enrollment projections Matthew Hertz Department of Computer Science, Canisius College 11:45-12:05 Program for International Student Assessment analysis in R § Kierra Shay Department of Mathematics, SUNY Geneseo SESSION 4 Room XXX Keynote Lecture Session Chair: 1:25-2:35 Ernest Fokoué, Rochester Institute of Technology From the classroom to the boardroom – how do we get there? Richard De Veaux Department of Mathematics and Statistics, Williams College SESSION 5A Room XXX Innovative Techniques to Address Missing Data in Biostatistics Session Organizer/Chair: 3:30-3:45 Valeriia Sherina, Department of Biostatistics and Computational Biology, University of Rochester Medical Center Challenges in estimating model parameters in qPCR § Valeriia Sherina, Matthew McCall, and Tanzy Love Department of Biostatistics and Computational Biology, University of Rochester Medical Center 3:45-4:00 Semiparametric inference concerning the geometric mean with detection limit § Bokai Wang, Changyong Feng, Hongyue Wang, and Xin Tu Department of Biostatistics and Computational Biology, University of Rochester Medical Center 4:00-4:15 Two data analysis methods concerning missing data § Lin Ge Department of Biostatistics and Computational Biology, University of Rochester Medical Center SESSION 5B Room XXX Use of Computing in Statistics Education Session Chair: 3:30-3:50 ??? Teaching programming skills to finance students: how to design and teach a great course Yuxing Yan Department of Economics and Finance, Canisius College 3:55-4:15 The biomarker challenge R package: a two stage array/validation in-class exercise § Luther Vucic, Dietrich Kuhlmann, and Daniel Gaile Department of Biostatistics, SUNY at Buffalo Jeremiah Grabowski School of Public Health and Health Professions, SUNY at Buffalo SESSION 5C Room XXX Bioinformatics Session Chair: 3:30-3:50 ??? A short survey of computational structural proteomics research using the Protein Data Bank (PDB) as the main data set Vicente M. Reyes Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology 3:55-4:15 A novel and quick method to power pilot studies for the comparison of assay platforms under controlled specificity § Zhuolin He, Ziqiang Chen, and Daniel Gaile Department of Biostatistics, SUNY at Buffalo SESSION 5D Room XXX Methods to Enrich Statistics Education Session Chair: 3:30-3:50 ??? Enriching statistics classes in K-12 schools by adding practical ideas from the history of mathematics and statistics Mucahit Polat and Celal Aydar § Graduate School of Education, SUNY at Buffalo 3:55-4:15 Multiple linear regression with sports data § Matthew D’Amico, Jake Ryder, and Yusuf Bilgic Department of Mathematics, SUNY Geneseo SESSION 5E Room XXX Extreme Values / Sparse Signal Recovery Session Chair: 3:30-3:50 ??? EXTREME statistics: data analysis for disasters § Nicholas LaVigne Department of Mathematics, SUNY Geneseo 3:55-4:15 Minimax optimal sparse signal recovery with Poisson statistics Mohamed Rohban Broad Institute of Harvard and MIT § Indicates presentations that are eligible for the student presentation awards