UP-STAT 2016 Program

advertisement
UP-STAT 2016
Fifth Annual Joint Conference of the Upstate Chapters of the
American Statistical Association
April 22-23, 2016, Canisius College, Buffalo, NY
Data Science, Statistical Practice, and Education
Organizing Committee
Dr. Michael McDermott, University of Rochester (Program Chair)
Dr. Ernest Fokoué, Rochester Institute of Technology (Data Competition Chair)
Dr. Leonid Khinkis, Canisius College
Dr. Yusuf Bilgic, SUNY Geneseo
Mr. Christopher Claeys, KJT, Inc. (Webmaster)
Mr. Padraic Neville, SAS Institute
Dr. John Handley, PARC, Inc.
Dr. Tanzy Love, University of Rochester
Local Organizing Committee
Dr. Leonid Khinkis, Canisius College
Dr. Mel Crotzer,
Dr. Adina Oprisan
Dr. Jeff Miecznikowski
Dr. Bruce Sun
Dr. William Brady
Dr. James Huard
Dr. Debra Burhans
All other Canisius College and ASA Buffalo members
Conference Venue
Canisius College
2001 Main St
Buffalo, NY14208
Most events will take place in the Science Hall building. WIFI is available on campus.
Friday 9-11, Fokue’s tutorial will be in the Horan O'Donnell Science building
Friday's cocktail hour, poster session, and dinner will take place in the Student Center
Conference participants will be able to park in the Science Hall parking ramp
Conference Sponsors
PARC, A Xerox Company – Data Competition Sponsor
Canisius College
Center for Quality and Applied Statistics – Rochester Institute of Technology (RIT)
SAS
State University of New York – College at Geneseo
Revolution Analytics - Microsoft
BlueCross BlueShield of Western New York
M&T Bank
iCitizen
Institute For Autism Research - Canisius College
American Statistical Association (ASA) – Rochester Chapter, Buffalo Chapter, Syracuse Chapter
Conference Website
http://www.up-stat.org/
UP-STAT 2016
PROGRAM OUTLINE
Friday, April 22
8:00-9:00
Registration
9:00-11:00
Tutorials
Science Hall Commons
A Gentle Introduction to Statistical Learning Theory for Data Science Rm HO 109
Ernest Fokoué, Rochester Institute of Technology
Human Factors in Graph Design
Rm SH 1004
Esa M. Rantanen, Rochester Institute of Technology
1:00-3:00
Tutorial
Kaggle Predictive Analytics with Random Forests and Boosted Trees Rm SH 1013 A
Padraic Neville, SAS Institute
1:00-2:00
Tutorial
Analyzing Gravitational Wave Data from the LIGO Open Science Center
John Whelan, Rochester Institute of Technology
2:00-3:00
Rm SH 1013 B
Tutorial
Practical Natural Language Processing
Rm SH 1013 B
Emily Prud’hommeaux, Rochester Institute of Technology
3:15-4:15
Fr. Haus Memorial Mathematics Lecture
Modeling the Effect of Age in Human Performance
Rm SH 1013 AB
Richard De Veaux, C. Carlisle and Margaret Tippit Professor of Statistics
Department of Mathematics and Statistics, Williams College
4:30-5:55
Panel Discussion
Rm SH 1013 AB
The Multiple Facets of Data Science
Ernest Fokoué, Rochester Institute of Technology (Moderator)
Richard De Veaux, Williams College
Reneta Barneva, SUNY Fredonia
John Coles, CUBRC
Beth Sardina, HealthNow
H. David Sheets, Canisius College
Brian Sullivan, M&T Bank
6:00-7:00
Poster Session
Grupp, 2nd floor of Student Center
7:00-9:00
Conference Dinner
Grupp, 2nd floor of Student Center
Saturday, April 23
8:00-9:00
Breakfast/registration/orientation
Science Hall Commons
9:00-9:20
Welcome
Science Hall Commons
9:30-10:15
Parallel sessions
Science Hall Classrooms
Session 1A: Epidemiology / Experimental Design
Session 1B: Advances in Clustering Methodology
Session 1C: Undergraduate/Graduate Statistics Education
Session 1D: Biostatistical Methods
Session 1E: Machine Learning
Session 1F: Geological Applications / Image Generation
10:00-12:00
Rm SH 1013 A
Rm SH 1004
Rm SH 1028
Rm SH 1036
Rm SH 1017
Rm SH 1053
Tutorial
Tidy Data Analysis in R with dplyr, ggplot2, and broom
Rm SH 1013 B
David Robinson, Stack Overflow
10:25-11:10
Parallel sessions
Session 2A: Multiple Outcomes in Biostatistics
Rm SH 1004
Session 2B: High-Dimensional Data Analysis with Dimension Reduction Rm SH 1028
Session 2C: Infectious Disease Modeling
Rm SH 1053
Session 2D: Analytics in Sports
Rm SH 1013 A
Session 2E: Statistics in the Evaluation of Education
Rm SH 1017
Session 2F: Data Science / Computing
Rm SH 1036
11:20-12:05
Parallel sessions
Session 3A: Undergraduate Statistics Education
Session 3B: Health Policy Statistics
Session 3C: Statistical Applications in Physics
Session 3D: Biostatistical Modeling
Session 3E: Statistics in Music
Session 3F: Statistical Applications in Education
Rm SH 1053
Rm SH 1013 A
Rm SH 1004
Rm SH 1036
Rm SH 1028
Rm SH 1017
12:15-1:15
Lunch / Poster Session
1:25-2:35
Session 4: Provost’s welcome and Keynote Lecture
2:40-3:20
Data competition session (three 10-minute presentations)
3:30-4:15
Parallel sessions
Session 5A: Innovative Methods for Missing Data
Session 5B: Use of Computing in Statistics Education
Session 5C: Bioinformatics
Session 5D: Methods to Enrich Statistics Education
Session 5E: Extreme Values / Sparse Signal Recovery
Session 5F: Machine Learning Applications
4:25-5:00
Awards and wrap-up
Science Hall Commons
Rm SH 1004
Rm SH 1028
Rm SH 1053
Rm SH 1013 A
Rm SH 1013 B
Rm SH 1036
Science Hall Commons
UP-STAT 2016
PROGRAM FOR SATURDAY, APRIL 23
SESSION 1A
Room 1013A
Epidemiology / Experimental Design
Session Chair:
9:30-9:50
Donald Harrington
The statistics of suicides
Jennifer Bready
Division of Math and Information Technology, Mount Saint Mary College
9:55-10:15
Order-of-addition experiments
Joseph Voelkel
School of Mathematical Sciences, Rochester Institute of Technology
SESSION 1B
Room 1004
Advances in Clustering Methodology Motivated by Mixed Data Type Applications
Session Organizer/Chair:
9:30-9:45
Rebecca Nugent, Department of Statistics, Carnegie Mellon University
Hitting the wall: mixture models of long distance running strategies §
Joseph Pane
Department of Statistics, Carnegie Mellon University
9:45-10:00
Prediction via clusters of CPT codes for improving surgical outcomes §
Elizabeth Lorenzi
Department of Statistical Science, Duke University
10:00-10:15
Clustering text data for record linkage
Samuel Ventura
Department of Statistics, Carnegie Mellon University
SESSION 1C
Room 1028
Undergraduate/Graduate Statistics Education
Session Chair:
9:30-9:50
Ernest Fokoue
Psychological statistics and the Stockholm syndrome
Susan Mason, Sarah Battaglia, and Sarah Ribble
Department of Psychology, Niagara University
9:55-10:15
Interdisciplinary professional science master’s program on data analytics between SUNY
Buffalo State and SUNY Fredonia
Joaquin Carbonara and Valentin Brimkov
Department of Mathematics, SUNY Buffalo State
Reneta Barneva, Department of Applied Professional Studies, SUNY Fredonia
SESSION 1D
Room 1036
Biostatistical Methods
Session Chair:
9:30-9:50
Matt McCall
Detecting changes in matched case count data three ways: analysis of the impact of longacting injectable antipsychotics on medical services in a Texas Medicaid population
John C. Handley
PARC, Inc.
Douglas Brink, Janelle Sheen, Anson Williams, and Lawrence Dent, Xerox Corporation
Robert Berringer, AllyAlign Health, Inc.
9:55-10:15
Bayesian approaches to missing data with known bounds: dental amalgams and the
Seychelles Child Development Study §
Chang Liu and Sally Thurston
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
SESSION 1E
Room 1017
Machine Learning
Session Chair:
9:30-9:50
Michael McDermott
An introduction to ensemble methods for machine learning §
Kenneth Tyler Wilcox
School of Mathematical Sciences, Rochester Institute of Technology
9:55-10:15
Generalization of training error bounds to test error bounds of the boosting algorithm §
Paige Houston and Ernest Fokoué
School of Mathematical Sciences, Rochester Institute of Technology
SESSION 1F
Room 1053
Geological Applications / Image Generation
Session Chair:
9:30-9:50
Tanzy Love
A new approach to quantifying stratigraphic resolution: measurements of accuracy in
geological sequences
H. David Sheets
Department of Physics, Canisius College
9:55-10:15
Image generation in the era of deep architectures
Ifeoma Nwogu
Department of Computer Science and Engineering, SUNY at Buffalo
SESSION 2A
Room 1004
Strategies for Analyzing Multiple Outcomes in Biostatistics
Session Organizer/Chair:
10:25-10:45
Amy LaLonde, Department of Biostatistics and Computational Biology,
University of Rochester Medical Center
Clustering multiple outcomes via the Dirichlet process prior §
Amy LaLonde and Tanzy Love
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
10:50-11:10
Global tests for multiple outcomes in randomized trials §
Donald Hebert
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
SESSION 2B
Room 1028
New Progress in High-Dimensional Data Analysis with Dimension Reduction
Session Organizer/Chair:
10:25-10:40
Wei Qian, School of Mathematical Sciences, Rochester Institute of
Technology
Consistency and convergence rate for the nearest subspace classifier
Yi Wang
Department of Mathematics, Syracuse University
10:40-10:55
A new approach to sparse sufficient dimension reduction with applications to highdimensional data analysis
Wei Qian
School of Mathematical Sciences, Rochester Institute of Technology
10:55-11:10
Tensor sliced inverse regression with application to neuroimaging data analysis
Shanshan Ding
Department of Applied Economics and Statistics, University of Delaware
SESSION 2C
Room 1053
Statistical Methods and Computational Tools for Modeling the Spread of Infectious
Diseases
Session Organizer/Chair:
10:25-10:45
Samuel Ventura, Carnegie Mellon University
Statistical and computational tools for generating synthetic ecosystems §
Lee Richardson
Department of Statistics, Carnegie Mellon University
10:50-11:10
An overview of statistical modeling of infectious diseases §
Shannon Gallagher
Department of Statistics, Carnegie Mellon University
SESSION 2D
Room 1013A
Analytics in Sports
Session Chair:
10:25-10:45
Tanzy Love
Statistics and data analytics in sports management
Reneta Barneva
Department of Applied Professional Studies, SUNY Fredonia
Valentin Brimkov
Department of Mathematics, SUNY Buffalo State
Patrick Hung and Kamen Kanev
Faculty of Business and Information Technology, University of Ontario Institute of Technology
10:50-11:10
Predictive analytics tools for identifying the keys to victory in professional tennis §
Shruti Jauhari and Aniket Morankar
Department of Computer Science, Rochester Institute of Technology
Ernest Fokoué
School of Mathematical Sciences, Rochester Institute of Technology
SESSION 2E
Room 1017
Statistics in the Evaluation of Education
Session Chair:
10:25-10:45
Ernest Fokoue
An analysis of covariance challenge to the Educational Testing Service
Richard Escobales
Department of Mathematics and Statistics, Canisius College
Ronald Rothenberg
Department of Mathematics, CUNY, Queens College
10:50-11:10
Model, model, my dear model! Tell me who the most effective is
Yusuf K. Bilgic
Department of Mathematics, SUNY Geneseo
SESSION 2F
Room 1036
Data Science / Computing
Session Chair:
10:25-10:45
Padraic Neville
An example of four data science curveballs §
Tamal Biswas and Kenneth Regan
Department of Computer Science and Engineering, SUNY at Buffalo
10:50-11:10
Initializing R objects using data frames: tips, tricks, and some helpful code to get you
started
Donald Harrington
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
SESSION 3A
Room 1053
Undergraduate Statistics Education
Session Organizer/Chair:
11:20-11:45
Rebecca Nugent, Department of Statistics, Carnegie Mellon University
Undergraduate statistics: How do students get there? What happens when they leave?
(and everything in between . . .)
Rebecca Nugent and Paige Houser
Department of Statistics, Carnegie Mellon University
11:45-12:05
Panel discussion
SESSION 3B
Room 1013A
Health Policy Statistics
Session Organizer/Chair:
11:20-11:35
Xueya Cai, Department of Biostatistics and Computational Biology,
University of Rochester Medical Center
Application of two stage residual inclusion (2SRI) model in testing the impact of length of
stay on short-term readmission rate
Xueya Cai
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
11:35-11:50
Health Services and Outcomes Research Methodology (HSORM): an international peerreviewed journal for statistics in health policy
Yue Li
Department of Public Health Sciences, University of Rochester Medical Center
11:50-12:05
Jackknife empirical likelihood inference for inequality and poverty measures
Dongliang Wang
Department of Public Health and Preventive Medicine, Upstate Medical University
SESSION 3C
Room 1004
Statistical Applications in Physics
Session Chair:
11:20-11:40
Tanzy Love
Investigation of statistical and non-statistical fluctuations observed in relativistic heavy-ion
collisions at AGS and CERN energies
Gurmukh Singh
Department of Computer and Information Sciences, SUNY Fredonia
Provash Mali and Amitabha Mukhopadhyay
Department of Physics, North Bengal University, India
11:45-12:05
Analysis of big data at the Jefferson Lab
Michael Wood
Department of Physics, Canisius College
SESSION 3D
Room 1036
Biostatistical Modeling
Session Chair:
11:20-11:40
John C. Handley
Techniques for estimating time-varying parameters in a cardiovascular disease dynamics
model §
Jacob Goldberg
Department of Mathematics, SUNY Geneseo
11:45-12:05
Prediction of event times with a parametric modeling approach in clinical trials §
Chongshu Chen
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
SESSION 3E
Room 1028
Statistics in Music
Session Chair:
11:20-11:40
Michael McDermott
Can statistics make me a millionaire music producer? §
Jessica Young
School of Mathematical Sciences, Rochester Institute of Technology
11:45-12:05
Automatic singer identification of popular music via singing voice separation §
Shiteng Yang
School of Mathematical Sciences, Rochester Institute of Technology
SESSION 3F
Room 1017
Statistical Applications in Education
Session Chair:
11:20-11:40
Ernest Fokoue
Monte Carlo-based enrollment projections
Matthew Hertz
Department of Computer Science, Canisius College
11:45-12:05
Program for International Student Assessment analysis in R §
Kierra Shay
Department of Mathematics, SUNY Geneseo
SESSION 4
Science Commons
Keynote Lecture
Session Chair:
1:25-2:35
Ernest Fokoué, Rochester Institute of Technology
From the classroom to the boardroom – how do we get there?
Richard De Veaux
Department of Mathematics and Statistics, Williams College
SESSION 5A
Room 1004
Innovative Techniques to Address Missing Data in Biostatistics
Session Organizer/Chair:
3:30-3:45
Valeriia Sherina, Department of Biostatistics and Computational Biology,
University of Rochester Medical Center
Challenges in estimating model parameters in qPCR §
Valeriia Sherina, Matthew McCall, and Tanzy Love
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
3:45-4:00
Semiparametric inference concerning the geometric mean with detection limit §
Bokai Wang, Changyong Feng, Hongyue Wang, and Xin Tu
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
4:00-4:15
Two data analysis methods concerning missing data §
Lin Ge
Department of Biostatistics and Computational Biology, University of Rochester Medical Center
SESSION 5B
Room 1028
Use of Computing in Statistics Education
Session Chair:
3:30-3:50
Ernest Fokoué
Teaching programming skills to finance students: how to design and teach a great course
Yuxing Yan
Department of Economics and Finance, Canisius College
3:55-4:15
The biomarker challenge R package: a two stage array/validation in-class exercise §
Luther Vucic, Dietrich Kuhlmann, and Daniel Gaile
Department of Biostatistics, SUNY at Buffalo
Jeremiah Grabowski
School of Public Health and Health Professions, SUNY at Buffalo
SESSION 5C
Room 1053
Bioinformatics
Session Chair:
3:30-3:50
Matt McCall
A short survey of computational structural proteomics research using the Protein Data
Bank (PDB) as the main data set
Vicente M. Reyes
Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology
3:55-4:15
A novel and quick method to power pilot studies for the comparison of assay platforms
under controlled specificity §
Zhuolin He, Ziqiang Chen, and Daniel Gaile
Department of Biostatistics, SUNY at Buffalo
SESSION 5D
Room 1013A
Methods to Enrich Statistics Education
Session Chair:
3:30-3:50
Tanzy Love
Enriching statistics classes in K-12 schools by adding practical ideas from the history of
mathematics and statistics
Mucahit Polat and Celal Aydar §
Graduate School of Education, SUNY at Buffalo
3:55-4:15
Multiple linear regression with sports data §
Matthew D’Amico, Jake Ryder, and Yusuf Bilgic
Department of Mathematics, SUNY Geneseo
SESSION 5E
Room 1013B
Extreme Values / Sparse Signal Recovery
Session Chair:
3:30-3:50
John C. Handley
EXTREME statistics: data analysis for disasters §
Nicholas LaVigne
Department of Mathematics, SUNY Geneseo
3:55-4:15
Minimax optimal sparse signal recovery with Poisson statistics
Mohammad Rohban
Broad Institute of Harvard and MIT
SESSION 5F
Room 1036
Machine Learning Applications
Session Organizer/Chair:
3:30-3:50
Gabriela Olinto, School of Mathematical Sciences, Rochester Institute of
Technology, and Soleo Communications
Evolutionary weighting scheme for random subspace learning §
André Lobato Ramos
School of Mathematical Sciences, Rochester Institute of Technology
3:55-4:15
Natural language processing for automatic keyword extraction and topic summarization §
Gabriela Olinto
School of Mathematical Sciences, Rochester Institute of Technology, and Soleo
Communications
William Consagra
Soleo Communications
§ Indicates presentations that are eligible for the student presentation awards
Download