Project 1 - Department of Statistics

advertisement

PROJECT LIST FOR SUMMER SCHOLARSHIPS

IN STATISTICS 2014/2015

Project 1: Face to face counselling overview and outcome evaluation for Lifeline Aotearoa

Background

Lifeline New Zealand is a leading provider of dedicated community helpline services, face to face counselling and suicide prevention education. The face to face counselling service has been established for over 40 years. There are around 20 well qualified volunteer counsellors at Lifeline, many of whom are also in private practice. Each month, an average of 80-90 sessions are provided to a broad cross section of the community, many of whom cannot afford counselling through other channels. Data has been collected on clients’ demographics, number of sessions, reasons for counselling, expectations and level of risk presented and recently, since August 2013, counsellors have started collecting clinical outcome assessments (CDOI assessment tool including outcome rating scale and session rating scale).

However, the data has been kept in several different places, in paper form, in Access databases and other electronic systems and all in slightly different format. Up to date, none of the data have been evaluated or analysed in a systematic way.

Aim

The project is aimed to evaluate the face to face counselling services including client profile, presenting problems, level of risk, clinical outcomes of the service as well as running a client satisfaction survey.

Method

The project will consist of 4 parts.

Part 1: literature review on the effectiveness and outcome evaluation of face to face community counselling services

Part 2: Data analysis on client profile, service overview and presenting issues including comparisons between groups

Part 3: Data analysis of clinical outcome assessments (repeated measures at each session; around 30 clients with 2-15 assessments each)

Part 4: on-line client survey

Requirements

The ideal student needs to have good knowledge and skills in Excel and SAS/SPSS, some experiences in literature reviews and/or running surveys and strong interest and willingness in working on a real-life project in community health service

Key skills/experiences that the student will gain:

Working at a real life community health service

A deep understanding of research project procedure and process

Survey design and analysis

Data management and manipulation skills using Excel and SAS

Repeated measures analysis

Contact:

Christine Yang Dong, research and clinical engagement manager, Lifeline/ Honorary Research Fellow at

Psychological Medicine Department, University of Auckland (christineD@lifeline.org.nz)

Co-supervisor: David Scott (d.scott@auckland.ac.nz)

Project 2: Maps and graphics for animal populations

This project will investigate creative charts and maps for data from large-scale animal trapping programmes in New Zealand. Ideally, the student will contribute directly to programming in our new

CatchIT software, helping to produce simple, attractive graphics and statistical analyses that will appeal to Mums-and-Dads conservation volunteers throughout the country.

Programming is a key component of this project, which is suitable for students who enjoy writing code and are highly competent in R and/or some other computer language. An interest in ecological applications is also useful.

Contact:

Rachel Fewster (r.fewster@auckland.ac.nz)

Project 3: Statistics Education Research

This summer research project involves exploring how the bootstrap and randomisation methods, including the dynamic visualisations developed for these methods, assist introductory statistics students in gaining access to the underlying concepts of statistical inference. The student would be required to classify assessment questions and undertake exploratory analyses.

Contact:

Dr Marie Fitch (m.fitch@auckland.ac.nz) or

Dr Stephanie Budgett (s.budgett@auckland.ac.nz)

Project 4: Construction of life-course variables for the New Zealand Longitudinal Census (NZLC)

The New Zealand Longitudinal Census has linked individuals across the 1981-2006 New Zealand

Censuses. This enables the assessment of life-course associations with various outcomes; e.g., we are investigating the influence of socio-economic circumstances throughout life on mortality risk. To make best use of this dataset requries that life-course variables are derived by combining data across the six censuses. This project involves using SAS code to construct variables from the Census data. Access to the data is through remote access to the Statistics New Zealand data lab, which is available at the

COMPASS offices at the University of Auckland. Students will need to be familiar with SAS – or be willing to learn it.

Contact:

Roy Lay-Yee (r.layyee@auckland.ac.nz)

Alan Lee (aj.lee@auckland.ac.nz)

Project 5: Developing bias weights for the New Zealand Longitudinal Census (NZLC)

The New Zealand Longitudinal Census (NZLC) has linked individuals across the 1981-2006 New Zealand

Censuses. This enables the assessment of life-course associations with various outcomes; e.g., we are investigating the influence of socio-economic circumstances throughout life on mortality risk. However, as there is incomplete linkage across censuses (i.e. some individuals are able to be linked while others are not), there is the potential for bias if associations among those linked differ from associations in the full population. Early work has suggested that there is evidence of association biases, and early attempts at constructing weights to account for bias reduced but did not eliminate these biases. This project will extend this work to different cohorts within the NZLC, and possible try some different appraches to the construction of weights. Access to the NZLC data is through remote access to the

Statistics New Zealand data lab, which is available at the COMPASS offices at the University of

Auckland. Students will need to be familiar with SAS – or be willing to learn it.

Contact:

Barry Milne (b.milne@auckland.ac.nz)

Alan Lee (aj.lee@auckland.ac.nz)

Project 6: A Topic in Statistical Computing: Constrained Additive Ordination

The VGAM package for R implements several broad classes of general statistical regression models.

Some of the classes are estimated using algorithms that are expensive, therefore there is the need to run compiled code for speed gains. This project is to implement the constrained additive ordination

(CAO) class using Rccp, which is an R package that provides R functions, as well as a C++ library, to facilitate the integration of R and C++. There are several secondary topics that need looking at too, e.g., the switch from LINPACK to LAPACK, and the backchat between optim() and C functions.

The project requires a student with strong programming skills in R and C (C++ may be necessary), is meticulous, and ideally familiar with Linux.

Contact:

Thomas Yee ( t.yee@auckland.ac.nz

)

Project 7: The openapi Project

The global movement towards Open Government Data has lead to a wealth of data being made available to the general public, with huge potential benefits for social, economic, and political empowerment.

However, few individuals possess the knowledge and skills to make use of these data by themselves.

The openapi project is developing a flow-based framework that is primarily aimed at lowering the barriers to use of Open Data by the general public.

This student project will involve testing the software tools developed for the openapi project by using the tools to build visualisations of public data sets, providing feedback on the tools, and ideally helping to improve the tools.

The project will involve writing small scripts (e.g., in R, Python, or

Perl) to tidy, process, and plot data, and writing small XML documents to integrate the scripts with the openapi framework.

Good grades in STATS 220 and STATS 380 are a must. A background in Computer Science would be a bonus.

Contact:

Paul Murrell ( p.murrell@auckland.ac.nz

)

Project 8: Player Lifetimes

I have data detailing the games played and times of first and last games played of all All Blacks players.

A question of interest is how long can players in different positions keep playing, given the number of games played and number of year they have played. The intention is to make predictions as to when new players might be required for different positions.

The project will involve data analysis using R. The student undertaking the project should have a good knowledge of R and have taken STATS 330, STATS 210, and preferably STATS 310 also.

Contact:

David Scott ( d.scott@auckland.ac.nz

)

Project 9: Working with data from conservation monitoring schemes

The CatchIT project aims to help community conservation schemes to manage their data needs, from storing the data to producing outputs like graphics and reports. This involves a lot of hands-on work with data, which will be the focus of this project. Examples of activities will be: helping users to format data sheets; developing routines that can check for consistency and errors; creating pilot charts and maps so that users can check the GPS locations of their tracks and devices; designing simple but effective reports such as catalogues of recent volunteer activity and catch rates; and developing statistical surveillance tests to detect unusual results that can be highlighted to keep volunteers interested in the project - for example, traps that are 'running hot' (making more catches than expected), or surprising results for a particular species or location. The project is likely to involve corresponding directly with users, who are typically from the general public and will not always be very computer-literate or scientifically-minded, but they will be intelligent and engaged with their conservation work. This will require, and develop, highly transferable skills of communicating with laypeople about technical matters.

This project would suit someone with an interest in conservation and communicating statistics to the general public; an enjoyment of helping people with technical issues even if they are sometimes quite easy or routine; and a careful eye for detail and commitment to correctness. Most of the work will be done in R. Advanced expertise in R is not required, but the project can accommodate more or less programming according to the student's abilities and interests. The primary requirement is an interest in the project's aims and values, and in helping to bring some neat statistical analyses to a public audience while protecting them from the nitty-gritty details.

Contact:

Rachel Fewster ( r.fewster@auckland.ac.nz

)

Project 10: Testing data-model fitness in phylogenetics

An continuously elusive question in phylogenetic inference is how to test data-to-model fitness. While there have been omnibus tests available for over 20 years, their applicability was restricted due to issues of power and computability of the test statistics. In a recent work we have argued to assess the fitness of each site (or observation) separately by fitting simultaneous confidence regions.

However, the more appropriate choice would be to do informed multiple testing on the sites and identify those sites for which the test finds evidence against model fitness. The student's task are to investigate a number of potential tests and assess their strength and weaknesses through simulation approaches.

The ideal candidate has some background knowledge in phylogenetics, and/or worked with categorical data before. The student should have some knowledge of statistical computation and simulation strategies to assess power and significance of a test.

Contact:

Steffen Klaere ( s.klaere@auckland.ac.nz

)

Project 11: Statistics Education Research- Literature Review

This summer research project would involve an extensive literature research for a chapter in an

International Handbook of Research in Statistics Education. The chapter is on Re-imaging curriculum

approaches for the next two decades in statistics and probability. Working closely with the supervisor the student would identify and summarise key research papers, identify key areas of research that illustrate the potential and power of the ideas behind new curriculum approaches and extract key principles and implications.

The student should have the ability to read, comprehend, and write good syntheses of literature. The student should also have a good conceptual understanding of statistics from a teaching and learning perspective and an interest in the learning of statistics from the primary to the tertiary level.

Contact:

Maxine Pfannkuch ( m.pfannkuch@auckland.ac.nz

)

Project 12: Modelling Competition and Dispersal in a Statistical Phylogeographic Framework

The processes that govern the spatial distribution of species are complex. Traditional approaches in ecology generally rely on the hypothesis that adaptation to the environment is the main force driving this distribution. We propose an alternative explanation which assumes that species are found in certain places simply because they were the first to colonize these locations during the course of evolution. We have recently designed a stochastic model that explains the observed spatial distribution of species using a combination of dispersal events (i.e., species migrating to new territories) and competition between species [1]. The reliability with which the parameters of our model are estimated is essential from a biological standpoint. In this project, the candidate will run in silico experiments and analyze real data in order to validate the software Phyloland [2] that implements our dispersal-competition model.

[1] Louis Ranjard, David Welch, Marie Paturel and Stéphane Guindon

Modelling Competition and Dispersal in a Statistical Phylogeographic Framework

Syst Biol (2014) 63 (5): 743-752

[2] http://cran.r-project.org/web/packages/phyloland/index.html

Contact:

Stephane Guindon ( s.guindon@auckland.ac.nz

)

Louis Ranjard (l.ranjard@auckland.ac.nz)

Project 13: Accessible graphics for data on maps

This project will look at ways in which people display data about what is happening at different geographical locations to inform scope and design decisions for a data-displayed-on maps module for the iNZight package ( www.stat.auckland.ac.nz/~wild/iNZight/ ) and build some prototypes. In addition to decisions about what gets displayed and how, thinking is needed about input-data formats that should be catered for.

Requirements

Stats 220 and Stats 380. [Some exposure to Geographic Information Systems would be useful background, but not necessary].

Contact:

Chris Wild ( c.wild@auckland.ac.nz

)

Project 14: Multivariate procedures when p>n

Many multivariate procedures involve the inverse of an estimate of the covariance matrix.

Inverting the MLE is not possible when the number of variables exceeds the sample size, as the estimated covariance matrix is singular in this case. Regularisation is a standard way of coping with this situation, but the properties of multivariate procedures using inverses of regularised estimates have not been widely studied. In this project we will explore the properties of some standard procedures that have been modified in this way.

Some familiarity with R is required.

Contact:

Alan Lee ( aj.lee@auckland.ac.nz

)

Project 15: Official statistics and issues in measurement of Maori health outcomes

This project will assess the impact of data quality issues in Maori health statistics, using a combination of cancer registry, mortality collection and Statistics NZ population data.

Background: Ideally, students should have done either STATS 340 or 326.

Contact:

Andrew Sporle (a.sporle@auckland.ac.nz)

Download