poster. (Office document, 68kB)

advertisement
SEMINAR
THURSDAY 1ST MAY 2014 14:30 – 15:30
Room OS6 – Oakfield House
Nuala A. Sheehan
Department of Health Sciences, University of Leicester
(Joint work with James Cussens & Mark Bartlett, York)
“Maximum Likelihood Pedigree Reconstruction from Genetic Marker Data”
Abstract
The problem of estimating relationships amongst a certain group of individuals from genetic marker
data (‘pedigree reconstruction’) is highly relevant to areas as diverse as evolution and conservation
research, forensic science identification problems, and epidemiological and genealogical research.
In theory, estimating the pedigree for a given set of individuals from genetic marker data simply
requires consideration of all possible relationships amongst them and then computing the likelihood
for each. Due to the large number of possible pedigrees, brute force enumeration rapidly becomes
impractical. When we consider the joint distribution of genotypes on a pedigree, the reconstruction
problem is one of Bayesian network (BN) learning or, more generally, graphical structure estimation, a
problem known to be NP-hard. The desired graph has structural constraints: it must be acyclic, each
node is known to have exactly two parent nodes (although either may be latent) and these parents
must be of opposite sexes. Maximum likelihood pedigree reconstruction can thus be viewed as a
constrained optimisation problem.
We propose an integer linear programming (ILP) optimisation approach to pedigree learning which is
adapted to find valid pedigrees by imposing appropriate constraints. Under the usual simplifying
conditions, our method, unlike others, is not restricted to small pedigrees and is guaranteed to return
a maximum likelihood pedigree. With additional constraints, we can also search for multiple high
probability pedigrees and thus account for the inherent uncertainty in any particular pedigree
reconstruction. The true pedigree is found very quickly by comparison with other methods when all
individuals are observed. The fact that we have a tool that solves large problems in the simplest
setting bodes well for extensions to smaller but more realistic problems when the complexity is due to
missing and correlated data rather than pedigree size and structure.
Biography
Nuala Sheehan obtained a Bachelor’s degree in Mathematics and French, a Higher Diploma in
Education and a Masters’ degree in Mathematics at University College Dublin (Ireland). She then
moved to Seattle (USA) and studied for a Master’s degree and PhD in Statistics at the University of
Washington. She came to the UK in 1990 as a post-doctoral RA in the School of Mathematical
Sciences at the University of Bath. In January 1993, she moved to the School of Mathematics and
Statistics at the University of Sheffield as a Wellcome Trust Research Training Fellow in Mathematical
Biology. In January 1995, she was appointed to Lecturer in Statistics at the Department of
Mathematical Sciences, Loughborough University. She joined the University of Leicester in July 2000
as a Senior Research Fellow in the Department of Epidemiology and Public Health. In November
2003, she received a Value in People Award, funded by the Wellcome Trust, from the University of
Leicester and was appointed to Senior Lecturer in Statistical Genetics, jointly with the Departments of
Genetics and Health Sciences, in November 2004. She was promoted to Reader in April 2007,
awarded a Leverhulme Research Fellowship in October 2009 and promoted to Professor of Statistical
Genetics in April 2013. Her research is largely focussed on the development of statistical
methodology motivated by problems in genetics. Areas of interest include: graphical modelling of
genetic problems; probability and likelihood calculations on complex pedigrees; estimation of
relationships from genetic marker data; inferring causality from observational epidemiological data via
instrumental variables; and consideration of confidentiality issues for sharing genetic study data.
ALL WELCOME
Download