Determining paternity and gene flow from cherry microsatellite data

advertisement
Determining paternity and gene flow from cherry microsatellite data
T. Connolly1, J. E. Cottrell1, S. P. Vaughan2 and K. Russell3
1
Forest Research, Northern Research Station, Roslin, Midlothian, EH25 9SY,
UK. e-mail: thomas.connolly@forestry.gsi.gov.uk
2
Department of Tropical Plant and Soil Sciences, University of Hawai’i, 3190
Maile Way, Honolulu, HI 98622, USA.
3
Horticulture Research International, East Malling, UK.
Abstract
There are concerns that the increasing fragmentation of today’s woodlands may
result in restricted gene flow between populations. Such isolation could lead to the
loss of genetic diversity as well as to inbreeding depression. The extent of gene flow
and the risk of genetic isolation are therefore important considerations in the
development of appropriate conservation strategies for our increasingly fragmented
forest habitats (Smouse & Sork, 2004).
Gene flow can be estimated by both direct and indirect measures. Indirect
approaches provide an insight into past gene flow and colonisation processes but
often fail to provide information on the effects of recent changes such as landscape
fragmentation. However, the development of the highly variable, codominant
microsatellite markers enable direct estimates of contemporary gene flow to be
obtained. Although these highly polymorphic markers make it technically possible to
study the extent of gene flow, the analysis of such data raises several statistical
issues.
Direct methods use genetic variation in the progeny to identify directly the parental
contribution and to calculate dispersal parameters of pollen or seed. The earliest and
conceptually simplest technique of parentage analysis is exclusion, where
incompatibilities between potential parents and offspring lead to rejection of particular
parent-offspring combinations. Although the allelic diversity obtained using several
microsatellite loci has been found to be large in early studies of paternity in trees
(Dow & Ashley, 1996; Streiff et al., 1999) the exclusion approach has certain
limitations. Under strict exclusion a single mismatch is enough to exclude a candidate
parent and, consequently, genotyping errors, null alleles and mutations can
contribute to false exclusions. These limitations suggest exclusion analysis is best
suited to situations in which there are few candidate parents.
With several loci and many potential parents it becomes necessary to use more
sophisticated statistical analyses. The LOD (log-odds ratio) score represents the ratio
of the likelihood of an individual being the parent of a given offspring and the
likelihood that the potential parent and offspring are unrelated. In an isolated
population, after an exhaustive evaluation of all genetically possible parents,
offspring are assigned to the candidate parent with the highest LOD score (Meagher
& Thompson, 1986; Slate et al., 2000). This approach has been further developed by
Gerber et al. (2000, 2003) in the software program FAMOZ, where parentage in a
non-isolated sub-population of adult trees can be analysed. The total gene flow is
sub-divided into two components; that from either outside or from inside the stand. In
situations in which the genotyped population forms part of a much larger population it
is likely that gene flow from outside the stand is underestimated because foreign and
local gametes are indistinguishable, thus generating an undetected ‘cryptic gene
flow’.
Because the genotyped sub-population forms part of a much larger population a
rationale is set by Gerber et al. (2000) to decide whether a given individual could be
considered as a true parent. Two simulations based on a theoretical, large, random
mating population each generate a LOD frequency distribution curve for N
hypothetical embryos. In the first simulation a set of possible gametes are generated
based on the genotyped parents. In the second simulation the gametes are
generated based on alleles in the genotyped parents according to their frequencies in
the genotyped parent population. FAMOZ requires values for the following simulation
parameters: departure from Hardy Weinberg equilibrium, calculation (mistyping) error
and simulation error. These parameters are varied until the best separation of the two
curves is found.
The LOD threshold intersection value is that at which the two distributions intersect.
A parent is considered to come from inside the stand if it has the highest LOD score
above the threshold. Where no parent LOD score exceeds the threshold the embryo
is considered to be fathered from outside the stand.
The analysis presented here uses data collected from a native stand of wild cherry to
compare estimations of pollen dispersal obtained either by exclusion or on likelihood
and LOD scores using FAMOZ software. The 29 ha study site is situated near East
Malling, Kent, England and consists of 248 wild cherry trees (Prunus avium). A total
of 420 embryos collected from 10 mother trees were analysed. Both adult trees and
embryos were analysed using 13 microsatellite loci. The data were subjected to
paternity analysis by exclusion and by LOD threshold.
Results based on a preliminary set of simulations (N=1000). The parameter values
which resulted in the best separation of the two simulation curves were: departure
from Hardy Weinberg equilibrium = 0, calculation error = 0.005 and simulation error =
0.010. The LOD threshold was at LOD=5 and the correct paternity assignment was
achieved in 82.9% of cases if the tree with the highest LOD score exceeding the
threshold was taken to be the actual father.
LOD threshold is demonstrated to make better use of the data by allowing the identity
of each father to be determined to a known level of accuracy. By using simple
exclusion it was only possible to identify the father unambiguously in 30% of
embryos, compared with 53% in the case of LOD threshold. Although simple
exclusion identified a further 29% of embryos as having multiple candidate fathers,
their paternity remained ambiguous.
Reference
Dow, B.D. and Ashley, M.V. (1996) Microsatellite analysis of seed dispersal and
parentage of saplings in bur oak, Quercus macrocarpa. Molecular Ecology 5, 615627.
Gerber, S., Mariette, S., Bodénès, C. and Kremer, A. (2000) Comparison of
microsatellites and amplified fragment length popymorphism markers for parentage
analysis. Molecular Ecology 9, 1037-1048.
Gerber, S., Chabrier, P. and Kremer, A. (2003) FAMOZ: a software for parentage
analysis using dominant, codominant and uniparentally inherited markers. Molecular
Ecology Notes 3, 479-481.
Meagher, T.R. and Thompson, E. (1986) The relationship between single parent
and parent pair genetic likelihoods in genealogy reconstruction. Theoretical
Population Biology 29, 87-106.
Slate, J., Marshall, T. and Pemberton, J. (2000) A retrospective assessment of the
accuracy of the paternity inference program CERVUS. Molecular Ecology 9, 801808.
Smouse, P.E. and Sork, V.L. (2004) Measuring pollen flow in forest trees: an
exposition of alternative approaches. Forest Ecology and Management 197, 21-38.
Streiff, R., Ducousso, A., Lexer, C., Steinkellner, H., Gloessl, J. and Kremer, A.
(1999) Pollen dispersal inferred from paternity analysis in a mixed oak stand of
Quercus robur L. and Quercus petraea (Matt.) Liebl. Molecular Ecology 8, 831-841.
Download