Multivariate analysis of allozyme variation ... 1 southwest Oregon

advertisement
181
Multivariate analysis of allozyme variation patterns in coastal Douglas-fir from
southwest Oregon1
S. A. MERKLE2
AND
W. T. ADAMS
Department of Forest Science, College of Forestry, Oregon State University, Corvallis, OR, U.S.A. 97331
AND
R. K. CAMPBELL
Pacific Northwest Forest and Range Experiment Station, United States Department of Agriculture, Forest
Service, Corvallis, OR, U.S.A. 97331
Received January 14, 1987
Accepted October 13, 1987
MERKLE, S. A., ADAMS, W. T., and CAMPBELL, R. K. 1988. Multivariate analysis of allozyme variation patterns in
1
coastal Douglas-fir from southwest Oregon • Can. J. For. Res.
18:
181-187.
Isozyme data collected from megagametophytes of coastal Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco var.
menziesii) parent trees, representing 22 southwest Oregon breeding zones, were analyzed by multivariate techniques
to describe the distribution of genotypic variation among and within breeding zones and to relate genotypic and envi­
ronmental variation. Data entered were mean haploid genotype scores obtained by averaging two haploid genotype
scores from each parent tree. Haploid genotype scores were created from 27-locus haploid genotypes of two mega­
gametophytes collected from each of 1230 parent trees. Although principal components analysis did not indicate the
presence of linkage disequilibria among loci, canonical discriminant analysis suggested that much more genotypic variation
may be accounted for by breeding-zone differences than was evident from single-locus techniques. The first two canonical
variables, which accounted for - 250Jo of the genotypic variation, appeared to separate breeding zones on the basis
of geographic and elevational differences among zones. Regressing canonical variable scores against location variables
failed to provide a model attributing > lOOJo of genotypic variation to latitude, elevation, or distance from the ocean.
Although canonical correlation analysis of mean haploid genotype scores with the same location variables produced
two significant canonical variables accounting for 390Jo of the variation, little of the variation accounted for by the
canonical variables was related to location variables. Although these results may be due to the small geographic scale
of the study, the region covered is characterized by extreme environmental heterogeneity, to which variability in seed­
ling quantitative traits has been strongly correlated in a companion common garden study. In sum, multivariate tech­
niques were not markedly better than single-locus techniques in providing evidence that allozyme variation is adaptive
in the coastal Douglas-fir breeding zones studied. Consequently, multivariate techniques cannot be expected to improve
the use of allozymes for certifying seed or for designating breeding zones in this region.
MERKLE, S. A., ADAMS, W. T., et CAMPBELL, R. K. 1988. Multivariate analysis of allozyme variation patterns in
1
coastal Douglas-fir from southwest Oregon • Can. J. For. Res.
18 :
181-187.
Des donnees d'isozyme provenant de megagametophytes d'arbres parents de douglas c6tiers (Pseudotsuga menziesii
(Mirb.) Franco var. menziesii) representant 22 zones de reproduction du sud-ouest de !'Oregon ont ete analysees suivant
des techniques multivariees afin de preciser Ia repartition de Ia variation genotypique entre et parmi ces zones et de
relier celles-ci aux variations environnementales. Les donnees obtenues comprenaientJes comptes genotypes haploldes
moyens obtenus en faisant Ia moyenne de deux comptes pour chaque arbre parent. Ces comptes furent crees a partir
de genotypes haploldes a 27 emplacements de deux megagametophytes recoltes sur chacun des 1230 arbres parents.
Bien que !'analyse des constituants principaux n'indiquait pas Ia presence de desequilibres de lien parmi les emplacements,
!'analyse discriminante canonique a tout de meme suggere qu'une plus grande part de Ia variation genotypique pourrait
etre due aux differences entre les zones de reproduction qu'il ne semblait evident a partir des techniques d'un seul
emplacement. Les deux premieres variables canoniques, qui comptaient pour environ 250Jo de Ia variation genotypique,
semblent pouvoir separer les zones de reproduction sur Ia base de differences geographiques et altitudinales parmi les
zones. Une regression des variables canoniques en fonction des variables d'emplacement n'a pu former un modele capable
d'attribuer plus de lOOJo de Ia variation phenotypique a Ia latitude, !'altitude ou a distance depuis !'ocean. Bien que
!'analyse de correlation canonique des comptes genotypiques haploides moyens avec les memes variables d'emplacement
ait donne deux variables canoniques significatives comptant pour 390Jo de Ia variation, seulement une faible proportion
de Ia variation impliquee par les variables canoniques a pu etre reliee aux variables d'emplacement. Bien que ces resultats
puissent etre dus a Ia petite echelle geographique de !'etude, Ia region couverte est neanmoins caracterisee par une
heterogeneite environnementale extreme, pour laquelle Ia variabilite des traits quantitatifs des semis a ete fortement
correlee au cours d'une etude similaire au champ. Pour resumer, les techniques multivariees n'ont pas ete tellement
meilleures que celles a emplacement unique pour rendre evident que Ia variation allozyme peut s'adapter aux zones
de reproduction du douglas c6tier etudiees. Par consequent, on ne peut s'attendre ace que les techniques multivariees
puissent ameliorer l'emploi des allozymes pour Ia certification des semences ou pour Ia designation de zones de reproduc­
tion dans Ia region.
[Traduit par Ia revue]
1
2
Paper No. 2148, Forest Research Laboratory, Oregon State University, Corvallis.
Present address: School of Forest Resources, University of Georgia, Athens, GA, U.S.A. 30602.
Printed in Canada I lrnprimC au Canada
CAN. J. FOR. RES. VOL. 18, 1988
182
Introduction
Studies of genetic diversity in coastal Douglas-fir
(Pseud otsuga menziesii ( Mirb.) Franco var. menziesil) have
led to opposing conclusions depending on whether the
investigated characteristics were quantitative (seedling) traits
or single-locus (allozyme) markers. Common garden studies
have revealed strong associations between patterns of genetic
variation in quantitative traits and environmental variables,
including evidence of consistent clines over geographic
transects, which suggests that adaptation has influenced the
observed patterns (Irgens-Moller 1967; Griffin and Ching
1977; Hermann and Lavender 1968; Campbell and Sorensen
1978; Loopstra 1984). Allozyme studies, on the other hand,
although indicating high levels of genetic diversity in coastal
Douglas-fir, have shown only weak associations between
allele frequencies and environmental variables (Li 1986 ;
Yang et a/. 1977; Yeh and O' Malley 1980). Furthermore,
< 100oJ of the total genetic diversity as defined by Nei (1973)
in coastal Douglas-fir resides among populations and over
90% within populations (Yeh and O' Malley 1980;
El-Kassaby and Sziklai 1982; Merkle and Adams 1987).
Patterns of genetic variation within and among southwest
Oregon breeding zones have been investigated for both
quantitative traits and single loci. This region is characterized
by variable climate and topography, Climate changes
dramatically from the coast eastward, with decreasing rain­
fall, decreasing winter temperatures, and increasing summer
temperatures (Franklin and Dyrness 1973). Although quan­
titative traits showed strong clines over environmental
gradients and varied substantially among breeding zones
(Loopstra 1984), single loci did not ( Merkle and Adams
1987). Lack of breeding-zone differences in allele frequen­
cies, however, does not necessarily mean that these loci are
neutral to selection pressure. In fact, Lewontin (1984) has
shown that differentiation is much more difficult to detect
statistically at the single-locus than at the quantitative-trait
level.
This inability to detect differences among populations at
the single-locus level has prompted application of
multivariate statistical techniques to the analysis of poly­
morphic marker loci (Smouse eta/. 1982). Whereas conven­
tional analyses restrict the researcher to viewing variation
at individual loci or their average over several loci,
multivariate techniques make it possible to examine variation
in multilocus sets. Furthermore, since genes do not act inde­
pendently of other loci in the genome, examining multilocus
sets may reveal differences among populations not reflected
at individual loci. The use of multivariate techniques may
even provide a crude method for screening coadapted gene
complexes that could then be examined further.
Recently, Conkle and Westfall (1984) have modified the
methods described by Smouse et a/. (1982) to evaluate
allozyme differences among ponderosa pine (Pinus
ponder osa Doug!. ex Laws.) breeding zones in California.
Using canonical correlation analysis, the authors could
account for 49% of the genotypic variation with two canon­
ical variables associated with latitude, longitude, and eleva­
tion. Similarly, in a canonical discriminant analysis of
allozyme data from 17 populations of lodgepole pine (Pinus
contorta Doug!. ex Loud.) in the Yukon and British
Columbia, Yeh eta/. (1985) accounted for 38% of the varia­
tion in 20 polymorphic loci with two significant canonical
functions associated with latitude and elevation. In this
paper, we examine patterns of allozyme variation among
and within breeding zones of coastal Douglas-fir in south­
west Oregon using multivariate techniques. The objectives
were to determine whether multivariate analyses would
uncover more genetic differentiation among zones than was
revealed when the same data were analyzed with conven­
tional single-locus techniques ( Merkle and Adams 1987), and
whether multivariate patterns of allozyme variation are
associated with environmental variables. The ability to
genetically differentiate among Douglas-fir populations with
multivariate techniques would have practical applications
in seed certification and in refining seed- and breeding-zone
boundaries ( Conkle and Westfall 1984).
Materials and methods
Wind-pollinated seeds of 31-72 (mean, 56) trees were collected
from each of 22 Douglas-fir breeding zones in southwest Oregon
(Table I, Fig. 1). Breeding zones, established in regional Douglas­
fir tree improvement programs (Silen and Wheat 1979), are eleva­
tion bands within geographically designated areas called breeding
units. Each breeding zone spans an altitudinal range of < 300 m
(1000 ft) and is generally smaller than 60 000 ha (!50 000 acres).
Subsamples of the 300 + parent trees initially selected for breeding
from wild stands in each zone were chosen for study. Among the
trees for which wind-pollinated seed were stored, individuals were
chosen so as to be distributed as uniformly as possible over the
entire breeding zone. Megagametophytes from two seeds sampled
from each parent tree were analyzed electrophoretically for
27 allozyme loci. Further information on the sampled zones and
trees and details of seed preparation, electrophoretic methods,
allozyme loci, and single-locus population analyses can be found
in Merkle and Adams (1987). In addition, location data compiled
for 1180 of the 1230 parent trees included latitude to the nearest
distance inland from the Pacific Ocean (i.e., distance east
0.001
of 124°30' longitude) to the nearest 0.1 km, and elevation to the
nearest 5 m.
o,
Scoring procedure
Allelic data for each of the megagametophytes were first
transformed to haploid genotype scores in a manner similar to that
employed for diploid genotypes by Smouse et a!. (1982). With
diploid genotypes, the genotype score for an allele is I, 1/2, or
0 depending on whether the allele is homozygous, heterozygous,
or absent in the individual, respectively. Since only haploid
genotype data were available, we modified the scoring procedure
with a method suggested by R.D. Westfall (Pacific Southwest
Forest and Range Experiment Station, United States Department
of Agriculture, Forest Service, personal communication, 1983) in
which a megagametophyte's genotype score for a given locus with
k alleles consists of k-1 positions coded with O's and l's. For the
first k-1 alleles, a 1 indicates that the allele is present, a 0 that it
is absent. Zeros in all k-1 positions indicate that the k'h allele is
present. For example, in the case of a three-allele locus, the two
positions can be filled in one of three ways: I 0, indicating allele
I; 0 I, indicating allele 2; or 0 0, indicating allele 3. Thus the
haploid genotype score for a given locus is a vector of zeros and
ones, and haploid genotype scores for all loci may be combined
into one multilocus vector for each megagametophyte. With this
procedure, we transformed each 27-locus haploid genotype into
a 78-variable haploid genotype score and performed subsequent
analyses using the resulting set of haploid scores.
As noted earlier, each parent tree was represented by only two
megagametophytes; thus, we could not obtain the true diploid
genotype of each parent tree. Although we might have drawn a
single megagametophyte per mother tree - the loss in precision
would have been minimal because the second megagametophyte
of each pair does not represent an independent gamete drawn from
the population - we decided to use both available gametophyte
MERKLE ET AL
183
TABLE 1. Coastal Douglas-fir seed sources, southwest Oregon, for
the multivariate allozyme analysis
from each tree and performed a second set of analyses using the
mean haploid scores. Mean haploid scores based on large numbers
of megagametophytes (i.e., > 7) sampled per mother tree would
Elevation range
(X 100 m)
No. of parent
trees
closely approximate the diploid genotype scores employed by
Smouse et at. (1982). With samples of only two megagametophytes
7.6-10.7
63
57
per mother tree, however, mean haploid scores must be considered
crude approximations of diploid scores because the probability of
detecting both alleles at a heterozygous locus is only 0.50. Despite
limitations in interpreting mean haploid scores in this study, we
Breeding unit
by breeding zone
Butte Falls
BFI
BF2
BF3
10.7-13.7
13.7-16.8
North Gold Beach
GBNI
GBN2
South Gold Beach
4.6
49
4.6- 7.6
58
0 - 4.6
4.6- 7.6
57
57
3.0- 6.1
31
43
0
GBSI
GBS2
Grants Pass
GP l
GP2
Jacksonville
JVl
39
6.1- 9.1
JV2
JV3
North Umpqua
NUl
NU2
NU3
4.6- 7.6
69
7.6-10.7
10.7-13.7
48
49
1.5- 6.1
72
6.1- 7.6
68
72
70
7.6- 9.1
9.1-10.7
10.7-12.2
NU4
NUS
57
South Umpqua
51
49
3.0- 6.1
6.1- 7.6
7.6- 9.1
9.1-10.7
10.7-13.7
SUI
SU2
SU3
SU4
sus
67
62
42
expected multivariate analyses based on these scores to more
effectively discriminate among breeding zones than analyses based
on individual megagametophyte scores. Eliminating the variation
between megagametophytes from the same tree was likely to reduce
genetic variation within breeding zones, relative to that among
zones, such that multivariate analyses would be better able to detect
among-zone variation.
Statistical analysis
The first multivariate procedure was a principal components
analysis (PRINCOMP; SAS Institute, Inc. 1982), applied to both
individual and mean haploid genotype scores without regard to
location or breeding-zone affiliation of parent trees, to determine
if alleles were "clustered" into groups within which the alleles
displayed similar patterns of variation. If any alleles were so
clustered, then variation in haploid scores could be summarized
by fewer principal components than the original 78 variables. We
conducted principal components analysis based on both correlation
and variance-covariance matrices. In four additional steps, we
examined how haploid scores might be associated with the location
of parent trees to establish (I) whether genotypes could be grouped
according to the breeding zones from which they originated, and
(il) whether haploid scores were related to the environment sur­
rounding the parent trees.
To examine haploid scores with respect to breeding zones, we
used canonical discriminant analysis (CANDISC; SAS Institute,
Inc. 1982) to estimate the canonical correlations between haploid
scores and a set of dummy variables coded for breeding zones.
Because
44"
each
resulting
canonical
variable
accounts
for
a
given proportion of the variation in haploid scores, canonical
discriminant analysis provides an index of the proportion of
variation accounted for by differences among breeding zones, the
canonical R2•
OREGON
If the variation in haploid scores results from selection by the
local environment, the average canonical variable scores for con­
tiguous or environmentally similar breeding zones might be
expected to be more similar than those for noncontiguous or envi­
T
ronmentally different breeding zones. To examine the potential
similarities or differences, we first plotted the mean canonical
variable scores of each breeding zone for pairs of canonical
KLAMATH FALLS 124"
122°
FIG. 1. Douglas-fir breeding units in southwest Oregon. Shaded
units were included in this study and are subdivided into two t o
five breeding (elevation) zones (from Merkle and Adams 1987).
variables. Then we conducted a hierarchical cluster analysis of the
Mahalanobis distances (Rao 1973), generated by CANDISC,
between each pair of breeding zones. Three clustering algorithms
were tried: average linkage, complete linkage, and single linkage
(CLUSTER; SAS Institute, Inc. 1982).
Variation among breeding zones might also follow regional gra­
dients because environments in mountainous areas tend to vary
along gradients. We therefore regressed the canonical variable
scores for each significant (p < 0.05) canonical variable against
the location variables latitude (L), distance from the ocean (D),
and elevation (E). The preliminary model for each canonical
2
variable had nine terms (L, D, E, L , IY, E, L x D, L x E,
D x
E) and was modified by stepwise backward elimination
(STEPWISE; SAS Institute, Inc. 1982) to a final model retaining
genotypes in order to compare the results of this analysis with those
of the companion study (Merkle and Adams 1987), which applied
single-locus techniques to the same data set.
only terms significant at p < 0.10. Each final model was tested
for lack of fit (Draper and Smith 1966, p. 26).
The canonical variable scores within each breeding zone used
Following another suggestion by R.D. Westfall (personal com­
munication, 1985), we computed mean haploid genotype scores
in the above regression were adjusted by CANDISC to the breeding­
zone mean. Because the adjustment is based on a linear combina­
tion chosen to maximize differences among breeding zones, it could
for each maternal tree based on the two megagametophytes sampled
CAN. J. FOR. RES. VOL. !8, 1988
184
TABLE 2. Correlation coefficients generated from canonical discriminant
analysis of mean haploid genotype scores and breeding zones for the first four
significant (p < 0.0001) canonical variables
Canonical variable
Locus
Fest
Pgml
Pgm2
Pgil
Pgi2
Lapl
Lap2
Pep2
Gotl
Got2
Got3
G6pd
Gly
Cat
Gdh
Sod
Mpi
Fhex
6pgd
Acol
Aco2
Idh
Dia
Mdhl
Mdh2
Mdh3
Mdh4
Eigenvalue
Percent of variance
Canonical correlation
CAN!
CAN2
CAN3
CAN4
0.0419
0.0276
0.1310
0.0792
-0.0711
0.2137
-0.1045
0.1857
0.2067
0.2881
0.0376
0.1087
0.0751
0.3339
-0.1215
-0.1803
0.1430
-0.2236
0.1303
0.1152
0.0536
0.0281
0.0440
0.0053
0.0760
0.1392
0.0172
0.0212
0.0973
0.0796
0.1023
0.0106
-0.0448
0.0291
0.0734
0.1240
0.8425
-0.0599
0.0349
0.0383
0.0820
0.2888
15.50
0.4734
-0.0547
0.1515
0.2547
-0.1716
-0.0758
0.0753
0.0769
-0.1990
0.1835
0.0981
-0.1940
0.0710
-0.1076
0.2280
0.2277
-0.0599
-0.0735
0.0663
0.1154
0.1229
0.1896
0.1295
0.1483
0.0307
0.0707
0.1624
0.0725
0.2446
0.0584
-0.1083
0.2689
0.0399
0.1303
-0.1239
-0.0447
0.0299
0.1066
0.0037
-0.1360
-0.1079
0.0687
0.0842
0.1186
-0.0449
0.0979
0.1911
-0.0486
-0.1381
0.1089
0.1424
-0.2620
0.1925
0.0622
0.0572
-0.2313
0.1809
-0.1333
0.1886
10.12
0.1696
9.10
0.1405
7.54
0.3984
0.3808
0.3510
0.1498
0.1317
-0.0304
0.1392
0.2569
0.0762
0.1073
0.1077
0.0109
0.0856
NoTE: Only the largest correlation coefficient at each locus is shown.
either camouflage or accentuate the relationship between the
haploid scores and location variables, depending on whether those
maximum differences were more closely associated with geographic
or other factors. To check these possibilities, we compared results
from the CANDISC analysis with those from a canonical correla­
tion analysis (CANCORR; SAS Institute, Inc. 1982), in which the
nine permutations of L, D, and E composed the X matrix and the
haploid scores the Y matrix.
Results
Results from the analyses based on the original haploid
genotype scores and on the mean haploid genotype scores
were very similar. However, as expected, slightly more
genetic variation could be attributed to breeding zones on
the basis of mean haploid scores. Therefore, we report here
results based on mean haploid scores, unless otherwise
indicated.
Examination of the eigenvectors generated by principal
components analysis of either the correlation or variance­
covariance matrix of haploid scores showed that none of
the principal components reflected variation in more than
two or three loci. For example, the first principal compo­
nent had relatively high coefficients (or "loadings") only
for alleles of the G6pd, Gly, and Cat loci. Evidently, few
clusters of alleles varied together in the same pattern. In the
analysis based on the correlation matrix, the first principal
component (which had the largest eigenvalue and thus
explained the largest proportion of the variation) accounted
for only 30Jo of the variation in haploid scores. The first
48 of the 78 principal components accounted for 90% of
the variation. In the analysis based on the variance­
covariance matrix, the first principal component accounted
for 10% of the variation and the first 20 principal com­
ponents for 90%.
The first 4 of the 21 canonical variables generated by
canonical discriminant analysis were highly significant (p <
0.0001), accounting for 42% of the variation in haploid
scores (Table 2). Canonical R2 values indicated that the
breeding-zone contributions to explaining variation in these
scores ranged from 2 2% in the first canonical variable to
12% in the fourth canonical variable. The other 17 canonical
variables were not significant (p > 0.05) The first canonical
variable was dominated by the Dia locus (Table 2). However,
the second, third, and fourth canonical variables were
associated with several loci, each having correlation coeffi­
cients of 0.20 or higher with alleles at 5, 4, and 4 loci, respec­
tively (Table 2). The same analysis performed with the
original haploid genotype scores resulted in four highly
significant (p < 0.0001) canonical variables accounting for
only 36% of the variation in scores. On the basis of
.
MERKLE ET AL.
CAN2
1.3
1.5
1.2
1. 0
•NU3
L&J
L&J
(!)\..)
z
<t
0.5
•BFI
a::;:!
a
GPI
1 5
- .
-1.0
05
- . .NU5
0.5
SUS
•JV2
·0.5
1.0
1.1
1.0
L&J<I)
•SU2
CAN I
·2.0
185
GP2
1.5
2.0
•SU3
•
GBSI
<I)
cwill
NO
-z
0.9
z:::!i
0.5
canonical R2 values obtained from this analysis, we found
that the breeding-zone contributions to this variation ranged
from only 13% for the first canonical variable to 80/o for
the fourth canonical variable (data not shown).
Plotting the breeding-zone mean canonical scores for the
first and second canonical variables, which together
represented 25.6% of the variation in haploid scores, also
reflected the moderate relationship of genotype variation
with breeding zones ( Fig. 2). The first canonical variable
(horizontal axis) appeared to partially separate the breeding
zones according to breeding unit. For example, the Grants
Pass zones (GP1, GP 2) were widely segregated from all
others. The first canonical variable also separated breeding
units from the west side of the Cascade Mountains (North
Umpqua (NU), South Umpqua (SU), Grants Pass) in a
north-south fashion, the higher values for the southernmost
unit, Grants Pass, the lower values for the northernmost
unit, North Umpqua. The second canonical variable (vertical
axis) seemed more strongly associated with elevation zones.
For example, the three Jacksonville zones (JV1-JV3), which
were not separated at all by the first canonical variable, were
spread from the highest elevation zone to the lowest by the
second canonical variable, the highest zone (JV3) having the
lowest value. Breeding zones within the North Umpqua and
South Umpqua (SU1-SU5) units were separated by the
second canonical variable but showed no clinal pattern with
elevation ( Fig. 2).
Average-linkage clustering, based on Mahalanobis dis­
tances between pairs of breeding zones, gave little evidence
to relate geographic or environmental differences to varia­
tion in haploid scores. For example, some Butte Falls (BF)
and Gold Beach (GB) zones, widely separated both geo­
graphically and environmentally, tended to cluster together
relatively early ( Fig. 3). However, the Grants Pass breeding
unit seemed separated genetically from the other units in
that its two zones clustered together last with the remaining
zones. One Jacksonville zone (JV3) also clustered with the
other zones very late. Maximum- and minimum-linkage
clustering resulted in dendrograms similar to that of Fig. 3.
Of the location variables, only distance from the ocean
was significantly associated with scores from all four signifi­
cant canonical variables (Table 3). In no case did the regres-
l
0.4
-1.5
FIG. 2. Mean canonical variable scores of 22 southwest Oregon
Douglas-fir breeding zones plotted for the first two canonical
variables. See Table I for breeding-zone abbreviations.
f
0.6
-1.0
•JV3
Of?,
...J<{ 0.8
<{
...J
::!!:<{
0.7
O::J:
0<{
N-
N<tl
-
N
<I)N
-
BREED! NG ZONE
N -ZN Z N
-
FIG. 3. Average-linkage clustering of 22 southwest
Douglas-fir breeding zones, based on a matrix of Mahalanobis
distances. See Table 1 for breeding-zone abbreviations.
TABLE 3. Outcome of regressing canonical variable scores for the
first four canonical variables against the location variables latitude
(L), distance from the ocean (D), and elevation (E)
Canonical
variable
CAN!
Location variable
(significance)
CAN3
lack of fit
Canonical
R2
< 0.05
0.11
>0.25
0.10
>0.05
0.07
>0.25
0.05
L (0.0001)
D (0.0001)
2
L (0.0001)
CAN2
Probability of
d
(0.0001)
D (0.0063)
(0.0064)
(0.0204)
LX E (0.0005)
DxE (0.0001)
D (0.0001)
d (0.0001)
2
E (0.0059)
LxD (0.0001)
Lx£(0.0001)
DxE (0.0004)
CAN4
D (0.0037)
2
L (0.0005)
d
(0.0001)
(0.0050)
LxD (0.0001)
D xE
(O.OO I I)
sion models account for more than 11% of the variation
in canonical variable scores. Nevertheless, lack of fit (Draper
and Smith 1 966) was significant (p < 0.05) only for the
regression model relating geographic variables to scores for
the first canonical variable.
Canonical correlation of haploid scores with location
variables produced two significant (p < 0.01) canonical
correlations. The canonical variables of these first two
correlations accounted for 3 90Jo of the variation in haploid
scores (Table 4). However, the canonical R2 values
associated with the canonical variables indicated that only
16% and 13% of the variation accounted for by the first
and second canonical variables, respectively, could be
attributed to the effects of the location variables. Both
CAN.
186
J. FOR. RES. VOL 18, 1988
TABLE 4. Correlation coefficients generated by canonical
correlation analysis of mean haploid genotype scores with location
variables for the first two significant canonical variables
Canonical variable
CAN!
CAN2
0.0675
0.0465
-0.121 I
However, this study provides evidence that canonical
discriminant analysis may be more sensitive than single-locus
methods to revealing allozyme variation associated with
geography. Canonical discriminant analysis assigned a larger
proportion of genotypic variation to breeding zones than
0.2348
0.0838
0.0170
-0.1464
-0.0418
0.0797
0.0783
-0.2119
-0.2664
-0.0591
0.0194
0.1243
0.0825
0.1455
0.0092
0.0665
about 22% could be attributed to breeding-zone differences
0.1379
0.1518
-0.1410
genetic variation could be assigned to breeding-zone differ­
0.1100
0.1541
Pep2
Got/
Got2
Got3
G6pd
G/y
Cal
Gdh
Sod
Mpi
Fhex
6pgd
A col
Aco2
Idh
Dia
0.1287
-0.0426
0.0237
0.1971
0.1416
0.5815
0.0354
0.0494
0.1105
-0.0410
Mdhl
Mdh2
Mdh3
Mdh4
gene diversity analysis for subdivided populations, 0.730Jo
-0.2284
0.1670
0.1017
0.1268
-0.2645
0.1420
Lap/
Lap2
-0.1322
0.0423
-0.0321
-0.1967
0.0265
0.0852
-0.0802
0.0969
-0.1280
0.2874
-0.0313
0.1839
0.1730
0.1158
of the variation in allele frequencies could be attributed to
variation among breeding zones-0.220Jo to breeding units,
0.51% to elevation bands within breeding units (Merkle and
Adams 1987). In our canonical discriminant analysis, on the
other hand, the first canonical variable accounted for over
15% of the variation in mean haploid scores, of which
(Table 2). Thus, 3.3% (0.15 x 0.22 x 100%) of the total
ences in the first canonical variable alone.
Although it was possible through canonical discriminant
analysis to assign a larger proportion of allozyme variation
to differences among breeding zones, the patterns of varia­
tion revealed were only weakly associated with geographic
variables. Cluster analysis based on Mahalanobis distances
between pairs of breeding zones gave results quite similar
to those found when clustering was based on a matrix of
Nei's (1978) unbiased genetic distances involving the same
loci (Merkle and Adams 1987). In both analyses, breeding
zones from the same unit did not generally cluster together,
and two zones from environmentally dissimilar breeding
units clustered relatively early. The Mahalanobis distance­
based clustering procedure improved upon that based on
Nei's distances only in the segregation of the Grants Pass
zones, as a pair, from the zones of the other breeding units.
(D)
Elevation (E)
L2
LxD
LxE
DxE
0.2710
0.1388
0.0070
0.4173
0.2844
0.0375
0.3653
0.3167
0.1658
0.6452
0.2669
0.3278
0.5508
related genotypic and environmental variation.
Thus, despite evidence of genotypic variation among
breeding zones of southwest Oregon Douglas-fir, the results
0.6126
tion is of adaptive significance with regard to the geographic
0.4876
-0.0470
0.0001
Significance
Similarly, neither regression of canonical variable scores
against location variables nor canonical correlation analysis
0.2373
-0.3908
0.1939
21.64
0.4030
Eigenvalue
Percent of variance
Canonical correlation
Noa;o
included in this study are weak or nonexistent.
did single-locus techniques. When we applied Nei's (1973)
Locus
Fest
Pgml
Pgm2
Pgil
Location variable
Latitude (L)
Distance to the ocean
number of loci suggest that linkage disequilibria among loci
0.1558
17.38
0.3671
0.0012
Only the largest correlation coefficient at each locus is shown.
canonical variables were associated with latitude, its square,
suggest that no more than a minor proportion of the varia­
variables tested. Furthermore, because multivariate analyses
could not clearly differentiate among breeding zones,
application of such techniques does little to improve the utility
of allozymes for certifying seed or designating breeding
zones in Douglas-fir.
Our results contrast quite strongly with those reporting
multivariate analyses of allozyme variation patterns in two
other conifer species. Using canonical correlation analysis
on ponderosa pine from 12 seed zones in California, Conkle
and the interactions of latitude and distance from the ocean
and of latitude and elevation. As with the results of
and Westfall (1984) found that 490Jo of the variation in
canonical discriminant analysis, the first canonical variable
by just two canonical variables, both of which were asso­
was dominated by the Dia locus; the second canonical
ciated with geographic variables. These authors concluded
variable was correlated with alleles at three loci (Table 4).
Discussion
The application of multivariate techniques to isozyme data
provides only a crude analysis of interlocus associations.
diploid genotype scores for 30 loci could be accounted for
that multivariate analysis could be a very effective tool for
correlating allozyme variation in ponderosa pine with
geographic origin of seed and, further, that geographic
variation in multilocus genotypes made it possible to assign
individual trees to correct seed zones with high precision.
Nevertheless, the inability of individual principal com­
Similarly, Yeh et at. (1985), working with 17 populations
ponents to account for more than a few percent of the varia­
of lodgepole pine in British Columbia and the Yukon,
tion in mean haploid genotype scores and the fact that the
reported that two significant canonical discriminant func­
principal components were dominated by alleles at a low
tions accounted for 38% of the variation in 20 polymorphic
MERKLE ET AI..
loci. The populations' contributions to the variance
accounted for by the first and second canonical variables
were 280Jo and 2 7%, respectively. Although we did find some
significant correlations between sets of allozyme and loca­
tion variables in southwest Oregon coastal Douglas-fir, our
canonical correlation and discriminant analyses could not
account for such large proportions of variation with so few
canonical variables as in the ponderosa and lodgepole pine
studies. Nor did our canonical discriminant analysis assign
such large proportions of the variation to breeding-zone
differences as could be assigned to population differences
in the lodgepole pine study.
Several factors could have contributed to the disparity
between our results and those reported in the two pine
studies. For example, our study did not cover as
a
geographic area as the other investigations. The latitudinal
range of our study area was only 1.5°, whereas the ranges
for the ponderosa and lodgepole pine studies were 7 and 1 oo,
respectively. Thus, our study results could be interpreted as
evidence that the multivariate techniques we applied are not
useful on such a small geographic scale and therefore no
conclusions can be drawn regarding the adaptive significance
of the genetic variation. However, in a companion common
garden study using the same families from this small region
(Loopstra 1984), seedling quantitative traits showed strong
clines over the same environmental gradients as tested here.
Another difference between our study and the two pine
studies was the type of isozyme data analyzed. As noted
earlier, only haploid genotypes were available to us,
although we did have two haploid genotypes from each
mother tree. Thus we could not make use of the genetic scor­
ing method of Smouse et at. (1982), which is based on
diploid genotypes. Although we eliminated within-tree varia­
tion by averaging the two haploid genotype scores, thereby
reducing the within-breeding-zone component of variance,
the resulting mean haploid scores were only crude approx­
imations of diploid scores. More precise and complete data
were available in the ponderosa and lodgepole pine analyses
because diploid genotypes were employed.
Finally, differences in the study results may actually reflect
basic differences in the genetic structure of the species at
the allozyme level. Allozyme variation may simply not be
as sensitive to geographic diversity in coastal Douglas-fir as
it appears to be in ponderosa and lodgepole pine. This
conclusion is supported by the results of a recent study on
range-wide patterns of allozyme variation in Douglas-fir (Li
1 986). Although differences between the coastal and inland
{var. glauca) varieties were quite distinct at the allozyme
level, patterns of allozyme variation among populations
representing the entire range of the coastal variety correlated
poorly with geographic variables. Thus, it is not surprising
that our southwest Oregon study also failed to provide
evidence of the impact of selection on allozyme variation.
Acknowledgments
The authors thank Robert Westfall for sharing his exper­
tise, and Allan Doerksen, Carol Loopstra, Deborah Smith,
and Louise Cremo for technical assistance. Financial sup­
port for this research was provided by the United States
Department of Interior, Bureau of Land Management, and
the United States Department of Agriculture, Forest Service,
under the auspices of the Southwest Oregon Forestry Inten­
sified Research (FIR) Program (grant No. PN W-80- 85).
187
CAMPBELL, R.K., and SORENSEN, F.C. 1978.
Effect of test
environment on expression of clines and on delimitation of seed
zones in Douglas-fir. Theor. Appl. Genet. 51: 233-246.
CONKLE, M.T., and WESTFALL, R.D. 1984. Evaluating breeding
zones for ponderosa pine in California.
In
Progeny Testing:
Proceedings of Servicewide Genetics Workshop, Charleston, SC,
December 5-9, 1983. USDA Forest Service, Washington, DC.
pp. 89-98.
DRAPER, N.R., and SMITH, H. 1966. Applied regression analysis.
John Wiley & Sons, New York.
EL-KASSABY, Y.A., and SZIKLAI, 0. 1982 . Genetic variation of
allozyme and quantitative traits in a selected Douglas-fir
(Pseudotsuga menziesii var. menziesii
(Mirb.) Franco) popula­
tion. For. Ecol. Manage. 4: 115-126.
FRANKLIN, J.F., and DYRNESS, C.T. 1973. Natural vegetation
of Oregon and Washington. U.S. Dep. Agric. For. Serv. Gen.
Tech. Rep. PNW -8.
GRIFFIN, A.R., and CHING, K.K. 1977. Geographic variation
in Douglas-fir from the coastal ranges of California. I. Seed,
seedling growth and hardiness characteristics. Silvae Genet.
26: 149-156.
HERMANN, R.K., and LAVENDER, D.P. 1968. Early growth of
Douglas-fir from various altitudes and aspects in southern
Oregon. Silvae Genet. 17(4): 143-151.
1RGENS-MOLLER, H. 1967. Patterns of height-growth initiation
and cessation in Douglas-fir. Silvae Genet. 16: 56-58.
LEWONTIN, R.C. 1984. Detecting population differences in quan­
titative characters as opposed to gene frequencies. Am. Nat.
123: 115-124.
Ll, P. 1986. Range-wide patterns of allozyme variation in
Douglas-fir
{Pseudotsuga menziesii (Mirb.) Franco]. M.S. thesis,
Oregon State University, Corvallis.
LOOPSTRA, C. 1984. Patterns of variation within and among
breeding zones of Douglas-fir in southwest Oregon. M.S. thesis,
Oregon State University, Corvallis.
MERKLE, S.A., and ADAMS, W.T. 1987. Patterns of allozyme
variation within and among Douglas-fir breeding zones in
southwest Oregon. Can. J. For. Res. 17: 402-407.
NEI, M. 1973. Analysis of gene diversity in subdivided popula­
tions. Proc. Nat!. Acad. Sci. U.S.A. 70: 3321-3323.
___
1978. Estimation of average heterozygosity and genetic
distance
from a
small
number
of
individuals. Genetics.
89: 583-590.
RAo, C.R. 1973. Linear statistical inference and its applications.
John Wiley & Sons, New York.
SAS INSTITUTE, INC. 1982. SAS user's guide: statistics, 1982
edition. SAS Institute, Inc., Cary, NC.
SILEN, R.R., and WHEAT, J.G. 1979. Progressive tree improve­
ment program in Douglas-fir. J. For. 77: 78-83.
SMOUSE, P.E., SPIELMAN, R.S. and PARK, M.H. 1982. Multiple­
locus allocation of individuals to groups as a function of the
genetic variation within and differences among human popula­
tions. Am. Nat. 119: 445-463.
YANG, J.C., CHING, T.M., and CHING, K.K. 1977. Isoenzyme
variation of coastal Douglas-fir. I. A study of geographic varia­
tion in three enzyme systems. Silvae Genet. 26: 10-18.
YEH, F.C., CHELIAK, W.M., DANCIK, B.P., I LLINGWORTH, K.,
TRUST, D.C., and PRYHITKA, B.A. 1985. Population differen­
tiation in lodgepole pine,
Pinus contorta spp. latifolia: a'discrimi­
nant analysis of allozyme variation. Can. J. Genet. Cytol.
27: 210-218.
YEH, F.C., and O'MALLEY, D.M. 1980. Enzyme variation in
natural populations of Douglas-fir,
Pseudotsuga menziesii
(Mirb.) Franco, from British Columbia. I. Genetic variation
patterns in coastal populations. Silvae Genet. 29: 83-92.
Download