A Genetic Neural Network Model of

advertisement
1
2
3
4
A Genetic Neural Network Model of
5
Flowering Time Control in Arabidopsis thaliana
6
7
8
By
9
Stephen M. Welch*, Judith L. Roe, Zhanshan Dong
10
11
To be submitted to: Agronomy Journal
12
13
14
15
16
S. M. Welch, Dept. of Agronomy, Kansas State University, Manhattan, KS. 66506. J. Roe,
17
Division of Biology, Kansas State University, Manhattan, KS. 66506. Z. Dong, Department of
18
Agronomy, Manhattan, KS, 66506. Contribution number 00-xxxx of the Kansas Agricultural
19
Experiment Station (KAES), Kansas State University. This project was supported in part by
20
KAES Project 0507 at Kansas State University.
21
1
A Genetic Neural Network Model of
2
Flowering Time Control in Arabidopsis thaliana
3
ABSTRACT
4
Crop simulation models incorporate many physiological processes within sophisticated
5
mathematical frameworks. However, the control mechanisms for these processes tend to be ad
6
hoc, empirical, indirectly inferred from data, and may lack realistic plasticity. Using model
7
organisms like Arabidopsis thaliana, genomic scientists are rapidly disentangling the networks
8
of genes that exert physiological control. As yet, however, these networks are qualitative in
9
nature, depicting promotion and inhibition pathways but not supporting quantitative predictions
10
of overall integrated effects. We believe (1) that neural networks can provide the quantification
11
that current genetic networks lack, and (2) that taxonomic conservation of central genetic
12
mechanisms will make networks developed for model plants also useful for crops. This paper
13
presents preliminary evidence supporting point (1). We developed a neural network with eight
14
nodes corresponding to genes affecting the timing of the A. thaliana inflorescence transition.
15
The nodes were linked into photoperiod and autonomous pathways abstracted from an existing,
16
qualitative genetic network model. Growth chamber data on transition timing were collected at
17
16C and 24C for seven A. thaliana strains possessing loss-of-function mutations at the network
18
loci. An eighth strain served as a common wild type control. The neural network model
19
reproduced the time course of the transition at both temperatures for all eight genotypes. This
20
included tracking a novel, temperature-dependent exchange in transition order exhibited by two
21
mutants whose duplication would strain traditional crop modeling methods.
22
demonstrated that the ability to imitate the data appeared to have a desirable sensitivity to
23
assumed network structure.
2
We also
1
3
1
Introduction
2
The development of crop growth simulation models has been an intensely researched area for
3
many years (Hanks and Ritchie, 1991). At the present time simulation models with widely
4
varying degrees of predictive skill exist for all major grain crops. While these models differ
5
according to the idiosyncrasies of the taxonomic groups involved, there are distinct
6
commonalities among them as well. Model outputs are calculated by integrating daily or hourly
7
changes in the rates of key physiological processes including photosynthesis, carbon allocation
8
among plant parts, evapotranspiration, respiration, biomass accumulation and/or senescence, and
9
phonological development. These rates are modeled as functions of both the internal plant state
10
and also the external environment with the latter including time-varying temperature, solar
11
radiation, and soil water balance.
12
Varietal differences in these processes have been modeled by the introduction of genetic
13
coefficients, so named because they quantify effects assumed to be of genetic origin. Recently,
14
there has been a great deal of research on efficient means for estimating these parameters (Irmak
15
et al., 2000) including attempts to relate them to known plant genotypes (White et al., 1996). A
16
positive result of these efforts has been a valuable reduction in the number of quantities requiring
17
estimation and the work required to do so.
18
However, these efforts have had limited impact on the model algorithms used to control plant
19
processes. Currently modeled control mechanisms tend to be ad hoc, empirical, and inferred
20
indirectly (albeit with great ingenuity) from observations. Some feel that the current empiricism
21
limits the accuracy of augmented models that attempt to combine physiological models with
22
physical models of the canopy and soil environment (Welch et al., 2000).
23
in model validation studies to find overestimation of low yields and underestimation of high
4
It is not uncommon
1
yields (Heiniger, 1997; Paoli, 1997). While it is usually possible to find specific adjustments to
2
remove such effects, perhaps current approaches to process control modeling are less plastic than
3
real plants. As a result, it has been suggested that improved representation of process controls
4
should have priority in modeling efforts (Hammer, 1998).
5
It is our belief that crop simulation models have much to gain from the exploding body of
6
genomic science, which is unraveling the networks of interacting genes that actually control
7
important plant processes. Unfortunately, experimental practicalities such as annual (or longer)
8
generation times, physical size, etc. greatly complicate the elucidation of such networks in crop
9
plants. Also, the methods used to construct genetic networks (mutant screenings, epistasis
10
detection experiments, phenotype rescue treatments, etc.) yield qualitative relationships (e.g.,
11
promotion and inhibition – see below) rather than the quantitative mathematical formulas needed
12
in simulation models.
13
The research program we are developing to respond to these needs is based on two premises.
14
First, neural networks (Kasabov, 1996) can be used to supply the quantification that genetic
15
networks currently lack (Mendoza et al., 1998, 2000). Second, networks developed using model
16
plants will, with calibration, be directly applicable to crops because genetic mechanisms of
17
central importance are often widely conserved taxonomically.
18
In this paper we present preliminary results supporting the first premise. Our evidence is
19
based on attempts to model the control of the inflorescence transition in Arabidopsis thaliana.
20
We first review current information on the genetics of this process. Then we introduce neural
21
networks, including the particular one we have structured to approximate inflorescence control.
22
Following that, we detail a data collection experiment currently underway and our successful
23
efforts to mimic data are from two temperature treatments. Included with the latter are some
5
1
additional results of possible relevance to bioinformatics as well as to crop modeling. The final
2
section includes a general discussion and our conclusions to date.
3
Flowering Control in A. thaliana
4
Accurate mimicry of plant phenology and especially of flowering time has long been
5
understood as critical to crop simulation. Understanding what controls transition to flowering is
6
a fundamental concern in developmental biology. There is an enormous literature on the control
7
of flowering, but a unified physiology model has not emerged (McDaniel, 1996). The genetic
8
control of flowering has been extensively studied in A. thaliana due to its small genome, short
9
generation time, self-compatibility, amenability to stable transformation, and the availability of
10
numerous mutants. A. thaliana is a facultative long day plant, so the stimulus of a long day
11
coupled with input from the circadian clock will promote flowering. Under short days, plants
12
will flower, but do so much later. Flowering in this species entails two transitions, one to form
13
an inflorescence (or bolt) and the second to produce flowers. The two transition processes are
14
distinct and can be distinguished genetically (Howell, 1998). The inflorescence transition is
15
influenced by environmental signals, such as day length, cold temperature (vernalization), and
16
growth temperature. The transition from inflorescence to floral development requires no other
17
known environmental signals. Hence the former transition is the focus of our work.
18
The interplay between environmental and genetic factors has been assembled into qualitative
19
network models for the control of flowering in Arabidopsis (Blazquez, 2000a; Blazquez et al.,
20
2000b; Devlin et al., 2000; Simpson et al., 1999; Theissen and Saedler, 1999; Levy and Dean,
21
1998; Koornneef et al., 1998a, 1998b). It is quite important to understand that the genes in these
22
models control phenology directly. That is, observed temporal changes are not the byproducts of
23
alterations in plant growth or vigor (Reeves and Coupland, 2000). Our work is based on the
6
1
Simpson et al. (1999) model because the Blazquez (2000a) version is too recent to have affected
2
our research planning. Both of these models can be viewed on the web at the addresses given in
3
the citation list.
4
According to Simpson et al. (1999) there are four pathways that operate in parallel to control
5
the inflorescence transition.
The two major ones are the autonomous pathway and the
6
photoperiod pathway. The prevailing view is that the latter exerts the promotive effect of long
7
days, whereas the former is an endogenous photoperiod-independent flower-inhibiting pathway.
8
Autonomous path repression of flowering during vegetative growth must be overcome to convert
9
the meristem to a reproductive state. This can be accomplished by vernalization, which injects
10
repressive signals into the autonomous path at the FLC (Flowering Locus C) gene 1. Simpson et
11
al. (1999) accords vernalization the status of a separate pathway although Blazquez (2000a)
12
considers it a part of the autonomous path. Beyond this, gibberellin is required for flowering in
13
short days, and also participates in flowering promotion under long days. The genes mediating
14
this process constitute the fourth independent path. In all models, signal integration of the two
15
main paths feeds into a transduction cascade ultimately upregulating flowering genes and
16
stimulating conversion to the reproductive phase.
17
A Genetic Neural Network
18
Neural networks (Kasabov, 1996) were initially developed to model brain function with the
19
intent of enabling computers to mimic various cognitive feats common in the animal kingdom
20
(learning, pattern recognition, etc.).
21
abstract neuron function. Each node typically has a number of inputs and one output that serves
22
as an input to multiple succeeding nodes.
They consist of interlinked nodes that mathematically
Nodal outputs are usually calculated as some
In this paper, “GENE” denotes a particular locus, “GENE” is the corresponding neural network node, and “gene” is
a plant with loss-of-function mutant alleles at that locus.
1
7
1
sigmoidal function applied to an arithmetic combination of the inputs. Thus, like the genes in
2
existing qualitative network models, these abstract neurons are capable of both ON/OFF and
3
graded activity in response to the net balance of stimulatory and inhibitory inputs (Burstein,
4
1995; Mjolsness et al., 1991). All nodes in a neural network use the same sigmoidal formula,
5
which is called the nodal transfer function. Finally, the links between nodes typically modify the
6
values they transmit. Almost always, this modification is to multiply the value by some constant
7
weight. Thus, the function of a neural network is completely determined by four items: the
8
structure of its nodes and links; the method of combining nodal inputs for substitution into the
9
transfer function; the transfer function, itself; and the weights applied to each link. The network
10
designer determines the first three and the weights are found from experimental data via a
11
process called training.
12
If functioning genes can, in fact, be identified with nodes then removing nodes should
13
equate to loss-of-function mutation. To test this concept, a set of genes and mutant alleles was
14
identified according to several criteria. First, we wanted a representative selection of both the
15
photoperiod and autonomous pathways. Second, to achieve maximal loss of function, we wanted
16
genes with readily available mutant alleles that had strong phenotypic effects. Third, for proper
17
experimental control, we needed those alleles to be obtainable in a common genetic background.
18
Finally, we had to balance numbers of genes, numbers of plants per genotype, and available
19
growth chamber space.
20
Figure 1 shows the genes we selected and their connection into a neural network abstracted
21
from the Simpson et al. (1999) model. Both the photoperiod and autonomous pathways are
22
present. They feed into a single integration node whose zero-to-one output is interpreted as the
23
fraction of the transition completed.
Inputs to the system are photoperiod and days after
8
1
planting. The latter was used as a surrogate for growth, which is thought to be of relevance to
2
the autonomous pathway (Blazquez et al., 2000b). Initially, we were concerned that day length
3
might somehow influence the autonomous pathway or, conversely, that growth-related
4
information might “leak” onto the photoperiod by some as yet undiscovered genetic mechanism.
5
For this reason we added some extraneous inputs to each pathway with little justification
6
(beyond a desire not to be victimized by limited thinking). Later, we had cause to remove them
7
(see below).
8
Table 1 shows the specific mutant alleles and background used. All seed was obtained from
9
the Arabidopsis Biological Resource Center (Columbus, OH) whose stock numbers appear in the
10
first column.
The next two columns list the alleles, gene symbols, and map positions
11
(chromosome number and location2 in centimorgans). The gene symbol can be used to locate the
12
position of the gene in our simplified network. The final column qualitatively describes the
13
phenotype, gives a molecular characterization of the gene product if known, and relevant
14
literature citations. It should be understood that phenotype descriptions pertain to nominal
15
rearing conditions and may not reflect our results below.
16
All of the chosen mutant alleles were available in the Landsberg erecta (Ler) background. In
17
addition to the traits listed in the table, Ler has reduced functionality at FLC. Therefore, while
18
we did not include a FLC node in our network, we did assume that the effects of upstream genes
19
would propagate to the pathway integration node.
20
Data Collection Methods
21
Once a neural network has been structured, its further quantification requires data. For this
22
purpose we reared plants in a growth chamber at two temperatures, 16C and 24C. To insure
23
that genes on both pathways were active, data was collected under long-day conditions (16 h
9
1
light – 8 h dark). Seeds for all genotypes were first soaked in water for 30 minutes. For each
2
run, the eight genotypes are individually sown on the soil surface in square pots (8.9 cm x 7.2
3
cm, Hummert International catalog# 12-1350) with nine replications (72 pots in all). The potting
4
soil was Fafard Mix #2 from Hummert International (Earth City, MO). Within each pot, single
5
seeds were placed ca. 1 cm inside every corner. After germination, plants were thinned to two
6
per pot, almost always diagonal to each other to minimize inter-plant effects. Lacking a fully
7
functional FLC gene, Ler does not require vernalization.
8
germination, the pots were placed in darkness at 4C for 72 h.
However, to help synchronize
9
The growth chamber used in the study was a Conviron Model E8VH (Controlled
10
Environments Ltd, Winnipeg Manitoba, Canada). Lighting was provided by 20 fluorescent tubes
11
(Philips F48T12/CW/VHO, 110 w) and 2 incandescent bulbs (Philips Standard Frost
12
Incandescent Lamps, 60 w). Light intensity at plant level was 216.1 E.m-2.s-1 as measured with
13
a Quantum sensor (Li-Cor, Inc., Lincoln, NE). Chamber relative humidity was maintained
14
between 70 and 80 %.
15
The 72 pots were randomly assigned to one of four trays and placed in the chamber. To
16
reduce any systematic bias that might result from temperature or lighting gradients within the
17
chamber, the trays were moved daily in an orderly fashion. This ensured that each pot spent an
18
approximately equal time in each chamber location. In addition, thermistors monitored the
19
canopy-level temperature within each tray at hourly intervals and data was recorded by a CR 21
20
data logger from Campbell Scientific (Logan, UT). To provide the seedlings with an initial, high
21
humidity, transparent domes (Hummert International, Catalog No. 14-2568-1) were placed over
22
the trays from planting until 6-7 days after emergence. This is standard practice in work with A.
2
Look for more detailed information at http://www.arabidopsis.org/chromosomes/.
10
1
thaliana but resulted in a transient avg. temperature increase of 0.38-0.74oC. This increase is
2
included in all temperature accumulations, which yielded treatment-long avg. temperatures of
3
24.16C and 16.37C. Plants were watered at ca. three-day intervals.
4
The pots were observed daily and records made of emergence date, bolting date, flowering
5
date, leaf number (including rosette and inflorescence leaves), and plant height. Emergence date
6
was taken as the first visible green but this is not particularly reliable because of the small seed
7
size. Bolting date (when flower buds are first visible on the main meristem) marks the transition
8
to inflorescence, the focus of the current report. Flowering date is the opening of the first bud
9
and denotes the second transition discussed above.
10
Genetic Neural Network Weights
11
Neural network weights encode the system’s storehouse of knowledge. They are found
12
through a training process during which weight estimates are refined until the network can
13
produce correct, known outputs when supplied the corresponding inputs. Larger networks with
14
more weights are presumed to be necessary to retain knowledge of a more complicated nature. If
15
initial training efforts are not successful, the typical assumption is that the original network had
16
insufficient storage and so more nodes are added. Although this strategy is often successful, in
17
reality another principle may also be contributing.
18
Training a neural network is an example of a nonlinear optimization problem where the
19
objective function is a goodness-of-fit criterion. A common difficulty in such work is that initial
20
estimates of the solution may be so distant from the best answer that the optimizer may converge
21
prematurely to some inferior result. Because of network complexity and, perhaps, because of the
22
opacity of brain function, designers neither demand nor expect any clear relationship between the
23
values of the weights and the knowledge represented. Good initial estimates are, therefore,
11
1
considered unobtainable, and zeros are commonly used. Subsequent success with a larger
2
network may, therefore, have nothing to do with storage capacity. It may simply be that the
3
goodness-of-fit landscape has been altered so as to provide the optimizer with a more congenial
4
path from arbitrary initial estimates to an acceptable endpoint.
5
In our research, the strategy of adding nodes disappeared once we made our choice of a
6
particular gene subnet to work with. Therefore, when our first training attempts failed, we
7
employed several tactics to search more efficiently and to modify the goodness-of-fit landscape
8
directly.
9
dimensionality of the problem whenever possible, (3) increasing dimensionality in a controlled
10
fashion whenever unavoidable, (4) reparameterizing, and (5) expanding model scope
11
incrementally. Candidly, our exploitation of these strategies was informed by some science, by
12
some intuition, and by a great deal of error.
The latter included (1) alternative goodness-of-fit functions, (2) reducing the
13
To date, two different goodness-of-fit measures have been used in training: ordinary least
14
squares (LS), and a maximum likelihood method (ML) that parallels the Kaplan-Meier product-
15
limit estimator well known in survival analysis (Cox and Oates, 1984). Three search procedures
16
were used: simulated complex evolution, SCE, (Duan et al., 1992, 1994; Thyer et al., 1999); the
17
Marquardt-Levenberg algorithm, MQL, (Press et al., 1992); and the variant of Newton’s method
18
(Press et al., 1992) embodied in the Excel/Solver software (E/S). SCE is readily adaptable to a
19
parallel computing environment and therefore potentially useful for large-scale problems. The
20
MQL algorithm is the current method of choice for nonlinear least squares (Press et al., 1992).
21
In terms of execution speed, robustness, flexibility, and numerical charisma, E/S has limited
22
long-term potential compared to MQL and SCE. However, the ease of Excel programming has
23
permitted rapid testing of new ideas, the determinative success factor in our work to date. This
12
1
report therefore contains results obtained by combining LS and E/S using the methods of Wraith
2
and Or (1998), which we have adapted.
3
Model Specifics and Results
4
5
In what follows it will be useful to have a concise notation for a neural network model. We
interpret
a : GENE : expression
6
7
to mean that the expression on the right is computed yielding, say, u, which is substituted into the
8
transfer function for the GENE node.
9
1/(1  exp(u )) . The function result is then multiplied by the quantity a. The
We have used the common transfer function
operator (a
10
multiplication by 0 for loss-of-function mutants and by 1 otherwise) calculates the penultimate
11
nodal output. If we let L represent photoperiod and D be days after planting, our starting
12
network in Fig. 1 can be written as
1: CRY 2 : a1 L  a2 D  bCRY 2
1: PHYB : a7CRY 2  a8 L  a9 D  bPHYB
1: FPA : a3 L  a4 D  bFPA
13
1: GI : a10 D  a11CRY 2  a12 PHYB  a19 FPA  bGI
1: FVE : a5 D  bFVE
(1)
1: CO : a13GI  a14 D  bCO
1: FCA : a6 D  bFCA
 1: Int : a15CO  a16 FPA  a17 FVE  a18 FCA  bInt
14
where Int represents the integration pathway,  indicates the final network output, and the a’s
15
and b’s are weights to be found. Loosely speaking, b values control the time until the onset of
16
nodal switching while the a’s affect the time it takes to complete a transition, once started. Our
17
earliest training efforts were focused on this model and failed utterly.
18
Once we realized that the training process was going to be nontrivial, we adopted an
19
incremental approach to the modeling. First we attempted to train the autonomous pathway on
13
1
the 24C data alone. Once this was achieved, we applied the lessons learned to an enlarged, two-
2
pathway model, still for one temperature. Upon success, we moved on to a two-path model for
3
two temperatures. We now recapitulate the latter two steps.
4
The 24C Model
5
To reduce dimensionality we removed all extraneous D and L inputs eliminating a2, a3, a9,
6
a10, and a14. This is easily justified since it actually makes the neural network more similar to the
7
underlying Simpson et al. (1999) model. Also, given data on only one photoperiod, there is no
8
way to unambiguously determine the parameters a1 and bCRY2 (for CRY2) or a8 and bPHYB (for
9
PHYB). Therefore, we fixed the CRY2 output at 1 for long days and 0 for short days (although no
10
use has yet been made of the latter) and absorbed a8 into bPHYB, which we signify by b’PHYB.
11
The data indicate that all mutant populations can progress from 10 to 90 % transitioned in 3 to 6
12
d. Thus, as an approximation, we equated a4, a5, a6, a7, a11, a12, a13, and a19 to a single
13
parameter, a. However, this did not mimic co data very well so a13 was released to vary
14
independently. A second parameter, a’, was used for a15, a16, a17, a18 in the Int node. Another
15
change was quite ad hoc but helpful.
16
propagating along a path then the input to Int should be zero. However, with the given transfer
17
function, a zero input to an upstream node propagates a signal of 0.5 to the next downstream
18
node. We removed this effect by making bInt proportional to a’ and to -0.5 times the number of
19
upstream nodes. Together, these changes yielded the nine-parameter model
It seemed reasonable that if no information was
14
2 : CRY 2 : 0
1: PHYB : aCRY 2  b 'PHYB
1: FPA : aD  bFPA
1: GI : a(CRY 2  PHYB  FPA)  bGI
1
1: FVE : aD  bFVE
(2)
1: CO : a13GI  bCO
1: FCA : aD  bFCA
 1: Int : a '(CO  FPA  FVE  FCA  2)
2
This model was fit using Excel/Solver. Figure 2 plots predicted vs. observed values for those
3
observations falling in the actual transition interval. Except for fve, predictions are close to the
4
actual data. The overall correlation of all predicted and observed values used in training was
5
0.9845.
6
An important question is whether genomic structure actually played a part in these results or
7
whether they are only due to the well-known ability of neural networks to fit complex data. To
8
assess this, 300 random networks were synthesized and fit with Excel/Solver. To allow a fair
9
comparison, each of the networks had the same number of nodes, number of inputs per node, and
10
redundancy of coefficients as Eq. 2.
The data, search method, initial conditions, and
11
convergence criterion were the same for all 300 runs. Figure 3 shows the cumulative distribution
12
of final sums of squared errors (SSE)3. As the graph indicates, SSE values ranged from 6.50 to
13
31.13. By comparison, the SSE of the genomically constructed network is, at 0.59, slightly over
14
11 times better, strongly suggesting the value of genetic information.
15
A common situation in genomics is to know that a group of genes may form a path but not be
16
able to tell the order in which they fall. For example, we might be uncertain as to whether a
17
portion of the photoperiod pathway runs PHYBGICO or the reverse. Typically a set of
3
Since the model output is the fraction of plants completing the transition, all reported SSE and root mean square
errors (RMSE) values are unitless.
15
1
single and double mutant experiments will be performed to resolve such issues. Although we
2
performed no double mutant experiments, the distinction between the two alternatives was clear
3
in our data. The reversed, improper order yielded an SSE of 6.68.
4
The Two-Temperature Model
5
It has often been suggested that temperature effects on development may be related to
6
underlying, enzyme-catalyzed, biochemical reaction rates (Pradhan, 1946; Sharpe and
7
DeMichelle, 1977; Wagner et al., 1984; Bridges et al., 1989). Furthermore, the activity curves
8
of many regulatory enzymes are allosteric (i.e., S-shaped) with increasing substrate concentration
9
(Segel, 1975, p353ff) just as neural node outputs are sigmoidal with increasing weighted input
10
sums.
In combination, these ideas suggested a simple analogy in which the amplitude of nodal
11
outputs could be multiplied by temperature-dependent factors that we will denote qi.
12
Applied completely, this would add 16 parameters to the model (8 nodes times 2
13
temperatures). To limit this increase while maximizing consistency with Eq. 2, the temperature
14
factors for 24C were assumed to be unity. Furthermore, because the Int output must range from
15
zero to one, no temperature multiplier was used there. It was also found desirable to relax the
16
assumption of common a and a’ values. The weights ai were estimated by parameters of the
17
forms avi or a’vi where a and a’ are as above and vi (i = 11, 12, 15, 16, 17, 18, 19) are new values
18
with initial estimates of 1. This allowed the search engine to find the approximate solution via
19
a, which was then particularized by the offset parameters vi. The weights a4, a5, a6, a7, were
20
varied freely. Again, it should be emphasized that these reparameterizations were tactical and
21
intended to facilitate training; no scientific significance is attached to their specific form.
22
23
In what follows, we will use [r,s] to denote a vector of temperature multipliers for 24C and
16C, respectively. This gives the combined temperature model
16
2 1, qCRY 2  : CRY 2 : 0
1
1, qPHYB  : PHYB : a7CRY 2  b 'PHYB
1, qFPA  : FPA : a4 D  bFPA
1, qGI  : GI : a(v11CRY 2  v12 PHYB  v19 FPA)  bGI
1, qFVE  : FVE : a5 D  bFVE
1, qCO  : CO : a13GI  bCO
1, qFCA  : FCA : a6 D  bFCA
 1,1 : Int : a '(v15CO  v16 FPA  v17 FVE  v18 FCA  2)
(3)
2
Column I of Table 2 gives the resulting parameter values and the root mean square error
3
(RMSE). There is an immediately noteworthy pattern among the temperature factors. Four of
4
the seven nodes (i.e., all of the autonomous pathway plus the closely linked GI) have temperature
5
factors between zero and one, plausibly suggesting a general plant growth slowdown with
6
cooling. Perhaps not surprisingly, the remaining photoperiod sensory pathway nodes (CRY2,
7
PHYB, and CO) depart from this. A final analysis of the sensory path should await the addition
8
of short-day data to the training set. Even so, there are some comments about the light sensors
9
that can be made now.
10
Interestingly, the CRY2 factor is quite close to one. This gene is known to interact quite
11
closely with the diurnal clock genes (Simpson et al., 1999; Blazquez, 2000a) so it would not be
12
unreasonable to hypothesize some form of temperature stabilization in this part of the network.
13
With this possibility in mind, we retrained the network under the assumption that qCRY2 was
14
identically one. The very similar results are shown in column II of Table 2.
15
The blue-light receptor encoded by CRY2 is known to mediate the red-light-dependent
16
flowering-control actions of phytochrome B (Guo et al., 1998). The network was able to
17
reproduce this to the extent that, neglecting the temperature constant, the PHYB node exactly
18
tracked the CRY2 output. However, in the absence of specific training data under red and blue
17
1
lighting, the network was unable to resolve the precise antagonistic relationship that exists
2
between these receptors in reality. Under the white light in our experiment, phyB behaved much
3
like the wild type, preceding it by just one to two days regardless of temperature.
4
temperature-stabilized CRY2 was apparently a much more reliable indicator of training set
5
behavior than was PHYB. As a result, v12, the weight applied to PHYB outputs, diminished by
6
almost four orders of magnitude (Table 2, column II). For this reason, we deleted PHYB from
7
the network, excluded the corresponding mutant data, and repeated the training process (Table 2,
8
column III). Not only is the aggregate RMSE value better, but also the bulk of the improvement
9
is associated with the wild type, whose prediction errors were reduced by 33 % (from
10
The
RMSE=0.152 to 0.102).
11
Figure 4 plots predicted transition curves and observed data for both temperatures using the
12
values from Table 2, column III. The ability of the model to track multiple genotypes accurately
13
as temperature changes is evident. However, the figure also reveals an interesting feature,
14
namely that the order of inflorescence transition for fve and co differs between the two
15
temperatures (arrows).
16
transition times (estimated by linear interpolation), co switches from 7.7 d ahead of fve at 24C
17
to 9.3 d behind at 16C for a net change (which we shall term switching magnitude) of 17 days.
18
To our knowledge, this observation regarding co and fve is novel.
Moreover, the scale of the exchange is large.
Comparing median
19
Crop models traditionally use either degree-day (Hesketh et al., 1973) or photothermal day
20
(Grimm et al., 1993) systems to calculate flowering times. These approaches cannot produce
21
order switching unless base temperatures differ between varieties, an assumption not commonly
22
made in crop models. Therefore, the observed order switch was unexpected, its magnitude
23
startling, and its association with specific point mutations intriguing. It raised the question as to
18
1
the occurrence of order switching in actual crops. A side study, detailed in the Appendix,
2
provided statistical evidence that order switching does occur in soybean [Glycine max (L.)
3
Merr.]. Beyond statistics, Hoogenboom and White (2000) have reported a specific gene in
4
common bean (Phaseolus vulgaris L.) that confers photoperiod sensitivity in warmer
5
environments but not in cooler ones. Under appropriate circumstances this gene would also
6
produce order switching. They were able to incorporate its effect into a crop simulation model
7
by appropriate, gene-specific modifications. Genetic neural networks, however, can potentially
8
subsume a wide range of such unique behaviors within a single, general mechanism.
9
Discussion and Conclusions
10
Although preliminary, our results indicate that neural networking has potential for converting
11
qualitative genetic networks into quantitative predictive tools, the first of the two long-term
12
research objectives we presented in the Introduction. Not only can neural networks reproduce
13
complex genetic data but they can also do so parsimoniously. We only know of one other study
14
that attempted to predict flowering time with a neural network, that of (Elizondo et al., 1992)
15
who studied soybean. They were successful but, lacking genomic information, they used a
16
traditional three-layer network that required a minimum total of 25 nodes and 104 weights for
17
acceptable prediction. In contrast, our final model had seven nodes and 22 free parameters.
18
A natural desire of applied modelers is for models to be no more complex than necessary to
19
achieve specific goals. In this regard, estimates of overall genome sizes are quite intimidating.
20
However, we believe fears of impracticality to be groundless because small numbers of genes
21
often control broad developmental and physiological processes. In floral development, the genes
22
we are studying top a hierarchy (Theissen and Saedler, 1999) that ultimately generates all
23
reproductive organs.
More generally, mathematical analysis suggests that the accrual of
19
1
disproportionate influence to tiny gene minorities may be an inevitable consequence of network
2
self-organization over evolutionary time (Barabasi and Albert, 1999). If so, significant model
3
improvements may only require the simulation of relatively small numbers of genes.
4
Furthermore, applied models, including genetic neural networks, will continue to benefit
5
from sensitivity analysis and other traditional simplification techniques. Our ability to delete
6
PHYB under white light conditions is one illustration. In addition, it may not be necessary to
7
model the complete dynamics of all genes. White and Hoogenboom (1996) have had success
8
melding genetics with existing simulation models using simple regression equations based on the
9
presence or absence of selected dominant alleles.
10
Beyond applied issues, one exciting aspect of this work is that it immediately suggests
11
further researchable questions. Like all good models, neural networks seem capable of providing
12
guidance in data interpretation. An apparent sensitivity to assumed structure was demonstrated
13
in the reversed photoperiod pathway and randomized network examples above. It may also have
14
played a part in our inability to fit our original model with its extraneous inputs of day length and
15
plant age. If this result is verified, automated tools may, with appropriate human guidance, be
16
able to screen large numbers of network structures for compatibility with existing data.
17
To be realized, such tools will require powerful numerical estimation methods. One of the
18
most useful features of neural networks is the extensive body of research results on training
19
techniques (Fine, 1999; Grierson and Hajela, 1996). Still, it remains to be seen what will prove
20
the most efficacious ways to apply these methods to genomic data. It is obvious, however, that
21
training will need to utilize data from multiple experiments conducted in different laboratories.
22
This is also illustrated by our PHYB results, which, adopting a research perspective, would have
20
1
been improved by data from red and blue light environments. Clearly no single research group
2
will be able to perform all the treatments necessary to construct a complete data set.
3
Complicating factors are (1) the propensity of various Arabidopsis investigators to use
4
different sets of alleles in separate genetic backgrounds and (2) the possible impacts of subtle
5
inter-lab variations in rearing environments. While these will, no doubt, cause initial problems,
6
quantitative models (using neural networks or any other successful approach) can provide
7
common reference points against which effect magnitudes can be judged. This may be an area in
8
which the experience of crop modelers can be of direct benefit to genomic scientists.
9
Because genetic network construction involves a consideration of phenotypes, there are
10
multiple possible mechanisms underlying the resulting links. The relationships may be quite
11
immediate, as when one gene codes for a transcription factor directly controlling the expression
12
of another. CO, whose sequence includes the DNA-binding, zinc-finger motif (Lewin, 1997,
13
p850) is a probable example. Alternatively, the effects may be indirect and mediated by
14
significant physiological processes. Even so, the construction method still guarantees that the
15
genes responsible will appear at their proper place in the network diagram. The cryptochrome
16
and phytochrome receptors illustrate this situation. In spite of this diversity, we have seen that
17
the neural network approach can, to some degree, span these differences in mechanism.
18
Current crop models, in contrast, emphasize physiological mechanisms to the complete
19
exclusion of genetic ones. As noted earlier, this has led to a very empirical representation of
20
process control mechanisms that may, in the final analysis, lack plasticity. Even so, the extensive
21
utilization of crop models in areas as diverse as policy analysis (Rosenzweig et al., 1996;
22
Tubiello et al., 1999); research (Hanks and Ritchie, 1991); and, increasingly, production
21
1
management (Welch et al., 1999), shows that the physiological approach is not without its
2
strengths.
3
However, if taken to extremes, genetic neural networks could run the risk of committing the
4
reverse error, that of subsuming too much under the heading of genomics. Future plant models
5
will need to craft a careful balance between traditional methods and new genomic approaches
6
like, or derived from, the one presented here. Thus, the most productive way forward will
7
undoubtedly entail a synthesis of crop modeling and genomic methods rather than the mere
8
adoption of a new approach by one existing research community.
9
aspects of this work, this is the most exciting of all.
10
22
Among all the exciting
1
Appendix
2
Ten years (1987-96) of anthesis date (R1) data from soybean performance trials at Tifton,
3
GA (obtained from G. Hoogenboom) were examined for evidence of order switching. There
4
were early and late plantings in each year for a total of 20 different environments. Only irrigated
5
treatments were used to limit the possible effects of water stress. This yielded a total of 901
6
observations (i.e., treatment means) representing 141 varieties. These were screened for all pairs
7
of varieties that co-occurred in two or more environments. A subset of 137 varieties comprising
8
873 observations met this criterion.
9
We next defined a case as two specific varieties compared in two particular environments.
10
Among the 37 395 cases in the data were 3470 (9.3%) with apparent switches. However, the
11
effect cannot be declared real without a statistical evaluation. We applied reasoning developed
12
by Dr. T. Loughin (Dept. Statistics, Kansas State University, pers. comm., 2000). Consider the
13
distribution of differences in mean genotype (G) effects between pairs of varieties.
14
attention is paid to the order in which the subtractions are performed then this distribution will
15
span zero. In this situation, there are only two ways in which switches can fail to occur. The
16
first is if there are no genotype-by-environment (GE) interactions, a premise that is undeniably
17
false. The second is if the magnitudes of GE interactions become small when G differences are
18
small. Only in this way can pairs of varieties have G differences that do not fall on opposite
19
sides of the origin in diverse environments.
If no
20
There were 5201 pairs of varieties found in two or more environments. For each pair we
21
computed the mean and variance across environments of the inter-varietal differences in
22
treatment means. The variances estimate GE interaction effects and were converted to standard
23
deviations so as to have the same units as the means. Figure 5 is a plot of all 5201 [mean,
23
1
standard deviation] pairs. As is evident, there is some reduction in the standard deviation as the
2
means approach zero. This is consistent with the relative rarity of switching. However, there are
3
points quite near zero with standard deviations as large as 4 d or greater making it very difficult
4
to assert that all treatment differences actually have the same sign. These deviations also exceed
5
the one-day sampling interval (whose impact is clear along both axes), thus eliminating
6
quantizing effects as a source of the observed switching.
7
The observed soybean switches have smaller magnitudes (as defined in the main text) than
8
seen in Arabidopsis; 90% are 8 d or less. However, the consistent, 8C-separation between
9
chamber runs is a wider deviation than would be expected between two temporally distinct field
10
studies at a single site. It is not surprising, therefore, that the cumulative effect would be less in
11
soybean plots. Dr. William Schapaugh, an experienced breeder (Dept. Agronomy, Kansas State
12
University), has also indicated (pers. comm., 2000) that switches of this size are not unusual in
13
irrigated soybean plots. Nor is it remarkable, given rarity and small magnitude, that switching
14
has not been noticed in comparisons of actual data with the predictions of simulation models
15
completely incapable of the effect. Switching may have just been overlooked as noise.
16
24
1
Citations
2
Ahmad, M., and A.R. Cashmore. 1993. HY4 gene of A. thaliana encodes a protein with
3
4
5
6
7
8
9
characteristics of a blue-light photoreceptor. Nature 366:162-166
Barabasi, A., and R. Albert. 1999. Emergence of scaling in random networks. Science 286:509512
Blazquez, M.A. 2000a. Flower development pathways. J. of Cell Sci. 113:3547-3548. Available
at http://www.biologists.org/JCS/113/20/jcs8511.html
Blazquez, M.A., and D. Weigel. 2000b. Integration of floral inductive signals in Arabidopsis.
Nature 404:889-892
10
Bridges, D.C., H.I. Wu, P.J.H. Sharpe, and J.M. Chandler. 1989. Modeling distributions of crop
11
and weed seed germination time. Weed Sci. Weed Science Society of America,
12
Champaign, IL. 37:724-729
13
14
Burstein, Z. 1995. A network model of the developmental gene hierarchy. J. Theor. Biol. 174:111
15
Cox, D.R., and D. Oaks. 1984. Analysis of survival data. Chapman and Hall. London
16
Devlin, F.P., and S.A. Kay. 2000. Flower arranging in Arabidopsis. Science 288:1600-1602
17
Duan, Q., S. Sorooshian, and V. Gopta. 1992. Effective and efficient global optimization for
18
19
20
conceptual rainfall-runoff models. Water Resource Research 28:1015-1031
Duan, Q., S. Sorooshian, and V. Gopta. 1994. Optimal use of the SCE-UA global optimization
method for calibrating watershed models. J. Hydrology 158:265-284
21
Elizondo, D.A., R.W. McClondon, and G. Hoogenboom. 1992. Neural network models for
22
predicting crop phenology. Presented at 1992 International Winter Meeting, Paper No.
23
92-3596. ASAE, 2950 Niles Road, St. Joseph, MI 49085-9659 USA
25
1
2
Fine, T.L. 1999. Feedforward neural network methodology. Springer-Verlag New York, Inc.
New York
3
Fowler, S., K. Lee, H. Onouchi, A. Samach, K. Richardson, B. Morris, G. Coupland, and J.
4
Putterill. 1999. GIGANTEA: a circadian clock-controlled gene that regulates
5
photoperiodic flowering in Arabidopsis and encodes a protein with several possible
6
membrane-spanning domains. EMBO J. 18:4679-4688
7
Grierson, D.E., and P. Hajela (Ed). 1996. Emergent computing methods in engineering design:
8
Applications of genetic algorithms and neural networks (NATO Asi Series. Series F,
9
Computer and system). Springer-Verlag. New York
10
11
12
13
14
15
Grimm, S.S., J.W. Jones, K.J. Boote, and J.D. Hesketh. 1993. Parameter estimation for
predicting flowering date of soybean cultivars. Crop Sci. 33:137-144
Guo, H., H. Yang, T.C. Mockler, and C. Lin. 1998. Regulation of flowering time by Arabidopsis
photoreceptors. Sci. 279:1360-1363
Hammer, G.L. 1998. Crop modeling: Current status and opportunities to advance. Acta
Horticulturae 456:27-36
16
Hanks, J. J.T. Ritchie. 1991. Modeling plant and soil systems. ASA, CSSA, SSSA. Madison, WI
17
Heiniger, R.W., R.L. Vanderlip, S.M. Welch. 1997. Developing guidelines for replanting grain
18
sorghum: I. Validation and sensitivity analysis of the SORKAM sorghum growth model.
19
Agron. J. 89:75-83
20
21
Hesketh, J.D., D.L. Myhre, and C.R. Willey. 1973. Temperature control of time intervals
between vegetative and reproductive events in soybeans. Crop Sci. 13:250-254
26
1
Hoogenboom, G., J.W. White. 2000. Improving physiological assumptions of simulation models
2
by using gene-based approaches. Agronomy Abstracts, American Society of Agronomy.
3
p.21
4
5
Howell, S.H. 1998. Molecular genetics of plant development. Cambridge University Press.
Cambridge, UK
6
Irmak, A., J.W. Jones, T. Mavromatis, S.M. Welch, K.J. Boote, and G.G. Wilkerson. 2000.
7
Evaluating methods for simulating soybean cultivar responses using cross validation.
8
Agron. J. 92:1140-1149
9
10
11
12
Kasabov, N.K. 1996. Foundations of neural networks, fuzzy systems, and knowledge
engineering. MIT Press. Cambridge, USA
Koornneef, M., C. Alanso-Blanco, A.J.M. Peeters, and W. Soppe. 1998a. Genetic control of
flowering time in Arabidopsis. Ann. Rev. Plant Physiol. Plant Mol. Biol. 49:345-370
13
Koornneef, M., C. Alonso-Blanco, H. Blankestijn-de Vries, C.J. Hanhart, and A.J.M. Peeters.
14
1998b. Genetic interactions among late flowering mutants of Arabidopsis. Genetics
15
148:885-892
16
17
Koornneef, M., C.J. Hanhart, and J.H. Van der Veen. 1991. A genetic and physiological analysis
of late flowering mutants in Arabidopsis thaliana. Mol. Gen. Genet. 229:57-66
18
Levy, Y.Y., C. Dean. 1998. The transition to flowering. Plant Cell 10:1973-1989
19
Lewin, B. 1997. Genes VI. Oxford Univ. Press. Oxford, UK
20
Macknight, R., I. Bancroft, T. Page, C. Lister, R. Schmidt, K. Love, L. Westphal, G. Murphy, S.
21
Sherson, C. Cobbett, and C. Dean. 1997. FCA, a gene controlling flowering time in
22
Arabidopsis, encodes a protein containing RNA-binding domains. Cell 89:737-745
27
1
2
3
4
5
6
7
8
McDaniel, C.N. 1996. Developmental physiology of floral initiation in Nicotiana tabacum L. J.
Exp. Bot. 47: 465-475
Mendoza, L., E.R. Alvarez-Buylla. 1998. Dynamics of the genetic regulatory network for
Arabidopsis thaliana flower morphogenesis. J. Theor. Biol. 193:307-319
Mendoza, L., E.R. Alvarez-Buylla. 2000. Genetic regulation of root hair development in
Arabidopsis thaliana: A Network Model. J. Theor. Biol. 204:311-326
Mjolsness, E., D.A. Sharp, and J.A. Reinitz. 1991. A connectionist model of development. J.
Theor. Biol. 192: 429-453
9
Paoli, E.R. 1997. Maize performance in Kansas: A CERES-Maize simulation. Ph.D. dissertation.
10
Kansas State University. Dissertation Abstracts International 58-08B: AAI9804375
11
Park, D.H., D.E. Somers, Y.S. Kim, Y.H. Choy, H.K. Lim, M.S. Soh, H.J. Kim, S.A. Kay, and
12
H.G. Nam. 1999. Control of circadian rhythms and photoperiodic flowering by the
13
Arabidopsis GIGANTEA gene. Science 285:1579-1582
14
15
16
17
Pradhan, S. 1946. Insect population studies IV: Dynamics of temperature effect on insect
development. Proc. Nat. Inst. Sci. India 12:385-404
Press, W.H., S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. 1992. Numerical recipes in C:
the art of scientific computing. Cambridge University Press. Cambridge
18
Putterill, J., F. Robson, K. Lee, R. Simon, and G. Coupland. 1995. The CONSTANS gene of
19
Arabidopsis promotes flowering and encodes a protein showing similarities to zinc finger
20
transcription factors. Cell 80:847-857
21
Reed, J.W., P. Nagpal, D.S. Poole, M. Furuya, and J. Chory. 1993. Mutations in the gene for the
22
red/far-red light receptor phytochrome B alter cell elongation and physiological responses
23
throughout Arabidopsis development. Plant Cell 5:147-157
28
1
2
Reeves, P.H., G. Coupland. 2000. Response of plant development to environment: Control of
flowering by daylength and temperature. Plant Biol. 3:37-42
3
Rosenzweig, C., J. Philips, R. Goldberg, J. Carroll, and T. Hodges. 1996. Potential impacts of
4
climate change on citrus and potato production in the US. Agric. Syst. 52:455-479
5
Segel, I.H. 1975. Enzyme kinetics. John Wiley & Sons, New York
6
Sharpe, P.J.H., and D.W. DeMichelle. 1977. Reaction kinetics of poikilotherm development. J.
7
Theor. Biol. 64:649-670
8
Simpson, G.G., A.R. Gendall, and C. Dean. 1999. When to switch to flowering. Ann. Rev. Cell
9
Dev. Biol. 99:519-550. Available at http://www.jic.bbsrc.ac.uk/staff/caroline-dean/ft.htm
10
11
Theissen, G., and H. Saedler. 1999. The golden decade of molecular floral development (19901999): A cheerful obituary. Developmental Genetics 25:181-193
12
Thyer, M., F. Kuezera, and B.C. Batas. 1999. Probabilistic optimization of conceptual rainfall-
13
runoff models: A comparison of the shuffled complex evolution and simulated annealing
14
algorithms. Water Resources J. 35:767-773
15
Tubiello, F.N., C. Rosenzweig, B.A. Kimball, P.J.Pinter Jr., G.W. Wall, D.J. Hunsaker, R.L.
16
LaMorte, and R.L. Garcia. 1999. Testing CERES-wheat with free-air carbon dioxide
17
enrichment (FACE) experiment data: CO2 and water interactions. Agron. J. 91:247-255
18
Wagner, T.L., H.I. Wu, P.J.H. Sharpe, R.M. Schoolfield, and R.N. Coulson. 1984. Modeling
19
insect development rates: a literature review and application of a biophysical model. Ann.
20
Entomol. Soc. Am. 77:208-225
21
Welch, S.M., J.M. Ham, G.J. Kluitenberg, and N.H. Trick. 2000. Components of next generation
22
research-oriented crop models. Seminar, Hebei Agricultural University, China. Available
23
at http://www.oznet.ksu.edu/dp_agrn/FACULTY/WELCH
29
1
Welch, S.M., J.W. Jones, G. Reeder, M.W. Brennan, B.M. Jacobson. 1999. PCYield: Model-
2
based support for soybean production. In: Donatelli, M., C. Stockle, F. Villalobos, and J.
3
M. Villar Mir, eds., Proceedings of the International Symposium Modeling Cropping
4
Systems. Lleida, Spain. European Society of Agronomy Division of Agroclimatology and
5
Agronomic Modeling, and the University of Lleida. p273-274
6
7
8
9
White, J.W., and G. Hoogenboom, 1996. Simulating effects of genes for physiological traits in a
process-oriented crop model. Agron. J. 88:416-422
Wraith, J., and D. Or. 1998. Nonlinear parameter estimation using spreadsheet software. J. Nat.
Resource Life Science Education. 27:13-19
10
30
1
Figure Captions
2
1 Initial genetic neural network. The nodes correspond to flowering time control genes.
3
The inputs are day length, L, and days after planting, D. Inputs with underlines were
4
later dropped (see text).
5
6
2 Predicted vs. observed inflorescence transition data. To focus on the accuracy of the
7
transition, only those data are plotted for which either the predicted or actual values are
8
between 0.05 and 0.95.
9
10
3 Cumulative distribution of SSE values for 300 random networks.
Training for all
11
networks began with weights and biases generation an SSE of 31.81. Final SSE’s ranged
12
from 6.50 to 31.13. The genetically-based network had a final SSE of 0.59.
13
14
4 Predictions of final two-temperature model. To simplify the plot, each series is truncated
15
one day before the first observed transition and one day after the last transition. Markers
16
are observed data and lines are model predictions. Except for the closed diamond which
17
is the Ler wild type, open markers denote autonomous path genes while closed markers
18
represent the photoperiod path. Color codes are: dark blue = wild type, orange=CRY2,
19
brown=GI, green=CO, purple=FPA, light blue=FVE, and red=FCA. The black arrows
20
indicate the switch in the transition order of FVE and CO with cooling (see text).
21
31
1
5 Relation between GE interaction magnitude and differences in genotype means. The
2
shape of the scatter is consistent with both the rarity and existence of exchanges in
3
anthesis date order among pairs of soybean varieties grown in different environments (see
4
text).
5
32
1
L
D
L
PHYB
CRY2
D
L
D
GI
D
CO
D
FPA
D
D
FVE
FCA
Int
33
Pre dicte d Fraction of Plants
1 .0 0
0 .7 5
Ler
CRY2
P HY B
GI
0 .5 0
F PA
F VE
FCA
CO
0 .2 5
0 .0 0
0 .0 0
0 .2 5
0 .5 0
0 .7 5
O b s e r v e d F r a c t io n o f P la n t s
34
1 .0 0
Fits of Random Networks
Cumulative Freq.
1
N=300
0.8
0.6
0.4
Gene-based network
SSE = 0.59
0.2
Initial Estimate Error
SSE = 31.81
0
0
5
10
15
20
25
Sum of Squared Errors
35
30
35
1
0.9
Ler
Series2
24C Data
0.8
CRY2
Series4
0.7
GI
0.6
Series6
CO
Fraction of Plants Completing Transition
0.5
Series8
FPA
0.4
Series10
0.3
FVE
Series12
0.2
FCA
Series14
0.1
0
10
20
30
40
50
60
1
0.9
Ler
16C Data
0.8
Series2
CRY2
Series4
0.7
GI
0.6
Series6
CO
0.5
Series8
FPA
0.4
Series10
0.3
FVE
Series12
0.2
FCA
Series14
0.1
0
10
20
30
40
Days after Planting
36
50
60
14
Std. Dev. of Genotype Differences (d)
12
10
8
6
4
2
0
-25
-20
-15
-10
-5
0
5
10
Mean of Genotype Differences (d)
37
15
20
25
30
Table 1. Genotype details of experimental materials.
Stock
No.
CS172
Allele
Symbol,
Position
FCA,
4-32.0
fca-6
CS166
fpa-2
FPA,
2-67.0
CS174
fve-2
FVE,
2-32.0
CS179
co-6
CO,
5-13.0
CS108
fha-1
CRY2,
1-12.0
CS183
gi-6
GI,
1-33.0
CS6211 phyB-1
PHYB,
2-35.0
CS20
(Ler-0)
ER,
2- 48.0
er-1
Phenotype, Gene Product, and Citations
 Late flowering; vernalization affects flowering time under
long and short days; strong short-day influence on flowering
time; flowers ca. 19 d after wild type.
 RNA binding protein
 Macknight et al. (1997)
 Late flowering; vernalization affects flowering time under
long and short days; flowers ca. 13 d after wild type.
 Unknown
 Koornneef et al. (1991)
 Late flowering; vernalization affects flowering time under
long and short days; strong short-day influence on flowering
time; flowers ca. 26 d after wild type.
 Unknown
 Koornneef et al. (1991)
 Late flowering (long days only); no effect of vernalization;
incompletely dominant; flowers ca. 21 d after wild type.
 Zinc-finger transcription factor
 Putterill et al. (1995)
 Late flowering; unaffected by short days or vernalization;
increased number of rosette leaves.
 Cryptochrome 2, a blue-light photoreceptor
 Ahmad et al. (1993); Guo et al. (1998)
 Late flowering (long days only); no effect of vernalization;
flowers ca. 22 d after wild type.
 Transmembrane protein
 Park et al. (1999); Fowler et al. (1999)
 Early flowering under short and long days; long hypocotyl in
red and white but not in far-red light; visibly reduced
chlorophyll level; abnormally long petioles, stems and root
hairs; smaller leaves and fewer in rosette; greater apical
dominance (longer main inflorescence, fewer lateral
branches, stem termini, and main inflorescence siliques).
 Phytochrome B, a red-light photoreceptor
 Reed et al. (1994)
1 Common background for all mutants; blunt fruits, short
petioles, usually upright stems, compact inflorescence; Ler-0,
ecotype derived from Landsberg (La-0), but genetically
distinct; carries erecta gene; blunt siliques, upright stems,
bunched flowers; suitable wild type, provides control for
many mutants.
38
Table 2. Parameter values and goodness-of-fit for three variations of the two-temperature model.
Node
Symbol
Values by version
I
II
III
CRY2
qCRY2
1.116
1.000
1.000
PHYB
qPHYB
-34.993
-50.637
a7
-18.556
-28.312
b’PHYB
-27.682
-18.222
FPA
qFPA
0.036
0.033
0.121
a4
0.233
0.241
0.132
b’FPA
-8.833
-8.724
-5.390
GI
qGI
0.128
0.125
0.130
a
-1.625
-4.244
-1.408
v11
1.492
0.640
1.099
v12
10.986
.003
v19
142.686
46.855
44.580
bGI
-2.311
-2.285
-2.235
FVE
qFVE
0.415
0.410
0.382
a5
0.070
0.077
0.093
bFVE
-2.093
-2.192
-2.627
CO
qCO
54.288
88.176
86.032
a13
5.209
5.265
5.584
bCO
-4.778
-5.708
-5.495
FCA
qFCA
0.331
0.321
0.288
a6
0.060
0.061
0.066
bFCA
-3.726
-3.811
-4.046
Int
a’
23.777
23.426
21.280
v15
1.470
2.296
1.903
v16
-0.863
-0.758
-2.040
v17
2.272
2.621
2.860
v18
8.860
9.850
11.720
RMSE
7.3410-2 7.3310-2 6.5710-2
39
Download