Department of
Biology
Courant Institute of
Math & Computer Sciences
Gloria Coruzzi
Mike Chou
Andrew Kouranov
Laurence Lejay
Dennis Shasha Bud Mishra
Marco Antoinotti
Marc Rejali
NH
4
+
Glu
Gln
Asp
Asn
Light Carbon Light Carbon
GS2
Amino acids
Gln
C
:N
C5:N2
AS1 Amino acids
Asn
C:
N
C4:N2
A Multi-factor Approach to C:N sensing in plants.
Identify how a combination of interactions of “inputs”
(Light, Carbon, & Nitrogen) affects gene regulation using Combinatorial Design and Genome Chip analysis.
Identify Arabidopsis mutants defective in C:N sensing
Forward genetics: Selections for C:N sensing mutants
Reverse genetics: Mutants in candidate C:N signaling genes
Ultimate Goal: Virtual plant… (frankenfoods)
A Combinatorial Approach to discovering interactions
Inputs: *Light
*Starvation to Various Nutrients
*Carbon
*Inorganic N (NO3/NH4)
*Organic N (Glu)
*Organic N (Gln)
If inputs are take binary values (first approximation)
6 binary (+/-) inputs= 2 6 or 64 input combinations (or treatments)
Use combinatorial design to reduce number of treatment combinations required to effectively cover the experimental space
ACTIVIST DATA MINING
Don’t study the experiments (only). Change them.
Combinatorial design generates a subset of the 64 treatments that give “good” approximation of the entire experimental space.
For every pair of “inputs”, all four combinations of binary variables are tested:
Example ; NO
+NO
3
3 and Carbon have four possible combinations
+Carbon; +NO
3
-Carbon; -NO
3
+Carbon; -NO
3
-Carbon
Each combination of inputs is present in at least one treatment of experiments predicted by combinatorial design
“Combinatorial design” predicts 12 conditions to test the effect of
Light in all combinations of Starvation, Carbon, and Nitrogen
EXPT 1
PIVOT
LIGHT
LANE LIGHT STARVE CARBON NO3NH4 GLU GLN
6
7
8
9
10
11
12
1
4
5
2
3
LIGHT
LIGHT
LIGHT
LIGHT
LIGHT
LIGHT
DARK
DARK
DARK
DARK
DARK
DARK
N
N
N
Y
N
N
Y
N
N
N
Y
Y
L
0
L
L
L
0
0
0
L
0
0
L
L
0
L
0
L
0
L
0
L
0
L
0
0
H
0
H
H
0
0
0
H
H
0
H
0
H
0
0
H
0
H
0
H
H
H
0
“Pivot” analysis of gene expression data from C:N treatments
Find “minimal pairs” of treatments that are the same except in one input (e.g. Light) to measure its effect on a dependent variable (gene) (e.g. AS1)
PIVOT Dependent
Variable
(Gene)
LIGHT AS1
EFFECT Evidence =
Minimal pair treatments repress 4_8
LITE STARVE CARBON NO3 GLU
L_D N L 0 H
Analyze a series of minimal pair treatments using one input
(e.g. Light) as a “pivot”, to determine the effect of light on a dependent variable (e.g. AS1) under a variety of carbon and nitrogen combinations. If consistent, likely always true.
LITE represses AS1 & induces GS2 under a variety of C:N conditions
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT AS1
LIGHT GS2
LIGHT GS2
LIGHT GS2
LIGHT GS2
LIGHT GS2
LIGHT GS2
LIGHT GS2
LIGHT GS2
PIVOT dependent EFFECT Evidence=
Minimalpair treatments induce induce induce induce induce induce induce induce repress 1_5 repress 2_6 repress 3_7 repress 4_8 repress 10_14 repress 11_15 repress 12_16 repress 13_17
1_5
2_6
3_7
4_8
10_14
11_15
12_16
13_17
LITE
L
0
0
0
L
0
0
0
L
0
0
0
L
0
0
0
L
L
0
L
L
L
0
L
L
L
0
L
L
L
0
L
N
Y
N
N
Y
Y
Y
Y
Y
Y
N
Y
Y
N
Y
Y
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
L_D
STARVE CARBON NO3/NH4 GLU
0
H
0
0
0
H
0
0
0
H
0
0
0
H
0
0
GLU induces AS1 & represses GS2 under a variety of conditions
PIVOT Gene
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GLU
GS2
GS2
GS2
GS2
AS1
AS1
AS1
AS1
AS1
AS1
AS1
GS2
GS2
GS2
GS2
EFFECT Evidence=
Minimalpair
Treatments induce 2_4 induce induce induce induce
6_8
15_17
19_21
23_25 induce induce
26_28
30_32 repress 2_4 repress 6_8 repress 11_13 repress 15_17 repress 19_21 repress 20_22 repress 23_25 repress 30_32
LIGHT STARVE Carbon NO3/NH4 GLU
L
L
D
L
L
L
D
D
L
D
D
D
D
L
L
Y
Y
Y
Y
Y
Y
Y
N
N
Y
Y
N
N
N
Y
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
0
0
0
L
0
0
0
0
0
0
0
0
0
0
0
0_H
0_H
0_L
0_L
0_H
0_H
0_H
0_H
0_L
0_L
0_L
0_H
0_H
0_H
0_H
Underlying Method: combinatorial design
Combinatorial design : Inspired by work in software testing by
David Cohen, Siddhartha Dalal, Michael Fredman and
Gardner Patton at Bellcore/Telcordia.
Their problem: how to test a good set of inputs to a program to discover whether there are any bugs.
Not program coverage, but input coverage.
Not all input combinations, but all combinations of every pair of of input variables.
Hypothesis: every input combination should give same output: no error.
If true for designed subset, then program is ok.
Underlying Method: combinatorial design 2
Scientific question: does input X induce
(resp. repress) the output?
If so, then, regardless of the other inputs,
X should induce.
So, choose X = low and then a combinatorial design of the other inputs.
Then choose X = high and then the same combinatorial design of the other inputs.
If for each context c in the design (high,c) has more output than (low,c) -- minimal pair -- then X is inductive.
Underlying Methods: adaptive design
What happens when X isn’t uniformly inductive or repressive?
Suppose X shows induction normally, but repression occasionally. That is for most c values
(low, c) vs. (high, c) shows induction, but for one c’
(low,c’) vs. (high, c’) shows repression.
Then study difference between those c values showing induction that are closest to c’ and design experiments to reduce those differences.
Conclusions About Methodology
Design/don’t wait : Use the data you are given, sure, but don’t be shy to ask for more.
Combinatorial Design can help test a hypothesis : e.g. 10 three-valued variables require
59,049 experiments to cover whole space. Combinatorial design can reduce this to 27.
Adaptation is easy: Study differences between normal cases and abnormal ones to discover fine structure.