WhitePaper-20040629 - Computational Biology and Informatics

advertisement
ISSUES RELATED TO EXPERIMENTAL DESIGN AND NORMALIZATION
RE: Mouse and Human PancChips
Elisabetta Manduchi1 and Peter White2
1
Computational Biology and Informatics Laboratory, Center for Bioinformatics and 2Functional
Genomics Core, Department of Genetics
University of Pennsylvania, Philadelphia, PA 19104
Introduction
This document informally describes some of the issues that need to be evaluated when designing
a microarray experiment, with a specific focus on the human and mouse PancChips. There is no
hard and fast rule for the design of any particular study. Decisions typically must be made on a
case by case basis, keeping into account the concerns listed below and the questions and samples
of interest.
1. Direct-Comparison vs. Reference Design
When using 2-channel arrays to compare different conditions, there are different options
regarding what to hybridize in each channel of each array. For example, when comparing
condition A with condition B, the following are two possibilities:
a. Carry out a series of direct comparisons, say n arrays, with a sample of type A in one
channel and one of type B in the other channel in each array (and possible dyeswaps).
b. Select a common reference sample of type C and carry out a series of hybridizations
with a sample of type A in one channel and the common reference in the other and a
series of hybridizations with a sample of type B in one channel and the common
reference in the other. Then A and B would be compared by comparing the ratios A/C
to B/C.
Option (b) introduces more variability and if the only comparison of interest is between the two
conditions A and B then design (a) is preferable. However, if one plans to extend a study to
involve additional conditions to be compared between each other and with A and B, then option
(b) would be preferable to a loop design. For additional background on this issue and pointers to
further references, the reader is referred to [1].
In the case of a design using a common reference, care must be taken as to what to use for such a
reference. To minimize variability, the reference should come from the same pool of RNA,
rather than represent different biological replicates of the same type. The issue arises on whether
or not this should be a pool of RNA prior to labeling, so that the labeling of the reference is done
separately for each hybridization, or whether one should start with the same pool of labeled
RNA. If the experiments are done on the same day, the second option would be preferable, as it
-1-
would introduce less variability. However if the hybridizations are done over the course of
several days, a concern is that the labeled RNA might not be stable for an extended period of
time.
Other important considerations regarding choosing a common reference include:
i. Good “expression coverage”: by this we mean that it would be useful to have a reference for
which a high portion of the genes represented on the chip are expressed, to avoid spots with
zero denominators in the A/C and B/C ratios, which would force discarding those spots from
the analyses.
ii. If there are no suitable controls to be used for normalization and if one is forced to use all
spots to compute the normalization function, then the reference should be chosen to be such
that the hypothesis for the applicability of such a normalization are satisfied. If an intensitydependent normalization is used, these hypotheses should be satisfied at all intensities. For
example, in the case of global lowess normalization, the underlying assumption is that that
changes over the spots represented on the chip are roughly symmetric in the two samples (i.e.
the sample of interest and the reference sample) at all intensities or that few genes change. If
the method used is print-tip lowess normalization, such hypothesis should be satisfied for the
set of spots of every print-tip group.
One reference whose use we have considered is Stratagene’s Universal Reference RNA. This
RNA is pooled total RNA derived from a number of different cell lines and as such offers broad
gene coverage on many microarrays. Thus use of this as a common reference fulfills
consideration (i), but raises significant issues in terms of consideration (ii). When carrying out a
study with the PancChip using samples of pancreatic origin, there will clearly be significant
differences in terms of gene expression if these samples are compared with Universal Reference
RNA. This situation would require an alternative method of normalization, utilizing appropriate
controls rather than all genes on the chip (see section 3) and/or utilizing dye-swap experimental
designs.
2. Replication issues.
When comparing conditions, it is crucial to get a sense of the variability within each condition,
especially in studies that look for differentially expressed genes. Typically, if the question is
comparison between two populations A and B, it is important to get a sense of the biological
variability, besides the technical variability. If the kind of replication performed is only technical,
for example n hybridizations involving the same mouse of type A versus n hybridizations
involving the same mouse of type B, then one would be able to make inferences about
differences between these two particular mice, but not necessarily between the populations
themselves. If one wants to make inferences about the populations then true biological replicates
should be used, e.g. different mice of each population. Needless to say the number of replicates
per conditions should be sufficiently large. For studies on the PancChip we recommend a
minimum of five biological replicates. The use of only 2 or 3 replicates is simply too small and
will seriously limit the statistical analyses that can be done to this end.
-2-
When possible, it is also usually a good idea to have a dye-swap technical replicates for each of
the biological replicates. This can expand the range of normalization methods applicable to the
assays at hand.
3. Normalization.
In the microarray jargon, “normalization” indicates the attempt to identify and remove
systematic sources of variation in the measured intensities due to separate reverse transcription
and labeling, different scanning parameters, print-tip differences, spatial effects, different dye
labeling efficiency, the quality of the microarray printing, the quality of the mRNA used to
synthesize cDNA, etc. Normalization is necessary in order to put the data on equal footing before
making intensity comparisons within or between slides. Reference [2] discusses some
normalization method that can be used with 2-channel data, in particular normalization of the M
values where M=log2(Cy5)-log2(Cy3) for each given spot. These methods include: normalization
by a global constant, intensity-dependent lowess-normalization, print-tip lowess normalization
etc. Normalization involves two choices:
(i)
(ii)
the choice of which normalization function to use (e.g. lowess curves, etc.) to
normalize the values at each spot;
the choice of the spots to be used in order to compute such a normalization
function.
The latter could be all genes on the array or a suitable subset of these genes, such as a set of
appropriate controls. Whether or not a given method (with choices as in (i) and (ii)) is applicable,
will depend on whether or not the samples hybridized to the two channels satisfy certain
assumptions relative to the spots used in (ii). For example, if all spots on the array are used in (ii)
to build a global lowess curve to be used to normalize the M values, then the assumption is that
changes over the spots represented on the chip are roughly symmetric in the two samples at all
intensities or few genes change. If the method used in (i) is print-tip lowess normalization, such
hypothesis should be satisfied for the set of spots of every print-tip group. Thus, if one hybridizes
to a PancChip two samples like pancreas and brain (or even Universal Reference RNA), these
hypotheses are unlikely satisfied. Even for pancreas and liver there is some concern. One
solution would be to use for (ii) not all genes on the array, but a suitable subset (appropriate
control genes) for which such hypotheses would be satisfied. One such set of controls are MultiSample Pool (MSP) controls, also suggested in [2]. However there are some technical difficulties
and concerns regarding inserting these on the PancChip.
Another alternative set of controls are spiked controls. These have been used by some
researchers to this end. We have experimented with the use of the Stratagene’s SpotReport®-10
array validation system. In an attempt to evaluate the concerns that arise in the evaluation of
microarray hybridization data, the SpotReport system provides positive and negative controls
that are printed onto the PancChip along with our set of test genes. The kit also provides 10
exogenous Arabidopsis thaliana mRNA Spikes that can be added to the labeling reaction along
with the experimental RNA. Our primary use of these Spikes has been to provide our labeling
reactions with an internal quality control, and allow us to rapidly identify problems in terms of
-3-
mRNA quality along with any potential issues that may have arisen due to an error in the
labeling experimental procedure.
When comparing samples for which normalization utilizing all genes on the arrays in step (ii)
above would be inappropriate, these mRNA spikes may provide an alternative set for
normalization. By spiking in each of the 10 mRNAs at varying concentrations we see that it is
possible to determine the expected dye ratios and to normalize the signal intensities due to the
differences in dye incorporation and quantum yield. To achieve a complete coverage of the
dynamic range of intensities we typically spike in a doubling range from 2 pg to 2000 pg
utilizing all ten mRNA spikes. The expression data for the genes of interest can then be
normalized in an intensity dependent manner based on the expression values for the A. thaliana
spikes. Our main concern with this type of control has been the observation that they are not
useful in situations when the quality/quantity of the RNA in the two samples of interest is
different. As such, use of these Spikes for normalization should only occur when the samples
being compared are not closely matched, and after taking due care to ensure that the samples are
of same purity and quantity using an Agilent Bioanalyzer.
One more option to keep in mind is to employ pairs of technical replicates done in dye-swap and
utilize the paired-slides normalization described in [3]. Because of the assumptions underlying
this normalization method (see [3]), it is best to control image acquisition settings (e.g. maintain
the same settings for Cy3 over the replicates and the same settings for Cy5). In the case in which
no other normalization method is applicable as its assumptions are not satisfied, paired-slides
normalization might be applicable and might offer a valuable alternative. Moreover, even when
other normalization methods are applicable, they might be combined with paired-slides
normalization, if the latter is applicable too.
4. Conclusion.
It is not possible to produce a set of hard and fast rules, or a simple recipe, for microarray design
and analysis because there are factors which are heavily influenced by the fundamental questions
that the researcher hopes to be answered. However, it is our hope that the recommendations,
issues and discussion contained in this manuscript will help guide the potential user of the
PancChip in making wise and appropriate choices during the microarray process, from initial
experimental design to final data analysis.
REFERENCES
1. Yang Y.H., and Speed T. (2002), Nature Reviews Genetics (3): 579-588.
2. Yang Y.H, Dudoit S., Luu P., Lin D.M., Peng V., Ngai J., Speed T.P. (2002), Nucleic Acids
Res 30: e15, 2002.
3. Yang Y.H, Dudoit S., Luu P., Speed T.P.
[http://www.stat.berkeley.edu/users/terry/zarray/TechReport/589.pdf]
-4-
Download