Singh - Multiple Housekeeping Genes for Normalization of

The use of Multiple Housekeeping Genes for
Normalization of Quantitative RT-PCR Data
Jatinderpreet Singh
Abstract - Due in part to its accuracy and reproducibility,
quantitative real-time polymerase chain reaction (qRTPCR) is currently the most widely used technique for
quantifying gene expression (ie. mRNA) levels. Although
qRT-PCR encompasses a number of positive attributes, the
issue of using a single housekeeping (HK) gene for
normalization purposes continues to raise some concern. In
order to effectively normalize the data of a ‘target’ gene,
the expression of a HK gene should remain constant in any
environment. However, a number of studies have shown
some commonly used HK genes to be differentially
expressed in certain environments, thus resulting in
erroneous data analysis. To avoid this problem, this
paper, as well as several others, has suggested using
multiple HK genes to obtain reliable results. The aim of the
following paper was to develop a statistical model, which
not only illustrated the problems associated with using a
single HK gene, but also the benefits of using multiple
genes for normalization.
Index Terms – data normalization, gene expression,
housekeeping gene, quantitative real-time PCR,
PON its discovery over 20 years ago, the polymerase
chain reaction (PCR) has revolutionized genetic research
and is considered to be one of the most significant
scientific discoveries over the past 100 years [1]. The PCR
concept of exponentially growing minute amounts of a ‘target’
nucleic acid in a step-wise cyclic manner, has contributed to
significant discoveries in a broad range of scientific areas,
including biomedical and environmental research [1].
A specific extension of the PCR process that has led to
many of these discoveries includes the quantification of the
target nucleic acid as it accumulates using a DNA binding
fluorescent molecule in a process known as quantitative realtime PCR (qRT-PCR). Since the PCR products generated at
any cycle is directly proportional to the amount of starting
material, this powerful tool can be applied in a number of
different research areas [2]. One of the most important
applications of this technology is the ability to quantify mRNA
to study the gene expression behaviour of different cells.
Since most cellular activities associated with survival and
growth are a direct result of changes in gene expression, the
accurate quantification of these alterations in different
environments is crucial to the understanding of cellular
behaviour and the design of novel therapeutics [3]. Due in
part to its accuracy and reproducibility, qRT-PCR is currently
the most popular experimental technique for quantifying
mRNA levels [4].
Although qRT-PCR has gained this widespread attention in
the scientific community, there are still certain problems
associated with its use that must be addressed [4]. One such
concern continues to be associated with the use of a reference
gene for normalization purposes. When comparing the gene
expression of two samples, it is essential that the amount of
starting material loaded into the machine is equal amongst
both samples, as very small differences may indicate
differential expression, even though no expression difference
actually exists [5]. To deal with this sample-to-sample
variation, an internal control or housekeeping (HK) gene is
simultaneously amplified with the gene of interest for
normalization purposes. A HK gene is one that codes for a
protein that is crucial to the cell’s survival, and as such, its
expression remains constant in any environment [6]. In theory,
since its expression should remain stable, any fluctuation in the
expression of a HK gene amongst two samples indicates
sample-to-sample variation (ie. difference in the starting
material loaded) that can be corrected for through
Although a HK gene should theoretically remain constant in
all environments, numerous studies have shown them to vary
under certain experimental conditions [6]. For example, many
researchers utilize glyceraldehyde 3-phosphate dehydrogenase
and beta-actin as HK genes for their qRT-PCR work, even
though recent research has shown the transcription of both
genes to vary in different experimental settings [5]. If the
expression of a HK gene does vary, the results obtained may
lead to misguided conclusions [7].
Based on an examination of this problem, a number of
authors have suggested using more than one HK gene in order
to obtain reliable results [5]. Even though there are numerous
studies illustrating HK gene variation, a 1999 study conducted
by Suzuki et al., estimated that greater than 90% of all qRTPCR studies only use one HK gene [5].
The aim of this paper is to develop a statistical model
illustrating the problem associated with using one HK gene,
and will also attempt to demonstrate benefits associated with
utilizing multiple genes for normalization.
A. Quantification of qRT-PCR Data
When running a typical qRT-PCR experiment, the data
output is in the form of ‘cycle threshold’ (Ct) values. Ct values
represent the cycle at which fluorescence increases appreciably
above background noise [6]. As the PCR product continues to
double at each subsequent cycle, the fluorescence is monitored
in order to examine how much product has been amplified to
that point [3]. Early on in the PCR process, the number of
transcripts is so small that no significant signal can be
detected, and all that is observed is background noise.
However, once a cycle is reached where the fluorescence is
determined to be significantly higher than the background
noise, this cycle is recorded and is known as the Ct value [3].
The Ct value provides a means of comparing the gene
expression of different samples. Thus, the higher the amount
of initial mRNA (ie. the higher the gene expression), the faster
the fluorescence increases, and thus the lower the Ct value.
Once the Ct values are collected for a particular run, one can
determine if there is a gene expression difference between the
two samples being compared. A typical qRT-PCR scenario
includes looking at the effects of a certain chemical on the
expression of a specific gene. The statistical model that will be
illustrated in this paper is a good example of this type of study.
Specifically, the effect of triethylene glycol (TEG) on the
differential gene expression of glucosyltransferase B (gtfB), a
gene that helps cariogenic Streptococcus mutans adhere to
teeth, was examined. Thus, in one particular experiment, S.
mutans was grown in the presence of both 1 mM (the sample)
and 0 mM of TEG (the control). Once all the mRNA was
isolated from each group of cells, qRT-PCR was used to
determine the Ct values of gtfB in both groups (ie. the sample
and control). In addition to gtfB, the gene expression of gyrase
A (gyrA), a commonly used HK gene in S. mutans, was also
determined for both the sample and control in order to
normalize the gtfB data [8]. Figure 1 illustrates the
arrangement of this study and also reinforces the concept of a
Ct value. Once all these values were obtained, a difference in
gtfB expression between the control and sample was
determined using the following fold expression difference
(FED) equation:
E gtfB 
 E gyrA  C
t , gtfB ( Control  sample)
t , gyrA ( Control  sample)
Where  Ct, gtfB (Control – Sample) represents the difference in
Ct values of gtfB in the control and sample,  Ct, gyrA (Control –
Sample) represents the difference in Ct values of HK gene,
gyrA, in the control and sample, and EgtfB and EgyrA represent
the PCR efficiencies of gtfB and gyrA respectively. Ideally,
after each PCR cycle the amount of genetic material should
double, thus resulting in a PCR efficiency of 2. However, this
is not always the case, and different primers have E values that
vary from 1.6-2.1 [6].
Fig. 1. The cycle at which the amplified genetic material rises above the bold
black line (ie. the threshold) represents the C t value for each sample. Based
on the above diagram, the Ct values in each case are: gtfB (Control) – 30.3,
gtfB (Sample) – 28.6, gyrA (Sample) – 24.5, gyrA (Control) – 22.9.
A more general form of Equation 1 is given below:
Ct ,t arg et (Control  sample)
E t arg et
T( X )
( 2)
Ct , HK (Control  sample)
 E HK 
The numerator of the above equation, T (X), represents the
un-normalized target gene expression difference between the
sample and control, while the denominator, H (X), acts to
normalize and correct for any sample-to-sample variations as
described earlier. Thus, a FED of one would represent a case
where the exposed TEG concentration has no effect on the
expression of the target gene (ie. gtfB).
B. Illustration of HK Gene Variability
In order to ensure that a HK gene’s expression remains
constant in the environment of interest, some researchers do a
qRT-PCR run with two independent HK genes. Theoretically,
the FED calculated from Equation 2 should be one, as the
expression of both genes should remain constant under any
environmental condition.
In the case of the S. mutans model eluded to earlier, an
analysis of gyrA and its constant expression was examined
using another commonly used S. mutans HK gene, 16s
ribosomal RNA. Using Equation 2, gyrA was considered the
‘target’ and 16s was the HK gene. QRT-PCR runs were done
for four different TEG concentrations (0.001, 0.01, .01, and 1
mM). All reactions were run as triplicates for every experiment
and reproduced four separate times (ie. from a different colony
of S. mutans). The uncertainty of each set of data is
represented by a standard error of the means, and a student’s ttest was performed for each TEG treatment level to test the
null hypothesis that the FED of gyrA is equal to one (ie. Ho:
FED = 1). The alternative hypothesis in this case is that the
FED is significantly different from one, thus indicating
differential gene expression of the HK gene (ie. H1: FED does
not equal 1). Thus, if p>0.05, Ho is accepted and the FED
between the two genes is not significantly different from 1,
thus indicating that both gyrA and 16s are likely suitable HK
genes. On the other hand, if p< 0.05, we fail to accept H o, and
conclude that the gene expression of gyrA or 16s varies upon
exposure to a certain concentration of TEG and the FED is
significantly different from 1. Therefore, using either gyrA or
16s as a HK gene without further research may lead to
inaccurate results.
Results from this statistical analysis are presented in Figure 2.
Fig. 2. The differential gene expression of gyrA upon exposure to varying TEG
concentrations. Based on the statistical analysis above, it appears as if gyrA is
differentially expressed upon exposure to 0.001 mM of TEG, as p<0.05 and the
error bars do not overlap a FED of 1.
Based on Figure 2, the mean FED of gyrA at 0.01, 0.1, and 1
mM are all close to one, and have a p-value greater than 0.05,
indicating that the use of gyrA as a HK gene in these cases is
suitable. However, in the case of the results obtained at 0.001
mM, p < 0.05, and the error bars do not overlap a FED of 1,
thus indicating that there is likely variability in the expression
of either gyrA or 16s, and that the FED is significantly
different from 1. Based on these results alone, neither 16s nor
gyrA can be used as a HK gene for gtfB analysis at 0.001 mM
of TEG, while all other concentrations appear to be adequate
for normalization purposes.
This finding is interesting considering what would have
happened if only one HK gene was used in this study. For
example, if gyrA was used to normalize gtfB expression at
0.001 mM without checking its stability with 16s under this
environment, the findings of this study could be flawed due to
the possible variability of gyrA expression. The word
‘possible’ is used here, because it is also possible that 16s is
effected by 0.001 mM of TEG and not gyrA, however there is
no way of telling without further studies. In closing, if gyrA
was indeed variable in this case, and used for normalization of
gtfB, the results would be inaccurate and trivial.
C. Benefits of using Multiple HK Genes
Unlike the results illustrated above, the following data was
not experimentally collected, but was instead randomly
generated to develop a statistical model, which illustrated
some of the benefits of using multiple HK genes for qRT-PCR
To keep things consistent with the S. mutans model
illustrated throughout this paper, this analysis will once again
look at the differential gene expression of gtfB upon exposure
to X mM of TEG. However, in this case, four independent HK
genes (G1, G2, G3, and G4) were used to normalize the data
obtained for gtfB.
Recall from Equation 2, that the expression representing the
FED is equal to T(X)/H(X). Thus, in the case of T (X), 100
samples containing 10 replicates of random numbers from 00.5 were generated to represent the un-normalized gene
expression difference of gtfB between the control and sample
(ie. T(X) = EgtfB Ct (Control-Sample)). In similar fashion, four
separate data sets (since there are four HK genes) of 100
samples containing 10 replicates of random numbers were
generated to represent H(XG1), H(XG2), H(XG3), and H(XG4).
For H(XG1), H(XG2), and H(XG3), random numbers from 0-1.0
were generated, while in the case of H(XG4), numbers from 00.5 were generated. Next, the four different sets of FED’s of
gtfB were calculated for each HK gene, using Equation 3:
T( X )
FEDn 
H( X n )
Where n = G1, G2, G3, and G4
Once the mean FED’s were calculated for each of the 100
samples, histograms illustrating the frequency of these means
were plotted. These graphs are presented in Figures 3a, b, c,
and d.
The reason data for H(XG4) was deliberately chosen from a
different set of upper and lower limits, was to illustrate the
most obvious advantage of using multiple HK genes. As you
can see, the population mean estimates (Xbar) for the FED’s of
G1, G2, and G3 (0.442, 0.501, and 0.588) are fairly close
together, while the mean for G4 (1.26) is noticeably different.
This result most likely indicates that G4 is being differentially
expressed due to the environmental change (ie. addition of
TEG), and thus is not a good option for a HK gene in this case.
This example also illustrates the possible inaccurate
conclusions that could have been drawn if the only HK gene
used was G4, as a FED of 1.26 would have been assumed to
be closely related to the actual result, when it is more likely to
be around 0.50. Thus, using multiple HK genes allows one to
have a better chance of pinpointing any gene that appears to be
differentially expressed based on a simple comparison of each
genes expression. Based on these results, G4 should be
omitted and should not be used for normalization of gtfB data.
To further illustrate the power of using multiple HK genes, a
graph was constructed where the data obtained from G1, G2,
and G3 were combined and averaged for normalization
purposes. As mentioned above, since G4 is differentially
expressed, it was omitted and not included in this combination
study. The FED’s for the combination of the 3 HK genes
(3HKG) was calculated using Equation 4:
T( X )
H ( X G1 )  H ( X G2 )  H ( X G3 )
Similar to the cases above, a histogram of the mean
FED3HKG’s was constructed and is presented in Figure 3e.
Upon comparing each single HK gene graph to that of the
combined analysis, several beneficial characteristics of the
latter are obvious. Firstly, the combination of the 3 HK genes
resulted in data with reduced dispersion as is evident by the
narrowing of the histogram distribution, and the decrease in
standard deviation.
graph has the smallest 95% confidence interval (CI). Since a
95% CI provides insight into the uncertainty of our estimate of
the population mean, this indicates that an increase in the
number of HK genes decreases the uncertainty of our estimate.
Secondly, the use of multiple HK genes resulted in a curve
that most resembled a normal distribution. Two key features of
a normal curve include its symmetrical bell shape, and the
equality of the mean, mode, and median. Based on a
qualitative analysis of the graphs presented, graphs G1 and
3HKG seem to possess shapes most similar to a bell shaped
curve, as G2 and G3 are clearly asymmetrical and skewed, and
graph G4 seems to have a triangular shaped distribution.
Based on the second criterion of the equality of the mean,
mode, and median, no graph perfectly exemplifies this;
however, an analysis of the data shows that the combined
graph meets this characteristic to the greatest degree. Although
in all cases the mode was essentially the same as the
population mean estimate (See Figure 3 for Xbar values), the
median (M) varied slightly in each case, with the 3HKG’s
mean being closest to its median (ie. MG1 = 0.431, MG2 =
0.515, MG3 = 0.599; M3HKG = 0.513). The reason it is
important for a data set to approach a normal distribution is
based on the fact that many statistical tests (ie. Student’s t-test)
assume that the data set is normal. Thus, the closer a data set
can approach a normal curve, the greater the accuracy and
applicability of the data obtained from these statistical tests,
thus providing increased confidence and better insight into the
nature of ones results.
In closing, HK genes often vary in different environmental
conditions and may lead to inaccurate conclusions. By using
multiple HK genes, you not only increase your chances of
picking out and disregarding a variable HK gene, but
combining them for normalization purposes decreases the
dispersion in your data and helps increase the accuracy of
various statistical tests, as the data more closely resembles a
normal distribution. Based on the results illustrated here, it is
recommended that multiple HK genes be used in order to
avoid erroneous conclusions.
Fig. 3. The Differential Expression of gtfB in S. mutans cells upon
exposure to X mM of TEG, using a) G1, b) G2, c) G3, d) G4, and e) 3HKG
(ie. average value of G1, G2, and G3) for normalization purposes.
By increasing the number of HK genes used, any outliers or
errors associated with any individual data set gets averaged out
in such a way that the overall ‘noise’ decreases, thus resulting
in a reduction in data dispersion. Furthermore, the 3HKG
1- Powledge TM, “The Polymerase Chain Reaction”, Adv Physiol Educ., vol.
28, pp 44-50, Dec. 2004
2- Ginzinger DG, “Gene quantification using real-time quantitative PCR: an
emerging technology hits the mainstream”, Exp Hematol. vol. 30, pp 50312, June 2002
3- Bustin SA, “Absolute quantification of mRNA using real-time reverse
transcription polymerase chain reaction assays”, J Mol Endocrinol., vol.
25, pp 169-193, Oct 2002
4- Bustin SA and Nolan T, “Pitfalls of quantitative real-time reversetranscription polymerase chain reaction”, J Biomol Tech., vol 15, pp 15566, Sept. 2004
5- Radonic A, Thulke S, Mackay IM, Landt O, Siegert W, and Nitsche A,
“Guideline to reference gene selection for quantitative real-time PCR”,
Biochem Biophys Res Commun., vol. 313, pp 856-862, Jan 2004.
6- Pfaffl MW, “A-Z of Quantitative PCR- Quantification strategies in realtime RT-PCR”, July 2003
7- Dheda K, Hugget J, Chang J, Kim L, Bustin S, Johnson M, Rook G, and
Zumla A, “The implications of using an inappropriate reference gene for
real-.time reverse transcription PCR data normalization”, Anal. Biochem, vol 34,
pp 141-43, Sept 2005
8- Chaussee M and Watson R, “Identification of RGG-regulated exoproteins of
Streptococcus pyogenes”, Infect Immun. Vol. 69, pp 822-31, 2002