The use of Multiple Housekeeping Genes for Normalization of Quantitative RT-PCR Data Jatinderpreet Singh Abstract - Due in part to its accuracy and reproducibility, quantitative real-time polymerase chain reaction (qRTPCR) is currently the most widely used technique for quantifying gene expression (ie. mRNA) levels. Although qRT-PCR encompasses a number of positive attributes, the issue of using a single housekeeping (HK) gene for normalization purposes continues to raise some concern. In order to effectively normalize the data of a ‘target’ gene, the expression of a HK gene should remain constant in any environment. However, a number of studies have shown some commonly used HK genes to be differentially expressed in certain environments, thus resulting in erroneous data analysis. To avoid this problem, this paper, as well as several others, has suggested using multiple HK genes to obtain reliable results. The aim of the following paper was to develop a statistical model, which not only illustrated the problems associated with using a single HK gene, but also the benefits of using multiple genes for normalization. Index Terms – data normalization, gene expression, housekeeping gene, quantitative real-time PCR, I. INTRODUCTION U PON its discovery over 20 years ago, the polymerase chain reaction (PCR) has revolutionized genetic research and is considered to be one of the most significant scientific discoveries over the past 100 years [1]. The PCR concept of exponentially growing minute amounts of a ‘target’ nucleic acid in a step-wise cyclic manner, has contributed to significant discoveries in a broad range of scientific areas, including biomedical and environmental research [1]. A specific extension of the PCR process that has led to many of these discoveries includes the quantification of the target nucleic acid as it accumulates using a DNA binding fluorescent molecule in a process known as quantitative realtime PCR (qRT-PCR). Since the PCR products generated at any cycle is directly proportional to the amount of starting material, this powerful tool can be applied in a number of different research areas [2]. One of the most important applications of this technology is the ability to quantify mRNA to study the gene expression behaviour of different cells. Since most cellular activities associated with survival and growth are a direct result of changes in gene expression, the accurate quantification of these alterations in different environments is crucial to the understanding of cellular behaviour and the design of novel therapeutics [3]. Due in part to its accuracy and reproducibility, qRT-PCR is currently the most popular experimental technique for quantifying mRNA levels [4]. Although qRT-PCR has gained this widespread attention in the scientific community, there are still certain problems associated with its use that must be addressed [4]. One such concern continues to be associated with the use of a reference gene for normalization purposes. When comparing the gene expression of two samples, it is essential that the amount of starting material loaded into the machine is equal amongst both samples, as very small differences may indicate differential expression, even though no expression difference actually exists [5]. To deal with this sample-to-sample variation, an internal control or housekeeping (HK) gene is simultaneously amplified with the gene of interest for normalization purposes. A HK gene is one that codes for a protein that is crucial to the cell’s survival, and as such, its expression remains constant in any environment [6]. In theory, since its expression should remain stable, any fluctuation in the expression of a HK gene amongst two samples indicates sample-to-sample variation (ie. difference in the starting material loaded) that can be corrected for through normalization. Although a HK gene should theoretically remain constant in all environments, numerous studies have shown them to vary under certain experimental conditions [6]. For example, many researchers utilize glyceraldehyde 3-phosphate dehydrogenase and beta-actin as HK genes for their qRT-PCR work, even though recent research has shown the transcription of both genes to vary in different experimental settings [5]. If the expression of a HK gene does vary, the results obtained may lead to misguided conclusions [7]. Based on an examination of this problem, a number of authors have suggested using more than one HK gene in order to obtain reliable results [5]. Even though there are numerous studies illustrating HK gene variation, a 1999 study conducted by Suzuki et al., estimated that greater than 90% of all qRTPCR studies only use one HK gene [5]. The aim of this paper is to develop a statistical model illustrating the problem associated with using one HK gene, and will also attempt to demonstrate benefits associated with utilizing multiple genes for normalization. 2 II. STATISTICAL MODELS A. Quantification of qRT-PCR Data When running a typical qRT-PCR experiment, the data output is in the form of ‘cycle threshold’ (Ct) values. Ct values represent the cycle at which fluorescence increases appreciably above background noise [6]. As the PCR product continues to double at each subsequent cycle, the fluorescence is monitored in order to examine how much product has been amplified to that point [3]. Early on in the PCR process, the number of transcripts is so small that no significant signal can be detected, and all that is observed is background noise. However, once a cycle is reached where the fluorescence is determined to be significantly higher than the background noise, this cycle is recorded and is known as the Ct value [3]. The Ct value provides a means of comparing the gene expression of different samples. Thus, the higher the amount of initial mRNA (ie. the higher the gene expression), the faster the fluorescence increases, and thus the lower the Ct value. Once the Ct values are collected for a particular run, one can determine if there is a gene expression difference between the two samples being compared. A typical qRT-PCR scenario includes looking at the effects of a certain chemical on the expression of a specific gene. The statistical model that will be illustrated in this paper is a good example of this type of study. Specifically, the effect of triethylene glycol (TEG) on the differential gene expression of glucosyltransferase B (gtfB), a gene that helps cariogenic Streptococcus mutans adhere to teeth, was examined. Thus, in one particular experiment, S. mutans was grown in the presence of both 1 mM (the sample) and 0 mM of TEG (the control). Once all the mRNA was isolated from each group of cells, qRT-PCR was used to determine the Ct values of gtfB in both groups (ie. the sample and control). In addition to gtfB, the gene expression of gyrase A (gyrA), a commonly used HK gene in S. mutans, was also determined for both the sample and control in order to normalize the gtfB data [8]. Figure 1 illustrates the arrangement of this study and also reinforces the concept of a Ct value. Once all these values were obtained, a difference in gtfB expression between the control and sample was determined using the following fold expression difference (FED) equation: C E gtfB FED E gyrA C t , gtfB ( Control sample) t , gyrA ( Control sample) (1) Where Ct, gtfB (Control – Sample) represents the difference in Ct values of gtfB in the control and sample, Ct, gyrA (Control – Sample) represents the difference in Ct values of HK gene, gyrA, in the control and sample, and EgtfB and EgyrA represent the PCR efficiencies of gtfB and gyrA respectively. Ideally, after each PCR cycle the amount of genetic material should double, thus resulting in a PCR efficiency of 2. However, this is not always the case, and different primers have E values that vary from 1.6-2.1 [6]. Fig. 1. The cycle at which the amplified genetic material rises above the bold black line (ie. the threshold) represents the C t value for each sample. Based on the above diagram, the Ct values in each case are: gtfB (Control) – 30.3, gtfB (Sample) – 28.6, gyrA (Sample) – 24.5, gyrA (Control) – 22.9. A more general form of Equation 1 is given below: Ct ,t arg et (Control sample) E t arg et T( X ) FED ( 2) Ct , HK (Control sample) H (X) E HK The numerator of the above equation, T (X), represents the un-normalized target gene expression difference between the sample and control, while the denominator, H (X), acts to normalize and correct for any sample-to-sample variations as described earlier. Thus, a FED of one would represent a case where the exposed TEG concentration has no effect on the expression of the target gene (ie. gtfB). B. Illustration of HK Gene Variability In order to ensure that a HK gene’s expression remains constant in the environment of interest, some researchers do a qRT-PCR run with two independent HK genes. Theoretically, the FED calculated from Equation 2 should be one, as the expression of both genes should remain constant under any environmental condition. In the case of the S. mutans model eluded to earlier, an analysis of gyrA and its constant expression was examined using another commonly used S. mutans HK gene, 16s ribosomal RNA. Using Equation 2, gyrA was considered the ‘target’ and 16s was the HK gene. QRT-PCR runs were done for four different TEG concentrations (0.001, 0.01, .01, and 1 mM). All reactions were run as triplicates for every experiment and reproduced four separate times (ie. from a different colony of S. mutans). The uncertainty of each set of data is represented by a standard error of the means, and a student’s ttest was performed for each TEG treatment level to test the null hypothesis that the FED of gyrA is equal to one (ie. Ho: FED = 1). The alternative hypothesis in this case is that the FED is significantly different from one, thus indicating differential gene expression of the HK gene (ie. H1: FED does not equal 1). Thus, if p>0.05, Ho is accepted and the FED between the two genes is not significantly different from 1, thus indicating that both gyrA and 16s are likely suitable HK genes. On the other hand, if p< 0.05, we fail to accept H o, and conclude that the gene expression of gyrA or 16s varies upon exposure to a certain concentration of TEG and the FED is significantly different from 1. Therefore, using either gyrA or 16s as a HK gene without further research may lead to inaccurate results. 3 Results from this statistical analysis are presented in Figure 2. Fig. 2. The differential gene expression of gyrA upon exposure to varying TEG concentrations. Based on the statistical analysis above, it appears as if gyrA is differentially expressed upon exposure to 0.001 mM of TEG, as p<0.05 and the error bars do not overlap a FED of 1. Based on Figure 2, the mean FED of gyrA at 0.01, 0.1, and 1 mM are all close to one, and have a p-value greater than 0.05, indicating that the use of gyrA as a HK gene in these cases is suitable. However, in the case of the results obtained at 0.001 mM, p < 0.05, and the error bars do not overlap a FED of 1, thus indicating that there is likely variability in the expression of either gyrA or 16s, and that the FED is significantly different from 1. Based on these results alone, neither 16s nor gyrA can be used as a HK gene for gtfB analysis at 0.001 mM of TEG, while all other concentrations appear to be adequate for normalization purposes. This finding is interesting considering what would have happened if only one HK gene was used in this study. For example, if gyrA was used to normalize gtfB expression at 0.001 mM without checking its stability with 16s under this environment, the findings of this study could be flawed due to the possible variability of gyrA expression. The word ‘possible’ is used here, because it is also possible that 16s is effected by 0.001 mM of TEG and not gyrA, however there is no way of telling without further studies. In closing, if gyrA was indeed variable in this case, and used for normalization of gtfB, the results would be inaccurate and trivial. C. Benefits of using Multiple HK Genes Unlike the results illustrated above, the following data was not experimentally collected, but was instead randomly generated to develop a statistical model, which illustrated some of the benefits of using multiple HK genes for qRT-PCR analysis. To keep things consistent with the S. mutans model illustrated throughout this paper, this analysis will once again look at the differential gene expression of gtfB upon exposure to X mM of TEG. However, in this case, four independent HK genes (G1, G2, G3, and G4) were used to normalize the data obtained for gtfB. Recall from Equation 2, that the expression representing the FED is equal to T(X)/H(X). Thus, in the case of T (X), 100 samples containing 10 replicates of random numbers from 00.5 were generated to represent the un-normalized gene expression difference of gtfB between the control and sample (ie. T(X) = EgtfB Ct (Control-Sample)). In similar fashion, four separate data sets (since there are four HK genes) of 100 samples containing 10 replicates of random numbers were generated to represent H(XG1), H(XG2), H(XG3), and H(XG4). For H(XG1), H(XG2), and H(XG3), random numbers from 0-1.0 were generated, while in the case of H(XG4), numbers from 00.5 were generated. Next, the four different sets of FED’s of gtfB were calculated for each HK gene, using Equation 3: T( X ) (3) FEDn H( X n ) Where n = G1, G2, G3, and G4 Once the mean FED’s were calculated for each of the 100 samples, histograms illustrating the frequency of these means were plotted. These graphs are presented in Figures 3a, b, c, and d. The reason data for H(XG4) was deliberately chosen from a different set of upper and lower limits, was to illustrate the most obvious advantage of using multiple HK genes. As you can see, the population mean estimates (Xbar) for the FED’s of G1, G2, and G3 (0.442, 0.501, and 0.588) are fairly close together, while the mean for G4 (1.26) is noticeably different. This result most likely indicates that G4 is being differentially expressed due to the environmental change (ie. addition of TEG), and thus is not a good option for a HK gene in this case. This example also illustrates the possible inaccurate conclusions that could have been drawn if the only HK gene used was G4, as a FED of 1.26 would have been assumed to be closely related to the actual result, when it is more likely to be around 0.50. Thus, using multiple HK genes allows one to have a better chance of pinpointing any gene that appears to be differentially expressed based on a simple comparison of each genes expression. Based on these results, G4 should be omitted and should not be used for normalization of gtfB data. To further illustrate the power of using multiple HK genes, a graph was constructed where the data obtained from G1, G2, and G3 were combined and averaged for normalization purposes. As mentioned above, since G4 is differentially expressed, it was omitted and not included in this combination study. The FED’s for the combination of the 3 HK genes (3HKG) was calculated using Equation 4: T( X ) FED3HKG (4) 1 H ( X G1 ) H ( X G2 ) H ( X G3 ) 3 Similar to the cases above, a histogram of the mean FED3HKG’s was constructed and is presented in Figure 3e. Upon comparing each single HK gene graph to that of the combined analysis, several beneficial characteristics of the latter are obvious. Firstly, the combination of the 3 HK genes resulted in data with reduced dispersion as is evident by the narrowing of the histogram distribution, and the decrease in standard deviation. 4 graph has the smallest 95% confidence interval (CI). Since a 95% CI provides insight into the uncertainty of our estimate of the population mean, this indicates that an increase in the number of HK genes decreases the uncertainty of our estimate. Secondly, the use of multiple HK genes resulted in a curve that most resembled a normal distribution. Two key features of a normal curve include its symmetrical bell shape, and the equality of the mean, mode, and median. Based on a qualitative analysis of the graphs presented, graphs G1 and 3HKG seem to possess shapes most similar to a bell shaped curve, as G2 and G3 are clearly asymmetrical and skewed, and graph G4 seems to have a triangular shaped distribution. Based on the second criterion of the equality of the mean, mode, and median, no graph perfectly exemplifies this; however, an analysis of the data shows that the combined graph meets this characteristic to the greatest degree. Although in all cases the mode was essentially the same as the population mean estimate (See Figure 3 for Xbar values), the median (M) varied slightly in each case, with the 3HKG’s mean being closest to its median (ie. MG1 = 0.431, MG2 = 0.515, MG3 = 0.599; M3HKG = 0.513). The reason it is important for a data set to approach a normal distribution is based on the fact that many statistical tests (ie. Student’s t-test) assume that the data set is normal. Thus, the closer a data set can approach a normal curve, the greater the accuracy and applicability of the data obtained from these statistical tests, thus providing increased confidence and better insight into the nature of ones results. In closing, HK genes often vary in different environmental conditions and may lead to inaccurate conclusions. By using multiple HK genes, you not only increase your chances of picking out and disregarding a variable HK gene, but combining them for normalization purposes decreases the dispersion in your data and helps increase the accuracy of various statistical tests, as the data more closely resembles a normal distribution. Based on the results illustrated here, it is recommended that multiple HK genes be used in order to avoid erroneous conclusions. REFERENCES Fig. 3. The Differential Expression of gtfB in S. mutans cells upon exposure to X mM of TEG, using a) G1, b) G2, c) G3, d) G4, and e) 3HKG (ie. average value of G1, G2, and G3) for normalization purposes. By increasing the number of HK genes used, any outliers or errors associated with any individual data set gets averaged out in such a way that the overall ‘noise’ decreases, thus resulting in a reduction in data dispersion. Furthermore, the 3HKG 1- Powledge TM, “The Polymerase Chain Reaction”, Adv Physiol Educ., vol. 28, pp 44-50, Dec. 2004 2- Ginzinger DG, “Gene quantification using real-time quantitative PCR: an emerging technology hits the mainstream”, Exp Hematol. vol. 30, pp 50312, June 2002 3- Bustin SA, “Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays”, J Mol Endocrinol., vol. 25, pp 169-193, Oct 2002 4- Bustin SA and Nolan T, “Pitfalls of quantitative real-time reversetranscription polymerase chain reaction”, J Biomol Tech., vol 15, pp 15566, Sept. 2004 5- Radonic A, Thulke S, Mackay IM, Landt O, Siegert W, and Nitsche A, “Guideline to reference gene selection for quantitative real-time PCR”, Biochem Biophys Res Commun., vol. 313, pp 856-862, Jan 2004. 6- Pfaffl MW, “A-Z of Quantitative PCR- Quantification strategies in realtime RT-PCR”, July 2003 7- Dheda K, Hugget J, Chang J, Kim L, Bustin S, Johnson M, Rook G, and Zumla A, “The implications of using an inappropriate reference gene for real-.time reverse transcription PCR data normalization”, Anal. Biochem, vol 34, pp 141-43, Sept 2005 8- Chaussee M and Watson R, “Identification of RGG-regulated exoproteins of Streptococcus pyogenes”, Infect Immun. Vol. 69, pp 822-31, 2002