Oligonucleotide probe design and microarray construction Supplementary text for: Zhigang Zhang, Ninad D. Pendse¶, Katherine N. Phillips, James B. Cotner, Arkady Khodursky. 2008. Gene expression patterns of sulfur deprivation in Synechocystis sp. PCC 6803. BMC Genomics, 2008. To construct an oligonucleotide-based microarray of the Synechocystis protein coding genes, we chose an algorithm that allowed selection of representative long oligonucleotides with high specificity and consistency of hybridization across the set. In the context of microarray design, specificity refers to the inability of the probe to bind strongly to non-target sequences during the hybridization and washing. High specificity can be achieved by avoiding probes with excessive sequence similarity to a non-target sequence that might be present in a complex pool of cellular targets. The hybridization consistency depends on uniformity of probe melting temperatures across the entire collection of the probes, which can be estimated by some computational prediction methods [1]. A freely available software, ArrayOligoSelector [2] developed by Bozdech et al [3], was used to design gene-specific oligonucleotide probes for the entire Synechocystis genome. Previously this program had been successfully used to design genome-wide array elements for the human malarial parasitic protozoan Plasmodium falciparum [3], the GC rich opportunistic pathogen Burkholderia cenocepacia [4] and the butanol biofuel producer Clostridium acetobutylicum [5]. Because it was shown that there is a strong correlation between signal intensity and oligo length [6], 70-nt-long oligomers were designed for the Synechocystis genome in order to detect genes expressed at low levels. Frequently, one probe per ORF may be sufficient to detect changes in the abundance of a transcript [4, 6, 7]. Thus, one 70-1- mer per gene was finally designed to reduce the cost. Oligomers for every ORF were selected on the basis of uniqueness within the genome (reduction of crosshybridization potential), avoidance of significant self-binding (secondary structure), exclusion of low-complexity sequence (minimizing non-specific hybridization), and balanced base composition (consistency in melting temperature). The collection of all possible 70-mer oligonucleotides contained in the coding region of an ORF was used for oligonucleotide selection, which was executed by employing filters for uniqueness, self-binding and complexity, in parallel. The intersections of the set of all oligonucleotides passing these filters were further selected for a desired GC content. We selected oligonucleotides that ranked among the top 5% of unique or almost unique 70-mers in the entire ORF and that were within 5 kcal/mol of a best candidate for the ORF. Next, 33% of top-scoring 70-mers passed the self-binding and lowcomplexity filters. The GC content cut-off was set to 50%. These filters were applied to the entire set of candidate 70-mers to obtain individual outputs. If no common oligomers were identified, then self binding and complexity filters were relaxed until an intersection appeared. The sequences of oligo-probes are available in a supplementary file [see Additional file 5]. Following computational design and selection of the probes, we constructed the microarray platform in two stages. In the first stage, we produced a 171-probe pilot array and assessed the specificity of spotted probes along with negative control spots by doing single-channel or dual-channel genomic DNA hybridizations. In the second stage, we produced an oligo-array with the genome-wide coverage (representing 3064 out of 3168 protein coding genes) and evaluated its performance based on the ability -2- to detect well-established environmental stress transcriptional responses by two-color hybridization, using the hybridization conditions optimized in the first stage. Microarray design stage 1: Pilot array design In stage one, 158 oligonucleotide probes (about 5% of the genome-wide set) representing well-annotated Synechocystis genes were designed. Thirteen Escherichia coli K-12 MG1655 oligos were also designed as negative controls. The E. coli genes used for designing control probes were selected by doing pair wise BLAST of all the E. coli ORFs against all Synechocystis genes. With an E-value cut-off of 10, 13 E. coli ORFs (b0215, b0273, b0605, b0814, b1078, b1732, b1887, b2798, b2827, b3701, b3744, b4024 and b4101), which had a match in Synechocystis sequences of at most 19-nt, were selected for designing 70-mer oligos. These 171 oligomers were spotted on glass slides in duplicate for evaluating their specificity using either Synechocystis or E. coli genomic DNA (gDNA) hybridization. Initially, an E. coli cDNA array hybridization protocol was used [8], but there was significant non-specific crosshybridization as reflected in the ratio of average intensities of target to non-target probes (data not shown). Hence, the protocol was optimized by varying one parameter (salt concentration, hybridization temperature or blocking solution) at a time to minimize the cross-hybridization, along with a slide pre-hybridization step, as described in Methods. Finally a significant reduction in cross-hybridization was achieved as the specificity ratio went up from 1.5 to 8-10. -3- Table 1 Statistical tests of specificity of hybridization for pilot oligonucleotide arrays of Synechocystis. (A) Two class t-test. E. coli DNA Synechocystis DNA Array ID s-23 s-27 s-29 Average s-24 s-26 s-28 Average T statistic 4.44 5.29 6.605 6.308 -12.79 -15.1659 -9.399 -15.2287 P-value 4.07E-03 5.91E-05 6.05E-06 1.04E-05 4.23E-29 6.01E-34 5.58E-15 5.06E-31 (B) Mann-Whitney U test. Organism E. coli DNA Synechocystis DNA Z statistic -4.12 -3.76 P-value <0.0002 <0.0002 (C) Two class t-test on data generated by random sampling. Organism E. coli Synechocystis T statistic -23.74 29.35 P-value 1.72E-49 2.33E-50 Subsequent to optimization of the hybridization protocol, we used several statistical tests to assess the performance of the designed oligos in gDNA hybridization experiments. The purpose was to determine whether the intensity of hybridization of target probes was significantly higher than that of non-target probes. First, 6 singlechannel genomic DNA hybridizations were performed, 3 with fluorescently labelled E. coli DNA and 3 with fluorescently labelled Synechocystis DNA. Spots with signal to noise ratios less than two standard deviations away from the background mean were excluded from the analysis. Resulting intensities were subjected to a parametric -4- two-class t-test (Table 1A) and non-parametric Mann-Whitney U test (Table 1B). We concluded from both tests that the intensity of hybridization of target oligo-probes was significantly higher than that of non-target probes. Because there was an order of magnitude difference in the sample size for target and non-target oligos, 25 random samplings were performed to create two populations of equal sample size from the existing populations. The two-class t-test was carried out on this dataset based on the rank of oligo intensity (Table 1C). The average ranks of the two populations were different (both p-values were less than 1E-48) and hence, the signals obtained by annealing to the probes from one population had a significantly higher intensity than the other. Figure 1 shows the distribution of mean ranks of intensities from a randomly sampled dataset for E. coli (A) and Synechocystis (B) DNA hybridization. It is clear that the distribution of average ranks was not overlapping and therefore, the oligos had high selectivity. Finally, two-channel comparative hybridizations (one channel for E. coli DNA and the other Synechocystis DNA) were done and the log2(ratio) for each oligo was calculated. As can be seen in Figure 1C, there was a clear segregation between E. coli and Synechocystis oligo-probes. Based on these results, the oligo design was deemed to be specific, with no or minimal crosshybridization. -5- Figure 1 Pilot array design to assess probe specificity. Distribution of mean ranks from randomly generated datasets for single-channel hybridizations with E. coli (A) and Synechocystis (B) genomic DNA. (C) Twochannel DNA hybridization demonstrates clear separation of the distributions of signals coming from E. coli and Synechocystis probes based on log2(ratio). Microarray design stage 2: Genome-wide array construction Because sub-genome arrays performed well in our pilot experiments, we concluded that the oligo selection criteria could be used for a large-scale design. We attempted to design 70-mers for all ORFs in the Synechocystis genome. Due to significant amount of sequence similarity between some ORFs, we were able to select representative oligonucleotides for 3064 out of 3168 protein coding genes (96.7% coverage). The -6- excluded ORFs (104 total) corresponded to duplicate genes and genes coding for transposases and some other hypothetical proteins [see Additional file 5]. A stringent criterion of no more than 30% identity with the non-target sequence was applied to all oligos. Oligo-probes were also designed for 3 rRNA genes in Synechocystis. The layout of the spots was made in such a way as to have 4 control spots, 3 positive and 1 negative, in each of the 16 blocks of the microarray spotted using an in-house robotic printer. A representative genome-wide microarray image is shown in Figure 2. Figure 2 Genome-wide Synechocystis oligonucleotide (70-mer) DNA microarray. The artificial 16-bit TIFF image of a representative two-color hybridization. -7- Next, we assessed the performance of the genome-wide array by examining wellknown transcriptional responses. First, cells from the mid-exponential growth phase were heat shocked for 1 hour by increasing incubator temperature from 26oC to 40 oC. Temperature up-shift usually triggers a heat-shock response [9], which at least in part can be characterized by transcriptional induction of genes encoding for molecular chaperones and proteases. By examining transcriptional responses in 3 independent biological cultures that were shifted to a higher temperature, we identified five known molecular chaperone genes, including groEL, groEL-2, groES, hspA, htpG, among the top 20 induced genes (mean fold change > 1.5, t-test P < 0.05). Induction of those transcripts has been also reported in the studies done by other groups which used cDNA microarrays [10-12]. Second, we tested the array under the condition of salt stress, by subjecting a bacterial culture for 35 min to NaCl at a final concentration of 0.5M. The top 100 induced genes (mean fold change > 1.5, P < 0.05) included genes encoding heat shock proteins (groEL-2, hspA), proteases (htrA, clpB, ctpB, ftsH), glucosylglycerol synthetase(ggpS), ribosomal proteins (rps21, rpl3), a sigma factor (rpoD), and a high-light-inducible protein (hliA). All of them were reported to be inducible by high salt. 11 common genes reported in three different studies were also identified in this study [13-15]. Since the genome-wide original intensity data reported in the literatures are not accessible, more systematic comparisons (overall scatter plot correlation between platforms, overlapping between differentially expressed genes identified by both platforms, etc) cannot be performed. We also compared the magnitude of change of top ranked common differentially expressed genes identified between this study and literature reports, as shown in Table 2. Literature reports using cDNA microarray platform consistently produced larger -8- values of transcriptional change,from 2 to 30 fold higher than our custom designed long oligonucleotide microarray. The experimental conditions had no significant differences. The main reasons for the magnitude differences could be attributed to differences in array cross-hybridization and data processing. cDNA microarrays use longer and double-strand DNA fragments, which are more prone to non-specific hybridization and can cross-hybridize more easily to similar sequences. This may result in higher background signal than that of oligonucleotide microarrays. On the other hand, those literature reports used local background-subtracted intensities for data normalization. However, there is no theoretical basis for background subtraction. Moreover this procedure may remove some spots with lower spot intensities than background from follow-up analysis, including significantly differentially expressed genes. Based on our own experiences on cDNA microarray data analysis, we prefer to use non-background subtracted intensity data for microarray data preprocessing. Though the magnitude of a particular expression ratio can significantly differ between our oligonucleotide microarray and reported cDNA microarray platform, the relative expression, in terms of rank or “direction” of expression change, appears to be well correlated (Table 2). Table 2 Comparison of magnitude of induction folds of common genes between this study and literature reports. (A) Heat shock. Experimental conditions: this study, 26 to 40°C for 60min; Li et al 2004, 35 to 45°C for 15 min; Suzuki et al 2004&2005, 34 to 44°C for 60min. (B) Salt stress. Experimental conditions: this study, 0.5 M NaCl for 35 min; Kanesaki et al 2002, 0.5 M NaCl for 30 min; Marin et al 2004, 0.684M NaCl for 0.2-24hr; Shoumskaya et al 2005, 0.5 M NaCl for 20 min. Data were presented in mean. -9- (A) Locus(gene) slr2075 (groES) slr2076 (groEL) sll1514 (hspA) sll0430(htpG) sll0416 (groEL-2) This study 4.4 2.7 1.7 1.6 1.5 Li et al 2004 8.7 8.2 5.1 9.9 11.6 Suzuki et al 2005&2006 15.5 16.4 21.6 3.7 6.5 (B) Locus(gene) This study sll1863 sll1862 sll1514(hspA) sll0528 sll1566(ggpS) slr1687 ssr2595 slr1544 slr0967 ssl2542(hliA) sll1085(glpD) 13.3 7.3 5.1 3.7 3.3 3.0 2.7 2.4 2.0 1.9 1.8 Kanesaki et al 2002 52.7 93.8 56.2 40.0 10.7 9.4 13.4 20.3 16.0 5.0 11.8 Marin et al 2004 265.2 231.5 23.9 50.3 7.6 3.1 3.0 2.1 25.3 2.1 3.5 Shoumskaya et al 2005 106.5 152.4 49.7 74.4 13.2 16.0 15.1 23.2 32.3 9.8 9.7 There are discrepancies among the reports on agreement of inter-lab and interplatform comparisons of DNA microarray data. Some studies suggest significant disagreement between platforms [16-21], and others show general agreement [22-35]. Systematic analyses indicated that all three platforms (cDNA, long or short oligonucleotide) can give similar and reproducible results if the criterion is the direction of change in gene expression (up-regulation or down-regulation) and minimal emphasis is placed on the magnitude of change [35]. This is also consistent with the notion that microarray experiments are superior in identification of regulatory pattern in genome-wide gene expression rather than giving the researcher quantitative expression values for individual genes [36]. In summary, the extent of concordance between the sets of genes identified in our study and the earlier studies employing PCR-based arrays was consistent with what can be expected from microarray results obtained on different platforms and by - 10 - different groups. Based on these results, we concluded that the genome-wide array was ready for studying transcriptional responses of the Synechocystis genome to poorly understood environmental challenges. Methods Probe design and microarray production The 70-nt oligomers were designed using ArrayOligoSelector [2] as detailed in the text. The oligos were synthesized by Invitrogen Corp. (Carlsbad, CA, USA), resuspended in 5µl of 3×SSC to a final concentration of 66.7 pmol/µl and spotted onto poly-L-lysine coated microscopic glass slides using OmniGrid Microarray printer (GeneMachines, San Carlos, CA, USA), as described previously [37]. All oligo sequences are provided [see Additional file 5]. Genomic DNA preparation and fluorescent labelling Genomic DNA was isolated from stationary phase cultures of E. coli K-12 MG1655 and Synechocystis sp. PCC 6803, respectively, using standard phenol-chloroform extraction method [38, 39] with minor modifications. Sucrose was supplemented at a final concentration of 0.5M to 1×TE buffer (10mM Tris, 1mM EDTA, pH 8) to efficiently disrupt Synechocystis cell envelop. Purified genomic DNA was sheared by sonication, yielding 300-1000bp long DNA fragments for direct labelling using Klenow fragment of DNA polymerase I. The labelling reaction consisted of 2-5µg of genomic DNA, 5 μg of random hexamer (pdN6), 1.5 μl of dNTP mix (0.5 mM dATP, 0.5 mM dCTP, 0.5 mM dGTP and 0.2 mM dTTP), 0.2 mM Cy3- or Cy5-dUTP (GE Healthcare, Piscataway, NJ, USA), 1× Klenow buffer and 6-10 units of Klenow. Reactions were incubated at 37°C for 2 hrs. - 11 - Heat and salt treatment experiment Synechocystis sp. PCC 6803 was grown in BG-11 medium at 26oC supplemented with 8mM NaHCO3 and buffered with 10mM HEPES-NaOH to a final pH of 7.4. The cells were grown in 0.5L conical flask with continuous shaking exposed to a full spectrum luminescent lamp with a photon flux density of 25µmol photons m-2 s-1 in 14:10 light dark cycles, continuously supplied with sterile air containing 1% (v/v) CO2. For heat shock experiment, 50ml of mid-exponentially growing cells (OD730nm = 0.6) were collected as control and then heat shock response was induced by increasing the incubator temperature to 40oC. Cells were harvested after 60 min at the elevated temperature. For salt stress experiment, 5M NaCl was added to the mid-exponential growth phase cell culture to a final NaCl concentration of 0.5M. The cell samples were taken just before (control) and 35min after NaCl addition. All the experiments were done in biological triplicates. Total RNA preparation, cDNA fluorescent labelling and microarray hybridization were the same as presented in the Method section of the main text. The microarray data were processed and analyzed as described in the supplementary text above. References 1. 2. 3. 4. 5. SantaLucia J, Jr.: A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc Natl Acad Sci USA 1998, 95(4):1460-1465. ArrayOligoSelector: [http://arrayoligosel.sourceforge.net/] Bozdech Z, Zhu J, Joachimiak M, Cohen F, Pulliam B, DeRisi J: Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray. Genome Biol 2003, 4(2):R9. Leiske D, Karimpour-Fard A, Hume P, Fairbanks B, Gill R: A comparison of alternative 60-mer probe designs in an in-situ synthesized oligonucleotide microarray. BMC Genomics 2006, 7(1):72. Paredes CJ, Senger RS, Spath IS, Borden JR, Sillers R, Papoutsakis ET: A general framework for designing and validating oligomer-based DNA - 12 - 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. microarrays and its application to Clostridium acetobutylicum. Appl Envir Microbiol 2007, 73:4631-4638. Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR et al: Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol 2001, 19(4):342-347. Relogio A, Schwager C, Richter A, Ansorge W, Valcarcel J: Optimization of oligonucleotide-based DNA microarrays. Nucleic Acids Res 2002, 30(11):e51. Khodursky AB, Bernstein JA, Peter BJ, Rhodius V, Wendisch VF, Zimmer DP: Escherichia coli spotted double-strand DNA microarrays: RNA extraction, labeling, hybridization, quality control, and data management. Methods Mol Biol 2003, 224:61-78. Segal GIL, Ron EZ: Regulation of heat-shock response in bacteria. Annals N Y Acad Sci 1998, 851(1):147-151. Li H, Singh AK, McIntyre LM, Sherman LA: Differential gene expression in response to hydrogen peroxide and the putative PerR regulon of Synechocystis sp. strain PCC 6803. J Bacteriol 2004, 186(11):3331-3345. Suzuki I, Kanesaki Y, Hayashi H, Hall JJ, Simon WJ, Slabas AR, Murata N: The histidine kinase Hik34 is involved in thermotolerance by regulating the expression of heat shock genes in Synechocystis. Plant Physiol 2005, 138(3):1409-1421. Suzuki I, Simon WJ, Slabas AR: The heat shock response of Synechocystis sp. PCC 6803 analysed by transcriptomics and proteomics. J Exp Bot 2006, 57(7):1573-1578. Kanesaki Y, Suzuki I, Allakhverdiev SI, Mikami K, Murata N: Salt stress and hyperosmotic stress regulate the expression of different sets of genes in Synechocystis sp. PCC 6803. Biochem Biophys Res Commun 2002, 290(1):339-348. Marin K, Kanesaki Y, Los DA, Murata N, Suzuki I, Hagemann M: Gene Expression Profiling Reflects Physiological Processes in Salt Acclimation of Synechocystis sp. Strain PCC 6803. Plant Physiol 2004, 136(2):32903300. Shoumskaya MA, Paithoonrangsarid K, Kanesaki Y, Los DA, Zinchenko VV, Tanticharoen M, Suzuki I, Murata N: Identical Hik-Rre systems are involved in perception and transduction of salt signals and hyperosmotic signals but regulate the expression of individual genes to different extents in Synechocystis. J Biol Chem 2005, 280(22):21531-21538. Kothapalli R, Yoder S, Mane S, Loughran T: Microarray results: how accurate are they? BMC Bioinformatics 2002, 3(1):22. Kuo WP, Jenssen T-K, Butte AJ, Ohno-Machado L, Kohane IS: Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 2002, 18(3):405-412. Li J, Pankratz M, Johnson JA: Differential gene expression patterns revealed by oligonucleotide versus long cDNA arrays. Toxicol Sci 2002, 69(2):383-390. Lenburg ME, Liou LS, Gerry NP, Frampton GM, Cohen HT, Christman MF: Previously unidentified changes in renal cell carcinoma gene expression identified by parametric analysis of microarray data. BMC Cancer 2003, 3:31. - 13 - 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. Tan PK, Downey TJ, Spitznagel EL, Jr., Xu P, Fu D, Dimitrov DS, Lempicki RA, Raaka BM, Cam MC: Evaluation of gene expression measurements from commercial microarray platforms. Nucl Acids Res 2003, 31(19):56765684. Mah N, Thelin A, Lu T, Nikolaus S, Kuhbacher T, Gurbuz Y, Eickhoff H, Kloppel G, Lehrach H, Mellgard B et al: A comparison of oligonucleotide and cDNA-based microarray systems. Physiol Genomics 2004, 16:361 370. Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ: Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucl Acids Res 2000, 28(22):4552-4557. Taniguchi M, Miura K, Iwao H, Yamanaka S: Quantitative assessment of DNA microarrays - Comparison with Northern blot analyses. Genomics 2001, 71:34 - 39. Guckenberger M, Kurz S, Aepinus C, Theiss S, Haller S, Leimbach T, Panzner U, Weber J, Paul H, Unkmeir A et al: Analysis of heat shock response of Neisseria meningitides with cDNA- and oligonucleotide-based DNA microarrays. J Bacteriol 2002, 184:2546 - 2551. Yuen T, Wurmbach E, Pfeffer RL, Ebersole BJ, Sealfon SC: Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucleic Acids Res 2002, 30:e48. Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, Erle DJ: Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 2003, 13:1775 - 1785. Wang HY, Malek RL, Kwitek AE, Greene AS, Luu TV, Behbahani B, Frank B, Quackenbush J, Lee NH: Assessing unmodified 70-mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol 2003, 4:R5. Bloom G, Yang IV, Boulware D, Kwong KY, Coppola D, Eschrich S, Quackenbush J, Yeatman TJ: Multi-platform, multi-site, microarray-based human tumor classification. Am J Pathol 2004, 164:9 - 16. Jarvinen AK, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi OP, Monni O: Are data from different gene expression platforms comparable? Genomics 2004, 83:1164 - 1168. Lee HS, Wang J, Tian L, Jiang H, Black MA, Madlung A, Watson B, Lukens L, Pires JC, Wang JJ et al: Sensitivity of 70-mer oligonucleotides and cDNAs for microarray analysis of gene expression in Arabidopsis and its related species. Plant Biotech J 2004, 2:45 - 57. Parmigiani G, Garrett-Mayer ES, Anbazhagan, Gabrielson E: A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res 2004, 10:2922 - 2927. Thompson KL, Afshari CA, Amin RP, Bertram TA, Car B, Cunningham M, Kind C, Kramer JA, Lawton M, Mirsky M et al: Identification of platformindependent gene expression markers of cisplatin nephrotoxicity. Environ Health Perspect 2004, 112:488 - 494. Ulrich RG, Rockett JC, Gibson GG, Pettit SD: Overview of an interlaboratory collaboration on evaluating the effects of model hepatotoxicants on hepatic gene expression. Environ Health Perspect 2004, 112:423 - 427. - 14 - 34. 35. 36. 37. 38. 39. Wang H, He X, Band M, Wilson C, Liu L: A study of inter-lab and interplatform agreement of DNA microarray data. BMC Genomics 2005, 6(1):71. Petersen D, Chandramouli GVR, Geoghegan J, Hilburn J, Paarlberg J, Kim C, Munroe D, Gangi L, Han J, Puri R et al: Three microarray platforms: an analysis of their concordance in profiling gene expression. BMC Genomics 2005, 6(1):63. Dharmadi Y, Gonzalez R: DNA microarrays: experimental Issues, data Analysis, and application to bacterial systems. Biotechnol Prog 2004, 20(5):1309-1324. Eisen MB, Brown PO: DNA arrays for analysis of gene expression. Methods Enzymol 1999, 303:179-205. Golden SS, Brusslan J, Haselkorn R: Genetic engineering of the cyanobacterial chromosome. Methods Enzymol 1987(153):215-231. Sambrook J, Russell DW: Molecular Cloning: A Laboratory Manual., Third edn. Cold Spring Harbor, NY: CHSL Press; 2001. - 15 -