Supplementary Methods - Word file (50 KB )

Supplementary Methods TaqMan® assays Primer/Probe design. TaqMan® Gene Expression Assays are designed to transcripts obtained from the NCBI Reference Sequence Project data base (RefSeq) using the Applied Biosystems genome-aided primer and probe design pipeline (Applied Biosystems, 2004). To identify the optimal region for primer and probe design, ambiguous regions (repetitive and low complexity sequences, SNPs) in each transcript are first masked, then the transcripts are aligned to the genome and exon-exon junctions marked. Primer and probes are designed for each gene and where possible, the assay is designed such that that probe spans the exon-exon junction. Parameters such as %GC content, Tm, secondary structure, and amplicon length are optimized to ensure high amplification efficiency. To ensure specificity, each assay design undergoes an in silico QC process against the public genome and transcripts database (NCBI) where we determine the degree of homology through BLAST between the assay primer and probe sequences and other closely related transcripts and homologous genes and pseudogenes. Thus each assay is designed to a transcript that is gene specific. To address sequence updates, the assays are periodically remapped to the most recent RefSeq versions of the NCBI database. Selection of Endogenous Control for Normalization. In QRT-PCR an endogenous control gene is used to normalize data and control for variability between samples as well as plate, instrument and pipetting differences. The ideal control should be sufficiently abundant, have constant RNA transcription levels across samples, and be unaffected by experimental treatments. To identify the best endogenous control for this study, we ran 11 samples containing titrations of UHR and Brain, ranging from 100% UHR to 100% Brain on a TaqMan® assays Low Density Array Endogenous Control Panel (Applied Biosystems), a 384-well low density array containing 16 common housekeeping genes (Supplementary Fig. 5a). From these data we identified 4 candidate endogenous control genes (Supplementary Fig. 5b) to use in the main MAQC study for normalization of the 1000 assays across samples A, B, C and D: 18S(Hs99999901_s1), UBC(Hs00824723_m1), HPRT(Hs99999909_m1), POLR2A(Hs00172187_m1). The endogenous control genes were run in quadruplicate on each of 44 plates. Although all 4 genes showed very little change across samples, ANOVA analysis across 3 factors (plates, instruments, samples) indicated that 18S and POLR2A showed the least variation across the samples. We decided to use POLR2A because its CT value was within the range of most of the genes in the study. Each replicate CT was normalized to the average CT of POLR2A on a per plate basis by subtracting the average CT of POLR2A from each replicate to give the ΔCT which is equivalent to the log2 difference between endogenous control and target gene. TaqMan® assays relationship between SD and fold change confidence. A SD of CT of 0.167 provides 99.72% confidence for measuring 2-fold discrimination with a single observation. A higher SD of CT may be the result of a low expressing gene (stochastic effects) or technical error (pipetting, mixing). StaRT-PCR™ Detailed Procedure. Standardized reverse transcriptase polymerase chain reaction (StaRT-PCR™) is a modification of the competitive template (CT) RT method described by Gilliland et al. (1). StaRT-PCR™ enables numerical quantification of gene expression at endpoint of PCR (35 cycles) for many genes simultaneously. An internal standard (IS) CT is prepared for each gene and cloned to generate enough for >1012 assays. Internal standards for 96 genes are mixed together into a Standardized Mixture of Internal Standards (SMIS™). In each measurement, variation in loading of cDNA is controlled by reference to ACTB, which has relatively constant expression among different samples (due to use of the same SMIS in each measurement, following data collection, normalization may be converted to any other single gene or combination of genes measured through a simple algebraic conversion factor). In each measurement, the native template (NT) for both the target gene and ACTB are measured relative to a known quantity of their respective internal standards. The ratio of NT to IS must be greater than 1:10 and less than 10:1 for the measurement to be within assay range. Initial calibration of each cDNA to a known quantity of ACTB internal standard ensures that the ACTB NT/IS is within this range for each subsequent measurement. Next, 2 l of the calibrated cDNA sample and 2 l of SMIS™ are PCR-amplified in a 20 l PCR reaction with primers specific to a different gene in each reaction. As with the ACTB loading control gene, the target gene NT/IS must be greater than 1:10 and less than 10:1. Because genes are expressed over more than six orders of magnitude in human tissues, the target gene internal standards in each 96gene SMIS™ are 10-fold serially diluted relative to the loading control gene (ACTB) internal standard, in a System of six SMIS™, A–F. Thus, there are 600,000 transcript molecules of ACTB IS in a l of each SMIS (A-F), and 6,000,000 transcript molecules of each target gene IS in SMIS A and 60 transcript molecules of each target gene IS in SMIS™ F. For each System, sufficient amount of A–F SMIS™ is prepared for more than 1012 assays. Thus, the relative concentration of each IS within a SMIS™ is constant and stable and when used by any lab according to recommended methods, will yield the same results when assessing the same samples. Thus far, Gene Express, Inc. has prepared Systems 1-8, comprising reagents for nearly 800 genes. When preparing an internal standard for each transcript, quality control is ensured through a 29-step GLP compliant protocol. Specificity of StaRT-PCR™ reagents to a particular transcript is ensured by selection of primers through careful database analysis. Optimal primer efficiency, including limit of detection (LOD) of less than 10 transcript molecules, and 100% signal-to-analyte response are ensured through serial limiting dilution of a known quantity of IS. In each measurement, quality control of gene specificity is ensured through SOP preparation of primers, SMIS™ and PCR reaction mixtures and examination of PCR products following electrophoretic separation to ensure that PCR products of expected size are observed. Quality Control. An internal standard peak was observed in every measurement and this documents that no false negatives were observed. Absence of false positives was documented through observation of no peaks in reaction mixtures with no cDNA or SMIS in reaction. Specificity. Some StaRT-PCR™ reagents were intentionally designed to assess more than one transcript at the request of customers. For these reagents, there was no increase in discordance compared to TAQ or QGN. Each of the StaRT-PCR™ reagents assessed transcripts representing a single gene. The design of the primers to cross introns ensured that genomic DNA PCR products, if present, would be detected as longer PCR products. cDNA Consumption. For each MAQC sample, 2 l of calibrated cDNA sample corresponded to 8 ng of original RNA. For each of the four samples (A, B, C, and D) 615 datapoints were obtained for each sample (205 genes in triplicate) and approximately 2.2 StaRT-PCR™ assays were done for each datapoint. In total, these assessments for cDNA calibration, range finding, and quantification of 205 genes consumed between 2,700 and 3,000 l of cDNA, corresponding to approximately 13 g of RNA. Although far less cDNA in each measurement could have been used in assessment of genes expressed at higher levels, use of less cDNA to measure genes expressed at lower levels would have led to increased variation in measurement and/or lack of representation of those genes due to stochastic sampling error. Gene Express has SOP guidelines for loading 10-fold more cDNA to obtain better reproducibility of low expressed genes, or as much as 100-fold less cDNA to conserve on sample in measurement of highly expressed genes. In summary, in each measurement a known quantity of SMIS™ is combined with a cDNA sample. Loading of cDNA is controlled by reference to ACTB. Each target gene and loading control gene is simultaneously measured relative to a known number of its respective internal standard transcript molecules in the SMIS™. Thus, it is possible to report each gene expression measurement as a numerical value in units of target gene cDNA transcript molecules/106 reference gene cDNA transcript molecules. Calculation of data in this format enables entry into a common databank, direct inter-experimental comparison, and combination of values into interactive transcript abundance indices. All data may be normalized to any gene measured other than ACTB through a simple correction factor. StaRT-PCR™ Data Set 1 (Figs. 3-6) All data were normalized to ACTB as loading control. It was assumed that mRNA/Total RNA was equal in Samples A and B. The determination of difference in ACTB/mRNA in Sample A compared to Sample B was based on identifying the A/B ratio of ACTB/mRNA that yielded optimal average R2 for Samples A, B, C and D across all genes measured. Based on this, ACTB/mRNA was calculated to be 2-fold higher in Sample A compared to Sample B. Data were corrected for this difference. This normalization effectively normalized the GEX data set to all other data sets, as is evident from the fold-difference graphs (Fig. 3 and Fig. 4e). StaRT-PCR™ Data Set 2 (Fig. 2 and related Table 1 data, R2 data in Table 1) As with Data Set 1, all data were normalized to ACTB as loading control. Based on MAQC sample titration data from StaRT-PCR™ (MS-7) and from each microarray (MS-8), and from spike-in data from MS-12, it was concluded that mRNA/Total RNA is higher in Sample A compared to Sample B. To identify the optimal normalization with respect to Sample A to Sample B difference using no a priori assumptions regarding either mRNA/Total RNA and ACTB/mRNA, a 3-D surface plot was used to assess the effect of each of these parameters on the linear correlation (R2) across the four MAQC samples averaged across all genes measured. It was discovered that there are two optimal R 2 peaks. One is achieved with the normalization assumptions used in Data Set 1 (i.e. A/B mRNA/Total RNA = 1.0, A/B ACTB/mRNA = 0.5). However, the assumption of A/B mRNA/Total RNA = 1.0 is not supported by multiple sets of data (MS-8, MS-12). Thus, the second R2 optimum, achieved with A/B mRNA/Total RNA = 2.5 and A/B mRNA/Total RNA = 0.5, is considered to be the correct one. These are the assumptions used in Data Set 2. StaRT-PCR™ Data Set 3 (Fig. 1 and related Table 1 data) The number of transcript molecules in the assay, with no normalization to ACTB or anything else. QuantiGene® Probe design. A probe set for a target gene consists of three types of oligonucleotide probes (CE, LE, BL) covering a contiguous region of the target, which allows the capture of target RNA to the surface of plate well and hybridization with branched DNA signal amplification molecule. For each target sequence, the software algorithm identifies regions that can serve as annealing templates for CEs (5-10 per gene), LEs (10-20 per gene), or BLs to fill the remaining space. More detailed description of the probe design is described previously (2) Scaling of data for TaqMan® assays, StaRT-PCR™, and QuantiGene® The data for the three platforms were transformed to be on the same X-axis scale in the following manner. For StaRT-PCR™, 6000 transcript molecules was defined by a value of 6000 or log2 (6000) = 12.55. For TaqMan® assays, first the CT values were transformed from a decreasing copy number scale to an increasing copy number scale. This was accomplished by taking the absolute value of the difference of every TaqMan ® assays CT value and the lowest value for TaqMan® assays CT (40). This rescaling preserves the assay range measured by TaqMan® assays in the log2 space. Given that a TaqMan® assays CT value of 35 corresponds to 5 transcript molecules, the extrapolated CT equivalent for 6,000 transcript molecules is approximately 24.78. This value on the transformed scale corresponds to |24.78 - 40| or 15.22. In order to scale this to the StaRT-PCR™ value of 6,000 transcript molecules, a re-scaling value of 2.66025 was applied to all values. This factor was calculated by taking the difference between the prescaling value in TaqMan® assays that corresponds to 6000 transcript molecules (15.22) and the value of StaRT-PCR™ that corresponds to 6000 transcript molecules (12.55). The same transformation was applied to QuantiGene® values resulting in a rescaling factor = 13.55. This factor was generated with the estimation of 6000 transcript molecules defined by 0.5 RLU or -1.0 on a log2 scale. These transformations result in all platforms having a post-scaling value of 12.55 on a log2 scale for an an analyte value of 6000 transcript molecules. Supplementary References 1. Gilliland, G., Perrin, S., Blanchard, K. & Bunn, H.F. Analysis of cytokine mRNA and DNA: detection and quantitation by competitive polymerase chain reaction. Proc. Natl. Acad. Sci. USA 87, 2725–2729 (1990). Bushnell, S. et al. ProbeDesigner: for the design of probesets for branched DNA (bDNA) signal amplification assays. Bioinformatics 15, 348–355 (1999).

Supplementary Methods - Word file (50 KB )

Related documents

Products

Support

Supplementary Methods - Word file (50 KB )

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib