Supplementary material

advertisement
Supplementary material
Microarray experiment.
Microarrays were custom-designed for this study. Collection of genes was identified which is
shown to be overexpressed in the RCC tumor compare to normal kidney tissue (Gumz 2007).
Each Reference sequence was divided into three segments of roughly equal length along 5’-3’
axes. Individual probe design was conducted for each sequence segments. Nine thousand four
hundred transcripts produced probes with optimal characteristics in base composition for all
three segments (28,000 probes), 2000 transcripts produced two out of three probes with
optimal design, 1100 transcripts produced only one probe with optimal design, and 300
transcripts did not result in design of optimal probes. Eighteen hundred transcripts produced all
three probes optimal for base composition, but 30% of the designed probes had potential for
cross hybridization with other sequences. Probes selected from the group with optimal
characteristics and probes available for all three segments were chosen and used to generate
spotted arrays for the experiments. Probes were generated as in situ synthesized oligos
(Agilent) in 44K format.
The RNA was used as the starting material for synthesis of cDNA target using oligo dT primer,
and a spiking control was added to each sample prior to cDNA synthesis in order to provide a
consistent positive reference in subsequent hybridizations. Single color labeled cDNA with Cy3
was hybridized to microarrays. The array production, cDNA synthesis, labeling, and
hybridization were conducted following standard Agilent procedures.
Initial analysis was conducted to test the system. Four replicates of probe labeling were
conducted on total RNA sample and RNA amplified from that total RNA sample in a standard
labeling reaction. Signal intensity distribution and between sample sets were conducted.
Frequency distribution of signal intensities showed slight bi-modality, with consistent
differences between total and amplified RNAs. Control probes spiked into each RNA population
produced signals with appropriate linearity of response indicating that target labeling reaction
is behaving normally. The labeled RNA generated from same biological total or amplified RNA
was hybridized to separate arrays. The data analyzed indicates linear correlation between
signals from both arrays indicating that technical replicates are highly reproducible. 96% of
reporters were concordant in both samples, whereas 4 % of reporters were absent in one and
present in another or vice versa, however these samples were found near the limit of signal
detection. Total RNA and amplified RNA produced similar results in this test indicating that the
system established is not biased to the type of sample. Reproducibility of results was similar to
that obtained in a conventional gene array experiments. The preliminary testing of the system
demonstrated that technical variability is highly controlled; therefore, experimental design can
focus on biological replicates and on understanding differences between methods used to
generate amplified RNA.
When the total RNA of sample was compared to the RNA amplified from that sample using the
old amplification method and the data is analyzed on all three probes or the whole transcript
levels there is a good correlation with 94% of transcripts behaving similarly (either present or
absent) in both the amplified and the total RNA sample.
248 transcripts present in total RNA but absent in amplified RNA (2.15%); 375 present in
amplified RNA but present in total RNA (3.25%). The data changes along probe types with the
data with data more concordant at the 3’end probe and less concordant at the 5’ end probe.
This results is expected because the methods for RNA amplification is anchored by the poly(A)
tail – probes located in the 3’ end of the transcript. Of transcripts that were detected in the
total RNA but not in the amplified RNA they may be absent due to truncation during
amplification process: for 5’ probe 13.9%, middle probe 12.2% and 3’ probe 3.9%. This was the
more frequent result. Some transcripts were detected in amplified RNA but not total RNA and
this could be due to high levels of amplification skewing the detectability and therefore the
relative representation: 5’ probe 4.9%; middle probe 3.3% and 3’ probe 2.4%. This was the less
frequent result. However, in both cases the differences were more pronounces for probes
located in the 5’ end are more likely to be differentially detected. This result indicates that
processivity at the target labeling reaction during the microarray experiment and not during the
RNA amplification process.
To understand the differences between amplification protocols the RNA was amplified using
either old protocol or new protocol and compared to the total RNA using Principle Component
analysis on the whole transcript data with all three probes (Figure 2 panel C of the manuscript).
The PCA analysis repeated using intensity data derived by probe type: 5’end, middle or 3’ end.
When the 5’ probe data was analyzed by amplification method it was shown that the RNA
amplified using the new process is more closely related to the total RNA from which it was
derived than the RNA amplified using the old method (data not shown or right plot from slide
26 and 28). Using data from 3’ end, the overall distribution of samples appears very similar to
whole transcript data. This indicates that Probe Type 3 is the major contributor to the overall
distribution of the data in PCA analysis using whole transcript data. When analyzed by the
amplification method the data obtained using new amplification protocol are still more closely
related to the source samples.
Download