here - Campus IFOM-IEO

advertisement
Experiment Design
We analysed gene expression profiles of different U937 cell clones
conditionally expressing PML/RAR, AML1/ETO, or PLZF/RAR. In these cells, the
different cDNAs are under the transcriptional control of the Zinc (Zn)-inducible
mouse metallothionein (Mt) promoter. All cell clones were treated with 100M
ZnSO4 for 8 hours before RNA extraction. We selected cell clones that showed
comparable levels of expression of the fusion proteins after induction (see figure
1A). A U937 bulk population containing the empty cloning vector (Mt), also
treated with 100M ZnSO4 for 8 hours, was used as reference for all samples.
Samples used, extract preparation and labelling:
Total RNA was extracted using TRIzol Reagent (Gibco), followed by clean
up on RNeasy mini/midi columns (RNeasy Mini/Midi Kit, Qiagen). For each of the
four U937 cell lines (PML/RAR, AML1/ETO, PLZF/RAR and Mt), three
independent vials were thawed, and the ZnSO4 induction and RNA extraction
were performed separately. Prior to RNA extraction, a small aliquot of cells was
lysed in Laemmli lysis buffer, and all experiments were controlled for fusion
protein expression by Western blotting.
For each cell line, an RNA pool was obtained by mixing equal quantities of
total RNA from each of the three independent RNA extractions. Biotin-labelled
cRNA targets were synthesized starting from 10g of pooled RNA. Double
stranded cDNA synthesis was performed with GIBCO SuperScript Custom cDNA
Synthesis Kit, and biotin-labelled antisense RNA was transcribed in vitro using
Ambion’s In Vitro Transcription System, including Bio-16-UTP and Bio-11-CTP
(Enzo) in the reaction. All steps of the labelling protocol were performed as
suggested
by
Affymetrix
(http://www.affymetrix.com/support/technical/manual/expression_manual.affx).
The size and the accuracy of quantitation of targets were checked by agarose gel
electrophoresis of 2g aliquots, prior to and after fragmentation (see below). After
fragmentation, targets were diluted in hybridisation buffer at a concentration of
150g/ml.
MK 1 2 3 4 5 6 7 8
Marker (9.49 kb, 7.46 kb, 4.4 kb, 2.37 kb, 1.35 kb, 0.24 kb)
1- U937-Mt
2- U937-Mt, fragmented
3- U937-PML/RAR
4- U937-PML/RAR, fragmented
5- U937-AML1/ETO
6- U937-AML1/ETO, fragmented
7- U937-PLZF/RAR
8- U937-PLZF/RAR, fragmented
Due to a globally less efficient performance of the PML/RAR target (see
Report files.xls), another triplicate of total RNA samples were extracted from the
PML/RAR cell line in the same conditions described above. Biotinylated cRNA
targets were then synthesized from the new PML/RAR RNA pool and from the
previously obtained reference RNA pool. The entire procedure described below
was repeated for the new PML/RAR and reference targets thus obtained, and
resulting data are referred to as 2PR and 2Mt in all analysis tables.
Hybridisation procedures and parameters
Hybridisation mix for target dilution (100 mM MES, 1 M [Na +], 20 mM
EDTA, 0.01% Tween 20) was prepared as indicated by Affymetrix, including premixed biotin-labelled control oligo B2 and bioB, bioC, bioD and cre controls
(Affymetrix cat# 900299) at a final concentration of 50 pM, 1.5 pM, 5 pM, 25 pM
and 100 pM respectively. Targets were diluted in hybridisation buffer at a
concentration of 150µg/ml and denatured at 99°C prior to introduction into the
GeneChip cartridge.
Targets were tested for quality by hybridisation to Affymetrix Test2 Arrays
(cat# 900271 – now substituted by Test3 Arrays, cat# 900341). Two copies of the
complete HG-U95 chip set (HG-U95Av2, HG-U95B, HG-U95C, HG-U95D, HGU95E, Affymetrix cat#900303, 900305, 900307, 900309, 900311) were then
hybridised with each biotin-labelled target.
Hybridisations were performed for 14-16 hours at 45°C in a rotisserie oven.
GeneChip cartridges were washed and stained in the Affymetrix fluidics station
following the EukGE-WS2 standard protocol (including Antibody Amplification):
1. Wash 10 cycles of 2 mixes/cycle with Wash Buffer A (6X SSPE, 0.01%
Tween 20) at 25°C
2. Wash 4 cycles of 15 mixes/cycle with Wash Buffer B (100 mM MES, 0.1 M
[Na+], 0.01% Tween 20) at 50°C
3. Stain the probe array for 10 minutes in SAPE solution (10 g/mL SAPE in
100 mM MES, 1 M [Na +], 0.05% Tween 20, 2 mg/mL BSA) at 25°C
4. Wash 10 cycles of 4 mixes/cycle with Wash Buffer A at 25°C
FIRST SCAN
5. Stain the probe array for 10 minutes in antibody solution (Normal Goat IgG
0.1 mg/mL,
6. Biotinylated antibody 3 g/mL, 100 mM MES, 1 M [Na +], 0.05% Tween
20, 2 mg/mL BSA) at 25°C
7. Stain the probe array for 10 minutes in SAPE solution at 25°C
8. Final Wash 15 cycles of 4 mixes/cycle with Wash Buffer A at 30°C
SECOND SCAN
Images were scanned using an Affymetrix GeneArray Scanner, using
default parameters. Each chip was scanned twice, to obtain two different images:
the first scan was performed after the first SAPE staining procedure (between
steps 4 and 5 above), and the second scan was performed after antibody
amplification of the signal, at the end of the washing procedure. The resulting
images were analysed using Microarray Suite version 5 (MASv5), Affymetrix cat#
690018. Data obtained from the two scans was processed independently, and
merged for each sample only at the end of all elaborations.
Measurement data and specifications
“Absolute analysis” was performed for each chip with MASv5 software using
default parameters, scaling all images to a value of 500. Report files were
extracted for each chip, and performance of labelled targets was evaluated on
the basis of several values (scaling factor, background and noise values, %
present calls, average signal value, etc). Homogeneity within the experiment was
also taken into account; for this reason, the PML/RAR sample was repeated in a
second experiment, together with a newly synthesized reference (Mt) target. A
summary report file for all chips can be found in “Report files.xls”.
Results derived from PML/RAR, AML1/ETO, and PLZF/RAR targets
(samples) were also compared to results from the Mt target (reference) by
“comparative analysis”, using the reference chips as baseline. Each sample chip
was compared to both reference chips for identification of regulated genes.
Furthermore, duplicate sample and reference chips were compared to each other
for calculation of noise (see scheme below).
U937-PML/RAR 1
U937-PML/RAR 2
U937-AML1/ETO 1
U937- AML1/ETO 2
U937-PLZF/RAR 1
U937-PLZF/RAR 2
COMPARISON FILES
NOISE FILES
U937-Mt 1
U937-Mt 2
This procedure yielded four comparison files for each sample under
analysis. All raw data deriving from absolute and comparative analyses of both
scans for each chip are available as text files in the “Absolute analysisscan1.zip”, “Absolute analysis-scan2.zip”, “Comparative analysis-scan1.zip” and
“Comparative analysis-scan2.zip” directories, respectively.
Data thus obtained was then subjected to further elaboration using two
sequential analysis procedures: DCall-Fold Change and T-test analyses. DCall –
Fold Change analysis is performed on Affymetrix comparison files. The
Affymetrix “Difference Call” (DCall) corresponds to the qualitative information
about the status of a Probe Set in the two conditions considered: it indicates if the
expression level of a Probe Set is decreased (D), mildly decreased (MD),
increased (I) or mildly increased (MI) in the sample as compared to the
reference. “Fold Change” gives the corresponding quantitative information: it is
calculated from the Signal Log ratio of Affymetrix comparison files.
FCi = 2SLRi
if SLRi > 0
FCi = -1/2SLRi if SLRi < 0
Where SLRi is the signal log ratio value for Probe Set i.
The expression of a gene represented by a specific Probe Set is considered as
decreased if, in each comparison file analysed, it has a DCall corresponding to
“D” or “MD” and its Fold Change value is lower than a fixed cut-off. Conversely,
the expression of a given gene is considered increased if its representative Probe
Set, in each comparison files analysed, has a DCall corresponding to “I” or “MI”
and its Fold Change value is higher than a fixed cut-off. For the purpose of
finding common regulated target genes, the analysis was performed at low
stringency; and the fold change cut-off value was set to 1,3 or –1.3.
The t-statistic is well suited for finding differentially expressed genes
because it allows the selection of an expression pattern that has maximal
difference in mean level of expression between two groups, and minimal variation
of expression within each group. A double sided t-test was performed on Signal
values generated by MASv5, considering that group 1 N (μ1,s1) and group 2 N
(μ2,s2) follow a Gaussian distribution. Parameters of the distributions are
unknown and we assume the identity of standard deviations.
t
X 2  X1
n1  1s12  n2  1s2 2
n1n2 f
n1  n2
where:

X2, X1 are the means of signal values for group1 and group2, respectively,

n1,n2 are the number of signals in group1 and group2, respectively,

s12,s22 are the variances of the signal values in group1 and group2,
respectively,

f is the number of degrees of freedom.
f = n1+n2-2
The McNemar test was used to determine data quality and cut-off values (see
Abell, M.L., Braselton, J.P., and Rafter, J.A., 1999. Statistics with Mathematica.
Academic Press). The McNemar test is often used in clinical trials to assess the
efficacy of drug treatment versus placebo controls. The test compares lists of
“yes/no” values. A P value >0.01 indicates lack of efficacy (the lists are
equivalent), whereas P values <0.01 suggest the presence of a therapeutic
effect. Applied to our data, we used the results from pair-wise chip comparisons
to obtain these lists where yes means “gene regulated” and no means “gene not
regulated”. A gene was called regulated when both the Dcall-Fold change
analysis and the t-test resulted positive for that gene. Lists of regulated genes
resulting from chip comparisons between two test chips or between two control
chips were used to determine the noise level of the experiments (called noise
lists), whereas chip comparisons between a test and a control chip were used to
determine the signal (called signal lists or data lists). Three parameters were
evaluated:
1. The equivalence of samples and chip performances (noise list vs. noise
list, P > 0.01).
2. The presence of differences in transcript levels (noise list vs. signal list, P
< 0.01).
3. The reproducibility of measurement of such differences (signal list vs.
signal list, P > 0.01).
Comparative analysis lists that resulted from single-chip comparisons were
combined into two duplicate lists using the logical AND operator (meaning that a
gene was called regulated only when it was found to be regulated in each of the
composing sub-lists). Noise lists and randomised signal lists combined the same
way were used as controls. Randomisation of signal lists was carried out using a
pseudorandom number generator that reassigned a new position to each gene in
the list before combining them.
Special-purpose software for replica analysis and t-test analysis was developed
by Heiko Muller, and is available on request (muller@ifom-firc.it).
Elaboration of results
Lists of regulated genes resulting from the analysis procedure described
above were imported into Access Databases for further elaboration. First,
regulated probe sets from the 5 chips were combined into a single gene list for
each experimental sample. Next, the results deriving from the first and the
second scans of each chip (see “Hybridisation procedures and parameters”)
were combined into a single list. Both fold change values were maintained for
reference. Results of the two experiments with PML/RAR targets were then
combined to obtain a single list. For this sample, the fold changes considered
were those relative to the experiment where regulation was found; in the case of
genes found regulated in both experiments, or of genes that were not regulated
in either experiment, an average value was calculated.
Gene identity was assigned to Affymetrix probe sets using the “Automated
Chip
Reannotation
tool
at
IFOM”
(http://bio.ifom-
firc.it/ARRAY_ANNOT/index.html), derived from UniGene release Hs.159. The
regulated probe sets thus annotated are visible in the table “1 Regulated
ProbeSets.xls”. These results were then converted into non-redundant regulated
genes, rather than regulated probe sets, using the UniGene ID as unique
identifier. All probe sets assigned to the same UniGene cluster were considered
as redundant, and are represented once in the table “2 Regulated Genes
Hs159.xls”. Probe sets that represent sequences not assigned to a UniGene
cluster were further grouped according to Gene Symbol (derived from EMBL or
dbEST) or to the Accession number itself. The fold change value indicated in the
table is the average fold change of all regulated probe sets representing the
gene. In the rare cases where probe sets representing the same gene displayed
opposite regulations, the results were discarded from further analysis for the
purpose of this study. Of the latter, only those probe sets that were concordantly
regulated by fusion proteins were maintained in the final list (“3 Common
Targets.xls”, see below), but they were discarded from all lists based on gene
identity rather than probe set.
The table “3 Common Targets.xls”, starting point of the results discussed in
the manuscript, includes all those genes that are regulated concordantly by at
least two fusion proteins, even when the third fusion protein under analysis
regulates some of these in the opposite direction. Common target genes were
first identified by searching for concordantly regulated probe sets (1623 probe
sets, corresponding to 1409 non-redundant genes). Additional 146 common
target genes were identified by searching the table “Regulated Genes Hs159.xls”,
and therefore derive from the results of different probe sets for the same gene.
In particular, of the 1555 genes represented in the table:

50 are induced by all 3 fusion proteins

113 are repressed by all 3 fusion proteins

94 are induced only by PML/RAR and AML1/ETO; of these, 28 are
repressed by PLZF/RAR

182 are repressed only by PML/RAR and AML1/ETO; of these, 71 are
induced by PLZF/RAR

219 are induced only by PML/RAR and PLZF/RAR; of these, 23 are
repressed by AML1/ETO

326 are repressed only by PML/RAR and PLZF/RAR; of these, 49 are
induced by AML1/ETO

291 are induced only by PLZF/RAR and AML1/ETO; of these, 40 are
repressed by PML/RAR

280 are repressed only by PLZF/RAR and AML1/ETO; of these, 34 are
induced by PML/RAR.
In summary, of the 1555 genes considered as common targets of AML
fusion proteins, 245 (=15%) were concordantly regulated by two fusion proteins
and discordantly regulated by the third.
Download