Supplementary Figures and Tables (doc 233K)

advertisement
Supplementary Figure Legends
Figure S1
Overview of the experimental strategy
The transcriptomes of CD3-purified cells from the peripheral blood of five inv(14)-positive T-PLL cases
were compared with those of eight normal donor derived immunomagnetically purified peripheral blood
T cell samples. Mononuclear cells were isolated from fresh peripheral blood samples (Lymphoprep,
Invitrogen, Karlsruhe, Germany). CD3-positive T cells were enriched employing anti-CD3 magnetic
microbeads (MidiMacs, Miltenyi Biotec, Bergisch Gladbach, Germany) resulting in a purity of CD3+ T
cells of >90% by flow cytometry. Gene expression profiling was performed on RNA from 1-2x108
CD3+ cells (RNeasy midi kit, Qiagen, Hilden, Germany) using the Affymetrix U133A microarray
platform as recently described (1). Differentially expressed genes (Table S2) were identified using the
Mann-Whitney U non-parametric test in a supervised approach as described in the legend to Table
S2. The panel of differentially expressed genes was further functionally analyzed for enrichment in the
gene ontology (GO) categories `biological process´ and `molecular function´ employing the GO stat
tool (Table S3) (2). Furthermore, differentially expressed genes were tested for non-random
distribution to individual chromosome arms employing the hypergeometric distribution method (3) in
order to identify candidate regions for genomic aberrations (Table 1). Gene expression changes were
then correlated with chromosomal imbalances as detected by FISH and Affymetrix GeneChip 50K
SNP array analysis. The raw experimental data can be accessed through the internet
(http://www.ncbi.nlm.nih.gov/geo/).
Figure S2 and Table S5
Summary of the GeneChip analyses
GeneChip analyses were conducted as recently described (6). Genomic DNA (extracted with QIAamp
DNA Blood Midi Kit, Qiagen, Hilden, Germany) was subjected to Affymetrix GeneChip 50K SNP XbaI
mapping array analyses following the standard protocol for Affymetrix GeneChip Mapping 100K arrays
(Affymetrix Inc., Santa Clara, CA, USA). Arrays were evaluated using the Affymetrix software tools
(GDAS, v3.0; CNAT2.0) and according to criteria we established based on FISH-validated copy
number aberrations.The problem to establish reliable cut-offs from SNP-Chip data is that the quality of
the individual SNPs, and consequently the single point copy number (SPA-CN) values, fluctuates
throughout the genome. Therefore, the cut-off for copy number changes was calculated correlating
SNP-Chip data with chromosomal regions containing FISH-proven imbalances using a stepwise
approach (Figure S2). As cut-offs are calculated as mean plus/minus three times the standard
deviation, the intrinsic heterogeneity of SPA_CN values leads to a size-dependent bias of the standard
deviation. The selection of too small or too large regions for cut-off determination leads to high and low
standard deviations/cut-offs, respectively. Therefore, the initial cut-offs were calculated using a region
spanning 150 consecutive SNPs. For deletions, an initial estimation of the cut-off was determined
using SPA_CN values from a first set of 150 consecutive SNPs, 75 upstream and 75 downstream of
the chromosomal location of the FISH-probe in 8p21.3, that was proven to be deleted in four T-PLL
cases by FISH. A deletion was defined as a region spanning at least 18 consecutive SNPs (to reach a
resolution of 1 Mb) with a mean value below 1.5. The performance of this estimate was tested on a
second set of 20 FISH-proven deletions. Mean SPA_CN values of 18 consecutive SNPs were
calculated for these regions and 15/20 showed mean SPA_CN values below 1.5. One deleted region
had a mean SPA_CN value above 1.5 and below 1.6 whereas 4 regions had mean SPA_CN values
above 1.75 (see details in Table S5). The final cut-off was therefore adjusted to 1.6. Using this
criterion, 4/20 FISH-proven deletions could not be detected, which is explained by the presence of low
quality SNPs in these particular regions.
The initial estimation of the cut-off for gains was calculated using 150 consecutive SNPs from three
cases with FISH-proven 6p21 chromosomal gains. The initial cut-off was 2.8. This estimated cut-off
was verified using 12 additional FISH-proven gained regions, and all of them contained mean values
above 2.8 for 18 consecutive SNPs. Therefore, the final cut-off for gains was set to 2.8. Additionally,
all 11 cases with FISH-proven balanced copy numbers on chromosome arm 9q were shown to have
mean values above 1.6 and below 2.8, which confirmed their balanced genomic status and
demonstrated the validity of the calculated cut-offs. In order to determine copy number alterations
throughout the SNP-Chip data, it was necessary to smooth the SPA_CN-data to compensate for
regions represented by low quality SNPs. Therefore, means of 18 consecutive copy numbers of single
point analyses were calculated and to get a clearer overview, a colour-code for gains and losses
applied using the established cut-offs. We identified 39 deleted regions and 21 gained regions (see
Table 2). As the boundaries of gained and deleted regions are blurred by the smoothing algorithm,
their exact locations were taken from the raw SPA_CN data.
Figure S1
Figure S2
Table S1
T-PLL
Clinical and cytogenetic data of the T-PLL patients included in the study
Age
(in years)
Gender
Karyotype of the tumor cells
45,X,Y,i(6)(p10),del(7)(p15),der(8)t(8;8)(p21;q21),-10,
der(11)del(11)(q22q23)dup(11)(q23q24),dic(12;15)(p12;p13),
add(13)(p13),inv(14)(q11q32),add(15)(q24),der(21)t(10;21)(q11;p13),
add(22)(p13),+der(?)t(?;8)(?;q21)[cp 21].
45,XX,der(5)t(5;11)(p14;q13),i(8)(q10),-11,t(12;14)(p12;q12~13),-14,
inv(14)(q11q32),add(17)(p12),add(18)(p11),+mar [15]
44,X,-Y, t(4;9)(q22~24;q34),add(6)(q12),+8,der(8)t(8;8)(p21;q21)x2,
-11,inv(14)(q11q32),del(17)(q24),add(19)(p13),
der(21;22)(q10;q10)[3].
45,X,-X,+10,der(10;18)(q10;q10),inv(14)(q11q32)[1]
/46,idem,der(9)t(9;14)(p21;q31),
inv(14)(q11q32),der(14)inv(14)(q11q32), t(9;14)(p21;q31),+16[12]
1
49
m
2
54
f
3
72
m
4
90
f
5
74
f
FISH only: TCRAD break, TCL1 break, nuc ish see table S5
6
59
m
FISH only: TCRAD break, TCL1 break, nuc ish see table S5
7
70
f
8
62
m
9
51
m
10
76
m
11
68
f
12
55
m
A, T-PLL
without
inv(14)/t(14;14)
67
m
Included in Included in
Included in
FISHSNP-Chip
GEP
analysis
analysis
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
46,XX,der(6)t(3;6)(p14;q21),inv(14)(q11.2;q32.1),
inv(16) (pter->p13::q11->p13::q24->q11::q24->qter),
der(21)t(21;21)(p11;q21)[6]/46,idem,der(8)t(8;8)(p21;q11)[17]
44,X,der(Y)t(Y;?1)(q12;q31),der(8)t(8;8)(p11;q22)hsr(8)(p11),
der(9)t(9;9)(p23;q34),der(10)t(5;10)(q31;q26),del(10)(q23),
-11,dup(11)(p12p14),dup(13)(q21q14),inv(14)(q11q32.1),
dic(20;22)(p12;p11),-22[17]
46,XY,der(8)t(8;8)(p21;q21),add(12)(p13),+14,
der(14)(14pter->14q10::14q22->14q11::14q32->14qter)x2,
-18,add(19)(p13),der(22)t(?12;22)(q14;q11)[29]
46,XY,t(1;2)(p32;p21),t(3;20)(q27;q11),+8,t(8;20)(p12;q13),
inv(14)(q11q32),-22[15]
44,X,add(X)(q25),del(3)(p11),der(4)t(3;4)(p12;p15),
der(8)?t(X;8)(q25;p22),der(8)t(8;8)(p22;q23),-11,-13,inv(14)(q11q32),
r(17)(p11q24)[cp 9]/44,XX,inv(1)(p12q25~31), -11,13,inv(14)(q11q32),add(16)(q11),r(17)(p11q24)[13]
43,X,-Y,t(1;3)(q22;p23),der(4)t(4;?22)(p15;q13),t(6;6)(p12;p22~24),
i(8)(q10),-9,-10, der(11)add(11)(p12)add(11)(q23),
+13,der(13;14)(q10;q10)x2, inv(14)(q11q32),add(17)(p12),add(17)(q24),22,+mar[25]
45,X,add(Y)(p11),add(5)(q34),i(8)(q10),del(11)(q23),dic(13;22)(p13;p13)
x
x
x
x
x
x
x
x
x
x
x
x
m= male, f= female, GEP= gene expression profiling
A: Results of this case were not included in the evaluations due to the lack of inv(14)/t(14;14) but
served for control purposes. Karyotypes of the tumour cells are described according to ISCN 1995 (4).
Breakpoints in the TCRAD and TCL1 loci were confirmed by FISH in all T-PLL samples (Table S4).
Table S2
Differentially expressed genes in T-PLL (N=5) vs. normal donor (ND, N=8) derived
CD3+ T-cells.
To eliminate genes with a low absolute expression intensity, only genes called present by the
Affymetrix algorithm in at least 50% of the T-PLL samples (for genes designated up-regulated) or
normal control samples (for genes defined as down-regulated) were selected (N=11028) and then
further filtered by comparing the median signal intensities of T-PLL vs. normal control samples and
defining a cut-off fold change of ±2.0 yielding N=1302 probe sets. This set of genes was further
analysed employing the Mann-Whitney U non-parametric test at a significance level of 0.05 resulting in
the identification of 830 differentially expressed genes (termed “subgroup distinction genes”, N=668
down-regulated and N=634 up-regulated probe sets, A). To correct for multiple testing the 1302
reliably measured probe sets with a fold change difference of ±2.0 between the two groups (see
above) were also analysed by multiclass supervised comparative analysis utilizing the significance
analysis of microarrays (SAM) method. Employing a false discovery rate of
q ≤ 5% N=1052
differentially expressed probe sets were identified, which showed a 99.8% overlap (828/ 830 probe
sets) with the set of subgroup distinction genes defined by the MWU test (see Venn diagram and raw
data in B).
Table S3
Statistically over-represented Gene Ontologies (GO) within the genes differentially
expressed between T-PLL and normal peripheral blood derived CD3+ T-cells
Annotated GO term
Genes
No. of annotated No. of annotated
genes in
genes in target
reference list (388 gene list (5132 in
in total)
total)
p-value
GO Biological Process
OPN3 TLE2 CALM3 GPR65 BLR1 ENPP2 CD2 KIAA1128 JAG2 CELSR2
GPR56 DOCK1 TSHB GPR171 TGFBR3 CXCL1 ACVR2B GNG11 CD8A F2R
PPAP2A KLRB1 P2RY5 DKK1 TNFRSF1B CD160 HRMT1L2 CD59 TCF7L2
GPR27 IFNGR2 ERBB3 FZD6 PIG8 ADRB2 RRH KLRF1 CD3D P2RY10
ITGA10 INHA
OPN3 SLC1A1 LST1 CST7 MBP GPR65 BLR1 MYBPC1 CTSW CD2 JAG2
ALPL GCH1 IGJ RIPK2 ICOS NR3C2 SNTA1 IL2RB CXCL1 CLECSF2
GO:0050874 : Organismal
PITPNA GBP2 KLRC3 XCL2 SORD CD8A F2R SIX6 KLRB1 CHST4 PDE7B
physiological process
CD244 CLCN5 CD160 CKMT2 CCL5 CD59 SCN10A CCL4 CTLA4 PPBP
PLUNC GLMN GNLY LGALS3B
LST1 CST7 MBP GPR65 BLR1 CTSW CD2 JAG2 IGJ RIPK2 ICOS CD48
IL2RB CXCL1 CLECSF2 GBP2 KLRC3 XCL2 CD8A KLRB1 CHST4 CD244
GO:0006952 : Defense response
CD160 CCL5 HRMT1L2 CD59 CCL4 CTLA4 PPBP PLUNC GLMN GNLY
LGALS3BP KCNN4 IK KLRF1 CD3D INHA GZMA
GO:0007186 : G-protein coupled
OPN3 CALM3 GPR65 P2RY5 BLR1 ENPP2 KIAA1128 CELSR2 GPR56
TSHB GPR171 CXCL1 GPR27 FZD6 ADRB2 GNG11 RRH F2R P2RY10
receptor protein signaling
PPAP2A
pathway
LST1 CST7 MBP GPR65 BLR1 CTSW CD2 JAG2 IGJ RIPK2 ICOS IL2RB
CXCL1 CLECSF2 GBP2 XCL2 KLRC3 CD8A KLRB1 CHST4 CD244 CD160
GO:0006955 : Immune response
CCL5 CD59 CCL4 CTLA4 PPBP PLUNC GLMN GNLY LGALS3BP IK KLRF1
CD3D GZMA INHA
GO Molecular Function
OPN3 NEO1 GPR65 BLR1 CD2 KIAA1128 LAIR2 GRM6 CELSR2 GPR56
NR3C2 IL2RB GPR171 TGFBR3 IL18R1 ACVR2B KLRC3 NR2C1 CR1 CD8A
GO:0004872 : Receptor activity
F2R KLRB1 FKBP1A P2RY5 CD244 TNFRSF1B CUL5 CD160 PLA2R1
EPS15 RORA PTPRM TRIP13 GPR27 IFNGR2 ERBB3 FZD6 LGALS3BP
ADRB2 CRSP6 RRH KLRF1 C
OPN3 NEO1 GPR65 P2RY5 BLR1 TNFRSF1B KIAA1128 GRM6 PTPRM
GO:0004888: Transmembrane
CELSR2 GPR56 IL2RB GPR171 ACVR2B IFNGR2 GPR27 KLRC3 FZD6
receptor activity
ERBB3 LGALS3BP ADRB2 RRH KLRF1 CD3D P2RY10 F2R KLRB1
GO:0007166: Cell surface
receptor linked signal
transduction
41
246
0.0012
53
396
0.0231
39
275
0.0427
20
108
0.0427
36
250
0.0427
45
316
0.0222
27
163
0.0351
The 830 probeset identifiers found to be significantly deregulated in T-PLL (Table S2) were annotated
and analyzed for the presence of overrepresented “biological processes” and “molecular functions”
using the GOstat tool (2) we used a list of 11028 probesets of the HG-U133A array, which showed
equal to or greater than 50% P detection calls in T-PLL (n=5) and/or normal T-cell (n=8) array
analyses. Significant GO terms are indicated together with the associated genes, the number of
associated genes in the T-PLL target gene list and the number of associated genes in the reference
list. Computed p-values were corrected for multiple testing using the Benjamini and Hochberg method
(5).
Table S4
Summary of the applied probes and FISH results
Table S5
Overview of the FISH-proven regions used to define the cut-offs for chromosomal
deletions and gains in the GeneChip analysis
Deletions
Chromosomal region
Number of cases with
deletion as detected by
GeneChip
3
0
2
4
3
0
2
2
Number of cases with
deletion as detected
by FISH
3
1
2
4
3
1
3
3
Median SPA_CN value
of cases
Number of cases with
gain as detected by
FISH
2
7
Median SPA_CN value
of cases
5p15
8p11.21
Number of cases with
gain as detected by
GeneChip
2
7
14q32.1
17p13
2
1
2
1
6q21
7q35
10p15~14
11q22~23
18p11.32~22
21q22
22q11.21
22q11.23
1.50, 1.39, 1.30
2.12
1.37, 1.24
1.35, 1.34, 1.36, 1.36
1.03, 0.94, 1.54
3.00
1.29, 1.41, 1.79
1.37, 1.34, 1.76
Gains
Chromosomal region
5.30, 3.17
3.66, 5.34, 3.38, 3.71,
3.26, 3.13, 3.48
3.35, 3.50
2.87
Reference List
(1) Schroers R, Griesinger F, Trümper L, Haase D, Kulle B, Klein-Hitpass L, Sellmann L, Dührsen
U, Dürig J. Combined analysis of ZAP-70 and CD38 expression as a predictor of disease
progression in B-cell chronic lymphocytic leukemia. Leukemia 2005; 19(5):750-758.
(2) Beissbarth T, Speed TP. GOstat: find statistically overrepresented Gene Ontologies within a
group of genes. Bioinformatics 2004; 20(9):1464-1465.
(3) Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM. Systematic determination of
genetic network architecture. Nature Genetics 1999; 22(3):281-285.
(4) Mitelman F. ISCN: An International System for Human Cytogenetic Nomenclature. Basel:
Karger, 1995;94-104.
(5) Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat
Med 1990; 9(7):811-818.
(6) Matsuzaki H, Dong S, Loi H, Di X, Liu G, Hubbell E, Law J, Berntsen T, Chadha M, Hui H,
Yang G, Kennedy GC, Webster TA, Cawley S, Walsh PS, Jones KW, Fodor SP, Mei R.
Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods 2004;
1(2):109-111.
Download