Supplementary Information for Manuscript:

advertisement
Supplementary Information
Gene expression-based classification of malignant gliomas
correlates better with survival than histological classification
Catherine L. Nutt, D. R. Mani, Rebecca A. Betensky, Pablo Tamayo, J. Gregory Cairncross,
Christine Ladd, Ute Pohl, Christian Hartmann, Margaret E. McLaughlin, Tracy T. Batchelor,
Peter M. Black, Andreas von Deimling, Scott L. Pomeroy, Todd R. Golub and David N. Louis
Table of Contents:
High Grade Glioma Dataset ............................................................................................................ 2
High Grade Glioma Class Markers ................................................................................................. 4
Features of the 20-feature k-NN Class Prediction Model ............................................................... 7
Features Used During Building of the Class Prediction Model ...................................................... 8
Summary of Training Sample Set Class Predictions .................................................................... 11
Summary of Test Sample Set Class Predictions ........................................................................... 12
Survival Statistics for the High Grade Glioma Dataset ................................................................ 13
Survival curves - all glioblastomas and anaplastic oligodendrogliomas ...................................... 15
High Grade Glioma Dataset
Dataset:
50 high grade gliomas
- 28/50 glioblastomas
- 14/28 classic glioblastomas
- 14/28 non-classic glioblastomas
- 22/50 anaplastic oligodendrogliomas
- 7/22 classic anaplastic oligodendrogliomas
- 15/22 non-classic anaplastic oligodendrogliomas
Sample Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Sample Name
Brain_CG_1
Brain_CG_2
Brain_CG_3
Brain_CG_4
Brain_CG_5
Brain_CG_6
Brain_CG_7
Brain_CG_8
Brain_CG_9
Brain_CG_10
Brain_CG_11
Brain_CG_12
Brain_CG_13
Brain_CG_14
Sample Type
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
Classic glioblastoma
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Brain_NG_1
Brain_NG_2
Brain_NG_3
Brain_NG_4
Brain_NG_5
Brain_NG_6
Brain_NG_7
Brain_NG_8
Brain_NG_9
Brain_NG_10
Brain_NG_11
Brain_NG_12
Brain_NG_13
Brain_NG_14
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
Non-classic glioblastoma
2
High Grade Glioma Dataset (continued)
Sample Number
29
30
31
32
33
34
35
Sample Name
Brain_CO_1
Brain_CO_2
Brain_CO_3
Brain_CO_4
Brain_CO_5
Brain_CO_6
Brain_CO_7
Sample Type
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
Classic anaplastic oligodendroglioma
36
37
37
39
40
41
42
43
44
45
46
47
48
49
50
Brain_NO_1
Brain_NO_2
Brain_NO_3
Brain_NO_4
Brain_NO_5
Brain_NO_6
Brain_NO_7
Brain_NO_8
Brain_NO_9
Brain_NO_10
Brain_NO_11
Brain_NO_12
Brain_NO_13
Brain_NO_14
Brain_NO_15
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
Non-classic anaplastic oligodendroglioma
3
High Grade Glioma Class Markers
The table below demonstrates the top 50 marker genes for each tumor class including the
permutation test values. Genes were selected based on the signal-to-noise metric.
Variation filter: max/min > 3 (3-fold), max-min > 100 absolute units.
GBM, glioblastoma; AO, anaplastic oligodendroglioma.
Permutation Test
Marker Genes
Distinction
GBM
GBM
GBM
GBM
Distance
1.3750
1.1982
1.1633
1.0315
Perm 1%
1.4928
1.3190
1.2929
1.2698
Perm 5%
1.3180
1.1916
1.1591
1.1338
Perm 10%
1.2629
1.1067
1.0328
1.0064
Feature
34091_s_at
630_at
631_g_at
39691_at
GBM
GBM
0.9587
0.9581
1.2638
1.2557
1.0978
1.0540
0.9705
0.9446
160039_at
35016_at
GBM
0.9041
1.2353
1.0474
0.9378
38791_at
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
0.9021
0.8941
0.8838
0.8798
0.8716
0.8617
0.8524
0.8523
1.2041
1.1966
1.1726
1.1645
1.1414
1.1310
1.1295
1.1208
1.0121
0.9997
0.9956
0.9917
0.9758
0.9428
0.9353
0.9303
0.9331
0.9047
0.8854
0.8842
0.8803
0.8780
0.8489
0.8415
1395_at
37542_at
935_at
34768_at
32749_s_at
36678_at
40793_s_at
37421_f_at
GBM
GBM
0.8492
0.8309
1.1185
1.0880
0.9088
0.9006
0.8316
0.8216
1318_at
37012_at
GBM
0.8237
1.0874
0.8904
0.8212
388_at
GBM
GBM
0.8128
0.8127
1.0801
1.0709
0.8858
0.8852
0.8194
0.8165
41624_r_at
34193_at
GBM
GBM
GBM
GBM
GBM
0.8096
0.7946
0.7882
0.7871
0.7857
1.0586
1.0563
1.0533
1.0390
1.0325
0.8835
0.8834
0.8722
0.8692
0.8631
0.8112
0.8098
0.8083
0.7893
0.7843
40807_at
31444_s_at
1860_at
36150_at
40771_at
GBM
0.7828
1.0272
0.8607
0.7797
31342_at
GALNT2: UDP-N-acetyl-alpha-D
galactosamine:polypeptide N
acetylgalactosaminyltransferase 2 (GalNAc-T2)
GBM
GBM
GBM
0.7820
0.7762
0.7691
1.0209
1.0189
1.0137
0.8536
0.8527
0.8508
0.7783
0.7781
0.7766
39122_at
34822_at
36921_at
GPI: glucose phosphate isomerase
GBM
GBM
GBM
GBM
GBM
0.7524
0.7470
0.7452
0.7383
0.7328
1.0128
1.0051
1.0037
0.9999
0.9931
0.8438
0.8434
0.8432
0.8421
0.8390
0.7756
0.7741
0.7735
0.7731
0.7628
406_at
36138_at
41485_at
39694_at
36131_at
GBM
GBM
GBM
0.7263
0.7257
0.7207
0.9888
0.9825
0.9786
0.8387
0.8299
0.8262
0.7599
0.7598
0.7587
769_s_at
33891_at
41549_s_at
GBM
0.7152
0.9775
0.8239
0.7547
37759_at
GBM
0.7150
0.9751
0.8225
0.7539
AFFX-HUMISGF3A/
M97935_MA_at
4
Description
VIM: vimentin
DCTD: dCMP deaminase
DCTD: dCMP deaminase
SH3GLB1: SH3-domain GRB2-like
endophilin B1
MAPK4: mitogen-activated protein kinase 4
CD74: CD74 antigen (invariant polypeptide
of major histocompatibility complex, class II
antigen-associated)
DDOST: dolichyl-diphosphooligosaccharide
protein glycosyltransferase
ARHC: ras homolog gene family, member C
LHFPL2: lipoma HMGIC fusion partner-like 2
CAP: adenylyl cyclase-associated protein
TXNDC: thioredoxin domain-containing
DKFZp586K1720 protein
TAGLN2: transgelin 2
AQP4: aquaporin 4
Human DNA sequence from clone
RP3-377H14 on chromosome 6p21.32-22.1
RBBP4: retinoblastoma-binding protein 4
CAPZB: capping protein (actin filament)
muscle Z-line, beta
PIK3R2: phosphoinositide-3-kinase,
regulatory subunit, polypeptide 2 (p85 beta)
FZR1: Fzr1 protein
CHL1: cell adhesion molecule with homology
to L1CAM (close homologue of L1)
MUF1: MUF1 protein
ANXA2P3: annexin A2 pseudogene 3
TP53BP2: tumor protein p53-binding protein, 2
KIAA0842 protein
Human DNA sequence from clone 376D21
on chromosome Xq11.1-12
TP53BP2: tumor protein p53-binding protein, 2
TCTE1L: t-complex-associated-testis
expressed 1-like
ITGB4: integrin, beta 4
CAPNS1: calpain, small subunit 1
LDHA: lactate dehydrogenase A
Hypothetical protein MGC5508
Homo sapiens genes encoding RNCC protein,
DDAH protein, Ly6-C protein, Ly6-D protein
and immunoglobulin receptor
ANXA2: annexin A2
DKFZp564H182 protein
AP1S2: adaptor-related protein complex 1,
sigma 2 subunit
LAPTM5: lysosomal-associated
multispanning membrane protein-5
STAT1: signal transducer and activator of
transcription 1, 91kD
High Grade Glioma Class Markers (continued)
Permutation Test
Marker Genes
Distinction
GBM
Distance
0.7121
Perm 1%
0.9636
Perm 5%
0.8217
Perm 10%
0.7531
Feature
38650_at
GBM
GBM
GBM
0.7112
0.7076
0.7058
0.9600
0.9569
0.9542
0.8166
0.8142
0.8102
0.7501
0.7499
0.7467
36950_at
40817_at
38253_at
GBM
GBM
GBM
GBM
GBM
GBM
GBM
0.7022
0.7018
0.6975
0.6937
0.6930
0.6910
0.6867
0.9520
0.9365
0.9359
0.9251
0.9240
0.9228
0.9210
0.8067
0.8051
0.8048
0.7995
0.7991
0.7899
0.7897
0.7428
0.7307
0.7290
0.7235
0.7219
0.7216
0.7204
38812_at
34224_at
39376_at
37714_at
37628_at
1649_at
38760_f_at
LAMB2: laminin, beta 2 (laminin S)
FADS3: fatty acid desaturase 3
KIAA0630 protein
GAP43: growth associated protein 43
MAOB: monoamine oxidase B
Human putative cyclin G1 interacting protein
AO
AO
AO
AO
AO
AO
1.8499
1.6403
1.4822
1.4652
1.4567
1.4044
1.6556
1.2785
1.2658
1.2568
1.2146
1.1938
1.3782
1.2232
1.1734
1.1395
1.1241
1.0952
1.2758
1.1896
1.1376
1.0980
1.0762
1.0535
33619_at
34679_at
37573_at
33677_at
326_i_at
41325_at
AO
1.4022
1.1925
1.0676
1.0309
38681_at
AO
1.3203
1.1910
1.0460
0.9988
41792_at
AO
AO
1.3163
1.2909
1.1745
1.1718
1.0286
1.0260
0.9905
0.9804
37249_at
37953_s_at
AO
AO
AO
AO
AO
1.2866
1.2755
1.2648
1.2501
1.2405
1.1641
1.1622
1.1595
1.1584
1.1535
0.9905
0.9871
0.9773
0.9735
0.9676
0.9720
0.9388
0.9360
0.9222
0.9109
35125_at
40235_at
41016_at
40840_at
34531_at
RPS13: ribosomal protein S13
BCR: breakpoint cluster region
ANGPTL2: angiopoietin-like 2
RPL24: ribosomal protein L24
RPS20: Ribosomal protein S20
KCNK3: potassium channel, subfamily K,
member 3 (TASK-1)
EIF3S6: eukaryotic translation initiation
factor 3, subunit 6 (48kD)
ABCC8: ATP-binding cassette, sub-family C
(CFTR/MRP), member 8
PDE8B: phosphodiesterase 8B
ACCN2: amiloride-sensitive cation channel 2,
neuronal
RPS6: Ribosomal protein S6
ACK1: activated p21cdc42Hs kinase
KIAA0510 protein
AO
AO
AO
1.2402
1.2377
1.2341
1.1335
1.1315
1.1073
0.9614
0.9448
0.9343
0.9060
0.9014
0.8900
37578_at
1134_at
41749_at
AO
1.2237
1.1071
0.9339
0.8848
38340_at
AO
AO
1.2166
1.1963
1.0978
1.0825
0.9261
0.9002
0.8840
0.8708
36196_at
39427_at
AO
AO
1.1878
1.1741
1.0824
1.0807
0.8999
0.8949
0.8691
0.8678
32341_f_at
36164_at
AO
AO
1.1702
1.1691
1.0749
1.0660
0.8908
0.8810
0.8666
0.8590
39856_at
36617_at
AO
AO
AO
1.1661
1.1607
1.1570
1.0413
1.0398
1.0388
0.8809
0.8794
0.8718
0.8532
0.8489
0.8461
41250_at
32436_at
39572_at
AO
AO
AO
1.1488
1.1401
1.1154
1.0378
1.0196
1.0150
0.8656
0.8579
0.8528
0.8422
0.8375
0.8374
35852_at
36358_at
36027_at
AO
AO
AO
AO
1.1128
1.1057
1.1035
1.1016
1.0137
1.0121
1.0097
1.0085
0.8467
0.8441
0.8439
0.8367
0.8284
0.8273
0.8253
0.8222
39864_at
34184_at
32791_at
36618_g_at
AO
1.0957
1.0056
0.8361
0.8124
33485_at
5
Description
IGFBP5: insulin-like growth factor binding
protein 5
HSGP25L2G: gp25L2 protein
NUCB1: nucleobindin 1
AGL: amylo-1,6-glucosidase, 4-alpha
glucanotransferase (glycogen debranching
enzyme, glycogen storage disease type III)
BTN3A2: butyrophilin, subfamily 3, member A2
PPIF: peptidylprolyl isomerase F (cyclophilin F)
FLRT1: fibronectin leucine rich
transmembrane protein 1
Homo sapiens clone-RES4-4
ACK1: activated p21cdc42Hs kinase
C21orf33: chromosome 21 open reading
frame 33
KIAA0655 protein: huntingtin interacting
protein-1-related
PFKM: phosphofructokinase, muscle
UQCRB: ubiquinol-cytochrome c reductase
binding protein
RPL23A: ribosomal protein L23a
PDX1: pyruvate dehydrogenase complex, lipoyl
containing component X; E3-binding protein
RPL36A: ribosomal protein L36a
ID1: inhibitor of DNA binding 1, dominant
negative helix-loop-helix protein
JYV1: JTV1 gene
RPL27A: ribosomal protein L27a
GRIK2: glutamate receptor, ionotropic,
kainate 2
CRY2: cryptochrome 2 (photolyase-like)
RPL9: ribosomal protein L9
POLR2F: polymerase (RNA) II (DNA
directed) polypeptide F
CIRBP: cold inducible RNA-binding protein
APCL: adenomatous polyposis coli like
MAC30: hypothetical protein
ID1: inhibitor of DNA binding 1, dominant
negative helix-loop-helix protein
RPL4: ribosomal protein L4
High Grade Glioma Class Markers (continued)
Permutation Test
Marker Genes
Distinction
AO
Distance
1.0949
Perm 1%
1.0028
Perm 5%
0.8355
Perm 10%
0.8099
Feature
32576_at
AO
AO
AO
1.0877
1.0870
1.0854
1.0024
1.0013
0.9997
0.8306
0.8268
0.8222
0.8093
0.8040
0.8038
537_f_at
327_f_at
34345_at
AO
AO
AO
AO
AO
1.0740
1.0713
1.0620
1.0609
1.0593
0.9934
0.9926
0.9919
0.9833
0.9778
0.8172
0.8150
0.8101
0.8047
0.8034
0.8035
0.7994
0.7979
0.7948
0.7752
31708_at
41264_at
41269_r_at
35848_at
841_at
AO
1.0567
0.9745
0.8012
0.7726
35633_at
AO
AO
AO
AO
1.0413
1.0364
1.0355
1.0346
0.9717
0.9648
0.9579
0.9550
0.8011
0.7998
0.7992
0.7952
0.7627
0.7624
0.7584
0.7558
32487_s_at
41289_at
37697_s_at
35326_at
6
Description
EIF3S5: eukaryotic translation initiation
factor 3, subunit 5 (epsilon, 47kD)
Human breakpoint cluster region (BCR) gene
RPS20: Ribosomal protein S20
TOM: putative mitochondrial outer
membrane protein import receptor
RPL30: ribosomal protein L30
DKFZp586F1322 protein
API5L1: API5-like 1
DKFZp586J231 protein
OLIG2: oligodendrocyte lineage transcription
factor 2
ELMO1: engulfment and cell motility 1 (ced
12 homolog, C. elegans)
KPNA4: karyopherin alpha 4 (importin alpha 3)
NCAM1: neural cell adhesion molecule 1
Homo sapiens porin (por) mRNA
54TM: putative transmembrane protein;
homolog of yeast Golgi membrane protein Yif1p
(Yip1p-interacting factor)
Features of the 20-feature k-NN Class Prediction Model
The table below demonstrates feature numbers and gene identifications of the 20-feature k-NN
class prediction model.
Class
Correlation
GBM
GBM
GBM
GBM
Feature
Number
34091_s_at
630_at
631_g_at
39691_at
Accession
Number
Z19554
L39874
L39874
AB007960
GBM
GBM
160039_at
35016_at
NM_002747
M13560
GBM
38791_at
D29643
GBM
GBM
GBM
AO
AO
AO
AO
AO
AO
1395_at
37542_at
935_at
33619_at
34679_at
37573_at
33677_at
326_i_at
41325_at
L25081
D86961
L12168
L01124
X02596
AF007150
M94314
HG1800-HT1823
AF006823
AO
38681_at
U62962
AO
41792_at
L78207
AO
AO
37249_at
37953_s_at
AF079529
U78181
Gene Description
VIM: vimentin
DCTD: dCMP deaminase
DCTD: dCMP deaminase
SH3GLB1: SH3-domain GRB2-like
endophilin B1
MAPK4: mitogen-activated protein kinase 4
CD74: CD74 antigen (invariant polypeptide of
major histocompatibility complex, class II
antigen-associated)
DDOST: dolichyl-diphosphooligosaccharide
protein glycosyltransferase
ARHC: ras homolog gene family, member C
LHFPL2: lipoma HMGIC fusion partner-like 2
CAP: adenylyl cyclase-associated protein
RPS13: ribosomal protein S13
BCR: breakpoint cluster region
ANGPTL2: angiopoietin-like 2
RPL24: ribosomal protein L24
RPS20: Ribosomal Protein S20
KCNK3: potassium channel, subfamily K,
member 3 (TASK-1)
EIF3S6: eukaryotic translation initiation factor
3, subunit 6 (48kD)
ABCC8: ATP-binding cassette, sub-family C
(CFTR/MRP), member 8
PDE8B: phosphodiesterase 8B
ACCN2: amiloride-sensitive cation channel 2,
neuronal
7
Features Used During Building of the Class Prediction Model
Features
The figure below demonstrates all features used to construct the 20-feature k-NN class prediction
model during leave-one-out cross validation and the frequency of their use. The gene
identifications of the feature numbers are given on the next two pages.
34091_s_at
630_at
33619_at
34679_at
631_g_at
37573_at
35016_at
38681_at
33677_at
160039_at
326_i_at
41325_at
39691_at
1395_at
38791_at
37542_at
37249_at
36678_at
1318_at
37953_s_at
37578_at
41792_at
41749_at
35125_at
34531_at
41016_at
34768_at
36358_at
935_at
38650_at
32749_s_at
40793_s_at
40235_at
388_at
32576_at
36617_at
38340_at
34193_at
37012_at
1134_at
32487_s_at
39338_at
39427_at
31481_s_at
36027_at
37421_f _at
39112_at
41624_r_at
32852_at
1860_at
36150_at
39694_at
38338_at
1704_at
32297_s_at
35852_at
537_f _at
41250_at
39572_at
40807_at
41551_at
41485_at
841_at
38545_at
39522_at
32791_at
406_at
33485_at
36927_at
32436_at
38391_at
31342_at
40840_at
37680_at
39856_at
36196_at
32819_at
0
25
50
75
Fractional Feature Use (%)
8
100
Features Used During Building of the Class Prediction Model (continued)
Feature
Number
631_g_at
34679_at
33619_at
630_at
34091_s_at
35016_at
37573_at
160039_at
33677_at
38681_at
41325_at
326_i_at
39691_at
1395_at
38791_at
37249_at
37542_at
37953_s_at
1318_at
36678_at
41792_at
37578_at
35125_at
41749_at
38650_at
935_at
36358_at
34768_at
41016_at
34531_at
32749_s_at
388_at
40235_at
40793_s_at
37012_at
34193_at
38340_at
36617_at
32576_at
Gene Description
DCTD: dCMP deaminase
BCR: breakpoint cluster region
RPS13: ribosomal protein S13
DCTD: dCMP deaminase
VIM: vimentin
CD74: CD74 antigen (invariant polypeptide of major histocompatibility
complex, class II antigen-associated)
ANGPTL2: angiopoietin-like 2
MAPK4: mitogen-activated protein kinase 4
RPL24: ribosomal protein L24
EIF3S6: eukaryotic translation initiation factor 3, subunit 6 (48kD)
KCNK3: potassium channel, subfamily K, member 3 (TASK-1)
RPS20: Ribosomal protein S20
SH3GLB1: SH3-domain GRB2-like endophilin B1
ARHC: ras homolog gene family, member C
DDOST: dolichyl-diphosphooligosaccharide protein glycosyltransferase
PDE8B: phosphodiesterase 8B
LHFPL2: lipoma HMGIC fusion partner-like 2
ACCN2: amiloride-sensitive cation channel 2, neuronal
RBBP4: retinoblastoma-binding protein 4
TAGLN2: transgelin 2
ABCC8: ATP-binding cassette, sub-family C (CFTR/MRP), member 8
Homo sapiens clone-RES4-4
RPS6: Ribosomal protein S6
C21orf33: chromosome 21 open reading frame 33
IGFBP5: insulin-like growth factor binding
CAP: adenylyl cyclase-associated protein
RPL9: ribosomal protein L9
TXNDC: thioredoxin domain-containing
KIAA0510 protein
FLRT1: fibronectin leucine rich transmembrane protein 1
DKFZp586K1720 protein
PIK3R2: phosphoinositide-3-kinase, regulatory subunit, polypeptide 2
(p85 beta)
ACK1: activated p21cdc42Hs kinase
AQP4: aquaporin 4
CAPZB: capping protein (actin filament) muscle Z-line, beta
CHL1: cell adhesion molecule with homology to L1CAM
(close homologue of L1)
KIAA0655 protein: huntingtin interacting protein-1-related
ID1: inhibitor of DNA binding 1, dominant negative helix-loop-helix protein
EIF3S5: eukaryotic translation initiation factor 3, subunit 5 (epsilon, 47kD)
9
Features Used During Building of the Class Prediction Model (continued)
Feature
Number
32819_at
36196_at
39856_at
37680_at
40840_at
31342_at
38391_at
32436_at
36927_at
33485_at
406_at
32791_at
39522_at
38545_at
841_at
41485_at
41551_at
40807_at
39572_at
41250_at
537_f_at
35852_at
32297_s_at
1704_at
38338_at
39694_at
36150_at
1860_at
32852_at
41624_r_at
39112_at
37421_f_at
36027_at
31481_s_at
39427_at
39338_at
32487_s_at
1134_at
Gene Description
H2BFA: H2B histone family, member A
PFKM: phosphofructokinase, muscle
RPL36A: ribosomal protein L36a
AKAP12: A kinase (PRKA) anchor protein (gravin) 12
PPIF: peptidylprolyl isomerase F (cyclophilin F)
GALNT2: UDP-N-acetyl-alpha-D galactosamine:polypeptide N
acetylgalactosaminyltransferase 2 (GalNAc-T2)
CAPG: capping protein (actin filament), gelsolin-like
RPL27A: ribosomal protein L27a
GS3686: hypothetical protein, expressed in osteoblast
RPL4: ribosomal protein L4
ITGB4: integrin, beta 4
MAC30: hypothetical protein
PFKFB3: 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3
INHBB: inhibin, beta B (activin AB beta polypeptide)
OLIG2: oligodendrocyte lineage transcription factor 2
LDHA: lactate dehydrogenase A
RER1: similar to S. cerevisiae RER1
MUF1: MUF1 protein
GRIK2: glutamate receptor, ionotropic, kainate 2
JYV1: JTV1 gene
Human breakpoint cluster region (BCR) gene
CRY2: cryptochrome 2 (photolyase-like)
KLRC2: killer cell lectin-like receptor subfamily C, member 2
VAV2: vav 2 oncogene
RRAS: related RAS viral (r-ras) oncogene homolog
Hypothetical protein MGC5508
KIAA0842 protein
TP53BP2: tumor protein p53-binding protein, 2
TXN2: thioredoxin, mitochondrial
FZR1: Fzr1 protein
USF2: upstream transcription factor 2, c-fos interacting
Human DNA sequence from clone RP3-377H14 on chromosome 6p21.32-22.1
POLR2F: polymerase (RNA) II (DNA directed) polypeptide F
TMSB10: thymosin, beta 10
UQCRB: ubiquinol-cytochrome c reductase binding protein
S100A10: S100 calcium-binding protein A10 (annexin II ligand, calpactin I,
light polypeptide (p11))
KPNA4: karyopherin alpha 4 (importin alpha 3)
ACK1: activated p21cdc42Hs kinase
10
Summary of Training Sample Set Class Predictions
The table below summarizes the class predictions of the training sample set. This set includes the
21 classic high grade gliomas. The “call” is the classification given by the 20-feature k-NN
model during leave-one-out cross validation and appears along with the confidence value.
“Errors” are those tumors whose classification differed from the pathological classification.
GBM, glioblastoma; AO, anaplastic oligodendroglioma.
Sample Name
GBM Calls
Brain_CG_8
Brain_CG_11
Brain_CG_3
Brain_CG_4
Brain_CG14
Brain_CG_2
Brain_CO_4
Brain_CG_1
Brain_CG_9
Brain_CG_6
Brain_CG_13
Brain_CG_12
Brain_CG_7
AO Calls
Brain_CO_5
Brain_CO_1
Brain_CO_6
Brain_CO_2
Brain_CO_7
Brain_CG_5
Brain_CO_3
Brain_CG_10
Call
Confidence
Pathology
“Error”
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
0.677
0.610
0.558
0.524
0.455
0.445
0.224
0.182
0.158
0.101
0.008
0.006
0.000
GBM
GBM
GBM
GBM
GBM
GBM
AO
GBM
GBM
GBM
GBM
GBM
GBM
*
AO
AO
AO
AO
AO
AO
AO
AO
0.377
0.234
0.166
0.143
0.141
0.028
0.023
0.021
AO
AO
AO
AO
AO
GBM
AO
GBM
11
*
*
Summary of Test Sample Set Class Predictions
The table below summarizes the class predictions of the test sample set. This test sample set
includes the 29 remaining high grade gliomas that were not used in the training set. The “call” is
the classification given by the 20-feature k-NN model and appears along with the confidence
value. “Errors” are those tumors whose classification differed from the pathological
classification. GBM, glioblastoma; AO, anaplastic oligodendroglioma.
Sample Name
GBM Calls
Brain_NO_8
Brain_NG_10
Brain_NO_6
Brain_NO_7
Brain_NG_14
Brain_NO_14
Brain_NG_3
Brain_NG_12
Brain_NG_9
Brain_NG_6
Brain_NG_5
Brain_NG_7
Brain_NO_4
Brain_NG_8
Brain_NG_4
Brain_NO_10
Brain_NO_3
Brain_NO_5
Brain_NG_11
Brain_NG_13
Brain_NO_13
Brain_NO_12
AO Calls
Brain_NO_2
Brain_NO_11
Brain_NO_9
Brain_NO_15
Brain_NG_2
Brain_NO_1
Brain_NG_1
Call
Confidence
Pathology
“Error”
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
GBM
0.6998
0.6852
0.5952
0.5889
0.5635
0.5497
0.4755
0.4352
0.4086
0.3103
0.3027
0.3024
0.3002
0.2978
0.2770
0.2261
0.1414
0.0875
0.0718
0.0536
0.0204
0.0000
AO
GBM
AO
AO
GBM
AO
GBM
GBM
GBM
GBM
GBM
GBM
AO
GBM
GBM
AO
AO
AO
GBM
GBM
AO
AO
*
AO
AO
AO
AO
AO
AO
AO
0.5160
0.4445
0.3126
0.2984
0.2578
0.1871
0.0108
AO
AO
AO
AO
GBM
AO
GBM
12
*
*
*
*
*
*
*
*
*
*
*
Survival Statistics for the High Grade Glioma Dataset
The table below summarizes survival statistics for the entire high grade glioma dataset. Survival
from date of intial diagnosis is given for all patients. For living patients, survival is given to time
of last follow-up.
Sample Name
Brain_CG_1
Brain_CG_2
Brain_CG_3
Brain_CG_4
Brain_CG_5
Brain_CG_6
Brain_CG_7
Brain_CG_8
Brain_CG_9
Brain_CG_10
Brain_CG_11
Brain_CG_12
Brain_CG_13
Brain_CG_14
Vital Status
Dead
Dead
Dead
Dead
Alive
Dead
Alive
Dead
Dead
Dead
Dead
Dead
Dead
Dead
Survival from Date of
Initial Diagnosis (Days)
308
281
501
670
729
21
630
263
219
408
242
323
213
97
Brain_NG_1
Brain_NG_2
Brain_NG_3
Brain_NG_4
Brain_NG_5
Brain_NG_6
Brain_NG_7
Brain_NG_8
Brain_NG_9
Brain_NG_10
Brain_NG_11
Brain_NG_12
Brain_NG_13
Brain_NG_14
Dead
Alive
Dead
Dead
Dead
Dead
Alive
Dead
Alive
Dead
Dead
Dead
Dead
Dead
1375
1644
406
308
177
103
992
41
1354
276
519
368
157
1162
13
Survival Statistics for the High Grade Glioma Dataset (continued)
Sample Name
Brain_CO_1
Brain_CO_2
Brain_CO_3
Brain_CO_4
Brain_CO_5
Brain_CO_6
Brain_CO_7
Vital Status
Alive
Alive
Alive
Dead
Alive
Alive
Dead
Survival from Date of
Initial Diagnosis (Days)
231
1674
1604
215
359
171
272
Brain_NO_1
Brain_NO_2
Brain_NO_3
Brain_NO_4
Brain_NO_5
Brain_NO_6
Brain_NO_7
Brain_NO_8
Brain_NO_9
Brain_NO_10
Brain_NO_11
Brain_NO_12
Brain_NO_13
Brain_NO_14
Brain_NO_15
Dead
Alive
Alive
Dead
Dead
Dead
Dead
Alive
Alive
Dead
Alive
Dead
Dead
Alive
Alive
63
585
1804
916
793
803
559
1137
1100
498
795
790
789
439
638
14
Survival curves - all glioblastomas and anaplastic oligodendrogliomas
Percent Survival
The figure below demonstrates a significant difference between survival curves of all patients
with glioblastomas and anaplastic oligodendrogliomas (p=0.009). Survival curves were plotted
according to classifications based on traditional pathology. GBM, glioblastoma; AO, anaplastic
oligodendroglioma.
AO
GBM
100
50
0
0
10
20
30
40
Time (Months)
15
50
60
70
Download