Supplementary methods Liquid Chromatography Mass spectrometry (LCMS). The HLA peptide pools as obtained were separated according to their hydrophobicity by reversed-phase chromatography (nanoAcquity UPLC system, Waters) and the eluting peptides were analyzed in an LTQ-Orbitrap hybrid mass spectrometer (Thermo Fisher Scientific) equipped with an electrospray ionization (ESI) source. Eluted peptide pools were loaded directly onto the analytical fused-silica micro-capillary column (75 µm i.d. x 250 mm) packed with 1.7 µm C18 reversed-phase material (Waters) applying a flow rate of 400 nl per minute. Subsequently, the peptides were separated using a two-step 180 minute-binary gradient from 10% to 33% B at a flow rate of 300 nl per minute. The gradient was composed of Solvent A (0.1% formic acid in water) and solvent B (0.1% formic acid in acetonitrile). A gold coated glass capillary (PicoTip, New Objective) was used for introduction into the nanoESI source. The LTQOrbitrap mass spectrometer was operated in the data-dependent mode using a TOP5 and a TOP3 strategy. In brief, a scan cycle was initiated with a full scan of high mass accuracy in the orbitrap (R = 30 000 for TOP3, R = 60000 for TOP5), which was followed by MS/MS scans either in the Orbitrap (R = 7500) on the 5 most abundant precursor ions with dynamic exclusion of previously selected ions (TOP5) or in the LTQ on the 3 most abundant precursor ions with dynamic exclusion of previously selected ions (TOP3). Data analysis and peptide sequence identification. The LCMS data was processed by analyzing the LCMS survey (parent mass of the unfragmented peptide) as well as the Tandem-MS (MS/MS) data (spectra of fragmented peptides containing sequence information). Data analysis was optimized and adapted for identification of HLA peptides using the following methods. Each new data set is integrated in the MySQL-based database. Tandem-MS spectra were extracted using msn_extract (ThermoFischerScientific) and searched with Sequest algorithm (http://fields.scripps.edu/sequest/index.html) against the IPI database. The protein database hits were subsequently validated by automated quality filtering using thresholds established on manually 1 identified HLA peptidomics data. For increased identification sensitivity, an in-house developed spectral clustering algorithm was used to assign spectra to known peptide MS/MS clusters which are being collected in the fragment spectra library. Before being considered as peptide vaccine candidates, peptide sequences suggested by the described automated pipeline were confirmed by manual inspection. The identity of peptides was further assured by comparison of the recorded natural peptide fragmentation pattern of a synthetic reference peptide with the sequence in question. These methods are also detailed in two patents available at www.google.com/patents/US20050221350 and www.google.com/patents/US20110257890 . Relative peptide quantification. LC-MS survey data (signals of intact and unfragmented peptides for quantitative information) was analyzed independently of the Tandem-MS fragment spectra data (recorded in the same experiment – and resulting in peptide sequence information) making use of the high-mass accuracy. To extract LC-MS signals as well as the signal areas (ion counting) the program SuperHirn (ETH Zürich) [Mueller et al. 2007] was used. Thus each identified peptide could be associated with quantitative data allowing relative quantification between samples and tissues. To account for variation between technical and biological replicates, a two-tier normalization scheme was used based on central tendency normalization. The normalization assumes that most measured signals result from house-keeping peptides and the small fraction of over-presented peptides does not influence the central tendency of the data significantly. In the first normalization step replicates of the same sample are normalized by calculating the mean presentation for each peptide in the respective replicate set. This mean is used to compute normalization factors for each peptide and LCMS run. Averaging over all peptides results in run-wise normalization factors which are applied to all peptides of the particular LCMS run. This approach ensures that systematic intra-sample variation is removed, e.g. due to different injection volumes between replicate runs. Only peptides, which had a coefficient of variation smaller than 50% between their replicate areas, were considered for calculation of further normalization factors: Again the mean presentation of each peptide was calculated, this time for all samples of a defined preparation antibody (e.g. BB7.2). The mean was 2 used to compute normalization factors for each peptide and sample. Averaging over all peptides resulted in sample-wise normalization factors which were applied to all peptides of the particular sample. Systematic bias due to different tissue weights or MHC expression levels was therefore removed. For each peptide a presentation profile was calculated showing the mean sample presentation as well as replicate variations. The profile juxtaposes GBM samples to a baseline of normal tissue samples. Identification and selection of HLA-A*02-restricted peptides. Assignment of HLA-A*02 restriction to a peptide sequence was based on the following criteria: (i) detection by MS/MS analysis from a sample immunoprecipitated with the HLA-A*02 specific antibody BB7.2 and (ii) SYFPEITHI score analysis (for 9-mers and 10-mers) / anchor residue criteria in case that no SYFPEITHI matrix was available. HLA-A*02-restriction was experimentally confirmed by determination of the binding constant Kd for all ten selected peptides by an HLA refolding assay and by the fact that refolding of HLA-A*02 monomers for tetramer staining as shown in Fig. 3 was successful for all ten peptides. Analysis of GBM samples was performed until the rate of newly identified sequences among the last 1,000 total identifications dropped below 15% (ie. ≥85% of identifications resulted in already known sequences of the GBM peptidome). For the 309 pre-selected peptides, median SYFPEITHI score is 25, 10% percentile is 17 – a score common to several published A*02 binding peptides (www.syfpeithi.de). For 26 of the pre-selected peptides a SYFPEITHI score was not available (no 9- or 10mers), but they meet several characteristics of A*02 binding in terms of anchor residues. Gene expression analysis. Gene expression analysis of all tumor and normal tissue RNA samples was performed by Affymetrix Human Genome (HG) U133A or HG-U133 Plus 2.0 oligonucleotide microarrays (Affymetrix). The same normal kidney sample was hybridized to both array types to achieve direct comparability of all samples. All steps were carried out according to the Affymetrix manual (http://media.affymetrix.com/support/downloads/manuals/expression_analysis_technical_manual.p 3 df). Briefly, double-stranded cDNA was synthesized from 5–8 µg of total RNA, using SuperScript RTII (Invitrogen) and the oligo-dT-T7 primer (MWG Biotech) as described in the manual. In vitro transcription was performed with the BioArray High Yield RNA Transcript Labeling Kit (ENZO Diagnostics, Inc.) for the U133A arrays or with the GeneChip IVT Labeling Kit (Affymetrix) for the U133 Plus 2.0 arrays, followed by cRNA fragmentation, hybridization, and staining with streptavidinphycoerythrin and biotinylated anti-streptavidin antibody (Molecular Probes). Images were scanned with the Agilent 2500A GeneArray Scanner (U133A) or the Affymetrix Gene-Chip Scanner 3000 (U133 Plus 2.0), and data were analyzed with the GCOS software (Affymetrix), using default settings for all parameters. Pairwise comparisons were calculated using the respective normal kidney array as baseline. For normalization, 100 housekeeping genes provided by Affymetrix were used (http://www.affymetrix.com/support/technical/mask_files.affx). Relative expression values were calculated from the signal log ratios given by the software and the normal brain sample was arbitrarily set to 1.0. An empirical mRNA over-expression score (S score) was calculated based on the signal log ratios for each gene: S = 0.25x[meantumor-meannormal+meantumor-maxnormal+meantumor,top40%meannormal+meantumor,top40%-maxnormal]. S considers expression levels in the analyzed GBM samples (average of all analyzed samples: meantumor) and average in 40% samples with highest expression (meantumor,top40%) vs. average and highest expression in normal tissues. An empirical cut-off (S ≥ 1.8) was set in order to pre-filter for “genes overexpressed”, qualifying the HLA-A*02-derived peptides from these genes for more detailed analysis as potential targets for GBM immunotherapy. Tissue Microarray, Immunohistochemistry and Immunofluorescent Stainings. TMA consisted of 250 formalin-fixed, paraffin-embedded GBM and 4 normal brain tissue samples as described elsewhere [Campos et al. 2011]. Informed consent was obtained from each patient according to the research proposals approved by the Institutional Review Board at Heidelberg Medical Faculty. Primary antibodies used in our study were: anti-PTP-zeta (1:50), anti-NRCAM (1:200), anti-FABP7 (1:200), anti-IGF2BP3 (uv, all Abcam), anti-NLGN4X (1:100), anti-Chi3L2 (1:25) anti-Brevican (1:200; SigmaAldrich), and anti-CSPG4 (1:500, Chemicon). Prior to TMA staining specificity of primary antibodies 4 was verified using corresponding isotype controls (all Acris) on glioma control tissues in equal concentrations as primary antibodies and as indicated by the manufacturer. Antigen retrieval, incubation with primary and secondary antibodies as well as detection with Vectastain Laboratories ELITE ABC KIT (Vector Laboratories) was carried out as described [Campos et al. 2011]. Each tumor biopsy was evaluated at 20x magnification by two independent investigators blinded to all clinical data. Staining of TMA biopsies was semiquantitatively graded in an antigen-dependent manner according to the estimated percentage of positive cells covering the whole tissue spot. In case of inter-observer variability staining frequency on individual biopsies was counted manually. Average staining patterns from all biopsies of an individual tumor were taken as final staining result. Additional antibodies used for double immunofluorescence staining were: mouse monoclonal antihuman CD31 (1:100; both BD Pharmingen), anti-human GFAP (ready to use; PROGEN), anti-human CD68 (1:25; Caltag) as well as secondary antibodies anti-mouse ALEXA488 and anti-rabbit ALEXA555 (1:500; both Invitrogen). Peptides, recombinant MHC molecules, fluorescent tetramers and artificial APC. Peptides were synthesized using standard Fmoc chemistry. The amino acid position and sequences of each peptide are described in Table 1. The Melan-A peptide used as a control was Melan-A26-35 (EAAGIGILTV). Biotinylated recombinant A*02 molecules and fluorescent MHC tetramers were produced as described previously [Altman et al. 1996]. The costimulatory mouse IgG2a anti-human CD28 antibody 9.3 [Jung et al. 1987] was biotinylated using sulfo-N-hydroxysuccinimidobiotin as recommended by the manufacturer (Perbio Science). For generation of artificial APCs, 5.6-µm-diameter streptavidincoated polystyrene particles with a binding capacity of 0.064 µg of biotin-FITC per mg of microsphere (Bangs Laboratories) were resuspended at 5x106 particles per milliliter in PBS, 0.5% BSA with biotinylated MHC (1 µg/ml) and anti-human CD28 antibody (3 µg/ml) and incubated at room temperature for 30 min under agitation [Walter et al. 2003]. 5 In vitro immunogenicity experiments. CD8+ T cells were stimulated with artificial APC. Briefly, CD8+ T cell were isolated by magnetic bead positive selection using a MACS device (Miltenyi) and plated in 96-well round bottom plates in 100µl IMDM containing 8% human serum (Laboratoires Jacques Boy), penicillin, streptomycin, non-essential amino acids, sodium pyruvate and Hepes (all from Invitrogen) (CTL medium) at a concentration of 1x106 cells/well. Artificial APC (100µl) were added together with rhIL-12 (5 ng/ml, Bioconcept). Cells were incubated for 3 days and medium was replaced with addition of IL-2 (10 IU/ml) and IL-7 (2.5 ng/ml, Bioconcept). Cultures were restimulated similarly at days 7 and 14 and tested at day 21 by flow cytometry. Each peptide was tested in 4 to 6 healthy individuals (9-12 wells per peptide) and in 7 to 11 patients with GBM (2-5 wells per peptide). For analysis of naïve and memory T cell populations, PBMC were sorted for CD8+ T cells by magnetic bead selection. The CD8+ population was then stained with CD45RA and CCR7 antibodies (Beckman Coulter) and the CD45RA+ CCR7+ (representing the naïve T cell population) and CD45RA- CCR7+/(representing the memory T cell population) fractions were sorted by FACS. Each population was then separately stimulated with artificial APC incorporating either the BCA478-486 or the Melan-A26-35 control peptide three times as described above. Cultures were stained with MHC/peptide tetramers incorporating either the cognate peptide or a control peptide. 6 Reference List Altman JD, Moss PA, Goulder PJ et al. Phenotypic analysis of antigen-specific T lymphocytes. Science 1996; 274: 94-96. Campos B, Bermejo JL, Han L et al. Expression of nuclear receptor corepressors and class I histone deacetylases in astrocytic gliomas. Cancer Sci 2011; 102: 387-392. Jung G, Ledbetter JA, Muller-Eberhard HJ. Induction of cytotoxicity in resting human T lymphocytes bound to tumor cells by antibody heteroconjugates. Proc Natl Acad Sci U S A 1987; 84: 4611-4615. Mueller LN, Rinner O, Schmidt A et al. SuperHirn - a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 2007; 7: 3470-3480. Walter S, Herrgen L, Schoor O et al. Cutting edge: predetermined avidity of human CD8 T cells expanded on calibrated MHC/anti-CD28-coated microspheres. J Immunol 2003; 171: 4974-4978. 7 Table S1. List of the GBM-associated proteins identified by peptidomics Gene name Accession number Description Gene name Accession number Description ACSBG1 Q96GR2 Long-chain-fatty-acid-CoA ligase ACSBG1 MAP1B P46821 Microtubule-associated protein 1B ACSL3 O95573 Long-chain-fatty-acid-CoA ligase 3 MAP2 P11137 Microtubule-associated protein 2 ADAM17 P78536 Disintegrin and metalloproteinase domaincontaining protein 17 MAP9 Q49MG5 Microtubule-associated protein 9 ADORA3 P33765 Adenosine receptor A3 MAP4K4 O95819 Mitogen-activated protein kinase kinase kinase kinase 4 AP3B2 Q13367 AP-3 complex subunit beta-2 MBP P02686 Myelin basic protein APC P25054 Adenomatous polyposis coli protein MCM4 P33991 DNA replication licensing factor MCM4 ARAP2 Q8WZ64 Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 2 MED27 Q6P2C8 Mediator of RNA polymerase II transcription subunit 27 ARHGAP12 Q8IWW6 Rho GTPase-activating protein 12 MLC1 Q15049 Membrane protein MLC1 ARHGEF26 Q96DR7 Rho guanine nucleotide exchange factor 26 MSH2 P43246 DNA mismatch repair protein Msh2 ARMC8 Q8IUR7 Armadillo repeat-containing protein 8 MSH6 P52701 DNA mismatch repair protein Msh6 ARNT2 Q9HBZ2 Aryl hydrocarbon receptor nuclear translocator 2 NAV2 Q8IVL1 Neuron navigator 2 ASNS P08243 Asparagine synthetase NCAN O14594 Neurocan core protein ATAT1 Q5SQI0 Alpha-tubulin N-acetyltransferase NCAPG Q9BPX3 Condensin complex subunit 3 ATP1A2 P50993 Sodium/potassium-transporting ATPase subunit alpha-2 NCDN Q9UBB6 Neurochondrin ATP2B1 P20020 Plasma membrane calcium-transporting ATPase 1 NDC80 O14777 Kinetochore protein NDC80 homolog ATR Q13535 Serine/threonine-protein kinase ATR NES P48681 Nestin BCAN Q96GW7 Brevican core protein NLGN4X Q8N0W4 Neuroligin-4, X-linked BCAT1 P54687 Branched-chain-amino-acid aminotransferase NMD3 Q96D46 60S ribosomal export protein NMD3 BCHE P06276 Cholinesterase NPAS3 Q8IXF0 Neuronal PAS domain-containing protein 3 C1QB P02746 NR2E1 Q9Y466 Nuclear receptor subfamily 2 group E member 1 CACNA1A O00555 NRCAM Q92823 Neuronal cell adhesion molecule CASK O14936 Complement C1q subcomponent subunit B Voltage-dependent P/Q-type calcium channel subunit alpha-1A Peripheral plasma membrane protein CASK PCDH17 O14917 Protocadherin-17 CCDC88A Q3V6T2 Girdin PCDHGC3 Q9UN70 Protocadherin gamma-C3 CCDC93 Q567U6 Coiled-coil domain-containing protein 93 PDE4DIP Q5VU43 Myomegalin CCNB1 P14635 G2/mitotic-specific cyclin-B1 PDPN Q86YL7 Podoplanin CCND2 P30279 G1/S-specific cyclin-D2 PDS5A Q29RF7 Sister chromatid cohesion protein PDS5 homolog A CD163 Q86VB7 Scavenger receptor cysteine-rich type 1 protein M130 PGAP1 Q75T13 GPI inositol-deacylase CENPF P49454 Centromere protein F PHKG1 Q16816 Phosphorylase b kinase gamma catalytic chain, skeletal muscle isoform CEP170 Q5SW79 Centrosomal protein of 170 kDa PLEKHA4 Q9H4M7 Pleckstrin homology domain-containing family A member 4 CHI3L1 P36222 Chitinase-3-like protein 1 PLIN2 Q99541 Perilipin-2 CHI3L2 Q15782 Chitinase-3-like protein 2 PLXNB3 Q9ULL4 Plexin-B3 CLIP2 Q9UDT6 PON2 Q15165 Serum paraoxonase/arylesterase 2 COG4 Q9H9E3 POSTN Q15063 Periostin CRMP1 Q14194 CAP-Gly domain-containing linker protein 2 Conserved oligomeric Golgi complex subunit 4 Dihydropyrimidinase-related protein 1 PRMT3 O60678 Protein arginine N-methyltransferase 3 CSPG4 Q6UVK1 Chondroitin sulfate proteoglycan 4 PRUNE2 Q8WUY3 Protein prune homolog 2 CSRP2BP Q9H8E8 Cysteine-rich protein 2-binding protein PTPRZ1 P23471 Receptor-type tyrosine-protein phosphatase zeta CYBB P04839 Cytochrome b-245 heavy chain PUS7L Q9H0K6 Pseudouridylate synthase 7 homolog-like protein DCLK2 Q8N568 Serine/threonine-protein kinase DCLK2 PYGB P11216 Glycogen phosphorylase, brain form DOCK10 Q96BY6 Dedicator of cytokinesis protein 10 QKI Q96PU8 Protein quaking DPYSL3 Q14195 Dihydropyrimidinase-related protein 3 RB1 P06400 Retinoblastoma-associated protein DPYSL4 O14531 Dihydropyrimidinase-related protein 4 SACS Q9NZJ4 Sacsin 8 DTNA Q9Y4J8 Dystrobrevin alpha Proteasome-associated protein ECM29 homolog Endothelin B receptor SAMSN1 Q9NSI8 SAM domain-containing protein SAMSN-1 ECM29 Q5VYK3 SDC3 O75056 Syndecan-3 EDNRB P24530 SEC31A O94979 Protein transport protein Sec31A EGFR P00533 Epidermal growth factor receptor Eukaryotic translation initiation factor 4 gamma 3 Elongation of very long chain fatty acids protein 2 SEC61G P60059 Protein transport protein Sec61 subunit gamma EIF4G3 O43432 SERPINA3 P01011 Alpha-1-antichymotrypsin ELOVL2 Q9NXB9 SEZ6L Q9BYH1 Seizure 6-like protein ENC1 O14682 Ectoderm-neural cortex protein 1 SFPQ P23246 Splicing factor, proline- and glutamine-rich EXOC1 Q9NV70 Exocyst complex component 1 SLC1A3 P43003 Excitatory amino acid transporter 1 FABP7 O15540 Fatty acid-binding protein, brain SLC1A4 P43007 Neutral amino acid transporter A FAM115A Q9Y4C2 Protein FAM115A SLC4A4 Q9Y6R1 FEN1 P39748 Flap endonuclease 1 SLCO1C1 Q9NYB5 GABPA Q06546 GA-binding protein alpha chain SMARCA1 P28370 GFAP P14136 Glial fibrillary acidic protein SMARCA5 O60264 Electrogenic sodium bicarbonate cotransporter 1 Solute carrier organic anion transporter family member 1C1 Probable global transcription activator SNF2L1 SWI/SNF-related matrix-associated actindependent regulator of chromatin subfamily A member 5 GFPT2 O94808 Glucosamine-fructose-6-phosphate aminotransferase 2 SMC2 O95347 Structural maintenance of chromosomes protein 2 GLT25D2 Q8IYK4 Procollagen galactosyltransferase 2 SMC3 Q9UQE7 Structural maintenance of chromosomes protein 3 GPM6B Q13491 Neuronal membrane glycoprotein M6-b SMC6 Q96SB8 Structural maintenance of chromosomes protein 6 GPR56 Q9Y653 G-protein coupled receptor 56 SOCS6 O14544 Suppressor of cytokine signaling 6 GRIA2 P42262 Glutamate receptor 2 SPAG9 O60271 C-Jun-amino-terminal kinase-interacting protein 4 GRIA3 P42263 Glutamate receptor 3 SRP72 O76094 Signal recognition particle 72 kDa protein H2AFY O75367 Core histone macro-H2A.1 SRRT Q9BXP5 Serrate RNA effector molecule homolog HEATR6 Q6AI08 HEAT repeat-containing protein 6 SSX2IP Q9Y2D8 Afadin- and alpha-actinin-binding protein HP1BP3 Q5SSJ5 Heterochromatin protein 1-binding protein 3 STK17A Q9UEE5 Serine/threonine-protein kinase 17A ID3 Q02535 TLR7 Q9NYK1 Toll-like receptor 7 IGF2BP3 O00425 TMEM144 Q7Z5S9 Transmembrane protein 144 ITGB8 P26012 DNA-binding protein inhibitor ID-3 Insulin-like growth factor 2 mRNA-binding protein 3 Integrin beta-8 TNC P24821 Tenascin KCTD3 Q9Y597 BTB/POZ domain-containing protein KCTD3 TOP2A P11388 DNA topoisomerase 2-alpha KIDINS220 Q9ULH0 Kinase D-interacting substrate of 220 kDa TRIM23 P36406 E3 ubiquitin-protein ligase TRIM23 KIF1A Q12756 Kinesin-like protein KIF1A TRIM24 O15164 Transcription intermediary factor 1-alpha KLHL7 Q8IXQ5 Kelch-like protein 7 TRIO O75962 Triple functional domain protein LANCL2 Q9NS86 LanC-like protein 2 TUBGCP5 Q96RT8 Gamma-tubulin complex component 5 LASS1 P27544 UBA6 A0AVT1 Ubiquitin-like modifier-activating enzyme 6 LPPR4 Q7Z2D5 VCAN P13611 Versican core protein MAGI2 Q86UL8 LAG1 longevity assurance homolog 1 Lipid phosphate phosphatase-related protein type 4 Membrane-associated guanylate kinase, WW and PDZ domain-containing protein 2 ZIC1 Q15915 Zinc finger protein ZIC 1 9 Table S2. Characteristics of the newly isolated GBM-associated antigens Antigen selection criteria BCA478-486 CHI10-18 CSP21-29 FABP7118-126 IGF2BP3552-560 NLGN4X131-139 NRCAM692-700 PTP195-203 PTP1347-1355 TNC3-11 Natural antigen presentation on GBM samples Natural presentation directly shown on tumor samples Antigen binding to HLA Demonstrated high-affinity binding to HLA-A2 X X X X X X X X X X X X X X X X X X X X 90 40 100 40 90 95 100 100 45 X X X X X X +++ +++ +++ ++ +++ ++ mRNA overexpression of source protein Over-expressed in GBM 80 samples (% samples)a Over-expression in GBM reported in literature Antigen immunogenicity In vitro immunogenicity demonstratedb X ++ +++ T cell responses against source protein described X Relevant cancer-associated functions of source proteins Oncofetal expression X pattern Expression by brain cancer stem cells X X X Pro-angiogenic effects/ Neovascularization X X X EGFR Wnt X X X X Biological properties of source protein Sub-cellular locationd ECM EC X Wnt, FGF2 X X Over-expression correlated with higher tumor grade Tumor-associated posttranslational modificationse X X Link of cancer-associated signaling pathwaysc Over-expression linked to decreased survival in GBM +++ X Roles in cell cycle progression and cell proliferation Involvement in tumor invasion, migration and metastasis ++ X X X X X CM TM TM X X X TM CY CY (NU) TM X ECM DG a: number of samples with expression > expression on highest normal tissue, b: + in vitro immunogenicity detected; ++ >20% of tested wells and/or >=50% of tested donors positive; +++ >50% of tested wells positive and/or >=80% of tested donors positive, c: EGFR = epithelial growth factor receptor; FGF2 = fibroblast growth factor 2; Wnt = Wnt / beta-catenin pathway (embryogenesis), d: CY = cytoplasmic; EC = extracellular localization; ECM = extracellular matrix; NU nuclear localization; TM = transmembrane protein, e: DG deglycosylation 10