Supplementary methods Liquid Chromatography Mass spectrometry (LCMS). The HLA peptide pools as obtained were separated according to their hydrophobicity by reversed-phase chromatography (nanoAcquity UPLC system, Waters) and the eluting peptides were analyzed in an LTQ-Orbitrap hybrid mass spectrometer (Thermo Fisher Scientific) equipped with an electrospray ionization (ESI) source. Eluted peptide pools were loaded directly onto the analytical fused-silica micro-capillary column (75 µm i.d. x 250 mm) packed with 1.7 µm C18 reversed-phase material (Waters) applying a flow rate of 400 nl per minute. Subsequently, the peptides were separated using a two-step 180 minute-binary gradient from 10% to 33% B at a flow rate of 300 nl per minute. The gradient was composed of Solvent A (0.1% formic acid in water) and solvent B (0.1% formic acid in acetonitrile). A gold coated glass capillary (PicoTip, New Objective) was used for introduction into the nanoESI source. The LTQOrbitrap mass spectrometer was operated in the data-dependent mode using a TOP5 and a TOP3 strategy. In brief, a scan cycle was initiated with a full scan of high mass accuracy in the orbitrap (R = 30 000 for TOP3, R = 60000 for TOP5), which was followed by MS/MS scans either in the Orbitrap (R = 7500) on the 5 most abundant precursor ions with dynamic exclusion of previously selected ions (TOP5) or in the LTQ on the 3 most abundant precursor ions with dynamic exclusion of previously selected ions (TOP3). Data analysis and peptide sequence identification. The LCMS data was processed by analyzing the LCMS survey (parent mass of the unfragmented peptide) as well as the Tandem-MS (MS/MS) data (spectra of fragmented peptides containing sequence information). Data analysis was optimized and adapted for identification of HLA peptides using the following methods. Each new data set is integrated in the MySQL-based database. Tandem-MS spectra were extracted using msn_extract (ThermoFischerScientific) and searched with Sequest algorithm ( against the IPI database. The protein database hits were subsequently validated by automated quality filtering using thresholds established on manually 1 identified HLA peptidomics data. For increased identification sensitivity, an in-house developed spectral clustering algorithm was used to assign spectra to known peptide MS/MS clusters which are being collected in the fragment spectra library. Before being considered as peptide vaccine candidates, peptide sequences suggested by the described automated pipeline were confirmed by manual inspection. The identity of peptides was further assured by comparison of the recorded natural peptide fragmentation pattern of a synthetic reference peptide with the sequence in question. These methods are also detailed in two patents available at and . Relative peptide quantification. LC-MS survey data (signals of intact and unfragmented peptides for quantitative information) was analyzed independently of the Tandem-MS fragment spectra data (recorded in the same experiment – and resulting in peptide sequence information) making use of the high-mass accuracy. To extract LC-MS signals as well as the signal areas (ion counting) the program SuperHirn (ETH Zürich) [Mueller et al. 2007] was used. Thus each identified peptide could be associated with quantitative data allowing relative quantification between samples and tissues. To account for variation between technical and biological replicates, a two-tier normalization scheme was used based on central tendency normalization. The normalization assumes that most measured signals result from house-keeping peptides and the small fraction of over-presented peptides does not influence the central tendency of the data significantly. In the first normalization step replicates of the same sample are normalized by calculating the mean presentation for each peptide in the respective replicate set. This mean is used to compute normalization factors for each peptide and LCMS run. Averaging over all peptides results in run-wise normalization factors which are applied to all peptides of the particular LCMS run. This approach ensures that systematic intra-sample variation is removed, e.g. due to different injection volumes between replicate runs. Only peptides, which had a coefficient of variation smaller than 50% between their replicate areas, were considered for calculation of further normalization factors: Again the mean presentation of each peptide was calculated, this time for all samples of a defined preparation antibody (e.g. BB7.2). The mean was 2 used to compute normalization factors for each peptide and sample. Averaging over all peptides resulted in sample-wise normalization factors which were applied to all peptides of the particular sample. Systematic bias due to different tissue weights or MHC expression levels was therefore removed. For each peptide a presentation profile was calculated showing the mean sample presentation as well as replicate variations. The profile juxtaposes GBM samples to a baseline of normal tissue samples. Identification and selection of HLA-A*02-restricted peptides. Assignment of HLA-A*02 restriction to a peptide sequence was based on the following criteria: (i) detection by MS/MS analysis from a sample immunoprecipitated with the HLA-A*02 specific antibody BB7.2 and (ii) SYFPEITHI score analysis (for 9-mers and 10-mers) / anchor residue criteria in case that no SYFPEITHI matrix was available. HLA-A*02-restriction was experimentally confirmed by determination of the binding constant Kd for all ten selected peptides by an HLA refolding assay and by the fact that refolding of HLA-A*02 monomers for tetramer staining as shown in Fig. 3 was successful for all ten peptides. Analysis of GBM samples was performed until the rate of newly identified sequences among the last 1,000 total identifications dropped below 15% (ie. ≥85% of identifications resulted in already known sequences of the GBM peptidome). For the 309 pre-selected peptides, median SYFPEITHI score is 25, 10% percentile is 17 – a score common to several published A*02 binding peptides ( For 26 of the pre-selected peptides a SYFPEITHI score was not available (no 9- or 10mers), but they meet several characteristics of A*02 binding in terms of anchor residues. Gene expression analysis. Gene expression analysis of all tumor and normal tissue RNA samples was performed by Affymetrix Human Genome (HG) U133A or HG-U133 Plus 2.0 oligonucleotide microarrays (Affymetrix). The same normal kidney sample was hybridized to both array types to achieve direct comparability of all samples. All steps were carried out according to the Affymetrix manual ( 3 df). Briefly, double-stranded cDNA was synthesized from 5–8 µg of total RNA, using SuperScript RTII (Invitrogen) and the oligo-dT-T7 primer (MWG Biotech) as described in the manual. In vitro transcription was performed with the BioArray High Yield RNA Transcript Labeling Kit (ENZO Diagnostics, Inc.) for the U133A arrays or with the GeneChip IVT Labeling Kit (Affymetrix) for the U133 Plus 2.0 arrays, followed by cRNA fragmentation, hybridization, and staining with streptavidinphycoerythrin and biotinylated anti-streptavidin antibody (Molecular Probes). Images were scanned with the Agilent 2500A GeneArray Scanner (U133A) or the Affymetrix Gene-Chip Scanner 3000 (U133 Plus 2.0), and data were analyzed with the GCOS software (Affymetrix), using default settings for all parameters. Pairwise comparisons were calculated using the respective normal kidney array as baseline. For normalization, 100 housekeeping genes provided by Affymetrix were used ( Relative expression values were calculated from the signal log ratios given by the software and the normal brain sample was arbitrarily set to 1.0. An empirical mRNA over-expression score (S score) was calculated based on the signal log ratios for each gene: S = 0.25x[meantumor-meannormal+meantumor-maxnormal+meantumor,top40%meannormal+meantumor,top40%-maxnormal]. S considers expression levels in the analyzed GBM samples (average of all analyzed samples: meantumor) and average in 40% samples with highest expression (meantumor,top40%) vs. average and highest expression in normal tissues. An empirical cut-off (S ≥ 1.8) was set in order to pre-filter for “genes overexpressed”, qualifying the HLA-A*02-derived peptides from these genes for more detailed analysis as potential targets for GBM immunotherapy. Tissue Microarray, Immunohistochemistry and Immunofluorescent Stainings. TMA consisted of 250 formalin-fixed, paraffin-embedded GBM and 4 normal brain tissue samples as described elsewhere [Campos et al. 2011]. Informed consent was obtained from each patient according to the research proposals approved by the Institutional Review Board at Heidelberg Medical Faculty. Primary antibodies used in our study were: anti-PTP-zeta (1:50), anti-NRCAM (1:200), anti-FABP7 (1:200), anti-IGF2BP3 (uv, all Abcam), anti-NLGN4X (1:100), anti-Chi3L2 (1:25) anti-Brevican (1:200; SigmaAldrich), and anti-CSPG4 (1:500, Chemicon). Prior to TMA staining specificity of primary antibodies 4 was verified using corresponding isotype controls (all Acris) on glioma control tissues in equal concentrations as primary antibodies and as indicated by the manufacturer. Antigen retrieval, incubation with primary and secondary antibodies as well as detection with Vectastain Laboratories ELITE ABC KIT (Vector Laboratories) was carried out as described [Campos et al. 2011]. Each tumor biopsy was evaluated at 20x magnification by two independent investigators blinded to all clinical data. Staining of TMA biopsies was semiquantitatively graded in an antigen-dependent manner according to the estimated percentage of positive cells covering the whole tissue spot. In case of inter-observer variability staining frequency on individual biopsies was counted manually. Average staining patterns from all biopsies of an individual tumor were taken as final staining result. Additional antibodies used for double immunofluorescence staining were: mouse monoclonal antihuman CD31 (1:100; both BD Pharmingen), anti-human GFAP (ready to use; PROGEN), anti-human CD68 (1:25; Caltag) as well as secondary antibodies anti-mouse ALEXA488 and anti-rabbit ALEXA555 (1:500; both Invitrogen). Peptides, recombinant MHC molecules, fluorescent tetramers and artificial APC. Peptides were synthesized using standard Fmoc chemistry. The amino acid position and sequences of each peptide are described in Table 1. The Melan-A peptide used as a control was Melan-A26-35 (EAAGIGILTV). Biotinylated recombinant A*02 molecules and fluorescent MHC tetramers were produced as described previously [Altman et al. 1996]. The costimulatory mouse IgG2a anti-human CD28 antibody 9.3 [Jung et al. 1987] was biotinylated using sulfo-N-hydroxysuccinimidobiotin as recommended by the manufacturer (Perbio Science). For generation of artificial APCs, 5.6-µm-diameter streptavidincoated polystyrene particles with a binding capacity of 0.064 µg of biotin-FITC per mg of microsphere (Bangs Laboratories) were resuspended at 5x106 particles per milliliter in PBS, 0.5% BSA with biotinylated MHC (1 µg/ml) and anti-human CD28 antibody (3 µg/ml) and incubated at room temperature for 30 min under agitation [Walter et al. 2003]. 5 In vitro immunogenicity experiments. CD8+ T cells were stimulated with artificial APC. Briefly, CD8+ T cell were isolated by magnetic bead positive selection using a MACS device (Miltenyi) and plated in 96-well round bottom plates in 100µl IMDM containing 8% human serum (Laboratoires Jacques Boy), penicillin, streptomycin, non-essential amino acids, sodium pyruvate and Hepes (all from Invitrogen) (CTL medium) at a concentration of 1x106 cells/well. Artificial APC (100µl) were added together with rhIL-12 (5 ng/ml, Bioconcept). Cells were incubated for 3 days and medium was replaced with addition of IL-2 (10 IU/ml) and IL-7 (2.5 ng/ml, Bioconcept). Cultures were restimulated similarly at days 7 and 14 and tested at day 21 by flow cytometry. Each peptide was tested in 4 to 6 healthy individuals (9-12 wells per peptide) and in 7 to 11 patients with GBM (2-5 wells per peptide). For analysis of naïve and memory T cell populations, PBMC were sorted for CD8+ T cells by magnetic bead selection. The CD8+ population was then stained with CD45RA and CCR7 antibodies (Beckman Coulter) and the CD45RA+ CCR7+ (representing the naïve T cell population) and CD45RA- CCR7+/(representing the memory T cell population) fractions were sorted by FACS. Each population was then separately stimulated with artificial APC incorporating either the BCA478-486 or the Melan-A26-35 control peptide three times as described above. Characteristics of the newly isolated GBM-associated antigens Antigen selection criteria BCA478-486 CHI10-18 CSP21-29 FABP7118-126 IGF2BP3552-560 NLGN4X131-139 NRCAM692-700 PTP195-203 PTP1347-1355 TNC3-11 Natural antigen presentation on GBM samples Natural presentation directly shown on tumor samples Antigen binding to HLA Demonstrated high-affinity binding to HLA-A2 X X X X X X X X X X X X X X X X X X X X 90 40 100 40 90 95 100 100 45 X X X X X X +++ +++ +++ ++ +++ ++ mRNA overexpression of source protein Over-expressed in GBM 80 samples (% samples)a Over-expression in GBM reported in literature Antigen immunogenicity In vitro immunogenicity demonstratedb X ++ +++ T cell responses against source protein described X Relevant cancer-associated functions of source proteins Oncofetal expression X pattern Expression by brain cancer stem cells X X X Pro-angiogenic effects/ Neovascularization X X X EGFR Wnt X X X X Biological properties of source protein Sub-cellular locationd ECM EC X Wnt, FGF2 X X Over-expression correlated with higher tumor grade Tumor-associated posttranslational modificationse X X Link of cancer-associated signaling pathwaysc Over-expression linked to decreased survival in GBM +++ X Roles in cell cycle progression and cell proliferation Involvement in tumor invasion, migration and metastasis ++ X X X X X CM TM TM X X X TM CY CY (NU) TM X ECM DG a: number of samples with expression > expression on highest normal tissue, b: + in vitro immunogenicity detected; ++ >20% of tested wells and/or >=50% of tested donors positive; +++ >50% of tested wells positive and/or >=80% of tested donors positive, c: EGFR = epithelial growth factor receptor; FGF2 = fibroblast growth factor 2; Wnt = Wnt / beta-catenin pathway (embryogenesis), d: CY = cytoplasmic; EC = extracellular localization; ECM = extracellular matrix; NU nuclear localization; TM = transmembrane protein, e: DG deglycosylation 10