Bioinformatic Treatment of Human Metabolome Profile for Diagnostics Dr. Petr Lokhov & Dr. Alexander Archakov Institute of Biomedical Chemistry, RAMS ? Set of small molecules (<1500Da) in biosample Visual or/and numerical profile Biosample from human of set of small molecules Human metabolome profile for diagnostics Biofluid samples (blood plasma) 2 Methods for metabolome profiling Technique Protocols of sample preparation LC (liquid chromatography) •Ultrafiltration CE (capillary electrophoresis) •Proteins sedimentation with GC organic solvent NMR MS (Mass spectrometry) LC-MS Detection GC-MS CE-MS Type of LC-NMR Ionization Mass spectrometry … mode 3 Direct-infusion electrospray (ESI) mass spectrometry of blood plasma metabolites 1. Add methanol Mass 2. Centrifuge spectrometry 3. Take supernatant blood plasma 100 µl Mass spectrometric metabolome profile Soft method for protein precipitation (with methanol) Direct-infusion of plasma metabolites in ion source Electrosprey ionization High accuracy MS Reproducible, rapid and cheap method for metabolome profiling 4 Representative mass spectrum of blood plasma metabolites x106 ~ 2000 metabolite ions are detected diagnostic metabolites lipidome ~ 2000 main metabolites in human organism Beecher C.W.W., in: Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis, Springer, 2003 pp. 311–335. Da 5 Bioinformatic treament of metabolome profile 1. Normalization (isn’t required) 2. Baseline subtraction (isn’t required) 3. Mass spectrometry peaks alignment (common for mass spectrometry data processing) 4. Detection of ionic inconsistency in plasma samples 5. Dimensionality reduction of mass spectrometry data 6. Samples classification (diagnostics) 6 Ions in blood plasma samples that affect ESI-mass spectra Ion Level in blood plasma sample Remark H+ pH ~2.8 Na+ 136–145 mM physiological conditions K+ 3.5–60 mM plasma level (3-5 mM) plus K+ leaked from erythrocytes (80–120 mM) Other ions - sample + formic acid levels too low for influencing ESI-mass spectra potassium leaks from cells when plasma is not immediately separated from collected blood, or when blood has been temporarily stored, or plasma is handled roughly 7 Detection of ionic inconsistency in plasma samples K2Cl+ peaks in mass spectrum Distribution of K+ in samples good 8 Dimensionality reduction Metabolite profile is multivariable characteristic of an organism. & To avoid overfitting, the rule of 10–15 samples per variable should be followed. The dimensionality of the mass spectrometry data should be reduced. 9 Dimensionality reduction by PCA PCA 2000 variables (peak’s intensities) PC1 disease pathogenesis PC2 X case PC3 risk factors control PC3 PC1 PC4 PC5 X X nutrition age, sex… Only PCs useful for diagnostics should be used 10 Samples classification (diagnostics) SVM Diagnostics parameters case testing PC3 • Sensitivity TP/(TP+FN) • Specificity TN/(TN+FP) • Accuracy control PC1 Support Vector Machine (SVM) may classify multidimensional data by formation of a hyperplane in a multidimensional space. 11 Biochemical context for diagnostics 2000 variables + identification Identified metabolites 0 PC3 accurate mass tag isotopic pattern MS/MS MRM - 0 Biochemical context for metabolome-based diagnostics + PC1 12 Ten Leading Cancer Types, 2010 CA CANCER J CLIN 2010;60:277–300 13 Example 1: Diagnostics of prostate cancer II stage Metabolome-based diagnostics Sensitivity Specificity Accuracy 95.0% 96.7% 95.7% PSA-based diagnostics Sensitivity Specificity Accuracy Lokhov, Archakov et al. Metabolic Fingerprinting of Blood Plasma from Patients with Prostate Cancer. Biochemistry (Moscow), 2010. Metabolite profiling of blood plasma of patients with prostate cancer. Metabolomics. 2010 35.0% 83.3% 51.4% 14 Example 2: Diagnostics of lung cancer Diagnostics Cancer stage I II III IV I-IV Sensitivity (%) 100.0 91.4 92.3 93.2 91.1 Selectivity (%) Accuracy (%) 92.4 93.9 92.3 92.4 92.5 93.3 Lokhov, Archakov et al. Diagnosis of lung cancer based on direct-infusion electrospray mass spectrometry of blood 15 plasma metabolites. International Journal of Mass Spectrometry. 2011 Diagnostics of lung cancer (identified metabolites) Identified metabolites (PC1) Exposure to tobacco smoke Biotin sulfone Creatinine R-benzene ethylbenzoic acid аcetanisol dimethylbenzoic acid benzenepropionate Permethrin Halfenprox …. Metabolites reflecting exposure organism to tobacco smoke contribute in diagnostics 16 Risk of lung cancer development Cigarette consumption Smoker/non-smoker Cigarettes smoked per day Individual differences in how cigarettes are smoked Ranges of nicotine intake per cigarette Metabolome-based approach Levels of: Biotin sulfone Creatinine R-benzene Permethrin Halfenprox …. Age Body weight Averaged and inexactly calculated exposure to tobacco smoke Objectively calculated exposure to tobacco smoke OR (odd ration) - Risk of disease development OR (smokers/non-smokers)=4 OR (R-benzene)=38 17 Conclusions Metabolome profile for diagnostics can be obtained by direct-infusion mass spectrometry of blood plasma sample. The profile quality can be checked using data from profile itself. Dimensionality reduction allows following to rule 10-15 samples per variable and select groups of metabolites useful for diagnostics. Metabolome profile can be used for early diagnostics of lung and prostate cancers as well for calculation of risk of lung cancer development. One metabolome profile is needed for diagnostics and prognosis of lung cancer. 18 Acknowledgements Program “Proteomics for Medicine and Biotechnology” of Russian Academy of Medical Sciences. Russian Foundation for Basic Research Russian N.N. Blokhin Cancer Research Center 19