Title (99/100): Multiclass classification of FDG-PET scans: Parkinson’s disease vs. atypical parkinsonian syndromes Authors: Christophe Phillips, Jessica Schrouff, Erik Salmon, Gaëtan Garraux Abstract (3993/4000): Introduction, Methods, Results, and Conclusions Part of the difficulty in the early diagnosis of Parkinson’s disease (PD) is in differentiating it from atypical parkinsonian disorders (APS) that have a poorer prognosis such as multiple system atrophy (MSA), progressive supranuclear palsy (PSP) and corticobasal syndrome (CBS). 18flurodeoxyglucose (FDG) positron emission tomography (PET) has been recommended for the early differentiation between PD and APS [1]. Here, 120 FDG PET scans (42, 31, 26 and 21 for the PD, MSA, PSP and CBS resp.) were acquired on average 3.5 years after symptoms onset (because the initial clinical features were outside the prevailing perception for PD) to look, without any a priori assumption, for cerebral FDG uptake patterns that discriminate either between the PD and APS classes, or between the four PD, MSA, PSP and CBS classes. The diagnostic used to label the scans was defined by clinical criteria on average 4.5 years after PET assessment. Scans were first spatially preprocessed (normalisation, smoothing and rescaling) in SPM8 then fed into a linear kernel “Relevance Vector Machine classifier” (RVMc, [2]) along their class label. RVMc models were built to distinguish between PD and APS classes (binary classification), and between PD, MSA, PSP, CBS classes (multiclass classification). The multiclass classification was performed using a one-versus-one approach involving six binary RVMc’s: PD vs. MSA, PD vs. PSP, PD vs. CBS, MSA vs. PSP, MSA vs. CBS, and PSP vs. CBS. The probabilities obtained from these six classifiers were then recombined using an Error-Correcting Output Code approach (ECOC, [3]) to produce a multiclass classifier. Robustness and accuracy of these binary RVMc’s were estimated using bootstrap resampling [4] with 100 ‘baggings’, i.e. random sampling with replacement. The number of images per class in each bagging was fixed as the number of scans in the class with the fewest scans, i.e. 42 (resp. 21) scans for the binary (resp. multiclass) problem. The remaining scans formed the test set, which was used to test the built RVMc models. As a result of this division between learning and test sets, the learning set is balanced between classes while the test set shows similar proportions of each class as in the whole data set. Accuracy, specificity and sensitivity are estimated for each trained classifier, over the 100 baggings. Similarly maps of voxel relevance can be built and averaged, leading to a discriminant image. Figure 1 shows the discriminant image obtained for the binary classification. The “excess network” (hot) and “deficit network” (cold) are in agreement with regions observed when comparing PD, APS and healthy controls with univariate statistical methods [5]. Six similar images are obtained for the multiclass classification highlighting patterns specific to each class comparisons. Figure 2 shows the median classification accuracy over baggings, for the binary and multiclass problems, with their interquantile intervals. All accuracies are well above classification at chance level, 50% and 25% for the binary and multiclass case respectively. Since the clinical diagnoses were likely not 100% accurate as no pathological verification was available [6], some of the observed misclassification is in fact due to true medical misdiagnosis. In conclusion we show that an RVM analysis of FDG PET scans acquired on average 3.5 years after symptom onset can accurately predict the clinical diagnosis of PD or APS as compared with the final clinical diagnosis obtained on average 4.5 years after PET assessment. Furthermore, there is evidence that the proposed RVM multiclass classification method is suitable to discriminate between scans from PD, MSA, PSP and CBS patients at individual level. Thus the results emphasize the potential role of RVM-based (multiclass) classification of FDG-PET scans as an aid to the clinicians to make informed decisions in the early stages of degenerative parkinsonisms when the diagnosis of PD is felt uncertain. Figures (max 4): 1. Relevance map of the binary classifier 2. Two graphs with classification accuracies, for the binary and multiclass problem. References (max 10): [1] Varrone A., Asenbaum S., Vander Borght T., Booij J., Nobili F., Nagren K., Darcourt J., Kapucu O.L., Tatsch K., Bartenstein P., Van Laere K. (2009), ‘EANM procedure guidelines for PET brain imaging using [F-18]FDG, version 2’, European Journal of Nuclear Medicine and Molecular Imaging, vol. 36, pp. 2103-2110. [2] Tipping, M.E. (2001), ‘Sparse Bayesian Learning and the Relevance Vector Machine’, Journal of Machine Learning Research, vol. 1, pp. 211-244. [3] Dietterich T.G., Bakiri G.0 (1995), ‘Solving Multiclass Learning Problems via Error-Correcting Output Codes’, JAIR, vol. 2, pp. 263-286. [4] Efron B., Tibshirani R.J. (1993), ‘An Introduction to the Bootstrap’, Chapman & Hall/CRC [5] Garraux G., Salmon E., Peigneux P., Kreisler A., Degueldre C., Lemaire C., Destee A., Franck G. (2000), ‘Voxel-based distribution of metabolic impairment in corticobasal degeneration’, Mov. Disord., vol. 15, pp. 894-904. [6] Hughes A.J., Daniel S.E., Lees A.J. (2001), ‘Improved accuracy of clinical diagnosis of Lewy body Parkinson's disease’, Neurology, vol. 57, pp. 1497-1499.