The Quantile Method for Symbolic Data Analysis Manabu Ichino School of Science and Engineering, Tokyo Denki University ichino@mail.dendai.ac.jp Keywords: Quantiles, Monotonicity, Visualization, PCA, Clustering Abstract The quantile method transforms the given (N objects)×(d variables) symbolic data table to a standard {N×(m+1) sub-objects}×(d variables) numerical data table, where m is a preselected integer number that controls the granularity to represent symbolic objects. Therefore, a set of (m+1) d-dimensional numerical vectors, called the quantile vectors, represents each symbolic object. According to the monotonicity of quantile vectors, we present the following three methods for symbolic data analysis. 1) Visualization: We visualize each symbolic object by m+1 parallel monotone line graphs [Ichino and Brito 2014]. Each line graph is composed of d-1 line segments accumulating the d zero-one normalized variable values. 2) PCA: When the given symbolic objects have a monotone structure in the representation space, the structure confines the corresponding quantile vectors to a similar geometrical shape. We apply the PCA to the quantile vectors based on the rank order correlation coefficients. We reproduce each symbolic object as m series of arrow lines that connect from the minimum quantile vector to the maximum quantile vector in the factor planes [Ichino 2011]. 3) Clustering: We present a hierarchical conceptual clustering based on the quantile vectors. We define the concept sizes of d-dimensional hyper-rectangles spanned by quantile vectors. The concept size plays the role of the similarity measure between sub-objects, i.e., quantile vectors, and it plays also the role of the measure for cluster quality [Ichino and Brito 2015]. References H-H. Bock and E. Diday (2000). Analysis of Symbolic Data - Exploratory Methods for Extracting Statistical Information from Complex Data. Heidelberg: Springer. L. Billard and E. Diday (2007). Symbolic Data Analysis - Conceptual Statistics and Data Mining. Chichester: Wiley. E. Diday and M. Noirhomme-Fraiture (2008). Symbolic Data Analysis and the SODAS Software. Chichester: Wiley. M. Ichino and P. Brito (2014). The data accumulation graph (DAG) to visualize multi-dimensional symbolic data. Workshop in Symbolic Data Analysis. Taipei, Taiwan. M. Ichino (2011). The quantile method for symbolic principal component analysis. Statistical Analysis and Data Mining, 4, 2, pp. 184-198. M. Ichino and P. Brito (2015). A hierarchical conceptual clustering based on the quantile method for mixed feature-type data. (Submitted to the IEEE Trans. SMC).