“Exploring High-D Spaces with Multiform Matrices and Small Multiples” MacEachren, A., Dai, X., Hardisty, F., Guo, D., and Lengerich, G. Proc. IEEE Symposium on Information Visualization (2003), 31–38. http://www.geovista.psu.edu/ Mudit Agrawal Nathaniel Ayewah The Plan Motivation Contribution Analysis Methods GeoVISTA studio Conclusions Motivation Discover Multivariate relationships Examine data from multiple perspectives DATA INFORMATION Contribution Visual analysis of multivariate data Combinations of scatterplots, bivariate maps and space-filling displays Conditional Entropy to identify interesting variables from a data-set, and to order the variables to show more information Dynamic query/filtering called Conditioning Contribution Back-end: Design Box Building of applications using visual programming tools Front-end: GUI Box Visualizing data using the developed designs Source: GeoVista Studio Analysis Methods Analysis Methods Sorting Nested sorting – sort a table on selected attributes To understand the relationships between sorted variables and the rest Permutation Matrix : cell values are replaced by graphical depiction of value. Rows/cols can be sorted to search for related entities e.g. Analysis Methods Sorting Augmented seriation: Organizing a set of objects along a single dimension using multimodal multimedia Correlation matrices Reorderable Matrices: Simple interactive visualization artifact for tabular data Source: (Siirtola, 1999) Analysis Methods Space-filling visualization Mosaic plot Sunburst methods Source: (Schedl, 2006) Source: (Young, 1999) Pixel-oriented methods Source: (Keim, 1996) Analysis Methods Multiform Bivariate Small Multiple Small Multiples A set of juxtaposed data representations that together support understanding of multivariate information Source: (MacEachren, 2003) Analysis Methods Multiform Bivariate Matrix Source: (MacEachren, 2003) GeoVista Studio Demonstration Basic Demo Application construction Scatterplot, Geomap Dynamic linking, eccentric labeling etc. Dealing with High Dimensionality High Dimensionality Interactive Feature Selection Guo, D., 2003. Coordinating Computational and Visualization Approaches for Interactive Feature Selection and Mulivariate Clustering. Information Visualization 2(4): 232-246. High Dimensionality “Goodness of Clustering” high coverage high density high dependence E.g. Correlation Chi-squared Conditional Entropy HIGH HIGH LOW Conditional Entropy Discretize two dimensions into intervals Nested Means mean 1 mean 1 2 mean 2 3 4 Source: (Guo, 2003) Conditional Entropy Source: (Guo, 2003) Ordering Dimensions Related dimensions should be close together Sort By: Conditional Entropy A A B 5 B 5 C 16 15 D 9 C 16 D 9 Sort Method: Minimum Spanning Tree 5 A 9 B 15 16 21 15 21 21 4 C 4 D 4 unsorted Ordering: B A D C Demonstration Advanced Demo Interactive Feature Selection PCP, SOM, Matrix Conditioning Conclusions Strengths Dynamic Linking of different representations Visualizing clusters of dimensions Rich and extensible toolbox Weaknesses Usability Arrangement of Windows References Guo, D., (2003). Coordinating Computational and Visualization Approaches for Interactive Feature Selection and Mulivariate Clustering. Information Visualization 2(4): 232-246. Keim, D (1996) Pixel-oriented Visualization Techniques for Exploring Very Large Databases, Journal of Computational and Graphical Statistics. Schedl, M (2006), CoMIRVA: Collection of Music Information Retrieval and Visualization Applications. Website. http://www.cp.jku.at/people/schedl/Research/Development/CoMIRVA/webpage/CoMIRVA.html Siirtola, H. (1999), Interaction with the Reorderable Matrix. In E. Banissi, F. Khosrowshahi, M. Sarfraz, E. Tatham, and A. Ursyn, editors, Information Visualization IV '99, pages 272-277. Proceedings International Conference on Information Visualization. Young, F (1999), Frequency Distribution Graphs (Visualizations) for Category Variables, unpublished. http://forrest.psych.unc.edu/research/vista-frames/help/lecturenotes/lecture02/repvis4a.html.