DEPARTMENT OF COMPUTER AND INFORMATION SCIENCES DEPARTMENT OF MATHEMATICS Dr. Juan R. Iglesias, Chair PPOHA Grant Invited Speaker Series Guest Speaker: Esteban Rangel PhD Student at Center for Ultra-scale Computing and Information Security Department EECS Northwestern University Wednesday, September 12, 2012 1:30-2:30 PM SETB 3rd Floor Conference Room Title: Clustering Massive Datasets: Applying Classical Problems in Probability Theory to the K-Medoids Problem Abstract: K-medoid methods for clustering data have many desirable properties such as robustness and the ability to use non-numerical values, but their typically high computational complexity has made their application to large data sets difficult. I will discuss AGORAS, a stochastic algorithm for the k-medoids problem that is especially well-suited to clustering massive data sets. The approach involves taking a sequence of uniform sample sets and a heuristic for determining the sample size and identifying potential cluster medoids from the sampled items. As a result, computing the final solution only involves solving k trivial sub-problems of centrality, which can be done much more efficiently on large data sets than searching a combinatorial space for the optimal value of an objective function. The complexity of AGORAS is effectively independent of the full data size, and it can scale to arbitrarily large data sets. Parallel implementations for shared and distributed memory architectures will be discussed along with general optimizations. Bio: I work in the field of high performance analytics. My research is focused on developing next generation data mining algorithms for the increasing size and complexity of data. To this end, my interests are in parallel computing for shared and distributed memory architectures and GPU's, and in approximation algorithms and stochastic methods. In a broad sense, I am interested in discovering relationships between accuracy, power consumption, and computing time for data mining tasks. After completing my MS in Computer Science from the University of Texas at Brownsville, I joined the Center for Ultra-scale Computing and Information Security (CUCIS) at Northwestern University in the Electrical Engineering and Computer Science Department. In 2011, I was awarded a Northwestern University Fellowship. 80 Fort Brown • Your Location • Brownsville, Texas 78520 • 956-882-6605 • Fax 956-882-6604 • utb.edu