Artificial Intelligence CSC462 Dr. Muhammad Humayoun Assistant Professor COMSATS Institute of Computer Science, Lahore. mhumayoun@ciitlahore.edu.pk Course homepage: https://sites.google.com/a/ciitlahore.edu.pk/ai13/ Modified slides of AI course on Udacity, etc. 1 Machine Learning Chapter 18 2 What we saw • • • • • Bayes Network = reasons with known models Machine learning = Learns models from data Unsupervised Learning (Today’s topic) Supervised Learning (Next topic) ML is a very large field with many different methods and many different applications 3 Taxonomy What? • Parameters: like the probabilities of a Bayes Network. • Structure: like the arc structure of a Bayes Network. • Hidden concepts that makes better sense of data: For example you might find that certain training example form a hidden group. 4 Taxonomy What from? Every ML method is driven by some sort of target information that you care about • Supervised learning: we have target labels • Unsupervised learning: target labels are missing and we use replacement principles to find, for example hidden concepts • Reinforcement learning: an agent learns from feedback with the physical environment by interacting and trying actions and receiving some sort of evaluation from the environment 5 Taxonomy What for? • Prediction: what's going to happen with the future in the stockmarket for example. • Diagnostics • Summarization • … 6 Taxonomy How to learn? • Passive: if your learning agent is just an observer and has no impact on the data itself. • Active: Otherwise, its active. • Online: learning occurs when the data is being generated • Offline: learning occurs after the data has been generated 7 Taxonomy Outputs? • Classification: the output is binary or a fixed number of classes. Ex. something is either a chair or not. • Regression is continuous. Ex. Tomorrow’s temperature might be 13 degrees in our prediction. 8 Taxonomy Internal details? • Generative: seeks to model the data as generally as possible • Discriminative seek to distinguish data (this might sound like a superficial distinction but it has enormous ramification on the learning algorithm) It has taken us many years to fully learn all these words. So don't expect to pick them all up in one class 9 Unsupervised Learning • We just have a data matrix of data items of N features each, with M records • Task of unsupervised learning is to find structure in data of this type 10 Warm-up Quiz • • • • Is there any structure in these data items? Yes. Data does not seem random. How many groups? 2 11 Another Quiz • What is the dimensionality of the space? • 2 • How many dimensions are needed intuitively??? • 1 • Important technique • Dimensionality Reduction • Useful in image resolution reduction 12 (Identically distributed and independently drawn) 13 Google street view 14 Google street view • A huge photographic database of many, many streets in the world. • Ground imagery of almost any location in the world • Vast regularities in these images • Homes, trees, cars, signboards, etc • So one of the fascinating, unsolved, unsupervised learning tasks is: – Given hundreds of billions of images as comprised in the Street View data set can we discover concepts such as trees, lane markers, stop signs, cars, and pedestrians? 15 16 17 18 K-means • K-means (MacQueen, 1967) is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. • The procedure follows a simple and easy way to classify a given data set through a certain number of fixed clusters (assume k clusters). 19 20 21 22 23 24 25 26 27 Calculations 28 Quiz 29 Quiz 30 Quiz 31 32 33 34 35 36 37 38 39 The K-means Algorithm for Clustering kmeans(D, k) choose K initial means randomly (e.g., pick K points randomly from D) while means_are_changing % assign each point to a cluster for i = 1: N membership[x(i)] = cluster with mean closest to x(i) end % update the means for k = 1:K mean_k = average of vectors x(i) assigned to cluster k end % check for convergence if (new means are the same as old means) then halt else means_are_changing = 1 end 40 41 k-Means: Step-By-Step Example • Consider the following data set consisting of the scores of two variables on each of seven individuals: 42 k-Means: Step-By-Step Example • For K=2, the data set is to be grouped into two clusters. • As a first step in finding a sensible initial partition, let the A & B values of the two individuals furthest apart (using the Euclidean distance measure), define the initial cluster means, giving: 43 k-Means: Step-By-Step Example • The remaining individuals are now examined in sequence and allocated to the cluster to which they are closest, in terms of Euclidean distance to the cluster mean. • The mean vector is recalculated each time a new member is added. This leads to the following series of steps: 44 k-Means: Step-By-Step Example • Now the initial partition has changed, and the two clusters at this stage having the following characteristics: 45 k-Means: Step-By-Step Example • But we cannot yet be sure that each individual has been assigned to the right cluster. • So, we compare each individual’s distance to its own cluster mean and to that of the opposite cluster. And we find: 46 k-Means: Step-By-Step Example • Only individual 3 is nearer to the mean of the opposite cluster (Cluster 2) than its own (Cluster 1). • In other words, each individual's distance to its own cluster mean should be smaller that the distance to the other cluster's mean (which is not the case with individual 3). • Thus, individual 3 is relocated to Cluster 2 resulting in the new partition: 47 k-Means: Step-By-Step Example • The iterative relocation would now continue from this new partition until no more relocations occur. • However, in this example each individual is now nearer its own cluster mean than that of the other cluster and the iteration stops, choosing the latest partitioning as the final cluster solution. • Also, it is possible that the k-means algorithm won't find a final solution. • In this case it would be a good idea to consider stopping the algorithm after a pre-chosen maximum of iterations. 48 http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html 49 End 50