Machine Learning Lecture 1 Introduction G53MLE: Machine Learning: Guoping Qiu 1 Motivating Problems • Handwritten Character Recognition G53MLE: Machine Learning: Guoping Qiu 2 Motivating Problems • Fingerprint Recognition (e.g., border control) G53MLE: Machine Learning: Guoping Qiu 3 Motivating Problems • Face Recognition (security access to buildings etc) G53MLE: Machine Learning: Guoping Qiu 4 Can Machines Learn to Solve These Problems? Or, to be more precise – Can we program machines to learn to do these tasks? G53MLE: Machine Learning: Guoping Qiu 5 Definition of Learning • A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E (Mitchell, Machine Learning, McGraw-Hill, 1997) G53MLE: Machine Learning: Guoping Qiu 6 Definition of Learning • What does this mean exactly? – Handwriting recognition problem • Task T: Recognizing hand written characters • Performance measure P: percent of characters correctly classified • Training experience E: a database of handwritten characters with given classifications G53MLE: Machine Learning: Guoping Qiu 7 Design a Learning System • We shall use handwritten Character recognition as an example to illustrate the design issues and approaches G53MLE: Machine Learning: Guoping Qiu 8 Design a Learning System Step 0: – Lets treat the learning system as a black box Learning System G53MLE: Machine Learning: Guoping Qiu Z 9 Design a Learning System Step 1: Collect Training Examples (Experience). – Without examples, our system will not learn (so-called learning from examples) 2 3 6 7 8 9 G53MLE: Machine Learning: Guoping Qiu 10 Design a Learning System Step 2: Representing Experience – Choose a representation scheme for the experience/examples (1,1,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1, 1,1,0, …., 1) 64-d Vector (1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1, 1,1,0, …., 1) 64-d Vector • The sensor input represented by an n-d vector, called the feature vector, X = (x1, x2, x3, …, xn) G53MLE: Machine Learning: Guoping Qiu 11 Design a Learning System Step 2: Representing Experience – Choose a representation scheme for the experience/examples • The sensor input represented by an n-d vector, called the feature vector, X = (x1, x2, x3, …, xn) • To represent the experience, we need to know what X is. • So we need a corresponding vector D, which will record our knowledge (experience) about X • The experience E is a pair of vectors E = (X, D) G53MLE: Machine Learning: Guoping Qiu 12 Design a Learning System Step 2: Representing Experience – Choose a representation scheme for the experience/examples • – The experience E is a pair of vectors E = (X, D) So, what would D be like? There are many possibilities. G53MLE: Machine Learning: Guoping Qiu 13 Design a Learning System Step 2: Representing Experience – So, what would D be like? There are many possibilities. – Assuming our system is to recognise 10 digits only, then D can be a 10-d binary vector; each correspond to one of the digits D = (d0, d1, d2, d3, d4, d5, d6, d7, d8, d9) e.g, if X is digit 5, then d5=1; all others =0 If X is digit 9, then d9=1; all others =0 G53MLE: Machine Learning: Guoping Qiu 14 Design a Learning System Step 2: Representing Experience – So, what would D be like? There are many possibilities. – Assuming our system is to recognise 10 digits only, then D can be a 10-d binary vector; each correspond to one of the digits D = (d0, d1, d2, d3, d4, d5, d6, d7, d8, d9) X = (1,1,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1, 1,1,0, …., 1); 64-d Vector D= (0,0,0,0,0,1,0,0,0,0) X= (1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1, 1,1,0, …., 1); 64-d Vector D= (0,0,0,0,0,0,0,0,1,0) G53MLE: Machine Learning: Guoping Qiu 15 Design a Learning System Step 3: Choose a Representation for the Black Box – We need to choose a function F to approximate the block box. For a given X, the value of F will give the classification of X. There are considerable flexibilities in choosing F X Learning System F G53MLE: Machine Learning: Guoping Qiu F(X) 16 Design a Learning System Step 3: Choose a Representation for the Black Box – F will be a function of some adjustable parameters, or weights, W = (w1, w2, w3, …wN), which the learning algorithm can modify or learn X Learning System F(W,X) F(W) G53MLE: Machine Learning: Guoping Qiu 17 Design a Learning System Step 4: Learning/Adjusting the Weights – We need a learning algorithm to adjust the weights such that the experience/prior knowledge from the training data can be learned into the system: E=(X,D) F(W,X) = D G53MLE: Machine Learning: Guoping Qiu 18 Design a Learning System Step 4: Learning/Adjusting the Weights Adjust W X E=(X,D) Learning System F(W,X) D F(W) Error = D-F(W,X) G53MLE: Machine Learning: Guoping Qiu 19 Design a Learning System Step 5: Use/Test the System – Once learning is completed, all parameters are fixed. An unknown input X is presented to the system, the system computes its answer according to F(W,X) X Learning System F(W,X) F(W) G53MLE: Machine Learning: Guoping Qiu Answer 20 Revision of Some Basic Maths • Vector and Matrix – – – – – – • Row vector/column vector/vector transposition Vector length/norm Inner/dot product Matrix (vector) multiplication Linear algebra Euclidean space Basic Calculus – – – Partial derivatives Gradient Chain rule G53MLE: Machine Learning: Guoping Qiu 21 Revision of Some Basic Maths • Inner/dot product x = [x1, x1, …, xn ]T , y = [y1, y1, …, yn ]T Inner/dot product of x and y, xTy n x y x1 y 1 x 2 y 2 x n y n T x i yi i 1 • Matrix/Vector multiplication G53MLE: Machine Learning: Guoping Qiu 22 Revision of Some Basic Maths • Vector space/Euclidean space • A vector space V is a set that is closed under finite vector addition and scalar multiplication. • The basic example is n-dimensional Euclidean space, where every element is represented by a list of n real numbers • An n-dimensional real vector corresponds to a point in the Euclidean space. [1, 3] is a point in 2-dimensional space [2, 4, 6] is point in 3-dimensional space G53MLE: Machine Learning: Guoping Qiu 23 Revision of Some Basic Maths • Vector space/Euclidean space – Euclidean space (Euclidean distance) X Y – x1 2 2 2 Dot/inner product and Euclidean distance • Let x and y are two normalized n vectors, ||x||= 1, ||y||=1, we can write X Y • – y1 x 2 y 2 x n y n 2 X Y T X Y 2 2X Y T Minimization of Euclidean distance between two vectors corresponds to maximization of their inner product. Euclidean distance/inner product as similarity measure G53MLE: Machine Learning: Guoping Qiu 24 Revision of Some Basic Maths • Basic Calculus y ( x ) f ( x1 , x 2 , ..., x n ) – Multivariable function: – Partial derivative: gives the direction and speed of change of y, with respect to xi – Gradient – – f f f , ...... x x n 1 dy Chain rule: Let y = f (g(x)), u = g(x), then dx dz Let z = f(x, y), x = g(t), y = h(t), then G53MLE: Machine Learning: Guoping Qiu dt dy du du dx f dx x dt f dy y dt 25 Feature Space • Representing real world objects using feature vectors i 2 1 x1(i) 3 4 x2(i) 5 6 7 x1 X(i) =[x1(i), x2(i)] 10 9 Feature Vector x1(i) 11 12 13 14 15 Feature Space x2(i) x2 8 16 Elliptical blobs (objects) G53MLE: Machine Learning: Guoping Qiu 26 Feature Space From Objects to Feature Vectors to Points in the Feature Spaces x1 2 1 X(15) X(1) X(7) X(16) X(3) X(8) X(25) X(12) X(9) X(10)X(13)X(6) X(4) X(11) X(14) 3 4 5 6 7 10 9 11 12 13 14 15 8 16 Elliptical blobs (objects) x2 G53MLE: Machine Learning: Guoping Qiu 27 Representing General Objects Feature vectors of Faces Cars Fingerprints Gestures Emotions (a smiling face, a sad expression etc) • … • • • • • G53MLE: Machine Learning: Guoping Qiu 28 Further Reading • T. M. Mitchell, Machine Learning, McGraw-Hill International Edition, 1997 Chapter 1 G53MLE: Machine Learning: Guoping Qiu 29 Tutorial/Exercise Questions 1. Describe informally in one paragraph of English, the task of learning to recognize handwriting numerical digits. 2. Describe the various steps involved in designing a learning system to perform the task of question 1, give as much detail as possible the tasks that have to be performed in each step. 3. For the tasks of learning to recognize human faces and fingerprints respectively, redo questions 1 and 2. 4. In the lecture, we used a very long binary vector to represent the handwriting digits, can you think of other representation methods? G53MLE: Machine Learning: Guoping Qiu 30