Instance-Based Learning - University of Birmingham

Instance Based Learning Ata Kaban The University of Birmingham 1 Today we learn:  K-Nearest Neighbours  Case-based reasoning  Lazy and eager learning 2 Instance-based learning    One way of solving tasks of approximating discrete or real valued target functions Have training examples: (xn, f(xn)), n=1..N. Key idea: – just store the training examples – when a test example is given then find the closest matches 3  1-Nearest neighbour: Given a query instance xq, • first locate the nearest training example xn • then f(xq):= f(xn)  K-Nearest neighbour: Given a query instance xq, • first locate the k nearest training examples • if discrete values target function then take vote among its k nearest nbrs else if real valued target fct then take the mean of the f values of the k nearest nbrs  f ( x ) : k i 1 q f ( xi ) k 4 The distance between examples    We need a measure of distance in order to know who are the neighbours Assume that we have T attributes for the learning problem. Then one example point x has elements xt  , t=1,…T. The distance between two points xi xj is often defined as the Euclidean distance: d (xi , x j )  T 2 [ x  x ]  ti tj t 1 5 Voronoi Diagram 6 Characteristics of Inst-b-Learning   An instance-based learner is a lazy-learner and does all the work when the test example is presented. This is opposed to so-called eager-learners, which build a parameterised compact model of the target. It produces local approximation to the target function (different with each test instance) 7 When to consider Nearest Neighbour algorithms?     Instances map to points in  Not more then say 20 attributes per instance Lots of training data Advantages: n – Training is very fast – Can learn complex target functions – Don’t lose information  Disadvantages: – ? (will see them shortly…) 8 one two three four five six seven Eight ? 9 Training data Number Lines Line types Rectangles Colours Mondrian? 1 6 1 10 4 No 2 4 2 8 5 No 3 5 2 7 4 Yes 4 5 1 8 4 Yes 5 5 1 10 5 No 6 6 1 8 6 Yes 7 7 1 14 5 No Test instance Number Lines Line types Rectangles Colours Mondrian? 8 7 2 9 4 10 Keep data in normalised form One way to normalise the data ar(x) to a´r(x) is xt '  xt  x t t x r  mean of t th attributes  t  sta ndard deviation of t attributes th 11 Normalised training data Number Lines 1 Line types 0.632 -0.632 2 -1.581 3 Rectangles Colours Mondrian? 0.327 -1.021 No 1.581 -0.588 0.408 No -0.474 1.581 -1.046 -1.021 Yes 4 -0.474 -0.632 -0.588 -1.021 Yes 5 -0.474 -0.632 0.327 0.408 No 6 0.632 -0.632 -0.588 1.837 Yes 7 1.739 -0.632 2.157 0.408 No Test instance Number Lines 8 Line types 1.739 1.581 Rectangles Colours Mondrian? -0.131 -1.021 12 Distances of test instance from training data Example Distance Mondrian? of test from example 1 No 2.517 Classification 1-NN Yes 3-NN Yes 2 3.644 No 5-NN No 3 2.395 Yes 7-NN No 4 3.164 Yes 5 3.472 No 6 3.808 Yes 7 3.490 No 13 What if the target function is real valued?  The k-nearest neighbour algorithm would just calculate the mean of the k nearest neighbours 14 Variant of kNN: Distance-Weighted kNN  We might want to weight nearer neighbors more heavily w f (x )  ) :  w k f (x q i 1 i k i 1  i i 1 where wi  d (x q , xi ) 2 Then it makes sense to use all training examples instead of just k (Stepard’s method) 15 Difficulties with k-nearest neighbour algorithms   Have to calculate the distance of the test case from all training cases There may be irrelevant attributes amongst the attributes – curse of dimensionality 16 Case-based reasoning (CBR)   CBR is an advanced instance based learning applied to more complex instance objects Objects may include complex structural descriptions of cases & adaptation rules 17    CBR cannot use Euclidean distance measures Must define distance measures for those complex objects instead (e.g. semantic nets) CBR tries to model human problem-solving – uses past experience (cases) to solve new problems – retains solutions to new problems  CBR is an ongoing area of machine learning research with many applications 18 Applications of CBR  Design – landscape, building, mechanical, conceptual design of aircraft sub-systems  Planning – repair schedules  Diagnosis – medical  Adversarial reasoning – legal 19 CBR process New Case Retrieve matching Learn Matched Cases Case Base Knowledge and Adaptation rules Retain Closest Case No Adapt? Yes Reuse Revise Suggest solution 20 CBR example: Property pricing Case Location Bedrooms Recep code rooms 1 8 2 1 Type floors Condition terraced 1 poor Price £ 20,500 2 8 2 2 terraced 1 fair 25,000 3 5 1 2 semi 2 good 48,000 4 5 1 2 terraced 2 good 41,000 Test instance Case Location Bedrooms Recep code rooms 5 7 2 2 Type semi floors Condition 1 poor Price £ ??? 21 How rules are generated   There is no unique way of doing it. Here is one possibility: Examine cases and look for ones that are almost identical – case 1 and case 2 • R1: If recep-rooms changes from 2 to 1 then reduce price by £5,000 – case 3 and case 4 • R2: If Type changes from semi to terraced then reduce price by £7,000 22 Matching  Comparing test instance – matches(5,1) = 3 – matches(5,2) = 3 – matches(5,3) = 2 – matches(5,4) = 1  Estimate price of case 5 is £25,000 23 Adapting  Reverse rule 2 – if type changes from terraced to semi then increase price by £7,000  Apply reversed rule 2 – new estimate of price of property 5 is £32,000 24 Learning  So far we have a new case and an estimated price – nothing is added yet to the case base  If later we find house sold for £35,000 then the case would be added – could add a new rule • if location changes from 8 to 7 increase price by £3,000 25 Problems with CBR     How should cases be represented? How should cases be indexed for fast retrieval? How can good adaptation heuristics be developed? When should old cases be removed? 26 Advantages    A local approximation is found for each test case Knowledge is in a form understandable to human beings Fast to train 27 Summary    K-Nearest Neighbours Case-based reasoning Lazy and eager learning 28 Lazy and Eager Learning  Lazy: wait for query before generalizing – k-Nearest Neighbour, Case based reasoning  Eager: generalize before seeing query – Radial Basis Function Networks, ID3, …  Does it matter? – Eager learner must create global approximation – Lazy learner can create many local approximations 29

Instance-Based Learning - University of Birmingham

Related documents

Products

Support

Instance-Based Learning - University of Birmingham

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib