ppt

Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml CHAPTER 12: Local Models Introduction  Divide the input space into local regions and learn simple (constant/linear) models in each patch  Unsupervised: Competitive, online clustering Supervised: Radial-basis func, mixture of experts  3 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Competitive Learning   E mi i 1 X  t i bit x t  mi k t t  1 if x  m  min x  ml t i l bi   0 othe rwise t t b x  t i Batc hk - me ans: mi  t b t i Batc hk - me ans: E t mij    bit x tj  mij mij   4 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Winner-take-all network 5 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Adaptive Resonance Theory  t i Incremental; add a new cluster if not covered; defined by vigilance, ρ k t b  x  mi   min x t  ml l 1 mk 1  x t  t  m   x  mi i    if bi   othe rwise (Carpenter and Grossberg, 1988) 6 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Self-Organizing Maps   Units have a neighborhood defined; mi is “between” mi-1 and mi+1, and are all updated together One-dim map: (Kohonen, 1990)  ml  e l ,i  x t  ml   l  i 2  1 e l ,i   e xp 2  2  2  7 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Radial-Basis Functions  Locally-tuned units:  xt  m h pht  e xp  2sh2  2     H y   whpht w 0 t h 1 8 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Local vs Distributed Representation 9 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Training RBF    Hybrid learning:  First layer centers and spreads: Unsupervised k-means  Second layer weights: Supervised gradient-descent Fully supervised (Broomhead and Lowe, 1988; Moody and Darken, 1989) 10 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Regression E m , s , w  h y it  h ih i ,h  1 | X    rit  y it 2 t i  2 H t w p  ih h  w i 0 h 1   w ih   rit  y it p ht t mhj     t t t    ri  y i w ih p h t  i  x t j  mhj s h2 t   t x  mh t t s h    ri  y i w ih p h 3 s t  i  h    2 11 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Classification E mh , sh , wih i ,h | X    rit log y it t i y   t e xp h wihpht  wi 0   i t e xp w p k h kh h  wk 0  12 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Rules and Exceptions H y t   whpht  v T x t  v 0 h 1 Exceptions Default rule 13 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Rule-Based Knowledge IF x 1  a  A ND x 2  b  OR x 3  c  THEN y  0.1  x 1  a 2   x 2  b 2  p1  e xp   e xp  with w 1  0.1 2 2 2s1  2s2     x 3  c 2  p 2  e xp  with w 2  0.1 2 2s3      Incorporation of prior knowledge (before training) Rule extraction (after training) (Tresp et al., 1997) Fuzzy membership functions and fuzzy rules 14 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Normalized Basis Functions t h g  pht t p l 1 l H 2 t  e xp  x  mh / 2sh2     2 t 2  e xp  x  m / 2 s l  l l   H y   w ih ght t i h 1   w ih   rit  y it ght t   mhj   ri  y w ih  y t i t t i t i  x g t h t j  mhj  sh2 15 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Competitive Basis Functions     p r t | xt  p h | xt p r t | h, xt Mixture model:   H  h 1  t p x | h p h  t ph|x  t p x | l p l  l     2 t  ah e xp  x  mh / 2sh2    t gh  2 t 2  a e xp  x  m / 2 s l l  l l   16 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)   t f t h t t i t ih  t h  t     g e xp 1 / 2 r  y    l t l   mhj    f  g  g e xp  1 / 2i ri  y t h   1 t 2 t | X    log g e xp  ri  y ih  h t   2 i t  w ih is the c onstantfit y ih t h w ih    ri  y f t  p r |x  Regression L mh , sh , w ih i ,h   rit  y it  1 e xp  2 2  2  t i t h t j  mhj  sh2 t 2 ih t i t h  x p h | x p r | h , x  p h | r , x   l p l | x p r | l , x  t 2 il 17 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Classification L mh , sh , wih i ,h | X    log g t h t h  y  t t ri ih i  t t    log g e xp ri log y ih  t h  i  e xpwih  k e xpwkh t h t y ih t h f   t ght e xp i rit log y ih   t t t g e xp r log y l l i i il  18 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) EM for RBF (Supervised EM)   E-step: M-step:  fht  p r | h, x t mh  sh t t f t h x t f t h f x   w ih   t t h t  t  mh x  mh t f t h  T t t f t h ri t f t h 19 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Learning Vector Quantization   H units per class prelabeled (Kohonen, 1990) Given x, mi is the closest:   mi   x t  mi  t  m    x  mi i    if x t and mi have the same c lasslabe l othe rwise x mi mj 20 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Mixture of Experts   In RBF, each local fit is a constant, wih, second layer weight In MoE, each local fit is a linear function of x, a local expert: t t wih  vih xt (Jacobs et al., 1991) 21 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) MoE as Models Combined  Radial gating: 2 t  e xp  x  mh / 2sh2    ght  2 t 2  e xp  x  m / 2 s l  l l    Softmax gating:   T t exp m t hx gh  T t exp m l l x   22 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Cooperative MoE  Regression E mh , s h , w ih i ,h    r  1 t t  | X   ri  y i 2 t i  w  2 t v ih   rit  y ih g ht x t t mhj i t t  y ih t ih   y it g ht x tj t 23 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Competitive MoE: Regression L mh , s h , w ih i ,h   1 t | X    log g e xp   rit  y ih t h  2 i t y ih  w ih  v ih x t t h    f  x  2    t v ih   rit  y ih f ht x t t m h t h  ght t t 24 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Competitive MoE: Classification L mh , sh , w ih i ,h | X    log g t t y ih h t h  y  t t ri ih i  t    log ght e xp  rit log y ih  t h  i  e xpw ih e xpv ih x t   t e xp w e xp v x k k kh kh 25 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Hierarchical Mixture of Experts    Tree of MoE where each MoE is an expert in a higher-level MoE Soft decision tree: Takes a weighted (gating) average of all leaves (experts), as opposed to using a single path and a single leaf Can be trained using EM (Jordan and Jacobs, 1994) 26 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

ppt

Related documents

Products

Support

ppt

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib