Lecture 21 Bounds on the generalization error of voting classiﬁers. 18.465

Lecture 21 Bounds on the generalization error of voting classifiers. 18.465 We continue to prove the lemma from Lecture 20: �d Lemma 21.1. Let Fd = convd H = { i=1 λi hi , hi ∈ H} and fix δ ∈ (0, 1]. Then � �� dV log nδ t Eϕδ − ϕ¯δ ≥ 1 − e−t . ≤K + P ∀f ∈ Fd , √ n n Eϕδ Proof. We showed that log D(ϕδ (yFd ) , ε/2, dx,y ) ≤ KdV log 2 . εδ By the result of Lecture 16, n 1� k Eϕδ (yf (x)) − ϕδ (yi f (xi )) ≤ √ n i=1 n � √ Eϕδ � log 1/2 D(ϕδ (yFd (x)) , ε)dε + 0 tEϕδ n with probability at least 1 − e−t . We have � √Eϕδ � √Eϕδ � k k 2 1/2 √ log D(ϕδ (yFd (x)) , ε)dε ≤ √ dV log dε εδ n 0 n 0 � � √ k 2 δ Eϕδ /2 √ 1 =√ dV log dx x nδ 0 � k 2√ δ� 2 ≤√ dV 2 Eϕδ log √ 2 nδ δ Eϕδ = x, ε = 2x . Without loss of generality, assume Eϕδ ≥ 1/n. � δ Otherwise, we’re doing better than in Lemma: √EE ≤ logn n ⇒ E ≤ logn n . Hence, � √ � √Eϕδ k dV Eϕδ 2 n 1/2 √ log D(ϕδ (yFd (x)) , ε)dε ≤ K log n δ n 0 � dV Eϕδ n ≤K log n δ where we have made a change of variables 2 εδ So, with probability at least 1 − e−t , n 1� Eϕδ (yf (x)) − ϕδ (yi f (xi )) ≤ K n i=1 � dV Eϕδ (yf (x)) n log + n δ � tEϕδ (yf (x)) n which concludes the proof. � The above lemma gives a result for a fixed d ≥ 1 and δ ∈ (0, 1]. To obtain a uniform result, it’s enough to consider δ ∈ Δ = {2−k , k ≥ 1} and d ∈ {1, 2, . . .}. For a fixed δ and d, use the Lemma above with tδ,d defined by e−tδ,d = e−t d26πδ2 . Then � P ∀f ∈ Fd , . . . + and ⎛ P⎝ � d,δ ∀f ∈ Fd , . . . + � tδ,d n � � ≥ 1 − e−tδ,d = 1 − e−t 6δ d2 π 2 �⎞ � tδ,d ⎠ 6δ ≥1− e−t 2 2 = 1 − e−t . n d π δ,d 53 Lecture 21 Bounds on the generalization error of voting classifiers. Since tδ,d = t + log 18.465 d2 π 2 6δ , � ⎞ ⎛� 2 2 t + log d 6δπ dV log nδ Eϕδ − ϕ¯δ ⎠ ∀f ∈ Fd , √ + ≤K⎝ n n Eϕδ �� dV log nδ d2 π 2 t ≤K + log + n n 6δ �� n dV log t δ ≤ K� + n n � 2 2 dV log n δ since log d 6δπ , the penalty for union-bound, is much smaller than . n Recall the bound on the misclassification error n 1 � P (yf (x) ≤ 0) ≤ I(yi f (xi ) ≤ δ) + n i=1 If � � n 1� Eϕδ (yf (x)) − ϕδ (yi f (xi )) . n i=1 �n Eϕδ − n1 i=1 ϕδ √ ≤ ε, Eϕδ then n Eϕδ − ε � Eϕδ − 1� ϕδ ≤ 0. n i=1 Hence, � � �� n � ε ε 2 1� Eϕδ ≤ + � + ϕδ 2 2 n i=1 Eϕδ ≤ 2 � ε �2 2 n +2 1� ϕδ . n i=1 The bound becomes ⎛ ⎞ n ⎜ 1 � dV n t⎟ ⎜ ⎟ P (yf (x) ≤ 0) ≤ K ⎜ I(yi f (xi ) ≤ δ) + log + ⎟ n δ n ⎝ n i=1 ⎠ � �� (*) where K is a rough constant. (*) not satisfactory because in boosting the bound should get better when the number of functions grows. We prove a better bound in the next lecture. 54

Lecture 21 Bounds on the generalization error of voting classiﬁers. 18.465

Related documents

Products

Support

Lecture 21 Bounds on the generalization error of voting classiﬁers. 18.465

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib