Lecture 31 Optimistic VC inequality for random classes of sets. 18.465 √ As in the previous lecture, let F = {(w, φ(x))H , �w� ≤ 1}, where φ(x) = ( λi φi (x))i≥1 , X ⊂ Rd . Define d(f, g) = �f − g�∞ = supx∈X |f (x) − g(x)|. The following theorem appears in Cucker & Smale: Theorem 31.1. ∀h ≥ d, � log N (F, ε, d) ≤ Ch ε � 2d h where Ch is a constant. Note that for any x1 , . . . , xn , � dx (f, g) = n 1� (f (xi ) − g(xi ))2 n i=1 �1/2 ≤ d(f, g) = sup |f (x) − g(x)| ≤ ε. x Hence, N (F, ε, dx ) ≤ N (F, ε, d). Assume the loss function L(y, f (x)) = (y − f (x))2 . The loss class is defined as L(y, F ) = {(y − f (x))2 , f ∈ F}. Suppose |y − f (x)| ≤ M . Then |(y − f (x))2 − (y − g(x))2 | ≤ 2M |f (x) − g(x)| ≤ ε. So, � � ε N (L(y, F), ε, dx ) ≤ N F, , dx 2M and � log N (L(y, F), ε, dx ) ≤ α= 2d h 2M Ch ε � 2d h � = 2M Ch ε �α < 2 (see Homework 2, problem 4). Now, we would like to use specific form of solution for SVM: f (x) = �n i=1 αi K(xi , x), i.e. f belongs to a random subclass. We now prove a VC inequality for random collection of sets. Let’s consider C(x1 , . . . , xn ) = {C : C ⊆ X } - random collection of sets. Assume that C(x1 , . . . , xn ) satisfies: (1) C(x1 , . . . , xn ) ⊆ C(x1 , . . . , xn , xn+1 ) (2) C(π(x1 , . . . , xn )) = C(x1 , . . . , xn ) for any permutation π. Let �C (x1 , . . . , xn ) = card {C ∩ {x1 , . . . , xn }; C ∈ C} and G(n) = E�C(x1 ,...,xn ) (x1 , . . . , xn ). 81 Lecture 31 Optimistic VC inequality for random classes of sets. 18.465 Theorem 31.2. � P (C) − sup P � �n nt2 i=1 I(xi ∈ C) � ≥ t ≤ 4G(2n)e− 4 P (C) 1 n C∈C(x1 ,...,xn ) Consider event � Ax = x = (x1 , . . . , xn ) : P (C) − sup C∈C(x1 ,...,xn ) � �n i=1 I(xi ∈ C) � ≥t P (C) 1 n So, there exists Cx ∈ C(x1 , . . . , xn ) such that P (Cx ) − �n I(xi ∈ Cx ) � i=1 ≥ t. P (Cx ) 1 n For x�1 , . . . , x�n , an independent copy of x, � Px� if P (Cx ) ≥ 1 n � n 1� 1 � P (Cx ) ≤ I(xi ∈ Cx ) ≥ n i=1 4 (which we can assume without loss of generality). Together, n P (Cx ) ≤ 1� I(x�i ∈ Cx ) n i=1 and P (Cx ) − �n I(xi ∈ Cx ) � i=1 ≥t P (Cx ) 1 n imply 1 n �n �n 1 � i=1 I(xi ∈ Cx ) − n i=1 I(xi ∈ Cx ) � � ≥ t. n 1 � i=1 (I(xi ∈ Cx ) + I(xi ∈ Cx )) 2n Indeed, �n I(xi ∈ Cx ) � i=1 0<t≤ P (Cx ) �n P (Cx ) − n1 i=1 I(xi ∈ Cx ) ≤� � � �n 1 1 i=1 I(xi ∈ Cx ) 2 P (Cx ) + n �n �n 1 1 � i=1 I(xi ∈ Cx ) − n i=1 I(xi ∈ Cx ) ≤ � n� � � �n n 1 1 1 � i=1 I(xi ∈ Cx ) + n i=1 I(xi ∈ Cx ) 2 n P (Cx ) − 1 n 82 Lecture 31 Optimistic VC inequality for random classes of sets. 18.465 Hence, multiplying by an indicator, � � n 1 1� � · I(x ∈ Ax ) ≤ Px� P (Cx ) ≤ I(xi ∈ Cx ) · I(x ∈ Ax ) 4 n i=1 ⎛ ⎞ �n �n 1 1 � i=1 I(xi ∈ Cx ) − n i=1 I(xi ∈ Cx ) ⎠ ≤ Px� ⎝ � n� � � ≥t �n n 1 1 1 � I(x ∈ C ) + I(x ∈ C ) x i x i i=1 i=1 2 n n ⎛ ⎞ �n �n 1 1 � I(x ∈ C ) − I(x ∈ C ) x i x i i=1 i=1 n ⎠ � n� � ≤ Px� ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ) I(x ∈ C ) + I(x ∈ C ) x i x i i=1 i=1 2 n n Taking expectation with respect to x on both sides, � � �n P (C) − n1 i=1 I(xi ∈ C) � ≥ t P sup P (C) C∈C(x1 ,...,xn ) ⎛ ⎞ �n �n 1 1 � i=1 I(xi ∈ Cx ) − n i=1 I(xi ∈ Cx ) ⎠ � n� � ≤ 4P ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ) I(x ∈ C ) + I(x ∈ C ) x i x i i=1 i=1 2 n n ⎛ ⎞ � � n n 1 1 � I(x ∈ C ) − I(x ∈ C ) x i x i i=1 i=1 n ⎠ � n� � ≤ 4P ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ,x�1 ,...,x�n ) I(x ∈ C ) + I(x ∈ C ) x i x i i=1 i=1 2 n n ⎛ ⎞ � n 1 � ε (I(x ∈ C ) − I(x ∈ C )) i x i x i ⎠ � � n � i=1 = 4P ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ,x�1 ,...,x�n ) i=1 I(xi ∈ Cx ) + n i=1 I(xi ∈ Cx ) 2 n ⎛ ⎞ �n 1 � ε (I(x ∈ C ) − I(x ∈ C )) i x i x i ⎠ � � n � i=1 = 4EPε ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ,x�1 ,...,x�n ) i=1 I(xi ∈ Cx ) + n i=1 I(xi ∈ Cx ) 2 n By Hoeffding, ⎛ ⎞ � ε (I(x ∈ C ) − I(x ∈ C )) i x i x i ⎠ � � � i=1 4EPε ⎝ sup � ≥t �n n 1 1 1 � C∈C(x1 ,...,xn ,x�1 ,...,x�n ) i=1 I(xi ∈ Cx ) + n i=1 I(xi ∈ Cx ) 2 n ⎛ 1 n �n ⎜ ⎜ ≤ 4E�C(x1 ,...,xn ,x�1 ,...,x�n ) (x1 , . . . , xn , x�1 , . . . , x�n ) · exp ⎜− � ⎝ � √ 2 ≤ 4E�C(x1 ,...,xn ,x�1 ,...,x�n ) (x1 , . . . , xn , x�1 , . . . , x�n ) · e− = 4G(2n)e− t2 1 2n 1 � n (I(xi ∈Cx )−I(xi ∈Cx )) Pn � i=1 (I(xi ∈Cx )+I(xi ∈Cx )) ⎞ ⎟ ⎟ �2 ⎟ ⎠ nt2 4 nt2 4 83