18.657. Fall 2105 October 18, 2015 Rigollet

18.657. Fall 2105 Rigollet October 18, 2015 Problem set #2 (due Wed., October 21) Should be typed in LATEX Problem 1. Rademacher Complexities and beyond Let F be a class of functions from X to IR and let X1 , . . . , Xn be iid copies of a random variable X ∈ X . Moreover, let σ1 , . . . , σn be n i.i.d. Rad(1/2) random variables and let g1 , . . . , gn be n i.i.d. N(0, 1). Assume that all these random variables are mutually independent. 1. Prove the desymmetrization inequality: n n 1 X 1 X h h i i IE sup f (Xi ) − IE[f (X)] σi f (Xi ) − IE[f (X)] ≤ 2IE sup f ∈F n i=1 f ∈F n i=1 2. Prove the Rademacher/Gaussian process comparison inequality n n i rπ h h i X X σi f (Xi ) ≤ IE sup IE sup gi f (Xi ) 2 f ∈F i=1 f ∈F i=1 n i 1 X σ f (X ) i i . Let F and G be two set of functions from X to n f ∈F i=1 IR and recall that F + G = {f + g : f ∈ F , g ∈ G}. h Define Rn (F ) = IE sup 3. Let h ∈ IRX be a given function and define F + h = {f + h : f ∈ F }. Show that khk∞ Rn (F + {h}) ≤ Rn (F ) + √ , n where khk∞ = supx∈X |h(x)|. 4. Let F1 , . . . , Fk be k sets of functions from X to IR. Show that Rn (F1 + · · · , Fk ) ≤ k X j=1 Rn (Fj ) . 5. Show that this inequality derived in 4. is in fact an equality when the Fj s are the same. page 1 of 4 Problem 2. Covering and packing Definition: A set P ⊂ T is called an ε-packing of the metric space (T, d) if d(f, g) > ε for every f, g ∈ P , f = 6 g. The largest cardinality of an ε-packing of (T, d) is called the packing number of (T, d): D(T, d, ε) = sup card(P ) : P is an ε packing of (T, d) Recall that N(T, d, ε) denotes the ε-covering number of (T, d). 1. Show that D(T, d, 2ε) ≤ N(T, d, ε) ≤ D(T, d, ε) Let M be an n × m random matrix with entries that are i.i.d Rad(1/2) entries. We are interested in its operator norm kMk = u⊤ Mv . sup u∈IRn : |u|2 ≤1 v∈IRm : |v|2 ≤1 2. Show that kMk ≤ 2 max u⊤ Mv , u∈Nn v∈Nm where Nn and Nm are 41 -nets of the unit Euclidean balls of IRn and IRm respectively. 3. Conclude that IEkMk ≤ C √ m+ √ n . Problem 3. Chaining Let F be the class of all nondecreasing functions from [0, 1] to [0, 1]. 1. Show that for any x = (x1 , . . . , xn ) ∈ [0, 1]n , the covering number of (F , dx∞) satisfy: N(F , dx∞ , ε) ≤ n2/ε . 2. Using the chaining bound, show that Rn (F ) ≤ C r log n n page 2 of 4 3. Show that there is indeed a strict improvement over the bound obtained using the theorem in section 5.2.1 Problem 4. Kernel ridge regression Consider the regression model: Yi = f (xi ) + ξi , , i = 1, . . . , n where x1 , . . . , xn are fixed design points in IRd , ξ = (ξ1 , . . . , ξn ) ∼ N (0, Σ) ∈ IRn with known covariance matrix Σ ≻ 0 and f : IRd → IR is an unknown regression function. Let W be an RKHS on IRd with reproducing kernel k. Define Y = (Y1 , . . . , Yn )⊤ and g = [g(x1 ), . . . , g(xn )]⊤ for any function g. Define the estimator fˆ of f by fˆ = argmin ψ(Y − g) + µkgk2W g∈W where k · kW denotes the Hilbert norm on W , ψ(x) = x⊤ Σ−1/2 x and µ > 0 is a tuning parameter to be chosen later. 1. Prove the representer theorem, i.e., that there exists a vector θ ∈ IRn such that fˆ(x) = n X θi k(xi , x) , i=1 for any x ∈ IRd ˆ n )]⊤ satisfies 2. Prove that the vector f̂ = [fˆ(x1 ), . . . , f(x (KΣ−1/2 + µIn )f̂ = KΣ−1/2 Y , where In is the identity matrix of IRn and K denotes the symmetric n × n matrix with elements Ki,j = k(xi , xj ). 3. Prove that the following inequality holds n 2 1X 2 ψ(f − f̂) ≤ inf ψ(f − g) + 2µkgkW + Zi k(xi , ·)W , g∈W µ i=1 where Z1 , . . . , Zn are iid N (0, 1). 4. Conclude that 1 2 ψ(f − g) + 2µkg kW + Tr(K) , g∈W µ IEψ(f − f̂) ≤ inf where Tr(K) denotes the trace of K. page 3 of 4 5. Assume now that k is the Gaussian kernel: ′ 2 k(x, x′ ) = e−|x−x |2 Show that there exists a choice of µ for which √ IEψ(f − f̂) ≤ 2kf kW 2n . page 4 of 4 MIT OpenCourseWare http://ocw.mit.edu 18.657 Mathematics of Machine Learning Fall 2015 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

18.657. Fall 2105 October 18, 2015 Rigollet

Related documents

Products

Support

18.657. Fall 2105 October 18, 2015 Rigollet

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib