Two Examples of Arguments Establishing That a Function is a "Kernel" and an "Implicit Set of Features" Insight On the first HW, the functions K1 x, z exp x z K 2 x, z 1 x, z 2 and d were advertised to be legitimate kernel functions for any choice of 0 and positive integer d . First consider using the facts on Slide 4 of Module 48 to establish this. To begin, x z x z, x z x, x 2 x, z z, z 2 so K1 x, z exp x, x exp 2 x, z exp z, z By Fact 8, x, z is a kernel function. Then, by Fact 1, so also is 2 x, z . By Fact 4, exp 2 x, z is a kernel function. Let f x exp x, x and apply Fact 2 to conclude that K1 x, z is a kernel function. Second, both 1 and x, z are kernel functions. Apply Fact 5 to conclude that 1 x, z is a kernel function. Then apply Fact 6 to conclude 1 x, z 2 is a kernel function and induction to conclude that K 2 x, z is a kernel function. Then, a potentially helpful insight about kernels is that often they can be viewed as implicitly defining a (potentially infinite) set of (latent) features and using "regular" Euclidean inner products with those features. For a very concrete example, consider the simple d 2 version of K 2 x, z for the case of x, z belonging to 2 . Define the function φ : 2 6 by 1 φ x 1, x1 , x2 , x12 , x22 , 2 x1 x2 Then it's obvious that φ x , φ z 1 x1 z1 x2 z2 x12 z12 x22 z22 2 x1 x2 z1 z2 K 2 x, z and the kernel is a Euclidean inner product for a special set of 6 features derived from x . Interestingly enough, a similar argument can be made for K1 x, z , but requiring the implicit definition of an infinite number of latent features. For concreteness sake, consider again p 2 . It's possible to argue (using the Taylor series expansion of the exponential function about 0 and a set of coordinate functions of a φ : 2 that are multiples of all possible products of the form x1l x2m for non-negative integers l and m ) that one can find a φ such that K1 x, z is a "regular inner product" K1 x, z φ x , φ z l x l z l 1 That is, for both of the kernels of HW1, the "implicit transformation to latent features and use of a Euclidean inner product" interpretation is possible. As it turns out, a result called Mercer's Theorem essentially guarantees that this kind of interpretation is possible for a wide class of kernels including these. 2