Document 10784904

advertisement
Two Examples of Arguments Establishing That a Function is a "Kernel" and an "Implicit
Set of Features" Insight
On the first HW, the functions

K1  x, z   exp  x  z
K 2  x, z   1  x, z

2

and
d
were advertised to be legitimate kernel functions for any choice of   0 and positive integer d .
First consider using the facts on Slide 4 of Module 48 to establish this.
To begin,
x  z  x  z, x  z  x, x  2 x, z  z, z
2
so
K1  x, z   exp   x, x  exp  2 x, z  exp   z, z

By Fact 8, x, z is a kernel function. Then, by Fact 1, so also is 2 x, z . By Fact 4,
exp  2 x, z
 is a kernel function.
Let
f  x   exp   x, x

and apply Fact 2 to conclude that K1  x, z  is a kernel function.
Second, both 1 and x, z are kernel functions. Apply Fact 5 to conclude that 1  x, z is a
kernel function. Then apply Fact 6 to conclude 1  x, z

2
is a kernel function and induction to
conclude that K 2  x, z  is a kernel function.
Then, a potentially helpful insight about kernels is that often they can be viewed as implicitly
defining a (potentially infinite) set of (latent) features and using "regular" Euclidean inner
products with those features. For a very concrete example, consider the simple d  2 version of
K 2  x, z  for the case of x, z belonging to  2 . Define the function φ : 2  6 by
1 
φ  x   1, x1 , x2 , x12 , x22 , 2 x1 x2

Then it's obvious that
φ  x  , φ  z   1  x1 z1  x2 z2  x12 z12  x22 z22  2 x1 x2 z1 z2  K 2  x, z 
and the kernel is a Euclidean inner product for a special set of 6 features derived from x .
Interestingly enough, a similar argument can be made for K1  x, z  , but requiring the implicit
definition of an infinite number of latent features. For concreteness sake, consider again p  2 .
It's possible to argue (using the Taylor series expansion of the exponential function about 0 and a
set of coordinate functions of a φ : 2   that are multiples of all possible products of the
form x1l x2m for non-negative integers l and m ) that one can find a φ such that K1  x, z  is a
"regular  inner product"
K1  x, z   φ  x  , φ  z 


  l  x  l  z 
l 1
That is, for both of the kernels of HW1, the "implicit transformation to latent features and use of
a Euclidean inner product" interpretation is possible. As it turns out, a result called Mercer's
Theorem essentially guarantees that this kind of interpretation is possible for a wide class of
kernels including these.
2 
Download