Back to Calculus class: Want to maximize f(x) =–2x2 +4x –3 such that x >=0. Take derivatives: f’(x) =-4x +4 Set derivative to 0 implies x =1 is maximum Want to maximize f(x) =–2x2 -4x –3 such that x >=0. Take derivatives: f’(x) =-4x -4 Set derivative to 0 implies x =-1 is maximum, which means that maximum for x non negative occurs at 0. Assume f(x, y) =–2x2 +4x –3- y2 +4y –1 Take derivatives: df=-4x +4 dx df=-2y +4 dy and set them to 0, and get maximum at x=1,y=2 But if f(x, y) =–2x2 +4x –3- y2 +4y –1+2xy The situation is not that simple because of the cross term. df=-4x +4 +2y dx df=-2y +4+2x dy These derivatives as a vector are called the gradient. We could go in the direction of the gradient. Thus if we were at x=2, y =3, we would go in the direction (+2,2) Recall Taylor’s theorem g(x+a) = g(x) + g’(x)a + a2g”(x)/2 + a cube terms An approximation for g(x+a) =0 means a= -g(x)/g’(x), so x+a might be close to a zero of g. This is called Newton’s method. If g were linear, then g”(x) =0, Newton’s step would give us the zero of the function exactly. In the support vector problem, g is the gradient. It is linear, but it is not a scalar quantity but a vector. Moreover g’(x) is a matrix of second derivatives g’ = d2f dx2 d2f dxdy d2f dxdy d2f dy2 = -4 2 2 -2 So essentially Newton’s method leads to solving a linear system.