Appendix s1 We introduce a simple optimization procedure for the

advertisement
Appendix s1
We introduce a simple optimization procedure for the support vector machine as
described in Equation (5). Recall that objective is
n
1
2
𝐺(π‘Š) = min [ ||π‘Š|| + 𝐢 ∑ max(1 − yi π‘Š T Xi , 0)]
π‘Š 2
i=1
Because it is not strictly monotonic, we can calculate the subgradient directions of
the objective function as follows,
0 −𝑦𝑖 π‘Š 𝑇 𝑋𝑖 < −1
πœ•πΊ
𝑦𝑖 𝑋𝑖
=π‘Š+∑
−
𝑦𝑖 π‘Š 𝑇 𝑋𝑖 = 1
πœ•π‘Š
2
𝑖
{ −𝑦𝑖 𝑋𝑖 −𝑦𝑖 π‘Š 𝑇 𝑋𝑖 > −1
and iteratively optimize the support vector machine objective.
Algorithm 1: Subgradient descent optimization for SVM
Input: Features 𝑿, labels π’š, parameter 𝐢, precision πœ–
Output: Learned weight parameters π‘Š
1. Initilaize weight parameters π‘Š with random values ranging from 0 to 1.
πœ•πΊ
2. For all elements 𝑋𝑖 ∈ 𝑿, 𝑦𝑖 ∈ π’š calculate Δπ‘Š = πœ•π‘Š using the equation
above the algorithm.
3. Update π‘Šπ‘‘+1 = π‘Šπ‘‘ + πœ‚Δπ‘Š, where 𝑑 is an index for the iteration and πœ‚ is a
step size parameter (i.e., a small number like 0.01)
4. If the maximum difference between π‘Šπ‘‘+1 and π‘Šπ‘‘ is smaller than the
precision πœ–, terminate the algorithm and return outputs. Otherwise, iterate
step 2-3 until convergence.
Download