Appendix s1 We introduce a simple optimization procedure for the

advertisement
Appendix s1
We introduce a simple optimization procedure for the support vector machine as
described in Equation (5). Recall that objective is
n
1
2
𝐺(π‘Š) = min [ ||π‘Š|| + 𝐢 βˆ‘ max(1 βˆ’ yi π‘Š T Xi , 0)]
π‘Š 2
i=1
Because it is not strictly monotonic, we can calculate the subgradient directions of
the objective function as follows,
0 βˆ’π‘¦π‘– π‘Š 𝑇 𝑋𝑖 < βˆ’1
πœ•πΊ
𝑦𝑖 𝑋𝑖
=π‘Š+βˆ‘
βˆ’
𝑦𝑖 π‘Š 𝑇 𝑋𝑖 = 1
πœ•π‘Š
2
𝑖
{ βˆ’π‘¦π‘– 𝑋𝑖 βˆ’π‘¦π‘– π‘Š 𝑇 𝑋𝑖 > βˆ’1
and iteratively optimize the support vector machine objective.
Algorithm 1: Subgradient descent optimization for SVM
Input: Features 𝑿, labels π’š, parameter 𝐢, precision πœ–
Output: Learned weight parameters π‘Š
1. Initilaize weight parameters π‘Š with random values ranging from 0 to 1.
πœ•πΊ
2. For all elements 𝑋𝑖 ∈ 𝑿, 𝑦𝑖 ∈ π’š calculate Ξ”π‘Š = πœ•π‘Š using the equation
above the algorithm.
3. Update π‘Šπ‘‘+1 = π‘Šπ‘‘ + πœ‚Ξ”π‘Š, where 𝑑 is an index for the iteration and πœ‚ is a
step size parameter (i.e., a small number like 0.01)
4. If the maximum difference between π‘Šπ‘‘+1 and π‘Šπ‘‘ is smaller than the
precision πœ–, terminate the algorithm and return outputs. Otherwise, iterate
step 2-3 until convergence.
Download
Related flashcards
Number theory

27 Cards

Create flashcards