Appendix s1 We introduce a simple optimization procedure for the support vector machine as described in Equation (5). Recall that objective is n 1 2 πΊ(π) = min [ ||π|| + πΆ ∑ max(1 − yi π T Xi , 0)] π 2 i=1 Because it is not strictly monotonic, we can calculate the subgradient directions of the objective function as follows, 0 −π¦π π π ππ < −1 ππΊ π¦π ππ =π+∑ − π¦π π π ππ = 1 ππ 2 π { −π¦π ππ −π¦π π π ππ > −1 and iteratively optimize the support vector machine objective. Algorithm 1: Subgradient descent optimization for SVM Input: Features πΏ, labels π, parameter πΆ, precision π Output: Learned weight parameters π 1. Initilaize weight parameters π with random values ranging from 0 to 1. ππΊ 2. For all elements ππ ∈ πΏ, π¦π ∈ π calculate Δπ = ππ using the equation above the algorithm. 3. Update ππ‘+1 = ππ‘ + πΔπ, where π‘ is an index for the iteration and π is a step size parameter (i.e., a small number like 0.01) 4. If the maximum difference between ππ‘+1 and ππ‘ is smaller than the precision π, terminate the algorithm and return outputs. Otherwise, iterate step 2-3 until convergence.