3.12 Support vector machine
A soft-margin support vector machine (SVM) is the classic classification QP. Given labelled points (x_i, y_i) with y_i ∈ {−1, +1}, it fits a separating hyperplane wᵀx + b by solving
minimize ½‖w‖₂² + C·Σ ξ_i
subject to y_i (wᵀx_i + b) ≥ 1 − ξ_i, ξ_i ≥ 0
The slacks ξ_i (the hinge losses) let points fall inside or past the margin at a cost C. This example exposes (train-svm data #:C C), returning the fitted w and b.
(require racket/list scs)
(provide train-svm run-example)
3.12.1 Variables and standard form
Stack the unknowns as v = (w, b, ξ) with d weights, one intercept, and m slacks. The objective is quadratic only in w, so P is the identity on the w block (giving ½‖w‖₂²) and zero elsewhere, and c places C on each ξ. Both constraint families become positive-orthant rows once moved to Ax ≤ b form: the margin constraint y_i(wᵀx_i + b) + ξ_i ≥ 1 negates to
−y_i x_iᵀ w − y_i b − ξ_i ≤ −1
and ξ_i ≥ 0 is −ξ_i ≤ 0.
(define (sparse-row n entries) (for/list ([k (in-range n)]) (cond [(assoc k entries) => cdr] [else 0])))
(define (train-svm data #:C [C 1.0]) (define m (length data)) (define d (length (caar data))) ; feature dimension (define n (+ d 1 m)) ; (w, b, slacks) (define b-col d) ; column of the intercept b (define (slack-col i) (+ d 1 i)) ; column of slack i (define P (apply scs:sparse-matrix n n (for/list ([j (in-range d)]) (list j j 1.0)))) (define rows (append* (for/list ([pt (in-list data)] [i (in-naturals)]) (define xs (car pt)) (define y (cadr pt)) (define margin (sparse-row n (append (for/list ([f (in-range d)]) (cons f (* (- y) (list-ref xs f)))) (list (cons b-col (- y)) (cons (slack-col i) -1))))) (define nonneg (sparse-row n (list (cons (slack-col i) -1)))) (list margin nonneg)))) (define A (apply scs:matrix (* 2 m) n (append* rows))) (define rhs (append* (for/list ([_ (in-range m)]) (list -1.0 0.0)))) (define c (list->vector (append (make-list (+ d 1) 0.0) (make-list m C)))) (define result (solve #:A A #:b rhs #:c c #:P P #:cone (make-cone #:positive (* 2 m)) #:settings (make-settings #:eps-abs 1e-9 #:eps-rel 1e-9))) (define x (scs-result-x result)) (values (for/vector ([j (in-range d)]) (vector-ref x j)) (vector-ref x b-col)))
Running it.
On a small symmetric, linearly separable set the max-margin hyperplane is vertical: w = (0.5, 0), b = 0.
(define dataset '(((2.0 1.0) 1) ((2.0 -1.0) 1) ((-2.0 1.0) -1) ((-2.0 -1.0) -1))) (define (run-example) (define-values (w b) (train-svm dataset #:C 1.0)) (list w b))
(run-example) ; (list #(0.5 0.0) 0.0)