A non-exhaustive list of SVM solvers

Yutong Wang

2020/10/02

Disclaimers:


Libraries

Generic ML libraries which include SVMs solvers

Shogun


Exact SVM solvers

This section lists solvers that asymptotically solve the SVM optimization (primal or dual) to optimality.

LIBSVM

paper

LIBSVM implements the “one-against-one” approach (Knerr et al., 1990) for multiclass classification. Some early works of applying this strategy to SVM include, for example, Kressel (1998). If \(k\) is the number of classes, then \(k(k − 1)/2\) classifiers are constructed and each one trains data from two classes.

LIBLINEAR

paper

For multi-class problems, we implement the one-vs-the-rest strategy and a method by Crammer and Singer. Details are in Keerthi et al. (2008).

Note that LIBLINEAR does not use the bias term b by default.

  • Though covered by Keerthi et al. (2008), the Weston-Watkins linear SVM is not implemented by LIBLINEAR.
  • BSVM implements the Weston-Watkins kernel SVM. See section on BSVM.
  • Shark implements the Weston-Watkins linear SVM. See code here.

liquidSVM

paper

All currently available solvers are based on the design principles for the hinge loss solver described by Steinwart et al. (2011).

For our implementations [of multiclass classificaiton] we used [one-versus-all] with the least-squares solver and no cell splitting.

Pegasos

paper

LASVM

paper website

LASVM is an approximate SVM solver that uses online approximation. It reaches accuracies similar to that of a real SVM after performing a single sequential pass through the training examples. Further benefits can be achieved using selective sampling techniques to choose which example should be considered next.

As show in the graph, LASVM requires considerably less memory than a regular SVM solver. This becomes a considerable speed advantage for large training sets. In fact LASVM has been used to train a 10 class SVM classifier with 8 million examples on a single processor.

SVMLight

website


Hierarchical solvers

ThunderSVM

paper

cuML SVM

article

cuML’s single GPU SVM package is 50x faster than ThunderSVM-CPU on 40 CPU cores. The reason is that the GPUs excel at the time-consuming kernel function calculation. The middle figure zooms onto the curves that show GPU training time for cuML and ThunderSVM. The training time with cuML is 22% faster than ThunderSVM-GPU for this dataset.

The prediction speedup of cuML SVM is even more impressive than its training speedup. It is more than 4x faster than ThunderSVM on GPU. Compared to ThunderSVM CPU it is 88x faster and compared to scikit-learn using 100k samples, it is 1000x faster.

The cuML SVM package is still in development and it does not yet offer as wide a range of functionality as LIBSVM or ThunderSVM.

LPSVM

arXiv


Approximate SVM solvers

DC-SVM

Divide-and-conquer solver for kernel SVMs

paper

EnsembleSVM

paper

BudgetedSVM

paper

We present BudgetedSVM, an open-source C++ toolbox comprising highly-optimized implementations of recently proposed algorithms for scalable training of Support Vector Machine (SVM) approximators: Adaptive Multi-hyperplane Machines, Low-rank Linearization SVM, and Budgeted Stochastic Gradient Descent. BudgetedSVM trains models with accuracy comparable to LibSVM in time comparable to LibLinear, solving non-linear problems with millions of high-dimensional examples within minutes on a regular computer. We provide command-line and Matlab interfaces to BudgetedSVM, an efficient API for handling large-scale, high-dimensional data sets, as well as detailed documentation to help developers use and further extend the toolbox.


GPU

Solvers designed specifically for SVM on GPU architecture.

GTSVM

website

OHD-SVM

paper


Multiclass

Solvers for single-machine multiclass SVMs, e.g., Crammer-Singer and Weston-Watkins SVMs.

MSVMpack

website paper 1 paper 2

implements the Weston-Watkins and the Crammer-Singer SVMs.

BSVM

website paper

implements the Weston-Watkins SVM (Section 3 of paper) and Crammer-Singer SVM (section 4 of paper).

LaRank

The related LaSVM algorithm (Bordes et al., 2005) alternates steps exploiting a fresh random training example and steps exploiting current support vectors selected using the gradient. We now extend this idea to the multiclass formulation.

Experiments were carried out on four datasets briefly described in table 1. The LETTER and USPS datasets are available from the UCI repository. The MNIST dataset is a well known handwritten digit recognition benchmark. The INEX dataset contains scientific articles from 18 journals and proceedings of the IEEE.

GaLa: Improved Working Set Selection for LaRank

paper

Crammer-Singer SVM

paper code

The original Crammer-Singer SVM paper.