Class SVM<T>

java.lang.Object
smile.base.svm.KernelMachine<T>
smile.classification.SVM<T>
All Implemented Interfaces:
Serializable, ToDoubleFunction<T>, ToIntFunction<T>, Classifier<T>

public class SVM<T> extends KernelMachine<T> implements Classifier<T>
Support vector machines for classification. The basic support vector machine is a binary linear classifier which chooses the hyperplane that represents the largest separation, or margin, between the two classes. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier.

If there exists no hyperplane that can perfectly split the positive and negative instances, the soft margin method will choose a hyperplane that splits the instances as cleanly as possible, while still maximizing the distance to the nearest cleanly split instances.

The soft margin parameter C trades off correct classification of training examples against maximization of the decision function's margin. For larger values of C, a smaller margin will be accepted if the decision function is better at classifying all training points correctly. A lower C will encourage a larger margin, therefore a simpler decision function, at the cost of training accuracy.

The nonlinear SVMs are created by applying the kernel trick to maximum-margin hyperplanes. The resulting algorithm is formally similar, except that every dot product is replaced by a nonlinear kernel function. This allows the algorithm to fit the maximum-margin hyperplane in a transformed feature space. The transformation may be nonlinear and the transformed space be high dimensional. For example, the feature space corresponding Gaussian kernel is a Hilbert space of infinite dimension. Thus though the classifier is a hyperplane in the high-dimensional feature space, it may be nonlinear in the original input space. Maximum margin classifiers are well regularized, so the infinite dimension does not spoil the results.

The effectiveness of SVM depends on the selection of kernel, the kernel's parameters, and the soft margin parameter C. Given a kernel, best combination of C and kernel's parameters is often selected by a grid-search with cross validation.

The dominant approach for creating multi-class SVMs is to reduce the single multi-class problem into multiple binary classification problems. Common methods for such reduction is to build binary classifiers which distinguish between (i) one of the labels to the rest (one-versus-all) or (ii) between every pair of classes (one-versus-one). Classification of new instances for one-versus-all case is done by a winner-takes-all strategy, in which the classifier with the highest output function assigns the class. For the one-versus-one approach, classification is done by a max-wins voting strategy, in which every classifier assigns the instance to one of the two classes, then the vote for the assigned class is increased by one vote, and finally the class with most votes determines the instance classification.

References

  1. Christopher J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2:121-167, 1998.
  2. John Platt. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.
  3. Rong-En Fan, Pai-Hsuen, and Chih-Jen Lin. Working Set Selection Using Second Order Information for Training Support Vector Machines. JMLR, 6:1889-1918, 2005.
  4. Antoine Bordes, Seyda Ertekin, Jason Weston and Leon Bottou. Fast Kernel Classifiers with Online and Active Learning, Journal of Machine Learning Research, 6:1579-1619, 2005.
  5. Tobias Glasmachers and Christian Igel. Second Order SMO Improves SVM Online and Active Learning.
  6. Chih-Chung Chang and Chih-Jen Lin. LIBSVM: a Library for Support Vector Machines.
See Also:
  • Constructor Details

    • SVM

      public SVM(MercerKernel<T> kernel, T[] vectors, double[] weight, double b)
      Constructor.
      Parameters:
      kernel - Kernel function.
      vectors - The support vectors.
      weight - The weights of instances.
      b - The intercept;
  • Method Details

    • numClasses

      public int numClasses()
      Description copied from interface: Classifier
      Returns the number of classes.
      Specified by:
      numClasses in interface Classifier<T>
      Returns:
      the number of classes.
    • classes

      public int[] classes()
      Description copied from interface: Classifier
      Returns the class labels.
      Specified by:
      classes in interface Classifier<T>
      Returns:
      the class labels.
    • predict

      public int predict(T x)
      Description copied from interface: Classifier
      Predicts the class label of an instance.
      Specified by:
      predict in interface Classifier<T>
      Parameters:
      x - the instance to be classified.
      Returns:
      the predicted class label.
    • fit

      public static Classifier<double[]> fit(double[][] x, int[] y, double C, double tol)
      Fits a binary linear SVM.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      Returns:
      the model.
    • fit

      public static Classifier<double[]> fit(double[][] x, int[] y, double C, double tol, int epochs)
      Fits a binary linear SVM.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      epochs - the number of epochs, usually 1 or 2 is sufficient.
      Returns:
      the model.
    • fit

      public static Classifier<int[]> fit(int[][] x, int[] y, int p, double C, double tol)
      Fits a binary linear SVM of binary sparse data.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      p - the dimension of input vector.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      Returns:
      the model.
    • fit

      public static Classifier<int[]> fit(int[][] x, int[] y, int p, double C, double tol, int epochs)
      Fits a binary linear SVM of binary sparse data.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      p - the dimension of input vector.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      epochs - the number of epochs, usually 1 or 2 is sufficient.
      Returns:
      the model.
    • fit

      public static Classifier<SparseArray> fit(SparseArray[] x, int[] y, int p, double C, double tol)
      Fits a binary linear SVM.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      p - the dimension of input vector.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      Returns:
      the model.
    • fit

      public static Classifier<SparseArray> fit(SparseArray[] x, int[] y, int p, double C, double tol, int epochs)
      Fits a binary linear SVM.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      p - the dimension of input vector.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      epochs - the number of epochs, usually 1 or 2 is sufficient.
      Returns:
      the model.
    • fit

      public static <T> SVM<T> fit(T[] x, int[] y, MercerKernel<T> kernel, double C, double tol)
      Fits a binary SVM.
      Type Parameters:
      T - the data type.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      kernel - the kernel function.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      Returns:
      the model.
    • fit

      public static <T> SVM<T> fit(T[] x, int[] y, MercerKernel<T> kernel, double C, double tol, int epochs)
      Fits a binary SVM.
      Type Parameters:
      T - the data type.
      Parameters:
      x - training samples.
      y - training labels of {-1, +1}.
      kernel - the kernel function.
      C - the soft margin penalty parameter.
      tol - the tolerance of convergence test.
      epochs - the number of epochs, usually 1 or 2 is sufficient.
      Returns:
      the model.
    • fit

      public static Classifier<double[]> fit(double[][] x, int[] y, Properties params)
      Fits a binary or multiclass SVM.
      Parameters:
      x - training samples.
      y - training labels.
      params - the hyper-parameters.
      Returns:
      the model.