Class PlattScaling

java.lang.Object
smile.classification.PlattScaling
All Implemented Interfaces:
Serializable

public class PlattScaling extends Object implements Serializable
Platt scaling or Platt calibration is a way of transforming the outputs of a classification model into a probability distribution over classes. The method was invented by John Platt in the context of support vector machines, but can be applied to other classification models. Platt scaling works by fitting a logistic regression model to a classifier's scores.

Platt suggested using the Levenberg–Marquardt algorithm to optimize the parameters, but a Newton algorithm was later proposed that should be more numerically stable, which is implemented in this class.

References

  1. John Platt. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Advances in large margin classifiers. 10 (3): 61–74.
See Also:
  • Constructor Details

    • PlattScaling

      public PlattScaling(double alpha, double beta)
      Constructor. P(y = 1 | x) = 1 / (1 + exp(alpha * f(x) + beta))
      Parameters:
      alpha - The scaling parameter.
      beta - The scaling parameter.
  • Method Details

    • scale

      public double scale(double y)
      Returns the posterior probability estimate P(y = 1 | x).
      Parameters:
      y - the binary classifier output score.
      Returns:
      the estimated probability.
    • fit

      public static PlattScaling fit(double[] scores, int[] y)
      Trains the Platt scaling.
      Parameters:
      scores - The predicted scores.
      y - The training labels.
      Returns:
      the model.
    • fit

      public static PlattScaling fit(double[] scores, int[] y, int maxIters)
      Trains the Platt scaling.
      Parameters:
      scores - The predicted scores.
      y - The training labels.
      maxIters - The maximal number of iterations.
      Returns:
      the model.
    • fit

      public static <T> PlattScaling fit(Classifier<T> model, T[] x, int[] y)
      Fits Platt Scaling to estimate posteriori probabilities.
      Type Parameters:
      T - the data type.
      Parameters:
      model - the binary-class model to fit Platt scaling.
      x - training samples.
      y - training labels.
      Returns:
      the model.