Interface Classifier<T>

Type Parameters:
T - the type of model input object.
All Superinterfaces:
Serializable, ToDoubleFunction<T>, ToIntFunction<T>
All Known Subinterfaces:
DataFrameClassifier
All Known Implementing Classes:
AbstractClassifier, AdaBoost, DecisionTree, DiscreteNaiveBayes, FLD, GradientTreeBoost, KNN, LDA, LogisticRegression, LogisticRegression.Binomial, LogisticRegression.Multinomial, Maxent, Maxent.Binomial, Maxent.Multinomial, MLP, NaiveBayes, OneVersusOne, OneVersusRest, QDA, RandomForest, RBFNetwork, RDA, SparseLogisticRegression, SparseLogisticRegression.Binomial, SparseLogisticRegression.Multinomial, SVM

public interface Classifier<T> extends ToIntFunction<T>, ToDoubleFunction<T>, Serializable
A classifier assigns an input object into one of a given number of categories. The input object is formally termed an instance, and the categories are termed classes. The instance is usually described by a vector of features, which together constitute a description of all known characteristics of the instance.

Classification normally refers to a supervised procedure, i.e. a procedure that produces an inferred function to predict the output value of new instances based on a training set of pairs consisting of an input object and a desired output value. The inferred function is called a classifier if the output is discrete or a regression function if the output is continuous.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Interface
    Description
    static interface 
    The classifier trainer.
  • Method Summary

    Modifier and Type
    Method
    Description
    default double
     
    default int
     
    int[]
    Returns the class labels.
    static <T> Classifier<T>
    ensemble(Classifier<T>... models)
    Return an ensemble of multiple base models to obtain better predictive performance.
    int
    Returns the number of classes.
    default boolean
    Returns true if this is an online learner.
    default int[]
    Predicts the class labels of a list of instances.
    default int[]
    predict(List<T> x, List<double[]> posteriori)
    Predicts the class labels of a list of instances.
    default int[]
    Predicts the class labels of a dataset.
    default int[]
    predict(Dataset<T,?> x, List<double[]> posteriori)
    Predicts the class labels of a dataset.
    int
    Predicts the class label of an instance.
    default int[]
    predict(T[] x)
    Predicts the class labels of an array of instances.
    default int[]
    predict(T[] x, double[][] posteriori)
    Predicts the class labels of an array of instances.
    default int
    predict(T x, double[] posteriori)
    Predicts the class label of an instance and also calculate a posteriori probabilities.
    default double
    score(T x)
    The raw prediction score.
    default boolean
    Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.
    default void
    Updates the model with a mini-batch of new samples.
    default void
    update(T[] x, int[] y)
    Updates the model with a mini-batch of new samples.
    default void
    update(T x, int y)
    Online update the classifier with a new training instance.
  • Method Details

    • numClasses

      int numClasses()
      Returns the number of classes.
      Returns:
      the number of classes.
    • classes

      int[] classes()
      Returns the class labels.
      Returns:
      the class labels.
    • predict

      int predict(T x)
      Predicts the class label of an instance.
      Parameters:
      x - the instance to be classified.
      Returns:
      the predicted class label.
    • score

      default double score(T x)
      The raw prediction score.
      Parameters:
      x - the instance to be classified.
      Returns:
      the raw prediction score.
    • applyAsInt

      default int applyAsInt(T x)
      Specified by:
      applyAsInt in interface ToIntFunction<T>
    • applyAsDouble

      default double applyAsDouble(T x)
      Specified by:
      applyAsDouble in interface ToDoubleFunction<T>
    • predict

      default int[] predict(T[] x)
      Predicts the class labels of an array of instances.
      Parameters:
      x - the instances to be classified.
      Returns:
      the predicted class labels.
    • predict

      default int[] predict(List<T> x)
      Predicts the class labels of a list of instances.
      Parameters:
      x - the instances to be classified.
      Returns:
      the predicted class labels.
    • predict

      default int[] predict(Dataset<T,?> x)
      Predicts the class labels of a dataset.
      Parameters:
      x - the dataset to be classified.
      Returns:
      the predicted class labels.
    • soft

      default boolean soft()
      Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.
      Returns:
      true if soft classifier.
    • predict

      default int predict(T x, double[] posteriori)
      Predicts the class label of an instance and also calculate a posteriori probabilities. Classifiers may NOT support this method since not all classification algorithms are able to calculate such a posteriori probabilities.
      Parameters:
      x - an instance to be classified.
      posteriori - a posteriori probabilities on output.
      Returns:
      the predicted class label
    • predict

      default int[] predict(T[] x, double[][] posteriori)
      Predicts the class labels of an array of instances.
      Parameters:
      x - the instances to be classified.
      posteriori - a posteriori probabilities on output.
      Returns:
      the predicted class labels.
    • predict

      default int[] predict(List<T> x, List<double[]> posteriori)
      Predicts the class labels of a list of instances.
      Parameters:
      x - the instances to be classified.
      posteriori - an empty list to store a posteriori probabilities on output.
      Returns:
      the predicted class labels.
    • predict

      default int[] predict(Dataset<T,?> x, List<double[]> posteriori)
      Predicts the class labels of a dataset.
      Parameters:
      x - the dataset to be classified.
      posteriori - an empty list to store a posteriori probabilities on output.
      Returns:
      the predicted class labels.
    • online

      default boolean online()
      Returns true if this is an online learner.
      Returns:
      true if online learner.
    • update

      default void update(T x, int y)
      Online update the classifier with a new training instance. In general, this method may be NOT multi-thread safe.
      Parameters:
      x - the training instance.
      y - the training label.
    • update

      default void update(T[] x, int[] y)
      Updates the model with a mini-batch of new samples.
      Parameters:
      x - the training instances.
      y - the training labels.
    • update

      default void update(Dataset<T,Integer> batch)
      Updates the model with a mini-batch of new samples.
      Parameters:
      batch - the training instances.
    • ensemble

      @SafeVarargs static <T> Classifier<T> ensemble(Classifier<T>... models)
      Return an ensemble of multiple base models to obtain better predictive performance.
      Type Parameters:
      T - the type of model input object.
      Parameters:
      models - the base models.
      Returns:
      the ensemble model.