Interface DataFrameClassifier

All Superinterfaces:
Classifier<Tuple>, Serializable, ToDoubleFunction<Tuple>, ToIntFunction<Tuple>
All Known Implementing Classes:
AdaBoost, DecisionTree, GradientTreeBoost, RandomForest

public interface DataFrameClassifier extends Classifier<Tuple>
Classification trait on DataFrame.
  • Method Details

    • formula

      Formula formula()
      Returns the formula associated with the model.
      Returns:
      the formula associated with the model.
    • schema

      StructType schema()
      Returns the predictor schema.
      Returns:
      the predictor schema.
    • predict

      default int[] predict(DataFrame data)
      Predicts the class labels of a data frame.
      Parameters:
      data - the data frame.
      Returns:
      the predicted class labels.
    • predict

      default int[] predict(DataFrame data, List<double[]> posteriori)
      Predicts the class labels of a dataset.
      Parameters:
      data - the data frame.
      posteriori - an empty list to store a posteriori probabilities on output.
      Returns:
      the predicted class labels.
    • of

      static DataFrameClassifier of(Formula formula, DataFrame data, Properties params, Classifier.Trainer<double[],?> trainer)
      Fits a vector classifier on data frame.
      Parameters:
      formula - a symbolic description of the model to be fitted.
      data - the data frame of the explanatory and response variables.
      params - the hyperparameters.
      trainer - the training lambda.
      Returns:
      the model.
    • ensemble

      static DataFrameClassifier ensemble(DataFrameClassifier... models)
      Return an ensemble of multiple base models to obtain better predictive performance.
      Parameters:
      models - the base models.
      Returns:
      the ensemble model.