Interface Model

All Known Implementing Classes:
ClassificationModel, RegressionModel

public interface Model
Generic model interface.
  • Field Details

  • Method Details

    • algorithm

      String algorithm()
      Returns the algorithm name.
      Returns:
      the algorithm name.
    • schema

      StructType schema()
      Returns the schema of input data (without response variable).
      Returns:
      the schema of input data (without response variable).
    • formula

      Formula formula()
      Returns the model formula.
      Returns:
      the model formula.
    • tags

      Properties tags()
      Returns the model metadata tags.
      Returns:
      the model metadata tags.
    • getTag

      default String getTag(String key)
      Returns the model metadata tag.
      Parameters:
      key - the tag key.
      Returns:
      the tag value.
    • getTag

      default String getTag(String key, String defaultValue)
      Returns the model metadata tag.
      Parameters:
      key - the tag key.
      defaultValue - a default value.
      Returns:
      the tag value.
    • setProperty

      default void setProperty(String key, String value)
      Sets a model metadata tag.
      Parameters:
      key - the tag key.
      value - the tag value.
    • classification

      static ClassificationModel classification(String algorithm, Formula formula, DataFrame data, DataFrame test, Properties params, int kfold, int round, boolean ensemble)
      Trains a classification model by cross validation.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      test - the optional test data.
      params - the hyperparameters.
      kfold - k-fold cross validation.
      round - the number of repeated cross validation.
      ensemble - create the ensemble of cross validation models if true.
      Returns:
      the classification model.
    • classification

      static ClassificationModel classification(String algorithm, Formula formula, DataFrame data, DataFrame test, Properties params)
      Trains a classification model.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      test - the optional test data.
      params - the hyperparameters.
      Returns:
      the classification model.
    • classification

      static DataFrameClassifier classification(String algorithm, Formula formula, DataFrame data, Properties params)
      Trains a classification model.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      params - the hyperparameters.
      Returns:
      the classification model.
    • regression

      static RegressionModel regression(String algorithm, Formula formula, DataFrame data, DataFrame test, Properties params, int kfold, int round, boolean ensemble)
      Trains a regression model.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      test - the optional test data.
      params - the hyperparameters.
      kfold - k-fold cross validation if kfold > 1.
      round - the number of repeated cross validation.
      ensemble - create the ensemble of cross validation models if true.
      Returns:
      the regression model.
    • regression

      static RegressionModel regression(String algorithm, Formula formula, DataFrame data, DataFrame test, Properties params)
      Trains a regression model.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      test - the optional test data.
      params - the hyperparameters.
      Returns:
      the regression model.
    • regression

      static DataFrameRegression regression(String algorithm, Formula formula, DataFrame data, Properties params)
      Trains a regression model.
      Parameters:
      algorithm - the learning algorithm.
      formula - the model formula.
      data - the training data.
      params - the hyperparameters.
      Returns:
      the regression model.