Class Maxent

java.lang.Object
smile.classification.AbstractClassifier<int[]>
smile.classification.Maxent
All Implemented Interfaces:
Serializable, ToDoubleFunction<int[]>, ToIntFunction<int[]>, Classifier<int[]>
Direct Known Subclasses:
Maxent.Binomial, Maxent.Multinomial

public abstract class Maxent extends AbstractClassifier<int[]>
Maximum Entropy Classifier. Maximum entropy is a technique for learning probability distributions from data. In maximum entropy models, the observed data itself is assumed to be the testable information. Maximum entropy models don't assume anything about the probability distribution other than what have been observed and always choose the most uniform distribution subject to the observed constraints.

Basically, maximum entropy classifier is another name of multinomial logistic regression applied to categorical independent variables, which are converted to binary dummy variables. Maximum entropy models are widely used in natural language processing. Here, we provide an implementation which assumes that binary features are stored in a sparse array, of which entries are the indices of nonzero features.

See Also:
  • Constructor Details

    • Maxent

      public Maxent(int p, double L, double lambda, IntSet labels)
      Constructor.
      Parameters:
      p - the dimension of input data.
      L - the log-likelihood of learned model.
      lambda - lambda > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
      labels - the class label encoder.
  • Method Details

    • fit

      public static Maxent fit(int p, int[][] x, int[] y)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      Returns:
      the model.
    • fit

      public static Maxent fit(int p, int[][] x, int[] y, Properties params)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      params - the hyper-parameters.
      Returns:
      the model.
    • fit

      public static Maxent fit(int p, int[][] x, int[] y, double lambda, double tol, int maxIter)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      lambda - lambda > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
      tol - the tolerance for stopping iterations.
      maxIter - maximum number of iterations.
      Returns:
      the model.
    • binomial

      public static Maxent.Binomial binomial(int p, int[][] x, int[] y)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      Returns:
      the model.
    • binomial

      public static Maxent.Binomial binomial(int p, int[][] x, int[] y, Properties params)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      params - the hyper-parameters.
      Returns:
      the model.
    • binomial

      public static Maxent.Binomial binomial(int p, int[][] x, int[] y, double lambda, double tol, int maxIter)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      lambda - lambda > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
      tol - the tolerance for stopping iterations.
      maxIter - maximum number of iterations.
      Returns:
      the model.
    • multinomial

      public static Maxent.Multinomial multinomial(int p, int[][] x, int[] y)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      Returns:
      the model.
    • multinomial

      public static Maxent.Multinomial multinomial(int p, int[][] x, int[] y, Properties params)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      params - the hyper-parameters.
      Returns:
      the model.
    • multinomial

      public static Maxent.Multinomial multinomial(int p, int[][] x, int[] y, double lambda, double tol, int maxIter)
      Fits maximum entropy classifier.
      Parameters:
      p - the dimension of feature space.
      x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
      y - training labels in [0, k), where k is the number of classes.
      lambda - lambda > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
      tol - the tolerance for stopping iterations.
      maxIter - maximum number of iterations.
      Returns:
      the model.
    • dimension

      public int dimension()
      Returns the dimension of input space.
      Returns:
      the dimension of input space.
    • soft

      public boolean soft()
      Description copied from interface: Classifier
      Returns true if this is a soft classifier that can estimate the posteriori probabilities of classification.
      Returns:
      true if soft classifier.
    • online

      public boolean online()
      Description copied from interface: Classifier
      Returns true if this is an online learner.
      Returns:
      true if online learner.
    • setLearningRate

      public void setLearningRate(double rate)
      Sets the learning rate of stochastic gradient descent. It is a good practice to adapt the learning rate for different data sizes. For example, it is typical to set the learning rate to eta/n, where eta is in [0.1, 0.3] and n is the size of the training data.
      Parameters:
      rate - the learning rate.
    • getLearningRate

      public double getLearningRate()
      Returns the learning rate of stochastic gradient descent.
      Returns:
      the learning rate of stochastic gradient descent.
    • loglikelihood

      public double loglikelihood()
      Returns the log-likelihood of model.
      Returns:
      the log-likelihood of model.
    • AIC

      public double AIC()
      Returns the AIC score.
      Returns:
      the AIC score.