Class ExponentialFamilyMixture

java.lang.Object
smile.stat.distribution.Mixture
smile.stat.distribution.ExponentialFamilyMixture
All Implemented Interfaces:
Serializable, Distribution
Direct Known Subclasses:
GaussianMixture

public class ExponentialFamilyMixture extends Mixture
The finite mixture of distributions from exponential family. The EM algorithm can be used to learn the mixture model from data. EM is particularly useful when the likelihood is an exponential family: the E-step becomes the sum of expectations of sufficient statistics, and the M-step involves maximizing a linear function. In such a case, it is usually possible to derive closed form updates for each step.
See Also:
  • Field Details

    • L

      public final double L
      The log-likelihood when the distribution is fit on a sample data.
    • bic

      public final double bic
      The BIC score when the distribution is fit on a sample data.
  • Constructor Details

    • ExponentialFamilyMixture

      public ExponentialFamilyMixture(Mixture.Component... components)
      Constructor.
      Parameters:
      components - a list of exponential family distributions.
  • Method Details

    • fit

      public static ExponentialFamilyMixture fit(double[] x, Mixture.Component... components)
      Fits the mixture model with the EM algorithm.
      Parameters:
      x - the training data.
      components - the initial configuration of mixture. Components may have different distribution form.
      Returns:
      the distribution.
    • fit

      public static ExponentialFamilyMixture fit(double[] x, Mixture.Component[] components, double gamma, int maxIter, double tol)
      Fits the mixture model with the EM algorithm.
      Parameters:
      x - the training data.
      components - the initial configuration.
      gamma - the regularization parameter. Although regularization works well for high dimensional data, it often reduces the model to too few components. For one-dimensional data, gamma should be 0 in general.
      maxIter - the maximum number of iterations.
      tol - the tolerance of convergence test.
      Returns:
      the distribution.