Class GMeans

java.lang.Object
smile.clustering.GMeans

public class GMeans extends Object
G-Means clustering algorithm, an extended K-Means which tries to automatically determine the number of clusters by normality test. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a hierarchical fashion until the test accepts the hypothesis that the data assigned to each k-means center are Gaussian.

References

  1. G. Hamerly and C. Elkan. Learning the k in k-means. NIPS, 2003.
See Also:
  • Method Details

    • fit

      public static CentroidClustering<double[],double[]> fit(double[][] data, int kmax, int maxIter)
      Clustering data with the number of clusters determined by G-Means algorithm automatically.
      Parameters:
      data - the input data of which each row is an observation.
      kmax - the maximum number of clusters.
      maxIter - the maximum number of iterations for k-means.
      Returns:
      the model.
    • fit

      public static CentroidClustering<double[],double[]> fit(double[][] data, Clustering.Options options)
      Clustering data with the number of clusters determined by G-Means algorithm automatically.
      Parameters:
      data - the input data of which each row is an observation.
      options - the hyperparameters.
      Returns:
      the model.