Package smile.clustering
Class GMeans
java.lang.Object
smile.clustering.GMeans
G-Means clustering algorithm, an extended K-Means which tries to
automatically determine the number of clusters by normality test.
The G-means algorithm is based on a statistical test for the hypothesis
that a subset of data follows a Gaussian distribution. G-means runs
k-means with increasing k in a hierarchical fashion until the test accepts
the hypothesis that the data assigned to each k-means center are Gaussian.
References
- G. Hamerly and C. Elkan. Learning the k in k-means. NIPS, 2003.
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionstatic CentroidClustering
<double[], double[]> fit
(double[][] data, int kmax, int maxIter) Clustering data with the number of clusters determined by G-Means algorithm automatically.static CentroidClustering
<double[], double[]> fit
(double[][] data, Clustering.Options options) Clustering data with the number of clusters determined by G-Means algorithm automatically.
-
Method Details
-
fit
Clustering data with the number of clusters determined by G-Means algorithm automatically.- Parameters:
data
- the input data of which each row is an observation.kmax
- the maximum number of clusters.maxIter
- the maximum number of iterations for k-means.- Returns:
- the model.
-
fit
public static CentroidClustering<double[],double[]> fit(double[][] data, Clustering.Options options) Clustering data with the number of clusters determined by G-Means algorithm automatically.- Parameters:
data
- the input data of which each row is an observation.options
- the hyperparameters.- Returns:
- the model.
-