gmeans

fun gmeans(data: Array<DoubleArray>, k: Int = 100): GMeans

G-Means clustering algorithm, an extended K-Means which tries to automatically determine the number of clusters by normality test. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a hierarchical fashion until the test accepts the hypothesis that the data assigned to each k-means center are Gaussian.

====References:====

  • G. Hamerly and C. Elkan. Learning the k in k-means. NIPS, 2003.

Parameters

data

the data set.

k

the maximum number of clusters.