Package smile.clustering
Hierarchical algorithms find successive clusters using previously established clusters. These algorithms usually are either agglomerative ("bottomup") or divisive ("topdown"). Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters. Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters.
Partitional algorithms typically determine all clusters at once, but can also be used as divisive algorithms in the hierarchical clustering. Many partitional clustering algorithms require the specification of the number of clusters to produce in the input data set, prior to execution of the algorithm. Barring knowledge of the proper value beforehand, the appropriate value must be determined, a problem on its own for which a number of techniques have been developed.
Densitybased clustering algorithms are devised to discover arbitraryshaped clusters. In this approach, a cluster is regarded as a region in which the density of data objects exceeds a threshold.
Subspace clustering methods look for clusters that can only be seen in a particular projection (subspace, manifold) of the data. These methods thus can ignore irrelevant attributes. The general problem is also known as Correlation clustering while the special case of axisparallel subspaces is also known as twoway clustering, coclustering or biclustering in bioinformatics: in these methods not only the objects are clustered but also the features of the objects, i.e., if the data is represented in a data matrix, the rows and columns are clustered simultaneously. They usually do not however work with arbitrary feature combinations as in general subspace methods.

ClassDescriptionBalanced BoxDecomposition Tree.CentroidClustering<T,
U> In centroidbased clustering, clusters are represented by a central vector, which may not necessarily be a member of the data set.CLARANS<T>Clustering Large Applications based upon RANdomized Search.DBSCAN<T>DensityBased Spatial Clustering of Applications with Noise.DENsity CLUstering.Deterministic annealing clustering.GMeans clustering algorithm, an extended KMeans which tries to automatically determine the number of clusters by normality test.Agglomerative Hierarchical Clustering.KMeans clustering.KModes clustering.MEC<T>Nonparametric Minimum Conditional Entropy Clustering.Partition clustering.The Sequential Information Bottleneck algorithm.Spectral Clustering.XMeans clustering algorithm, an extended KMeans which tries to automatically determine the number of clusters based on BIC scores.