Package smile.clustering
Class CentroidClustering<T,U>
java.lang.Object
smile.clustering.PartitionClustering
smile.clustering.CentroidClustering<T,U>
- Type Parameters:
T
- the type of centroids.U
- the tpe of observations. Usually, T and U are the same. But in case of SIB, they are different.
- All Implemented Interfaces:
Serializable
,Comparable<CentroidClustering<T,
U>>
public abstract class CentroidClustering<T,U>
extends PartitionClustering
implements Comparable<CentroidClustering<T,U>>
In centroid-based clustering, clusters are represented by a central vector,
which may not necessarily be a member of the data set. When the number of
clusters is fixed to k, k-means clustering gives a formal definition as
an optimization problem: find the k cluster centers and assign the objects
to the nearest cluster center, such that the squared distances from the
cluster are minimized.
Variations of k-means include restricting the centroids to members of the data set (k-medoids), choosing medians (k-medians clustering), choosing the initial centers less randomly (k-means++) or allowing a fuzzy cluster assignment (fuzzy c-means), etc.
Most k-means-type algorithms require the number of clusters to be specified in advance, which is considered to be one of the biggest drawbacks of these algorithms. Furthermore, the algorithms prefer clusters of approximately similar size, as they will always assign an object to the nearest centroid. This often leads to incorrectly cut borders of clusters (which is not surprising since the algorithm optimizes cluster centers, not cluster borders).
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionfinal T[]
The centroids of each cluster.final double
The total distortion.Fields inherited from class smile.clustering.PartitionClustering
k, OUTLIER, size, y
-
Constructor Summary
-
Method Summary
Methods inherited from class smile.clustering.PartitionClustering
run, seed
-
Field Details
-
distortion
public final double distortionThe total distortion. -
centroids
The centroids of each cluster.
-
-
Constructor Details
-
CentroidClustering
Constructor.- Parameters:
distortion
- the total distortion.centroids
- the centroids of each cluster.y
- the cluster labels.
-
-
Method Details
-
compareTo
- Specified by:
compareTo
in interfaceComparable<T>
-
distance
The distance function.- Parameters:
a
- an observation.b
- the other observation.- Returns:
- the distance.
-
predict
Classifies a new observation.- Parameters:
x
- a new observation.- Returns:
- the cluster label.
-
toString
- Overrides:
toString
in classPartitionClustering
-