Package smile.clustering
Class CLARANS<T>
java.lang.Object
smile.clustering.PartitionClustering
smile.clustering.CentroidClustering<T,T>
smile.clustering.CLARANS<T>
- Type Parameters:
T
- the type of input object.
- All Implemented Interfaces:
Serializable
,Comparable<CentroidClustering<T,
T>>
Clustering Large Applications based upon RANdomized Search. CLARANS is an
efficient medoid-based clustering algorithm. The k-medoids algorithm is an
adaptation of the k-means algorithm. Rather than calculate the mean of the
items in each cluster, a representative item, or medoid, is chosen for each
cluster at each iteration. In CLARANS, the process of finding k medoids from
n objects is viewed abstractly as searching through a certain graph. In the
graph, a node is represented by a set of k objects as selected medoids. Two
nodes are neighbors if their sets differ by only one object. In each iteration,
CLARANS considers a set of randomly chosen neighbor nodes as candidate
of new medoids. We will move to the neighbor node if the neighbor
is a better choice for medoids. Otherwise, a local optima is discovered. The
entire process is repeated multiple time to find better.
CLARANS has two parameters: the maximum number of neighbors examined (maxNeighbor) and the number of local minima obtained (numLocal). The higher the value of maxNeighbor, the closer is CLARANS to PAM, and the longer is each search of a local minima. But the quality of such a local minima is higher and fewer local minima needs to be obtained.
References
- R. Ng and J. Han. CLARANS: A Method for Clustering Objects for Spatial Data Mining. IEEE TRANS. KNOWLEDGE AND DATA ENGINEERING, 2002.
- See Also:
-
Field Summary
Fields inherited from class smile.clustering.CentroidClustering
centroids, distortion
Fields inherited from class smile.clustering.PartitionClustering
k, OUTLIER, size, y
-
Constructor Summary
-
Method Summary
Methods inherited from class smile.clustering.CentroidClustering
compareTo, predict, toString
Methods inherited from class smile.clustering.PartitionClustering
run, seed
-
Constructor Details
-
CLARANS
Constructor.- Parameters:
distortion
- the total distortion.medoids
- the medoids of each cluster.y
- the cluster labels.distance
- the lambda of distance measure.
-
-
Method Details
-
distance
Description copied from class:CentroidClustering
The distance function.- Specified by:
distance
in classCentroidClustering<T,
T> - Parameters:
x
- an observation.y
- the other observation.- Returns:
- the distance.
-
fit
Clustering data into k clusters. The maximum number of random search is set to 1.25% * k * (n - k), where n is the number of data and k is the number clusters.- Type Parameters:
T
- the data type.- Parameters:
data
- the observations.distance
- the lambda of distance measure.k
- the number of clusters.- Returns:
- the model.
-
fit
Constructor. Clustering data into k clusters.- Type Parameters:
T
- the data type.- Parameters:
data
- the observations.distance
- the lambda of distance measure.k
- the number of clusters.maxNeighbor
- the maximum number of neighbors examined during the random search of local minima.- Returns:
- the model.
-