Package smile.clustering
Class SpectralClustering
java.lang.Object
smile.clustering.PartitionClustering
smile.clustering.SpectralClustering
- All Implemented Interfaces:
Serializable
Spectral Clustering. Given a set of data points, the similarity matrix may
be defined as a matrix S where Sij represents a measure of the
similarity between points. Spectral clustering techniques make use of the
spectrum of the similarity matrix of the data to perform dimensionality
reduction for clustering in fewer dimensions. Then the clustering will
be performed in the dimension-reduce space, in which clusters of non-convex
shape may become tight. There are some intriguing similarities between
spectral clustering methods and kernel PCA, which has been empirically
observed to perform clustering.
References
- A.Y. Ng, M.I. Jordan, and Y. Weiss. On Spectral Clustering: Analysis and an algorithm. NIPS, 2001.
- Marina Maila and Jianbo Shi. Learning segmentation by random walks. NIPS, 2000.
- Deepak Verma and Marina Meila. A Comparison of Spectral Clustering Algorithms. 2003.
- See Also:
-
Field Summary
Fields inherited from class smile.clustering.PartitionClustering
k, OUTLIER, size, y
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic SpectralClustering
fit
(double[][] data, int k, double sigma) Spectral clustering the data.static SpectralClustering
fit
(double[][] data, int k, double sigma, int maxIter, double tol) Spectral clustering the data.static SpectralClustering
fit
(double[][] data, int k, int l, double sigma) Spectral clustering with Nystrom approximation.static SpectralClustering
fit
(double[][] data, int k, int l, double sigma, int maxIter, double tol) Spectral clustering with Nystrom approximation.static SpectralClustering
Spectral graph clustering.static SpectralClustering
Spectral graph clustering.Methods inherited from class smile.clustering.PartitionClustering
run, seed, toString
-
Field Details
-
distortion
public final double distortionThe distortion in feature space.
-
-
Constructor Details
-
SpectralClustering
public SpectralClustering(double distortion, int k, int[] y) Constructor.- Parameters:
distortion
- the total distortion.k
- the number of clusters.y
- the cluster labels.
-
-
Method Details
-
fit
Spectral graph clustering.- Parameters:
W
- the adjacency matrix of graph, which will be modified.k
- the number of clusters.- Returns:
- the model.
-
fit
Spectral graph clustering.- Parameters:
W
- the adjacency matrix of graph, which will be modified.k
- the number of clusters.maxIter
- the maximum number of iterations for k-means.tol
- the tolerance of k-means convergence test.- Returns:
- the model.
-
fit
Spectral clustering the data.- Parameters:
data
- the input data of which each row is an observation.k
- the number of clusters.sigma
- the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion) in feature space.- Returns:
- the model.
-
fit
Spectral clustering the data.- Parameters:
data
- the input data of which each row is an observation.k
- the number of clusters.sigma
- the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion) in feature space.maxIter
- the maximum number of iterations for k-means.tol
- the tolerance of k-means convergence test.- Returns:
- the model.
-
fit
Spectral clustering with Nystrom approximation.- Parameters:
data
- the input data of which each row is an observation.k
- the number of clusters.l
- the number of random samples for Nystrom approximation.sigma
- the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion) in feature space.- Returns:
- the model.
-
fit
public static SpectralClustering fit(double[][] data, int k, int l, double sigma, int maxIter, double tol) Spectral clustering with Nystrom approximation.- Parameters:
data
- the input data of which each row is an observation.k
- the number of clusters.l
- the number of random samples for Nystrom approximation.sigma
- the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion) in feature space.maxIter
- the maximum number of iterations for k-means.tol
- the tolerance of k-means convergence test.- Returns:
- the model.
-