specc

fun specc(W: Matrix, k: Int): SpectralClustering

Spectral Clustering. Given a set of data points, the similarity matrix may be defined as a matrix S where Sij represents a measure of the similarity between points. Spectral clustering techniques make use of the spectrum of the similarity matrix of the data to perform dimensionality reduction for clustering in fewer dimensions. Then the clustering will be performed in the dimension-reduce space, in which clusters of non-convex shape may become tight. There are some intriguing similarities between spectral clustering methods and kernel PCA, which has been empirically observed to perform clustering.

====References:====

  • A.Y. Ng, M.I. Jordan, and Y. Weiss. On Spectral Clustering: Analysis and an algorithm. NIPS, 2001.

  • Marina Maila and Jianbo Shi. Learning segmentation by random walks. NIPS, 2000.

  • Deepak Verma and Marina Meila. A Comparison of Spectral Clustering Algorithms. 2003.

Parameters

W

the adjacency matrix of graph.

k

the number of clusters.


fun specc(data: Array<DoubleArray>, k: Int, sigma: Double): SpectralClustering

Spectral clustering.

Parameters

data

the dataset for clustering.

k

the number of clusters.

sigma

the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion, see { @link #distortion()}) in feature space.


fun specc(data: Array<DoubleArray>, k: Int, l: Int, sigma: Double): SpectralClustering

Spectral clustering with Nystrom approximation.

Parameters

data

the dataset for clustering.

k

the number of clusters.

l

the number of random samples for Nystrom approximation.

sigma

the smooth/width parameter of Gaussian kernel, which is a somewhat sensitive parameter. To search for the best setting, one may pick the value that gives the tightest clusters (smallest distortion, see { @link #distortion()}) in feature space.