Package smile.manifold
Some prominent approaches are locally linear embedding (LLE), Hessian LLE, Laplacian eigenmaps, and LTSA. These techniques construct a low-dimensional data representation using a cost function that retains local properties of the data, and can be viewed as defining a graph-based kernel for Kernel PCA. More recently, techniques have been proposed that, instead of defining a fixed kernel, try to learn the kernel using semidefinite programming. The most prominent example of such a technique is maximum variance unfolding (MVU). The central idea of MVU is to exactly preserve all pairwise distances between nearest neighbors (in the inner product space), while maximizing the distances between points that are not nearest neighbors.
An alternative approach to neighborhood preservation is through the minimization of a cost function that measures differences between distances in the input and output spaces. Important examples of such techniques include classical multidimensional scaling (which is identical to PCA), Isomap (which uses geodesic distances in the data space), diffusion maps (which uses diffusion distances in the data space), t-SNE (which minimizes the divergence between distributions over pairs of points), and curvilinear component analysis.
Multidimensional scaling is a set of related statistical techniques often used in information visualization for exploring similarities or dissimilarities in data. An MDS algorithm starts with a matrix of item-item similarities, then assigns a location to each item in N-dimensional space. For sufficiently small N, the resulting locations may be displayed in a graph or 3D visualization.
The major types of MDS algorithms include:
- Classical multidimensional scaling
- takes an input matrix giving dissimilarities between pairs of items and outputs a coordinate matrix whose configuration minimizes a loss function called strain.
- Metric multidimensional scaling
- A superset of classical MDS that generalizes the optimization procedure to a variety of loss functions and input matrices of known distances with weights and so on. A useful loss function in this context is called stress which is often minimized using a procedure called stress majorization.
- Non-metric multidimensional scaling
- In contrast to metric MDS, non-metric MDS finds both a non-parametric monotonic relationship between the dissimilarities in the item-item matrix and the Euclidean distances between items, and the location of each item in the low-dimensional space. The relationship is typically found using isotonic regression.
- Generalized multidimensional scaling
- An extension of metric multidimensional scaling, in which the target space is an arbitrary smooth non-Euclidean space. In case when the dissimilarities are distances on a surface and the target space is another surface, GMDS allows finding the minimum-distortion embedding of one surface into another.
-
ClassDescriptionIsometric feature mapping.Kruskal's non-metric MDS.KPCA<T>Kernel principal component analysis.Laplacian Eigenmap.Locally Linear Embedding.Classical multidimensional scaling, also known as principal coordinates analysis.The Sammon's mapping is an iterative technique for making interpoint distances in the low-dimensional projection as close as possible to the interpoint distances in the high-dimensional object.The t-distributed stochastic neighbor embedding.Uniform Manifold Approximation and Projection.