# Package-level declarations

Manifold learning finds a low-dimensional basis for describing high-dimensional data.

Manifold learning is a popular approach to nonlinear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high; though each data point consists of perhaps thousands of features, it may be described as a function of only a few underlying parameters. That is, the data points are actually samples from a low-dimensional manifold that is embedded in a high-dimensional space. Manifold learning algorithms attempt to uncover these parameters in order to find a low-dimensional representation of the data.

Some prominent approaches are locally linear embedding (LLE), Hessian LLE, Laplacian eigenmaps, and LTSA. These techniques construct a low-dimensional data representation using a cost function that retains local properties of the data, and can be viewed as defining a graph-based kernel for Kernel PCA. More recently, techniques have been proposed that, instead of defining a fixed kernel, try to learn the kernel using semidefinite programming. The most prominent example of such a technique is maximum variance unfolding (MVU). The central idea of MVU is to exactly preserve all pairwise distances between nearest neighbors (in the inner product space), while maximizing the distances between points that are not nearest neighbors.

An alternative approach to neighborhood preservation is through the minimization of a cost function that measures differences between distances in the input and output spaces. Important examples of such techniques include classical multidimensional scaling (which is identical to PCA), Isomap (which uses geodesic distances in the data space), diffusion maps (which uses diffusion distances in the data space), t-SNE (which minimizes the divergence between distributions over pairs of points), and curvilinear component analysis.

## Functions

Isometric feature mapping. Isomap is a widely used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical multidimensional scaling. Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points. Isomap is highly efficient and generally applicable to a broad range of data sources and dimensionalities.

Kruskal's nonmetric MDS. In non-metric MDS, only the rank order of entries in the proximity matrix (not the actual dissimilarities) is assumed to contain the significant information. Hence, the distances of the final configuration should as far as possible be in the same rank order as the original data. Note that a perfect ordinal re-scaling of the data into distances is usually not possible. The relationship is typically found using isotonic regression.

Laplacian Eigenmap. Using the notion of the Laplacian of the nearest neighbor adjacency graph, Laplacian Eigenmap compute a low dimensional representation of the dataset that optimally preserves local neighborhood information in a certain sense. The representation map generated by the algorithm may be viewed as a discrete approximation to a continuous map that naturally arises from the geometry of the manifold.

Locally Linear Embedding. It has several advantages over Isomap, including faster optimization when implemented to take advantage of sparse matrix algorithms, and better results with many problems. LLE also begins by finding a set of the nearest neighbors of each point. It then computes a set of weights for each point that best describe the point as a linear combination of its neighbors. Finally, it uses an eigenvector-based optimization technique to find the low-dimensional embedding of points, such that each point is still described with the same linear combination of its neighbors. LLE tends to handle non-uniform sample densities poorly because there is no fixed unit to prevent the weights from drifting as various regions differ in sample densities.

Classical multidimensional scaling, also known as principal coordinates analysis. Given a matrix of dissimilarities (e.g. pairwise distances), MDS finds a set of points in low dimensional space that well-approximates the dissimilarities in A. We are not restricted to using a Euclidean distance metric. However, when Euclidean distances are used MDS is equivalent to PCA.

The Sammon's mapping is an iterative technique for making interpoint distances in the low-dimensional projection as close as possible to the interpoint distances in the high-dimensional object. Two points close together in the high-dimensional space should appear close together in the projection, while two points far apart in the high dimensional space should appear far apart in the projection. The Sammon's mapping is a special case of metric least-square multidimensional scaling.

t-distributed stochastic neighbor embedding. t-SNE is a nonlinear dimensionality reduction technique that is particularly well suited for embedding high-dimensional data into a space of two or three dimensions, which can then be visualized in a scatter plot. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points.

Uniform Manifold Approximation and Projection.