Class UMAP

java.lang.Object
smile.manifold.UMAP

public class UMAP extends Object
Uniform Manifold Approximation and Projection. UMAP is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data:
  • The data is uniformly distributed on a Riemannian manifold;
  • The Riemannian metric is locally constant (or can be approximated as such);
  • The manifold is locally connected.
From these assumptions it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

References

  1. McInnes, L, Healy, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv e-prints 1802.03426, 2018
  2. How UMAP Works
See Also:
  • Method Details

    • fit

      public static double[][] fit(double[][] data, UMAP.Options options)
      Runs the UMAP algorithm with Euclidean distance.
      Parameters:
      data - the input data.
      options - the hyperparameters.
      Returns:
      The embedding coordinates.
    • fit

      public static <T> double[][] fit(T[] data, Metric<T> distance, UMAP.Options options)
      Runs the UMAP algorithm.
      Type Parameters:
      T - The data type of points.
      Parameters:
      data - the input data.
      distance - the distance function.
      options - the hyperparameters.
      Returns:
      The embedding coordinates.
    • fit

      public static <T> double[][] fit(T[] data, NearestNeighborGraph nng, UMAP.Options options)
      Runs the UMAP algorithm.
      Type Parameters:
      T - the data type of points.
      Parameters:
      data - the input data.
      nng - the k-nearest neighbor graph.
      options - the hyperparameters.
      Returns:
      the embedding coordinates.