Class ProbabilisticPCA

java.lang.Object
smile.feature.extraction.Projection
smile.feature.extraction.ProbabilisticPCA
All Implemented Interfaces:
Serializable, Function<Tuple,Tuple>, Transform

public class ProbabilisticPCA extends Projection
Probabilistic principal component analysis. Probabilistic PCA is a simplified factor analysis that employs a latent variable model with linear relationship:
     y ∼ W * x + μ + ε
 
where latent variables x ∼ N(0, I), error (or noise) ε ∼ N(0, Ψ), and μ is the location term (mean). In probabilistic PCA, an isotropic noise model is used, i.e., noise variances constrained to be equal (Ψi = σ2). A close form of estimation of above parameters can be obtained by maximum likelihood method.

References

  1. Michael E. Tipping and Christopher M. Bishop. Probabilistic Principal Component Analysis. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 61(3):611-622, 1999.
See Also:
  • Constructor Details

    • ProbabilisticPCA

      public ProbabilisticPCA(double noise, double[] mu, Matrix loading, Matrix projection, String... columns)
      Constructor.
      Parameters:
      noise - the variance of noise.
      mu - the mean of samples.
      loading - the loading matrix.
      projection - the projection matrix. Note that this is not the matrix W in the latent model.
      columns - the columns to transform when applied on Tuple/DataFrame.
  • Method Details

    • loadings

      public Matrix loadings()
      Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.
      Returns:
      the variable loading matrix.
    • center

      public double[] center()
      Returns the center of data.
      Returns:
      the center of data.
    • variance

      public double variance()
      Returns the variance of noise.
      Returns:
      the variance of noise.
    • postprocess

      protected double[] postprocess(double[] x)
      Description copied from class: Projection
      Postprocess the output vector after projection.
      Overrides:
      postprocess in class Projection
      Parameters:
      x - the output vector of projection.
      Returns:
      the postprocessed vector.
    • fit

      public static ProbabilisticPCA fit(DataFrame data, int k, String... columns)
      Fits probabilistic principal component analysis.
      Parameters:
      data - training data of which each row is a sample.
      k - the number of principal component to learn.
      columns - the columns to fit PCA. If empty, all columns will be used.
      Returns:
      the model.
    • fit

      public static ProbabilisticPCA fit(double[][] data, int k, String... columns)
      Fits probabilistic principal component analysis.
      Parameters:
      data - training data of which each row is a sample.
      k - the number of principal component to learn.
      columns - the columns to transform when applied on Tuple/DataFrame.
      Returns:
      the model.