Class ProbabilisticPCA
java.lang.Object
smile.feature.extraction.Projection
smile.feature.extraction.ProbabilisticPCA
- All Implemented Interfaces:
Serializable, Function<Tuple,Tuple>, Transform
Probabilistic principal component analysis. Probabilistic PCA is
a simplified factor analysis that employs a latent variable model
with linear relationship:
y ∼ W * x + μ + ε
where latent variables x ∼ N(0, I), error (or noise)
ε ∼ N(0, Ψ), and μ is the location
term (mean). In probabilistic PCA, an isotropic noise model is used,
i.e., noise variances constrained to be equal
(Ψi = σ2).
A close form of estimation of above parameters can be obtained
by maximum likelihood method.
References
- Michael E. Tipping and Christopher M. Bishop. Probabilistic Principal Component Analysis. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 61(3):611-622, 1999.
- See Also:
-
Field Summary
Fields inherited from class Projection
columns, projection, schema -
Constructor Summary
ConstructorsConstructorDescriptionProbabilisticPCA(double noise, double[] mu, DenseMatrix loading, DenseMatrix projection, String... columns) Constructor. -
Method Summary
Modifier and TypeMethodDescriptioncenter()Returns the center of data.static ProbabilisticPCAFits probabilistic principal component analysis.static ProbabilisticPCAFits probabilistic principal component analysis.loadings()Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.protected double[]postprocess(double[] x) Postprocess the output vector after projection.doublevariance()Returns the variance of noise.Methods inherited from class Projection
apply, apply, apply, apply, preprocess
-
Constructor Details
-
ProbabilisticPCA
public ProbabilisticPCA(double noise, double[] mu, DenseMatrix loading, DenseMatrix projection, String... columns) Constructor.- Parameters:
noise- the variance of noise.mu- the mean of samples.loading- the loading matrix.projection- the projection matrix. Note that this is not the matrix W in the latent model.columns- the columns to transform when applied on Tuple/DataFrame.
-
-
Method Details
-
loadings
Returns the variable loading matrix, ordered from largest to smallest by corresponding eigenvalues.- Returns:
- the variable loading matrix.
-
center
-
variance
public double variance()Returns the variance of noise.- Returns:
- the variance of noise.
-
postprocess
protected double[] postprocess(double[] x) Description copied from class:ProjectionPostprocess the output vector after projection.- Overrides:
postprocessin classProjection- Parameters:
x- the output vector of projection.- Returns:
- the postprocessed vector.
-
fit
Fits probabilistic principal component analysis.- Parameters:
data- training data of which each row is a sample.k- the number of principal component to learn.columns- the columns to fit PCA. If empty, all columns will be used.- Returns:
- the model.
-
fit
Fits probabilistic principal component analysis.- Parameters:
data- training data of which each row is a sample.k- the number of principal component to learn.columns- the columns to transform when applied on Tuple/DataFrame.- Returns:
- the model.
-