Package smile.manifold
Class KPCA<T>
java.lang.Object
smile.manifold.KPCA<T>
- Type Parameters:
T
- the data type of model input objects.
- All Implemented Interfaces:
Serializable
,Function<T,
double[]>
Kernel principal component analysis. Kernel PCA is an extension of
principal component analysis (PCA) using techniques of kernel methods.
Using a kernel, the originally linear operations of PCA are done in a
reproducing kernel Hilbert space with a non-linear mapping.
In practice, a large data set leads to a large Kernel/Gram matrix K, and storing K may become a problem. One way to deal with this is to perform clustering on your large dataset, and populate the kernel with the means of those clusters. Since even this method may yield a relatively large K, it is common to compute only the top P eigenvalues and eigenvectors of K.
Kernel PCA with an isotropic kernel function is closely related to metric MDS. Carrying out metric MDS on the kernel matrix K produces an equivalent configuration of points as the distance (2(1 - K(xi, xj)))1/2 computed in feature space.
Kernel PCA also has close connections with Isomap, LLE, and Laplacian eigenmaps.
References
- Bernhard Scholkopf, Alexander Smola, and Klaus-Robert Muller. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 1998.
- See Also:
-
Constructor Summary
ConstructorDescriptionKPCA
(T[] data, MercerKernel<T> kernel, double[] mean, double mu, double[][] coordinates, double[] latent, Matrix projection) Constructor. -
Method Summary
Modifier and TypeMethodDescriptiondouble[]
double[][]
Project a set of data to the feature space.double[][]
Returns the nonlinear principal component scores, i.e., the representation of learning data in the nonlinear principal component space.static <T> KPCA
<T> fit
(T[] data, MercerKernel<T> kernel, int k) Fits kernel principal component analysis.static <T> KPCA
<T> fit
(T[] data, MercerKernel<T> kernel, int k, double threshold) Fits kernel principal component analysis.Returns the projection matrix.double[]
Returns the eigenvalues of kernel principal components, ordered from largest to smallest.
-
Constructor Details
-
KPCA
public KPCA(T[] data, MercerKernel<T> kernel, double[] mean, double mu, double[][] coordinates, double[] latent, Matrix projection) Constructor.- Parameters:
data
- training data.kernel
- Mercer kernel.mean
- the row/column average of kernel matrix.mu
- the average of kernel matrix.coordinates
- the coordinates of projected training data.latent
- the projection matrix.projection
- the projection matrix.
-
-
Method Details
-
fit
Fits kernel principal component analysis.- Type Parameters:
T
- the data type of samples.- Parameters:
data
- training data.kernel
- Mercer kernel.k
- choose up to k principal components (larger than 0.0001) used for projection.- Returns:
- the model.
-
fit
Fits kernel principal component analysis.- Type Parameters:
T
- the data type of samples.- Parameters:
data
- training data.kernel
- Mercer kernel.k
- choose top k principal components used for projection.threshold
- only principal components with eigenvalues larger than the given threshold will be kept.- Returns:
- the model.
-
variances
public double[] variances()Returns the eigenvalues of kernel principal components, ordered from largest to smallest.- Returns:
- the eigenvalues of kernel principal components, ordered from largest to smallest.
-
projection
Returns the projection matrix. The dimension reduced data can be obtained by y = W * K(x, ·).- Returns:
- the projection matrix.
-
coordinates
public double[][] coordinates()Returns the nonlinear principal component scores, i.e., the representation of learning data in the nonlinear principal component space. Rows correspond to observations, columns to components.- Returns:
- the nonlinear principal component scores.
-
apply
-
apply
Project a set of data to the feature space.- Parameters:
x
- the data set.- Returns:
- the projection in the feature space.
-