smile.regression.GaussianProcessRegression<T>

Type Parameters:: T - the data type of model input objects.

All Implemented Interfaces:: Serializable, ToDoubleFunction<T>, Regression<T>

public class GaussianProcessRegression<T> extends Object implements Regression<T>

Gaussian Process for Regression. A Gaussian process is a stochastic process whose realizations consist of random values associated with every point in a range of times (or of space) such that each such random variable has a normal distribution. Moreover, every finite collection of those random variables has a multivariate normal distribution.

A Gaussian process can be used as a prior probability distribution over functions in Bayesian inference. Given any set of N points in the desired domain of your functions, take a multivariate Gaussian whose covariance matrix parameter is the Gram matrix of N points with some desired kernel, and sample from that Gaussian. Inference of continuous values with a Gaussian process prior is known as Gaussian process regression.

The fitting is performed in the reproducing kernel Hilbert space with the "kernel trick". The loss function is squared-error. This also arises as the kriging estimate of a Gaussian random field in spatial statistics.

A significant problem with Gaussian process prediction is that it typically scales as O(n³). For large problems (e.g. n > 10,000) both storing the Gram matrix and solving the associated linear systems are prohibitive on modern workstations. An extensive range of proposals have been suggested to deal with this problem. A popular approach is the reduced-rank Approximations of the Gram Matrix, known as Nystrom approximation. Subset of Regressors (SR) is another popular approach that uses an active set of training samples of size m selected from the training set of size n > m. We assume that it is impossible to search for the optimal subset of size m due to combinatorics. The samples in the active set could be selected randomly, but in general we might expect better performance if the samples are selected greedily w.r.t. some criterion. Recently, researchers had proposed relaxing the constraint that the inducing variables must be a subset of training/test cases, turning the discrete selection problem into one of continuous optimization.

Experimental evidence suggests that for large m the SR and Nystrom methods have similar performance, but for small m the Nystrom method can be quite poor. Also, embarrassments can occur like the approximated predictive variance being negative. For these reasons we do not recommend the Nystrom method over the SR method.

References

Carl Edward Rasmussen and Chris Williams. Gaussian Processes for Machine Learning, 2006.
Joaquin Quinonero-candela, Carl Edward Ramussen, Christopher K. I. Williams. Approximation Methods for Gaussian Process Regression. 2007.
T. Poggio and F. Girosi. Networks for approximation and learning. Proc. IEEE 78(9):1484-1487, 1990.
Kai Zhang and James T. Kwok. Clustered Nystrom Method for Large Scale Manifold Learning and Dimension Reduction. IEEE Transactions on Neural Networks, 2010.

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

class

GaussianProcessRegression.JointPrediction

The joint prediction of multiple data points.

static final record

GaussianProcessRegression.Options

Gaussian process regression hyperparameters.

Nested classes/interfaces inherited from interface Regression
Regression.Trainer<T,M>
Field Summary

Fields

Modifier and Type

Field

Description

final MercerKernel<T>

kernel

The covariance/kernel function.

final double

L

The log marginal likelihood, which may be not available (NaN) when the model is fit with approximate methods.

final double

mean

The mean of responsible variable.

final double

noise

The variance of noise.

final T[]

regressors

The regressors.

final double

sd

The standard deviation of responsible variable.

final Vector

w

The linear weights.
Constructor Summary

Constructors

Constructor

Description

GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise)

Constructor.

GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise, double mean, double sd)

Constructor.

GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise, double mean, double sd, Cholesky cholesky, double L)

Constructor.
Method Summary

Modifier and Type

Method

Description

static GaussianProcessRegression<double[]>

fit(double[][] x, double[] y, Properties params)

Fits a regular Gaussian process model.

static <T> GaussianProcessRegression<T>

fit(T[] x, double[] y, MercerKernel<T> kernel, GaussianProcessRegression.Options options)

Fits a regular Gaussian process model.

static <T> GaussianProcessRegression<T>

fit(T[] x, double[] y, T[] t, MercerKernel<T> kernel, GaussianProcessRegression.Options options)

Fits an approximate Gaussian process model by the method of subset of regressors.

static <T> GaussianProcessRegression<T>

nystrom(T[] x, double[] y, T[] t, MercerKernel<T> kernel, GaussianProcessRegression.Options options)

Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.

double

predict(T x)

Predicts the dependent variable of an instance.

double

predict(T x, double[] estimation)

Predicts the mean and standard deviation of an instance.

GaussianProcessRegression<T>.JointPrediction

query(T[] samples)

Evaluates the Gaussian Process at some query points.

String

toString()

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface Regression
applyAsDouble, online, predict, predict, predict, update, update, update

Field Details
- kernel
  
  public final MercerKernel<T> kernel
  
  The covariance/kernel function.
- regressors
  
  public final T[] regressors
  
  The regressors.
- w
  
  public final Vector w
  
  The linear weights.
- mean
  
  public final double mean
  
  The mean of responsible variable.
- sd
  
  public final double sd
  
  The standard deviation of responsible variable.
- noise
  
  public final double noise
  
  The variance of noise.
- L
  
  public final double L
  
  The log marginal likelihood, which may be not available (NaN) when the model is fit with approximate methods.
Constructor Details
- GaussianProcessRegression
  
  public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise)
  
  Constructor.
  
  Parameters:
  
  kernel - Kernel function.
  
  regressors - The regressors.
  
  weight - The weights of regressors.
  
  noise - The variance of noise.
- GaussianProcessRegression
  
  public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise, double mean, double sd)
  
  Constructor.
  
  Parameters:
  
  kernel - Kernel function.
  
  regressors - The regressors.
  
  weight - The weights of regressors.
  
  noise - The variance of noise.
  
  mean - The mean of responsible variable.
  
  sd - The standard deviation of responsible variable.
- GaussianProcessRegression
  
  public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, Vector weight, double noise, double mean, double sd, Cholesky cholesky, double L)
  
  Constructor.
  
  Parameters:
  
  kernel - Kernel function.
  
  regressors - The regressors.
  
  weight - The weights of regressors.
  
  noise - The variance of noise.
  
  mean - The mean of responsible variable.
  
  sd - The standard deviation of responsible variable.
  
  cholesky - The Cholesky decomposition of kernel matrix.
  
  L - The log marginal likelihood.
Method Details
- predict
  
  public double predict(T x)
  
  Description copied from interface: Regression
  
  Predicts the dependent variable of an instance.
  
  Specified by:
  
  predict in interface Regression<T>
  
  Parameters:
  
  x - an instance.
  
  Returns:
  
  the predicted value of dependent variable.
- predict
  
  public double predict(T x, double[] estimation)
  
  Predicts the mean and standard deviation of an instance.
  
  Parameters:
  
  x - an instance.
  
  estimation - an output array of the estimated mean and standard deviation.
  
  Returns:
  
  the estimated mean value.
- query
  
  public GaussianProcessRegression<T>.JointPrediction query(T[] samples)
  
  Evaluates the Gaussian Process at some query points.
  
  Parameters:
  
  samples - query points.
  
  Returns:
  
  The mean, standard deviation and covariances of GP at query points.
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- fit
  
  public static GaussianProcessRegression<double[]> fit(double[][] x, double[] y, Properties params)
  
  Fits a regular Gaussian process model.
  
  Parameters:
  
  x - the training dataset.
  
  y - the response variable.
  
  params - the hyperparameters.
  
  Returns:
  
  the model.
- fit
  
  public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, MercerKernel<T> kernel, GaussianProcessRegression.Options options)
  
  Fits a regular Gaussian process model.
  
  Type Parameters:
  
  T - the data type of samples.
  
  Parameters:
  
  x - the training dataset.
  
  y - the response variable.
  
  kernel - the Mercer kernel.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- fit
  
  public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, T[] t, MercerKernel<T> kernel, GaussianProcessRegression.Options options)
  
  Fits an approximate Gaussian process model by the method of subset of regressors.
  
  Type Parameters:
  
  T - the data type of samples.
  
  Parameters:
  
  x - the training dataset.
  
  y - the response variable.
  
  t - the inducing input, which are pre-selected or inducing samples acting as active set of regressors. In simple case, these can be chosen randomly from the training set or as the centers of k-means clustering.
  
  kernel - the Mercer kernel.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.
- nystrom
  
  public static <T> GaussianProcessRegression<T> nystrom(T[] x, double[] y, T[] t, MercerKernel<T> kernel, GaussianProcessRegression.Options options)
  
  Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.
  
  Type Parameters:
  
  T - the data type of samples.
  
  Parameters:
  
  x - the training dataset.
  
  y - the response variable.
  
  t - the inducing input, which are pre-selected for Nystrom approximation.
  
  kernel - the Mercer kernel.
  
  options - the hyperparameters.
  
  Returns:
  
  the model.

Class GaussianProcessRegression<T>

References

Nested Class Summary

Nested classes/interfaces inherited from interface Regression

Field Summary

Constructor Summary

Method Summary

Methods inherited from class Object

Methods inherited from interface Regression

Field Details

kernel

regressors

w

mean

sd

noise

L

Constructor Details

GaussianProcessRegression

GaussianProcessRegression

GaussianProcessRegression

Method Details

predict

predict

query

toString

fit

fit

fit

nystrom