Class FLD

java.lang.Object
smile.classification.AbstractClassifier<double[]>
smile.classification.FLD
All Implemented Interfaces:
Serializable, ToDoubleFunction<double[]>, ToIntFunction<double[]>, Classifier<double[]>

public class FLD extends AbstractClassifier<double[]>
Fisher's linear discriminant. Fisher defined the separation between two distributions to be the ratio of the variance between the classes to the variance within the classes, which is, in some sense, a measure of the signal-to-noise ratio for the class labeling. FLD finds a linear combination of features which maximizes the separation after the projection. The resulting combination may be used for dimensionality reduction before later classification.

The terms Fisher's linear discriminant and LDA are often used interchangeably, although FLD actually describes a slightly different discriminant, which does not make some of the assumptions of LDA such as normally distributed classes or equal class covariances. When the assumptions of LDA are satisfied, FLD is equivalent to LDA.

FLD is also closely related to principal component analysis (PCA), which also looks for linear combinations of variables which best explain the data. As a supervised method, FLD explicitly attempts to model the difference between the classes of data. On the other hand, PCA is a unsupervised method and does not take into account any difference in class.

One complication in applying FLD (and LDA) to real data occurs when the number of variables/features does not exceed the number of samples. In this case, the covariance estimates do not have full rank, and so cannot be inverted. This is known as small sample size problem.

References

  1. H. Li, K. Zhang, and T. Jiang. Robust and Accurate Cancer Classification with Gene Expression Profiling. CSB'05, pp 310-321.
See Also:
  • Constructor Details

    • FLD

      public FLD(double[] mean, double[][] mu, Matrix scaling)
      Constructor.
      Parameters:
      mean - the mean vector of all samples.
      mu - the mean vectors of each class.
      scaling - the projection matrix.
    • FLD

      public FLD(double[] mean, double[][] mu, Matrix scaling, IntSet labels)
      Constructor.
      Parameters:
      mean - the mean vector of all samples.
      mu - the mean vectors of each class - mean.
      scaling - the projection matrix.
      labels - the class label encoder.
  • Method Details

    • fit

      public static FLD fit(double[][] x, int[] y)
      Fits Fisher's linear discriminant.
      Parameters:
      x - training samples.
      y - training labels.
      Returns:
      the model
    • fit

      public static FLD fit(double[][] x, int[] y, Properties params)
      Fits Fisher's linear discriminant.
      Parameters:
      x - training samples.
      y - training labels.
      params - the hyper-parameters.
      Returns:
      the model
    • fit

      public static FLD fit(double[][] x, int[] y, int L, double tol)
      Fits Fisher's linear discriminant.
      Parameters:
      x - training samples.
      y - training labels.
      L - the dimensionality of mapped space.
      tol - a tolerance to decide if a covariance matrix is singular; it will reject variables whose variance is less than tol2.
      Returns:
      the model
    • predict

      public int predict(double[] x)
      Description copied from interface: Classifier
      Predicts the class label of an instance.
      Parameters:
      x - the instance to be classified.
      Returns:
      the predicted class label.
    • project

      public double[] project(double[] x)
      Projects a sample to the feature space.
      Parameters:
      x - a sample
      Returns:
      the feature vector.
    • project

      public double[][] project(double[][] x)
      Projects samples to the feature space.
      Parameters:
      x - samples
      Returns:
      the feature vectors.
    • getProjection

      public Matrix getProjection()
      Returns the projection matrix W. The dimension reduced data can be obtained by y = W' * x.
      Returns:
      the projection matrix.