Class KSTest

java.lang.Object
smile.stat.hypothesis.KSTest

public class KSTest extends Object
The Kolmogorov-Smirnov test (K-S test) is a form of minimum distance estimation used as a non-parametric test of equality of one-dimensional probability distributions. K-S test is used to compare a sample with a reference probability distribution (one-sample K-S test), or to compare two samples (two-sample K-S test). The Kolmogorov-Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the samples are drawn from the same distribution (in the two-sample case) or that the sample is drawn from the reference distribution (in the one-sample case). In each case, the distributions considered under the null hypothesis are continuous distributions but are otherwise unrestricted.

The two-sample KS test is one of the most useful and general non-parametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples.

The Kolmogorov-Smirnov test can be modified to serve goodness of fit test. In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using the sample to modify the null hypothesis reduces the power of a test. Correcting for this bias leads to the Lilliefors test. However, even Lilliefors' modification is less powerful than the Shapiro-Wilk test or Anderson-Darling test for testing normality.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    final double
    Kolmogorov-Smirnov statistic.
    final String
    The type of test.
    final double
    P-value.
  • Constructor Summary

    Constructors
    Constructor
    Description
    KSTest(String method, double d, double pvalue)
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    static KSTest
    test(double[] x, double[] y)
    The two-sample KS test for the null hypothesis that the data sets are drawn from the same distribution.
    static KSTest
    test(double[] x, Distribution dist)
    The one-sample KS test for the null hypothesis that the data set x is drawn from the given distribution.
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Field Details

    • method

      public final String method
      The type of test.
    • d

      public final double d
      Kolmogorov-Smirnov statistic.
    • pvalue

      public final double pvalue
      P-value.
  • Constructor Details

    • KSTest

      public KSTest(String method, double d, double pvalue)
      Constructor.
      Parameters:
      method - the type of test.
      d - the Kolmogorov-Smirnov statistic.
      pvalue - the p-value.
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • test

      public static KSTest test(double[] x, Distribution dist)
      The one-sample KS test for the null hypothesis that the data set x is drawn from the given distribution. Small values of p-value show that the cumulative distribution function of x is significantly different from the given distribution. The array x is modified by being sorted into ascending order.
      Parameters:
      x - the sample values.
      dist - the distribution.
      Returns:
      the test results.
    • test

      public static KSTest test(double[] x, double[] y)
      The two-sample KS test for the null hypothesis that the data sets are drawn from the same distribution. Small values of p-value show that the cumulative distribution function of x is significantly different from that of y. The arrays x and y are modified by being sorted into ascending order.
      Parameters:
      x - the sample values.
      y - the sample values.
      Returns:
      the test results.