Class EmpiricalDistribution

java.lang.Object
smile.stat.distribution.DiscreteDistribution
smile.stat.distribution.EmpiricalDistribution
All Implemented Interfaces:
Serializable, Distribution

public class EmpiricalDistribution extends DiscreteDistribution
An empirical distribution function or empirical cdf, is a cumulative probability distribution function that concentrates probability 1/n at each of the n numbers in a sample. As n grows the empirical distribution will get closer to the true distribution. Empirical distribution is a very important estimator in Statistics. In particular, the Bootstrap method rely heavily on the empirical distribution.
See Also:
  • Field Details

    • p

      public final double[] p
      The probabilities for each x.
  • Constructor Details

    • EmpiricalDistribution

      public EmpiricalDistribution(double[] prob)
      Constructor.
      Parameters:
      prob - the probabilities.
    • EmpiricalDistribution

      public EmpiricalDistribution(double[] prob, IntSet x)
      Constructor.
      Parameters:
      prob - the probabilities.
      x - the values of random variable.
  • Method Details

    • fit

      public static EmpiricalDistribution fit(int[] data)
      Estimates the distribution.
      Parameters:
      data - the training data.
      Returns:
      the distribution.
    • fit

      public static EmpiricalDistribution fit(int[] data, IntSet x)
      Estimates the distribution. Sometimes, the data may not contain all possible values. In this case, the user should provide the value set.
      Parameters:
      data - the training data.
      x - the value set.
      Returns:
      the distribution.
    • length

      public int length()
      Description copied from interface: Distribution
      Returns the number of parameters of the distribution. The "length" is in the sense of the minimum description length principle.
      Returns:
      The number of parameters.
    • mean

      public double mean()
      Description copied from interface: Distribution
      Returns the mean of distribution.
      Returns:
      The mean.
    • variance

      public double variance()
      Description copied from interface: Distribution
      Returns the variance of distribution.
      Returns:
      The variance.
    • sd

      public double sd()
      Description copied from interface: Distribution
      Returns the standard deviation of distribution.
      Returns:
      The standard deviation.
    • entropy

      public double entropy()
      Description copied from interface: Distribution
      Returns Shannon entropy of the distribution.
      Returns:
      Shannon entropy.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • rand

      public double rand()
      Description copied from interface: Distribution
      Generates a random number following this distribution.
      Returns:
      a random number.
    • randi

      public int[] randi(int n)
      Description copied from class: DiscreteDistribution
      Generates a set of integer random numbers following this discrete distribution.
      Overrides:
      randi in class DiscreteDistribution
      Parameters:
      n - the number of random numbers to generate.
      Returns:
      an array of integer random numbers.
    • p

      public double p(int k)
      Description copied from class: DiscreteDistribution
      The probability mass function.
      Specified by:
      p in class DiscreteDistribution
      Parameters:
      k - a real value.
      Returns:
      the probability.
    • logp

      public double logp(int k)
      Description copied from class: DiscreteDistribution
      The probability mass function in log scale.
      Specified by:
      logp in class DiscreteDistribution
      Parameters:
      k - a real value.
      Returns:
      the log probability.
    • cdf

      public double cdf(double k)
      Description copied from interface: Distribution
      Cumulative distribution function. That is the probability to the left of x.
      Parameters:
      k - a real number.
      Returns:
      the probability.
    • quantile

      public double quantile(double p)
      Description copied from interface: Distribution
      The quantile, the probability to the left of quantile is p. It is actually the inverse of cdf.
      Parameters:
      p - the probability.
      Returns:
      the quantile.