# Interface Distribution

All Superinterfaces:
`Serializable`
All Known Subinterfaces:
`ExponentialFamily`
All Known Implementing Classes:
`BernoulliDistribution`, `BetaDistribution`, `BinomialDistribution`, `ChiSquareDistribution`, `DiscreteDistribution`, `DiscreteExponentialFamilyMixture`, `DiscreteMixture`, `EmpiricalDistribution`, `ExponentialDistribution`, `ExponentialFamilyMixture`, `FDistribution`, `GammaDistribution`, `GaussianDistribution`, `GaussianMixture`, `GeometricDistribution`, `HyperGeometricDistribution`, `KernelDensity`, `LogisticDistribution`, `LogNormalDistribution`, `Mixture`, `NegativeBinomialDistribution`, `PoissonDistribution`, `ShiftedGeometricDistribution`, `TDistribution`, `WeibullDistribution`

public interface Distribution extends Serializable
Probability distribution of univariate random variable. A probability distribution identifies either the probability of each value of a random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous). When the random variable takes values in the set of real numbers, the probability distribution is completely described by the cumulative distribution function, whose value at each real x is the probability that the random variable is smaller than or equal to x.

Both rejection and inverse transform sampling methods are implemented to provide some general approaches to generate random samples based on probability density function or quantile function. Besides, a quantile function is also provided based on bisection searching.

• ## Method Summary

Modifier and Type
Method
Description
`double`
`cdf(double x)`
Cumulative distribution function.
`double`
`entropy()`
Returns Shannon entropy of the distribution.
`default double`
`inverseTransformSampling()`
Use inverse transform sampling (also known as the inverse probability integral transform or inverse transformation method or Smirnov transform) to draw a sample from the given distribution.
`int`
`length()`
Returns the number of parameters of the distribution.
`default double`
`likelihood(double[] x)`
The likelihood of the sample set following this distribution.
`default double`
`logLikelihood(double[] x)`
The log likelihood of the sample set following this distribution.
`double`
`logp(double x)`
The density at x in log scale, which may prevents the underflow problem.
`double`
`mean()`
Returns the mean of distribution.
`double`
`p(double x)`
The probability density function for continuous distribution or probability mass function for discrete distribution at x.
`double`
`quantile(double p)`
The quantile, the probability to the left of quantile is p.
`default double`
```quantile(double p, double xmin, double xmax)```
Inversion of CDF by bisection numeric root finding of "cdf(x) = p" for continuous distribution.
`default double`
```quantile(double p, double xmin, double xmax, double eps)```
Inversion of CDF by bisection numeric root finding of "cdf(x) = p" for continuous distribution.
`double`
`rand()`
Generates a random number following this distribution.
`default double[]`
`rand(int n)`
Generates a set of random numbers following this distribution.
`default double`
```rejectionSampling(double pmax, double xmin, double xmax)```
Use the rejection technique to draw a sample from the given distribution.
`default double`
`sd()`
Returns the standard deviation of distribution.
`double`
`variance()`
Returns the variance of distribution.
• ## Method Details

• ### length

int length()
Returns the number of parameters of the distribution. The "length" is in the sense of the minimum description length principle.
Returns:
The number of parameters.
• ### mean

double mean()
Returns the mean of distribution.
Returns:
The mean.
• ### variance

double variance()
Returns the variance of distribution.
Returns:
The variance.
• ### sd

default double sd()
Returns the standard deviation of distribution.
Returns:
The standard deviation.
• ### entropy

double entropy()
Returns Shannon entropy of the distribution.
Returns:
Shannon entropy.
• ### rand

double rand()
Generates a random number following this distribution.
Returns:
a random number.
• ### rand

default double[] rand(int n)
Generates a set of random numbers following this distribution.
Parameters:
`n` - the number of random numbers to generate.
Returns:
an array of random numbers.
• ### p

double p(double x)
The probability density function for continuous distribution or probability mass function for discrete distribution at x.
Parameters:
`x` - a real number.
Returns:
the density.
• ### logp

double logp(double x)
The density at x in log scale, which may prevents the underflow problem.
Parameters:
`x` - a real number.
Returns:
the log density.
• ### cdf

double cdf(double x)
Cumulative distribution function. That is the probability to the left of x.
Parameters:
`x` - a real number.
Returns:
the probability.
• ### quantile

double quantile(double p)
The quantile, the probability to the left of quantile is p. It is actually the inverse of cdf.
Parameters:
`p` - the probability.
Returns:
the quantile.
• ### likelihood

default double likelihood(double[] x)
The likelihood of the sample set following this distribution.
Parameters:
`x` - a set of samples.
Returns:
the likelihood.
• ### logLikelihood

default double logLikelihood(double[] x)
The log likelihood of the sample set following this distribution.
Parameters:
`x` - a set of samples.
Returns:
the log likelihood.
• ### rejectionSampling

default double rejectionSampling(double pmax, double xmin, double xmax)
Use the rejection technique to draw a sample from the given distribution. WARNING: this simulation technique can take a very long time. Rejection sampling is also commonly called the acceptance-rejection method or "accept-reject algorithm". It generates sampling values from an arbitrary probability distribution function f(x) by using an instrumental distribution g(x), under the only restriction that `f(x) < M g(x)` where `M > 1` is an appropriate bound on `f(x) / g(x)`.

Rejection sampling is usually used in cases where the form of `f(x)` makes sampling difficult. Instead of sampling directly from the distribution `f(x)`, we use an envelope distribution `M g(x)` where sampling is easier. These samples from `M g(x)` are probabilistically accepted or rejected.

This method relates to the general field of Monte Carlo techniques, including Markov chain Monte Carlo algorithms that also use a proxy distribution to achieve simulation from the target distribution `f(x)`. It forms the basis for algorithms such as the Metropolis algorithm.

Parameters:
`pmax` - the scale of instrumental distribution (uniform).
`xmin` - the lower bound of random variable range.
`xmax` - the upper bound of random variable range.
Returns:
a random number.
• ### inverseTransformSampling

default double inverseTransformSampling()
Use inverse transform sampling (also known as the inverse probability integral transform or inverse transformation method or Smirnov transform) to draw a sample from the given distribution. This is a method for generating sample numbers at random from any probability distribution given its cumulative distribution function (cdf). Subject to the restriction that the distribution is continuous, this method is generally applicable (and can be computationally efficient if the cdf can be analytically inverted), but may be too computationally expensive in practice for some probability distributions. The Box-Muller transform is an example of an algorithm which is less general but more computationally efficient. It is often the case that, even for simple distributions, the inverse transform sampling method can be improved on, given substantial research effort, e.g. the ziggurat algorithm and rejection sampling.
Returns:
a random number.
• ### quantile

default double quantile(double p, double xmin, double xmax, double eps)
Inversion of CDF by bisection numeric root finding of "cdf(x) = p" for continuous distribution.
Parameters:
`p` - the probability.
`xmin` - the lower bound of search range.
`xmax` - the upper bound of search range.
`eps` - the epsilon close to zero.
Returns:
the quantile.
• ### quantile

default double quantile(double p, double xmin, double xmax)
Inversion of CDF by bisection numeric root finding of "cdf(x) = p" for continuous distribution. The default epsilon is 1E-6.
Parameters:
`p` - the probability.
`xmin` - the lower bound of search range.
`xmax` - the upper bound of search range.
Returns:
the quantile.