Package smile.stat.distribution
Class Mixture
java.lang.Object
smile.stat.distribution.Mixture
- All Implemented Interfaces:
Serializable
,Distribution
- Direct Known Subclasses:
ExponentialFamilyMixture
A finite mixture model is a probabilistic model for density estimation
using a mixture distribution. A mixture model can be regarded as a type of
unsupervised learning or clustering.
The Expectation-maximization algorithm can be used to compute the parameters of a parametric mixture model distribution. The EM algorithm is a method for finding maximum likelihood estimates of parameters, where the model depends on unobserved latent variables. EM is an iterative method which alternates between performing an expectation (E) step, which computes the expectation of the log-likelihood evaluated using the current estimate for the latent variables, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter estimates are then used to determine the distribution of the latent variables in the next E step.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic final record
A component in the mixture distribution is defined by a distribution and its weight in the mixture. -
Field Summary
Modifier and TypeFieldDescriptionfinal Mixture.Component[]
The components of finite mixture model. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptiondouble
bic
(double[] data) Returns the BIC score.double
cdf
(double x) Cumulative distribution function.double
entropy()
Shannon's entropy.int
length()
Returns the number of parameters of the distribution.double
logp
(double x) The density at x in log scale, which may prevents the underflow problem.int
map
(double x) Returns the index of component with maximum a posteriori probability.double
mean()
Returns the mean of distribution.double
p
(double x) The probability density function for continuous distribution or probability mass function for discrete distribution at x.double[]
posteriori
(double x) Returns the posteriori probabilities.double
quantile
(double p) The quantile, the probability to the left of quantile is p.double
rand()
Generates a random number following this distribution.int
size()
Returns the number of components in the mixture.toString()
double
variance()
Returns the variance of distribution.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface smile.stat.distribution.Distribution
inverseTransformSampling, likelihood, logLikelihood, quantile, quantile, rand, rejectionSampling, sd
-
Field Details
-
components
The components of finite mixture model.
-
-
Constructor Details
-
Mixture
Constructor.- Parameters:
components
- a list of distributions.
-
-
Method Details
-
posteriori
public double[] posteriori(double x) Returns the posteriori probabilities.- Parameters:
x
- a real value.- Returns:
- the posteriori probabilities.
-
map
public int map(double x) Returns the index of component with maximum a posteriori probability.- Parameters:
x
- an integer value.- Returns:
- the index of component with maximum a posteriori probability.
-
mean
public double mean()Description copied from interface:Distribution
Returns the mean of distribution.- Specified by:
mean
in interfaceDistribution
- Returns:
- The mean.
-
variance
public double variance()Description copied from interface:Distribution
Returns the variance of distribution.- Specified by:
variance
in interfaceDistribution
- Returns:
- The variance.
-
entropy
public double entropy()Shannon's entropy. Not supported.- Specified by:
entropy
in interfaceDistribution
- Returns:
- Shannon entropy.
-
p
public double p(double x) Description copied from interface:Distribution
The probability density function for continuous distribution or probability mass function for discrete distribution at x.- Specified by:
p
in interfaceDistribution
- Parameters:
x
- a real number.- Returns:
- the density.
-
logp
public double logp(double x) Description copied from interface:Distribution
The density at x in log scale, which may prevents the underflow problem.- Specified by:
logp
in interfaceDistribution
- Parameters:
x
- a real number.- Returns:
- the log density.
-
cdf
public double cdf(double x) Description copied from interface:Distribution
Cumulative distribution function. That is the probability to the left of x.- Specified by:
cdf
in interfaceDistribution
- Parameters:
x
- a real number.- Returns:
- the probability.
-
rand
public double rand()Description copied from interface:Distribution
Generates a random number following this distribution.- Specified by:
rand
in interfaceDistribution
- Returns:
- a random number.
-
quantile
public double quantile(double p) Description copied from interface:Distribution
The quantile, the probability to the left of quantile is p. It is actually the inverse of cdf.- Specified by:
quantile
in interfaceDistribution
- Parameters:
p
- the probability.- Returns:
- the quantile.
-
length
public int length()Description copied from interface:Distribution
Returns the number of parameters of the distribution. The "length" is in the sense of the minimum description length principle.- Specified by:
length
in interfaceDistribution
- Returns:
- The number of parameters.
-
size
public int size()Returns the number of components in the mixture.- Returns:
- the number of components in the mixture.
-
bic
public double bic(double[] data) Returns the BIC score.- Parameters:
data
- the data to calculate likelihood.- Returns:
- the BIC score.
-
toString
-