Package smile.validation.metric
Class AdjustedMutualInformation
java.lang.Object
smile.validation.metric.AdjustedMutualInformation
- All Implemented Interfaces:
Serializable
,ClusteringMetric
Adjusted Mutual Information (AMI) for comparing clustering.
Like the Rand index, the baseline value of mutual information between two
random clusterings does not take on a constant value, and tends to be
larger when the two partitions have a larger number of clusters (with
a fixed number of observations). AMI adopts a hypergeometric model of
randomness to adjust for chance. The AMI takes a value of 1 when the
two partitions are identical and 0 when the MI between two partitions
equals the value expected due to chance alone.
WARNING: The computation of adjustment is really slow.
References
- X. Vinh, J. Epps, J. Bailey. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. JMLR, 2010.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
The normalization method. -
Field Summary
Modifier and TypeFieldDescriptionstatic final AdjustedMutualInformation
Default instance with max normalization.static final AdjustedMutualInformation
Default instance with min normalization.static final AdjustedMutualInformation
Default instance with sqrt normalization.static final AdjustedMutualInformation
Default instance with sum normalization. -
Constructor Summary
ConstructorDescriptionConstructor. -
Method Summary
Modifier and TypeMethodDescriptionstatic double
max
(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (max(H(y1), H(y2)) - E(MI)).static double
min
(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (min(H(y1), H(y2)) - E(MI)).double
score
(int[] y1, int[] y2) Returns a score to measure the quality of clustering.static double
sqrt
(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (sqrt(H(y1) * H(y2)) - E(MI)).static double
sum
(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (0.5 * (H(y1) + H(y2)) - E(MI)).toString()
-
Field Details
-
MAX
Default instance with max normalization. -
MIN
Default instance with min normalization. -
SUM
Default instance with sum normalization. -
SQRT
Default instance with sqrt normalization.
-
-
Constructor Details
-
AdjustedMutualInformation
Constructor.- Parameters:
method
- the normalization method.
-
-
Method Details
-
score
public double score(int[] y1, int[] y2) Description copied from interface:ClusteringMetric
Returns a score to measure the quality of clustering.- Specified by:
score
in interfaceClusteringMetric
- Parameters:
y1
- the ground truth (or simply a clustering labels).y2
- the alternative cluster labels.- Returns:
- the metric.
-
max
public static double max(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (max(H(y1), H(y2)) - E(MI)).- Parameters:
y1
- the clustering labels.y2
- the alternative cluster labels.- Returns:
- the metric.
-
sum
public static double sum(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (0.5 * (H(y1) + H(y2)) - E(MI)).- Parameters:
y1
- the clustering labels.y2
- the alternative cluster labels.- Returns:
- the metric.
-
sqrt
public static double sqrt(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (sqrt(H(y1) * H(y2)) - E(MI)).- Parameters:
y1
- the clustering labels.y2
- the alternative cluster labels.- Returns:
- the metric.
-
min
public static double min(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (min(H(y1), H(y2)) - E(MI)).- Parameters:
y1
- the clustering labels.y2
- the alternative cluster labels.- Returns:
- the metric.
-
toString
-