Class AdjustedMutualInformation
java.lang.Object
smile.validation.metric.AdjustedMutualInformation
- All Implemented Interfaces:
Serializable, ClusteringMetric
Adjusted Mutual Information (AMI) for comparing clustering.
Like the Rand index, the baseline value of mutual information between two
random clusterings does not take on a constant value, and tends to be
larger when the two partitions have a larger number of clusters (with
a fixed number of observations). AMI adopts a hypergeometric model of
randomness to adjust for chance. The AMI takes a value of 1 when the
two partitions are identical and 0 when the MI between two partitions
equals the value expected due to chance alone.
WARNING: The computation of adjustment is really slow.
References
- X. Vinh, J. Epps, J. Bailey. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. JMLR, 2010.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumThe normalization method. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final AdjustedMutualInformationDefault instance with max normalization.static final AdjustedMutualInformationDefault instance with min normalization.static final AdjustedMutualInformationDefault instance with sqrt normalization.static final AdjustedMutualInformationDefault instance with sum normalization. -
Constructor Summary
ConstructorsConstructorDescriptionConstructor. -
Method Summary
Modifier and TypeMethodDescriptionstatic doublemax(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (max(H(y1), H(y2)) - E(MI)).static doublemin(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (min(H(y1), H(y2)) - E(MI)).doublescore(int[] y1, int[] y2) Returns a score to measure the quality of clustering.static doublesqrt(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (sqrt(H(y1) * H(y2)) - E(MI)).static doublesum(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (0.5 * (H(y1) + H(y2)) - E(MI)).toString()
-
Field Details
-
MAX
Default instance with max normalization. -
MIN
Default instance with min normalization. -
SUM
Default instance with sum normalization. -
SQRT
Default instance with sqrt normalization.
-
-
Constructor Details
-
AdjustedMutualInformation
Constructor.- Parameters:
method- the normalization method.
-
-
Method Details
-
score
public double score(int[] y1, int[] y2) Description copied from interface:ClusteringMetricReturns a score to measure the quality of clustering.- Specified by:
scorein interfaceClusteringMetric- Parameters:
y1- the ground truth (or simply a clustering labels).y2- the alternative cluster labels.- Returns:
- the metric.
-
max
public static double max(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (max(H(y1), H(y2)) - E(MI)).- Parameters:
y1- the clustering labels.y2- the alternative cluster labels.- Returns:
- the metric.
-
sum
public static double sum(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (0.5 * (H(y1) + H(y2)) - E(MI)).- Parameters:
y1- the clustering labels.y2- the alternative cluster labels.- Returns:
- the metric.
-
sqrt
public static double sqrt(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (sqrt(H(y1) * H(y2)) - E(MI)).- Parameters:
y1- the clustering labels.y2- the alternative cluster labels.- Returns:
- the metric.
-
min
public static double min(int[] y1, int[] y2) Calculates the adjusted mutual information of (I(y1, y2) - E(MI)) / (min(H(y1), H(y2)) - E(MI)).- Parameters:
y1- the clustering labels.y2- the alternative cluster labels.- Returns:
- the metric.
-
toString
-