The ratio of between-groups to within-groups sum of squares is a univariate
feature ranking metric, which can be used as a feature selection criterion
for multi-class classification problems. For each variable j, this ratio is
BSS(j) / WSS(j) = ΣI(yi
denotes the average of variable j across all
denotes the average of variable j across samples
belonging to class k, and xij
is the value of variable j of sample i.
Clearly, features with larger sum squares ratios are better for classification.
- S. Dudoit, J. Fridlyand and T. Speed. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc, 97:77-87, 2002.