Interface Standardizer
public interface Standardizer
Standardizes numeric feature to 0 mean and unit variance.
Standardization makes an assumption that the data follows
a Gaussian distribution and are also not robust when outliers present.
A robust alternative is to subtract the median and divide by the IQR
by
RobustStandardizer.
The standard deviation is computed with the sample formula
(N−1 denominator). For a constant column (stdev = 0), the scale falls
back to 1.0 so that the output is simply x - mean (all zeros
for training data). A single-row data frame is treated the same way.
-
Method Summary
Static Methods
-
Method Details
-
fit
Fits the data transformation.- Parameters:
data- the training data.columns- the columns to transform. If empty, transform all the numeric columns.- Returns:
- the transform.
- Throws:
IllegalArgumentException- if the data frame is empty or a specified column is non-numeric.
-