Record Class InformationValue
- Record Components:
feature
- The feature name.iv
- The information value.woe
- The weight of evidence.breaks
- The breakpoints of intervals for numerical variables.
- All Implemented Interfaces:
Comparable<InformationValue>
IV is a good measure of the predictive power of a feature. It also helps point out the suspicious feature. Unlike other feature selection methods available, the features selected using IV might not be the best feature set for a non-linear model building.
Information Value | Predictive power |
---|---|
<0.02 | Useless |
0.02 to 0.1 | Weak predictors |
0.1 to 0.3 | Medium Predictors |
0.3 to 0.5 | Strong predictors |
>0.5 | Suspicious |
WoE = ln (percentage of events / percentage of non-events).Note that the conditional log odds is exactly what a logistic regression model tries to predict.
WoE values of a categorical variable can be used to convert a categorical feature to a numerical feature. If a continuous feature does not have a linear relationship with the log odds, the feature can be binned into groups and a new feature created by replaced each bin with its WoE value. Therefore, WoE is a good variable transformation method for logistic regression.
On arranging a numerical feature in ascending order, if the WoE values are all linear, we know that the feature has the right linear relation with the target. However, if the feature's WoE is non-linear, we should either discard it or consider some other variable transformation to ensure the linearity. Hence, WoE helps check the linear relationship of a feature with its dependent variable to be used in the model. Though WoE and IV are highly useful, always ensure that it is only used with logistic regression.
WoE is better than on-hot encoding as it does not increase the complexity of the model.
-
Constructor Summary
ConstructorDescriptionInformationValue
(String feature, double iv, double[] woe, double[] breaks) Creates an instance of aInformationValue
record class. -
Method Summary
Modifier and TypeMethodDescriptiondouble[]
breaks()
Returns the value of thebreaks
record component.int
compareTo
(InformationValue other) final boolean
Indicates whether some other object is "equal to" this one.feature()
Returns the value of thefeature
record component.static InformationValue[]
Calculates the information value.static InformationValue[]
Calculates the information value.final int
hashCode()
Returns a hash code value for this object.double
iv()
Returns the value of theiv
record component.toString()
Returns a string representation of this record class.static String
toString
(InformationValue[] ivs) Returns a string representation of the array of information values.static ColumnTransform
toTransform
(InformationValue[] values) Returns the data transformation that covert feature value to its weight of evidence.double[]
woe()
Returns the value of thewoe
record component.
-
Constructor Details
-
Method Details
-
compareTo
- Specified by:
compareTo
in interfaceComparable<InformationValue>
-
toString
Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components. -
toString
Returns a string representation of the array of information values.- Parameters:
ivs
- the array of information values.- Returns:
- a string representation of information values
-
toTransform
Returns the data transformation that covert feature value to its weight of evidence.- Parameters:
values
- the information value objects of features.- Returns:
- the transform.
-
fit
Calculates the information value.- Parameters:
data
- the data frame of the explanatory and response variables.clazz
- the column name of binary class labels.- Returns:
- the information value.
-
fit
Calculates the information value.- Parameters:
data
- the data frame of the explanatory and response variables.clazz
- the column name of binary class labels.nbins
- the number of bins to discretize numeric variables in WOE calculation.- Returns:
- the information value.
-
hashCode
public final int hashCode()Returns a hash code value for this object. The value is derived from the hash code of each of the record components. -
equals
Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared withObjects::equals(Object,Object)
; primitive components are compared with '=='. -
feature
Returns the value of thefeature
record component.- Returns:
- the value of the
feature
record component
-
iv
public double iv()Returns the value of theiv
record component.- Returns:
- the value of the
iv
record component
-
woe
public double[] woe()Returns the value of thewoe
record component.- Returns:
- the value of the
woe
record component
-
breaks
public double[] breaks()Returns the value of thebreaks
record component.- Returns:
- the value of the
breaks
record component
-