Interface SHAP<T>

Type Parameters:
T - the data type of model input objects.
All Known Subinterfaces:
TreeSHAP
All Known Implementing Classes:
AdaBoost, CART, DecisionTree, GradientTreeBoost, GradientTreeBoost, RandomForest, RandomForest, RegressionTree

public interface SHAP<T>
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory.

SHAP leverages local methods designed to explain a prediction f(x) based on a single input x. The local methods are defined as any interpretable approximation of the original model. In particular, SHAP employs additive feature attribution methods.

SHAP values attribute to each feature the change in the expected model prediction when conditioning on that feature. They explain how to get from the base value E[f(z)] that would be predicted if we did not know any features to the current output f(x).

In game theory, the Shapley value is the average expected marginal contribution of one player after all possible combinations have been considered.

References

  1. Lundberg, Scott M., and Su-In Lee. A unified approach to interpreting model predictions. NIPS, 2017.
  2. Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. Consistent individualized feature attribution for tree ensembles.
  • Method Summary

    Modifier and Type
    Method
    Description
    default double[]
    shap(Stream<T> data)
    Returns the average of absolute SHAP values over a data set.
    double[]
    shap(T x)
    Returns the SHAP values.
  • Method Details

    • shap

      double[] shap(T x)
      Returns the SHAP values. For regression, the length of SHAP values is same as the number of features. For classification, SHAP values are of p x k, where p is the number of features and k is the classes. The first k elements are the SHAP values of first feature over k classes, respectively. The rest features follow accordingly.
      Parameters:
      x - an instance.
      Returns:
      the SHAP values.
    • shap

      default double[] shap(Stream<T> data)
      Returns the average of absolute SHAP values over a data set.
      Parameters:
      data - the data set.
      Returns:
      the average of absolute SHAP values over a data set.