Class BinaryEncoder

java.lang.Object
smile.feature.extraction.BinaryEncoder
All Implemented Interfaces:
Function<Tuple,int[]>

public class BinaryEncoder extends Object implements Function<Tuple,int[]>
Encodes categorical features using sparse one-hot scheme. The categorical attributes will be converted to binary dummy variables in a compact representation in which only indices of nonzero elements are stored in an integer array. In Maximum Entropy Classifier, the data are expected to store in this format.
  • Constructor Details

    • BinaryEncoder

      public BinaryEncoder(StructType schema, String... columns)
      Constructor.
      Parameters:
      schema - the data frame schema.
      columns - the column names of categorical variables. If empty, all categorical columns will be used.
  • Method Details

    • apply

      public int[] apply(Tuple x)
      Generates the compact representation of sparse binary features for given object.
      Specified by:
      apply in interface Function<Tuple,int[]>
      Parameters:
      x - an object of interest.
      Returns:
      an integer array of nonzero binary features.
    • apply

      public int[][] apply(DataFrame data)
      Generates the compact representation of sparse binary features for a data frame.
      Parameters:
      data - a data frame.
      Returns:
      the binary feature vectors.