Class Transform

java.lang.Object
smile.plot.vega.Transform

public class Transform extends Object
View-level data transformations such as filter and new field calculation. When both view-level transforms and field transforms inside encoding are specified, the view-level transforms are executed first based on the order in the array. Then the inline transforms are executed in this order: bin, timeUnit, aggregate, sort, and stack.
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • toPrettyString

      public String toPrettyString()
      Returns the specification in pretty print.
      Returns:
      the specification in pretty print.
    • aggregate

      public Transform aggregate(String op, String field, String as, String... groupby)
      Aggregate summarizes a table as one record for each group. To preserve the original table structure and instead add a new column with the aggregate values, use the join aggregate transform.
      Parameters:
      op - The aggregation operation to apply to the fields (e.g., "sum", "average", or "count").
      field - The data field for which to compute aggregate function. This is required for all aggregation operations except "count".
      as - The output field names to use for each aggregated field.
      groupby - The data fields to group by. If not specified, a single group containing all data objects will be used.
      Returns:
      this object.
    • joinAggregate

      public Transform joinAggregate(String op, String field, String as, String... groupby)
      The join-aggregate transform extends the input data objects with aggregate values in a new field. Aggregation is performed and the results are then joined with the input data. This transform can be helpful for creating derived values that combine both raw data and aggregate calculations, such as percentages of group totals. This transform is a special case of the window transform where the frame is always [null, null]. Compared with the regular aggregate transform, join-aggregate preserves the original table structure and augments records with aggregate values rather than summarizing the data in one record for each group.
      Parameters:
      op - The aggregation operation to apply to the fields (e.g., "sum", "average", or "count").
      field - The data field for which to compute aggregate function. This is required for all aggregation operations except "count".
      as - The output field names to use for each aggregated field.
      groupby - The data fields to group by. If not specified, a single group containing all data objects will be used.
      Returns:
      this object.
    • bin

      public Transform bin(String field, String as)
      Adds a bin transformation.
      Parameters:
      field - The data field to bin.
      as - The output fields at which to write the start and end bin values.
      Returns:
      this object.
    • calculate

      public Transform calculate(String expr, String field)
      Adds a formula transform extends data objects with new fields (columns) according to an expression.
      Parameters:
      expr - an expression string. Use the variable datum to refer to the current data object.
      field - the field for storing the computed formula value.
      Returns:
      this object.
    • density

      public DensityTransform density(String field, String... groupby)
      Adds a density transformation.
      Parameters:
      field - The data field for which to perform density estimation.
      groupby - The data fields to group by. If not specified, a single group containing all data objects will be used.
      Returns:
      this object.
    • extent

      public Transform extent(String field, String param)
      Adds an extent transform. The extent transform finds the extent of a field and stores the result in a parameter.
      Parameters:
      field - The field of which to get the extent.
      param - The output parameter produced by the extent transform.
      Returns:
      this object.
    • flatten

      public Transform flatten(String[] fields, String[] output)
      Adds a flatten transform. The flatten transform maps array-valued fields to a set of individual data objects, one per array entry. This transform generates a new data stream in which each data object consists of an extracted array value as well as all the original fields of the corresponding input data object.
      Parameters:
      fields - An array of one or more data fields containing arrays to flatten. If multiple fields are specified, their array values should have a parallel structure, ideally with the same length. If the lengths of parallel arrays do not match, the longest array will be used with null values added for missing entries.
      output - The output parameter produced by the extent transform.
      Returns:
      this object.
    • fold

      public Transform fold(String[] fields, String[] output)
      Adds a fold transform. The fold transform collapses (or "folds") one or more data fields into two properties: a key property (containing the original data field name) and a value property (containing the data value).

      The fold transform is useful for mapping matrix or cross-tabulation data into a standardized format.

      This transform generates a new data stream in which each data object consists of the key and value properties as well as all the original fields of the corresponding input data object.

      Note: The fold transform only applies to a list of known fields (set using the fields parameter). If your data objects instead contain array-typed fields, you may wish to use the flatten transform instead.

      Parameters:
      fields - An array of data fields indicating the properties to fold.
      output - The output field names for the key and value properties produced by the fold transform.
      Returns:
      this object.
    • filter

      public Transform filter(String predicate)
      Adds a filter transform.
      Parameters:
      predicate - an expression string, where datum can be used to refer to the current data object. For example, "datum.b2 > 60" would make the output data includes only items that have values in the field b2 over 60.
      Returns:
      this object.
    • filter

      public Transform filter(Predicate predicate)
      Adds a filter transform.
      Parameters:
      predicate - a predicate object.
      Returns:
      this object.
    • impute

      public ImputeTransform impute(String field, String key)
      Adds an impute transform.
      Parameters:
      field - The data field for which the missing values should be imputed.
      key - A key field that uniquely identifies data objects within a group. Missing key values (those occurring in the data but not in the current group) will be imputed.
      Returns:
      an impute transform object.
    • loess

      public LoessTransform loess(String field, String on)
      Adds a loess transform.
      Parameters:
      field - The data field of the dependent variable to smooth.
      on - The data field of the independent variable to use a predictor.
      Returns:
      a loess transform object.
    • lookup

      public Transform lookup(String key, String param)
      Adds a lookup transformation.
      Parameters:
      key - the key in primary data source.
      param - Selection parameter name to look up.
      Returns:
      this object.
    • lookup

      public Transform lookup(String key, LookupData from)
      Adds a lookup transformation.
      Parameters:
      key - the key in primary data source.
      from - the data source or selection for secondary data reference.
      Returns:
      this object.
    • lookupData

      public LookupData lookupData(String key)
      Creates a lookup data.
      Parameters:
      key - the key in data to lookup.
      Returns:
      a lookup data.
    • pivot

      public PivotTransform pivot(String field, String value)
      Adds a pivot transform.
      Parameters:
      field - The data field to pivot on. The unique values of this field become new field names in the output stream.
      value - The data field to populate pivoted fields. The aggregate values of this field become the values of the new pivoted fields.
      Returns:
      a pivot transform object.
    • quantile

      public QuantileTransform quantile(String field)
      Adds a quantile transform.
      Parameters:
      field - The data field for which to perform quantile estimation.
      Returns:
      a quantile transform object.
    • regression

      public RegressionTransform regression(String field, String on)
      Adds a regression transform.
      Parameters:
      field - The data field of the dependent variable to predict.
      on - The data field of the independent variable to use a predictor.
      Returns:
      a regression transform object.
    • sample

      public Transform sample(int size)
      Adds a sample transform. The sample transform filters random rows from the data source to reduce its size. As input data objects are added and removed, the sampled values may change in first-in, first-out manner. This transform uses reservoir sampling to maintain a representative sample of the stream.
      Parameters:
      size - The maximum number of data objects to include in the sample.
      Returns:
      this object.
    • stack

      public StackTransform stack(String field, String as, String... groupby)
      Adds a stack transform.
      Parameters:
      field - The field which is stacked.
      as - the output start field name. The end field will be "$as_end".
      groupby - The data fields to group by.
      Returns:
      a stack transform object.
    • timeUnit

      public Transform timeUnit(String timeUnit, String field, String as)
      Adds a time unit transform.
      Parameters:
      timeUnit - The timeUnit.
      field - The data field to apply time unit.
      as - The output field to write the timeUnit value.
      Returns:
      this object.
    • window

      public WindowTransform window(WindowTransformField... fields)
      Creates a data specification object.
      Returns:
      a data specification object.