Package smile.plot.vega
Class Transform
java.lang.Object
smile.plot.vega.Transform
View-level data transformations such as filter and new field calculation.
When both view-level transforms and field transforms inside encoding are
specified, the view-level transforms are executed first based on the order
in the array. Then the inline transforms are executed in this order: bin,
timeUnit, aggregate, sort, and stack.
-
Method Summary
Modifier and TypeMethodDescriptionAggregate summarizes a table as one record for each group.Adds a bin transformation.Adds a formula transform extends data objects with new fields (columns) according to an expression.Adds a density transformation.Adds an extent transform.Adds a filter transform.Adds a filter transform.Adds a flatten transform.Adds a fold transform.Adds an impute transform.joinAggregate
(String op, String field, String as, String... groupby) The join-aggregate transform extends the input data objects with aggregate values in a new field.Adds a loess transform.Adds a lookup transformation.lookup
(String key, LookupData from) Adds a lookup transformation.lookupData
(String key) Creates a lookup data.Adds a pivot transform.Adds a quantile transform.regression
(String field, String on) Adds a regression transform.sample
(int size) Adds a sample transform.Adds a stack transform.Adds a time unit transform.Returns the specification in pretty print.toString()
window
(WindowTransformField... fields) Creates a data specification object.
-
Method Details
-
toString
-
toPrettyString
Returns the specification in pretty print.- Returns:
- the specification in pretty print.
-
aggregate
Aggregate summarizes a table as one record for each group. To preserve the original table structure and instead add a new column with the aggregate values, use the join aggregate transform.- Parameters:
op
- The aggregation operation to apply to the fields (e.g., "sum", "average", or "count").field
- The data field for which to compute aggregate function. This is required for all aggregation operations except "count".as
- The output field names to use for each aggregated field.groupby
- The data fields to group by. If not specified, a single group containing all data objects will be used.- Returns:
- this object.
-
joinAggregate
The join-aggregate transform extends the input data objects with aggregate values in a new field. Aggregation is performed and the results are then joined with the input data. This transform can be helpful for creating derived values that combine both raw data and aggregate calculations, such as percentages of group totals. This transform is a special case of the window transform where the frame is always [null, null]. Compared with the regular aggregate transform, join-aggregate preserves the original table structure and augments records with aggregate values rather than summarizing the data in one record for each group.- Parameters:
op
- The aggregation operation to apply to the fields (e.g., "sum", "average", or "count").field
- The data field for which to compute aggregate function. This is required for all aggregation operations except "count".as
- The output field names to use for each aggregated field.groupby
- The data fields to group by. If not specified, a single group containing all data objects will be used.- Returns:
- this object.
-
bin
Adds a bin transformation.- Parameters:
field
- The data field to bin.as
- The output fields at which to write the start and end bin values.- Returns:
- this object.
-
calculate
Adds a formula transform extends data objects with new fields (columns) according to an expression.- Parameters:
expr
- an expression string. Use the variable datum to refer to the current data object.field
- the field for storing the computed formula value.- Returns:
- this object.
-
density
Adds a density transformation.- Parameters:
field
- The data field for which to perform density estimation.groupby
- The data fields to group by. If not specified, a single group containing all data objects will be used.- Returns:
- this object.
-
extent
Adds an extent transform. The extent transform finds the extent of a field and stores the result in a parameter.- Parameters:
field
- The field of which to get the extent.param
- The output parameter produced by the extent transform.- Returns:
- this object.
-
flatten
Adds a flatten transform. The flatten transform maps array-valued fields to a set of individual data objects, one per array entry. This transform generates a new data stream in which each data object consists of an extracted array value as well as all the original fields of the corresponding input data object.- Parameters:
fields
- An array of one or more data fields containing arrays to flatten. If multiple fields are specified, their array values should have a parallel structure, ideally with the same length. If the lengths of parallel arrays do not match, the longest array will be used with null values added for missing entries.output
- The output parameter produced by the extent transform.- Returns:
- this object.
-
fold
Adds a fold transform. The fold transform collapses (or "folds") one or more data fields into two properties: a key property (containing the original data field name) and a value property (containing the data value).The fold transform is useful for mapping matrix or cross-tabulation data into a standardized format.
This transform generates a new data stream in which each data object consists of the key and value properties as well as all the original fields of the corresponding input data object.
Note: The fold transform only applies to a list of known fields (set using the fields parameter). If your data objects instead contain array-typed fields, you may wish to use the flatten transform instead.
- Parameters:
fields
- An array of data fields indicating the properties to fold.output
- The output field names for the key and value properties produced by the fold transform.- Returns:
- this object.
-
filter
Adds a filter transform.- Parameters:
predicate
- an expression string, where datum can be used to refer to the current data object. For example, "datum.b2 > 60" would make the output data includes only items that have values in the field b2 over 60.- Returns:
- this object.
-
filter
Adds a filter transform.- Parameters:
predicate
- a predicate object.- Returns:
- this object.
-
impute
Adds an impute transform.- Parameters:
field
- The data field for which the missing values should be imputed.key
- A key field that uniquely identifies data objects within a group. Missing key values (those occurring in the data but not in the current group) will be imputed.- Returns:
- an impute transform object.
-
loess
Adds a loess transform.- Parameters:
field
- The data field of the dependent variable to smooth.on
- The data field of the independent variable to use a predictor.- Returns:
- a loess transform object.
-
lookup
Adds a lookup transformation.- Parameters:
key
- the key in primary data source.param
- Selection parameter name to look up.- Returns:
- this object.
-
lookup
Adds a lookup transformation.- Parameters:
key
- the key in primary data source.from
- the data source or selection for secondary data reference.- Returns:
- this object.
-
lookupData
Creates a lookup data.- Parameters:
key
- the key in data to lookup.- Returns:
- a lookup data.
-
pivot
Adds a pivot transform.- Parameters:
field
- The data field to pivot on. The unique values of this field become new field names in the output stream.value
- The data field to populate pivoted fields. The aggregate values of this field become the values of the new pivoted fields.- Returns:
- a pivot transform object.
-
quantile
Adds a quantile transform.- Parameters:
field
- The data field for which to perform quantile estimation.- Returns:
- a quantile transform object.
-
regression
Adds a regression transform.- Parameters:
field
- The data field of the dependent variable to predict.on
- The data field of the independent variable to use a predictor.- Returns:
- a regression transform object.
-
sample
Adds a sample transform. The sample transform filters random rows from the data source to reduce its size. As input data objects are added and removed, the sampled values may change in first-in, first-out manner. This transform uses reservoir sampling to maintain a representative sample of the stream.- Parameters:
size
- The maximum number of data objects to include in the sample.- Returns:
- this object.
-
stack
Adds a stack transform.- Parameters:
field
- The field which is stacked.as
- the output start field name. The end field will be "$as_end".groupby
- The data fields to group by.- Returns:
- a stack transform object.
-
timeUnit
Adds a time unit transform.- Parameters:
timeUnit
- The timeUnit.field
- The data field to apply time unit.as
- The output field to write the timeUnit value.- Returns:
- this object.
-
window
Creates a data specification object.- Returns:
- a data specification object.
-