A mungebit which takes a fixed group of columns and produces a new group of columns (or a single new column) can be abstracted into a multi-column transformation. This functions allows one to specify what happens to a fixed list of columns, and the mungebit will be the resulting multi-column transformation applied to an arbitrary combination of columns. An arity-1 multi-column transformation with a single output column equal to its original input column is simply a column_transformation. can be abstracted into a column transformation. This function allows one to specify what happens to an individual column, and the mungebit will be the resulting column transformation applied to an arbitrary combination of columns.

multi_column_transformation(transformation, nonstandard = FALSE)

Arguments

transformation

function. The function's first argument will receive atomic vectors derived from some data.frame. Any other arguments will be received as the list(...) from calling the function produced by multi_column_transformation.

nonstandard

logical. If TRUE, nonstandard evaluation support will be provided for the derived function, so it will be possible to capture the calling expression for each column. By default FALSE. Note this will slow the transformation by 0.1ms on each column.

Value

a function which takes a data.frame and a vector of column names (or several other formats, see standard_column_format) and applies the transformation.

Note

The function produced by calling multi_column_transformation will not run independently. It must be used a train or predict function for a mungebit.

See also

column_transformation, standard_column_format

Examples

divider <- multi_column_transformation(function(x, y) { x / y }) # Determines the ratio of Sepal.Length and Sepal.Width in the iris dataset. iris2 <- mungebit$new(divider)$run(iris, c("Sepal.Length", "Sepal.Width"), "Sepal.Ratio")