Announcing wrapr 1.6.2

wrapr 1.6.2 is now up on CRAN. We have some neat new features for R users to try (in addition to many earlier wrapr goodies).

The first is the %in_block% alternate notation for let().

The wrapr let()-block allows easy replacement of names in name-capturing interfaces (such as transform()), as we show below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
library("wrapr")

column_mapping <- qc(
 AREA_COL = Sepal.Area,
 LENGTH_COL = Sepal.Length,
 WIDTH_COL = Sepal.Width
)

# let-block notation
let(
 alias = column_mapping,
 
 iris %.>%
 transform(.,
 AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>%
 subset(., 
 select = qc(Species, AREA_COL)) %.>%
 head(.)
)
1
2
3
4
5
6
7
1
2
3
4
5
6
7
## Species Sepal.Area
## 1 setosa 14.01936
## 2 setosa 11.54535
## 3 setosa 11.81239
## 4 setosa 11.19978
## 5 setosa 14.13717
## 6 setosa 16.54049

The qc() notation allowed us to specify a named-vector without quotes. qc(a = b) is equivalent to c("a" = "b").

With the %in_block% operator notation one writes the let()-block as an in-line operator supplying the mapping into a code block. The above example can now be re-written as the following.

1
2
3
4
5
6
7
8
9
# %in_block% notation
column_mapping %in_block% {
 iris %.>%
 transform(., 
 AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL) %.>%
 subset(., 
 select = qc(Species, AREA_COL)) %.>%
 head(.)
}
1
2
3
4
5
6
7
1
2
3
4
5
6
7
## Species Sepal.Area
## 1 setosa 14.01936
## 2 setosa 11.54535
## 3 setosa 11.81239
## 4 setosa 11.19978
## 5 setosa 14.13717
## 6 setosa 16.54049

This notation can be handy for defining functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
compute_area <- function(
 .data, 
 area_col, 
 length_col, 
 width_col) c( # End of function argument definiton
 AREA_COL = area_col,
 LENGTH_COL = length_col,
 WIDTH_COL = width_col
 ) %in_block% { # End of argument mapping block
 .data %.>%
 transform(., 
 AREA_COL = (pi/4)*LENGTH_COL*WIDTH_COL)
 } # End of function body block

iris %.>%
 'Sepal.Area', 'Sepal.Length', 'Sepal.Width') %.>%
 compute_area(., 
 'Petal.Area', 'Petal.Length', 'Petal.Width') %.>%
 subset(., 
 select = c("Species", "Sepal.Area", "Petal.Area")) %.>%
 head(.)
1
2
3
4
5
6
7
## Species Sepal.Area Petal.Area
## 1 setosa 14.01936 0.2199115
## 2 setosa 11.54535 0.2199115
## 3 setosa 11.81239 0.2042035
## 4 setosa 11.19978 0.2356194
## 5 setosa 14.13717 0.2199115
## 6 setosa 16.54049 0.5340708

We can think of the above function definition notation as having two blocks: the alias defining block (the portion before “%in_block%”) and the templated function body (the portion after “%in_block%”). Notice how easy it is to use this notation to convert a non-standard (or name/code-capturing interface) into a value-oriented interface. The point is value-oriented interfaces are much more re-usable and easier to program over (use in for-loops, applies, and functions).

The second new feature is the orderv() function, a value-oriented adapter for base::order(). orderv() uses a vector of column names to compute an ordering permutation for a data.frame. We can use it as we show below.

1
2
3
4
5
6
7
library("wrapr")

sort_columns <- qc(mpg, hp, gear)
ordering <- orderv(mtcars[ , sort_columns, drop = FALSE],
 decreasing = TRUE,
 method = "radix")
head(mtcars[ordering, , drop = FALSE])
1
2
3
4
5
6
7
1
2
3
4
5
6
7
## mpg cyl disp hp drat wt qsec vs am gear carb
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2

Of course we have also have all the steps wrapped in a convenient function: sortv().

1
2
3
4
5
6
mtcars %.>%
 sortv(., 
 sort_columns, 
 decreasing = TRUE,
 method = "radix") %.>%
 head(.)
1
2
3
4
5
6
7
1
2
3
4
5
6
7
## mpg cyl disp hp drat wt qsec vs am gear carb
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2

For details on “method = "radix"” please see our earlier tip here.

A third new feature is mk_formula(). mk_formula() is used to build simple formulas for modeling tasks (which may have a large number of variables) without any string processing or parsing.

Our usual advice for building simple formulas has been to use the paste()-based methods exhibited in “R Tip: How to Pass a formula to lm”. This remains good advice. However mk_formula() is a more concise and more hygienic alternative. An example is given below.

1
2
3
4
5
6
7
8
9
# specifications of how to model,
# coming from somewhere else
outcome <- "mpg"
variables <- c("cyl", "disp", "hp", "carb")

# our modeling effort, 
# fully parameterized!
f <- wrapr::mk_formula(outcome, variables)
print(f)
1
## mpg ~ cyl + disp + hp + carb
1
2
3

model <- lm(f, data = mtcars)
print(model)
1
2
3
4
5
6
7
## 
## Call:
## lm(formula = f, data = mtcars)
## 
## Coefficients:
## (Intercept) cyl disp hp carb 
## 34.021595 -1.048523 -0.026906 0.009349 -0.926863

The above notation is good for programming over modeling tasks.

Like this:

Like Loading…

Related