Correlation matrix of grouped variables in dplyr - r

I have a grouped data frame (using dplyr) with 50 numeric columns, which are split into groups using one of the columns. I want to calculate a matrix of correlation between all non grouping columns and one particular column.
An example with the mtcars dataset:
data(mtcars)
cor(mtcars[,2:11], mtcars[,2])
returns a list of correlations between miles per galleon and the other variables.
Let's say however, that I wish to calculate this same correlation for each group of cylinders, e.g.:
library(dplyr)
mtcars <-
mtcars %>%
group_by(cyl)
How would I do this? I am thinking something like
mtcars %>%
group_by(cyl) %>%
summarise_each(funs(cor(...))
But I do not know what to put in the ... as I don't know how to specify a column in the dplyr chain.
Related:
Linear model and dplyr - a better solution? has an answer which is very similar to #akrun's answer. Also, over on cross validated: https://stats.stackexchange.com/questions/4040/r-compute-correlation-by-group has other solutions using packages which are not dplyr.

We could use do.
library(dplyr)
mtcars %>%
group_by(cyl) %>%
do(data.frame(Cor=t(cor(.[,3:11], .[,3]))))
# A tibble: 3 x 10
# Groups: cyl [3]
# cyl Cor.disp Cor.hp Cor.drat Cor.wt Cor.qsec Cor.vs Cor.am Cor.gear Cor.carb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.00 0.435 -0.500 0.857 0.328 -0.187 -0.734 -0.0679 0.490
#2 6 1.00 -0.514 -0.831 0.473 0.789 0.637 -0.637 -0.899 -0.942
#3 8 1 0.118 -0.0922 0.755 0.195 NA -0.169 -0.169 0.0615
NOTE: t part is contributed by #Alex
Or use group_modify
mtcars %>%
select(-mpg) %>%
group_by(cyl) %>%
group_modify(.f = ~ as.data.frame(t(cor(select(.x, everything()),
.x[['disp']]))))
# A tibble: 3 x 10
# Groups: cyl [3]
# cyl disp hp drat wt qsec vs am gear carb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.00 0.435 -0.500 0.857 0.328 -0.187 -0.734 -0.0679 0.490
#2 6 1.00 -0.514 -0.831 0.473 0.789 0.637 -0.637 -0.899 -0.942
#3 8 1 0.118 -0.0922 0.755 0.195 NA -0.169 -0.169 0.0615
Or another option is summarise with across. Created a new column 'disp1' as 'disp' then grouped by 'cyl', get the cor of columns 'disp' to 'carb' with 'disp1'
mtcars %>%
mutate(disp1 = disp) %>%
group_by(cyl) %>%
summarise(across(disp:carb, ~ cor(., disp1)))
# A tibble: 3 x 10
# cyl disp hp drat wt qsec vs am gear carb
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.00 0.435 -0.500 0.857 0.328 -0.187 -0.734 -0.0679 0.490
#2 6 1.00 -0.514 -0.831 0.473 0.789 0.637 -0.637 -0.899 -0.942
#3 8 1 0.118 -0.0922 0.755 0.195 NA -0.169 -0.169 0.0615
Or
library(data.table)
d1 <- copy(mtcars)
setnames(setDT(d1)[, as.list(cor(.SD, .SD[[1]])) , cyl,
.SDcols=3:11], names(d1)[2:11])[]

Related

Having problems updating do() function

I am trying to update my function using the new version of dplyr.
First, I had this function (old version):
slope.k <- function(data, Treatment, Replicate, Day, Ln.AFDMrem){
fitted_models <- data %>% group_by(Treatment, Replicate) %>%
do(model = lm(Ln.AFDMrem ~ Day, data = .))
broom::tidy(fitted_models,model) %>% print(n = Inf)
}
However, the do() function was superseded. Now, I am trying to update with this (new) version:
slope.k <- function(data, Treatment, Replicate, Day, Ln.AFDMrem){
mod_t <- data %>% nest_by(Treatment, Replicate) %>%
mutate(model = list(lm(Ln.AFDMrem ~ Day, data = data))) %>%
summarise(tidy_out = list(tidy(model)))
unnest(select(mod_t, Treatment, tidy_out)) %>% print(n = Inf)
}
However, it doesn't work properly, because I have the following warnings:
Warning messages:
1: `cols` is now required when using unnest().
Please use `cols = c(tidy_out)`
2: `...` is not empty.
We detected these problematic arguments:
* `needs_dots`
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?
Thanks in advance!!!
The issue would be the use of select with unnest. It can be reproduced by changing the select to c
libary(dplyr)
library(broom)
library(tidyr)
mtcars %>%
nest_by(carb, gear) %>%
mutate(model = list(lm(mpg ~ disp + drat, data = data))) %>%
summarise(tidy_out = list(tidy(model)), .groups = 'drop') %>%
unnest(c(tidy_out))
-output
# A tibble: 33 x 7
# carb gear term estimate std.error statistic p.value
# <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 3 (Intercept) -8.50 NaN NaN NaN
# 2 1 3 disp 0.0312 NaN NaN NaN
# 3 1 3 drat 7.10 NaN NaN NaN
# 4 1 4 (Intercept) -70.5 302. -0.234 0.854
# 5 1 4 disp -0.0445 0.587 -0.0757 0.952
# 6 1 4 drat 25.5 62.4 0.408 0.753
# 7 2 3 (Intercept) -3.72 8.57 -0.434 0.739
# 8 2 3 disp 0.0437 0.0123 3.54 0.175
# 9 2 3 drat 1.90 2.88 0.661 0.628
#10 2 4 (Intercept) -10.0 226. -0.0443 0.972
# … with 23 more rows
Also, after the mutate, step, we can directly use the unnest on the 'tidy_out' column
If we use as a function, assuming that unquoted arguments are passed as column names
slope.k <- function(data, Treatment, Replicate, Day, Ln.AFDMrem){
ln_col <- rlang::as_string(ensym(Ln.AFDMrem))
day_col <- rlang::as_string(ensym(Day))
data %>%
nest_by({{Treatment}}, {{Replicate}}) %>%
mutate(model = list(lm(reformulate(day_col, ln_col), data = data))) %>%
summarise(tidy_out = list(tidy(model)), .groups = 'drop') %>%
unnest(tidy_out)
}
slope.k(mtcars, carb, gear, disp, mpg)
# A tibble: 22 x 7
carb gear term estimate std.error statistic p.value
<dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 3 (Intercept) 22.0 5.35 4.12 0.152
2 1 3 disp -0.00841 0.0255 -0.329 0.797
3 1 4 (Intercept) 52.6 8.32 6.32 0.0242
4 1 4 disp -0.279 0.0975 -2.86 0.104
5 2 3 (Intercept) 1.25 3.49 0.357 0.755
6 2 3 disp 0.0460 0.0100 4.59 0.0443
7 2 4 (Intercept) 36.6 6.57 5.57 0.0308
8 2 4 disp -0.0978 0.0529 -1.85 0.206
9 2 5 (Intercept) 47.0 NaN NaN NaN
10 2 5 disp -0.175 NaN NaN NaN
# … with 12 more rows

Is it possible to pass multible variables to the same curly curly?

I am building a function that uses {{ }} (curly curly or double mustache)
I would like the user to be able to pass multiple variables into the same {{ }}, but I am not sure if this is possible using {{ }}. I can't find any examples showing how to do this.
Can you tell me if it possible, and if yes help me make the below minimal reprex work?
library(tidyverse)
group_mean <- function(.data, group){
.data %>%
group_by({{group}}) %>%
summarise_all(mean)
}
# Works
mtcars %>%
group_mean(group = cyl)
# Fails
mtcars %>%
group_mean(group = c(cyl, am))
Error: Column `c(cyl, am)` must be length 32 (the number of rows) or one, not 64
Edit 2022: Nowadays we'd tend to use the c() syntax of tidyselect for taking in multiple groups of variables.
library(dplyr)
my_mean <- function(data, group_vars, summary_vars) {
data |>
group_by(across({{ group_vars }})) |>
summarise(across({{ summary_vars }}, \(x) mean(x, na.rm = TRUE)))
}
mtcars |> my_mean(c(cyl, am), c(mpg, disp))
#> `summarise()` has grouped output by 'cyl'. You can override using the
#> `.groups` argument.
#> # A tibble: 6 × 4
#> # Groups: cyl [3]
#> cyl am mpg disp
#> <dbl> <dbl> <dbl> <dbl>
#> 1 4 0 22.9 136.
#> 2 4 1 28.1 93.6
#> 3 6 0 19.1 205.
#> 4 6 1 20.6 155
#> 5 8 0 15.0 358.
#> 6 8 1 15.4 326
See also the Bidge patterns section in https://rlang.r-lib.org/reference/topic-data-mask-programming.html
If your function takes several groups of multiple variables, you need external quoting with vars(). This function simply capture its inputs as a list of expressions:
vars(foo, bar)
#> [[1]]
#> <quosure>
#> expr: ^foo
#> env: global
#>
#> [[2]]
#> <quosure>
#> expr: ^bar
#> env: global
Take an argument that you splice with !!!:
group_mean <- function(.data, .vars, ...) {
.data <- doingsomethingelse(.data, ...)
.data %>%
group_by(!!!.vars) %>%
summarise_all(mean)
}
Use it like this:
data %>% group_mean(vars(foo, bar), baz, quux)
For multiple grouping variables, you don't need curly-curly, pass three dots instead.
group_mean <- function(.data, ...){
.data %>%
group_by(...) %>%
summarise_all(mean)
}
mtcars %>% group_mean(cyl)
# A tibble: 3 x 11
# cyl mpg disp hp drat wt qsec vs am gear carb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 26.7 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
#2 6 19.7 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
#3 8 15.1 353. 209. 3.23 4.00 16.8 0 0.143 3.29 3.5
mtcars %>% group_mean(cyl, am)
# cyl am mpg disp hp drat wt qsec vs gear carb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 0 22.9 136. 84.7 3.77 2.94 21.0 1 3.67 1.67
#2 4 1 28.1 93.6 81.9 4.18 2.04 18.4 0.875 4.25 1.5
#3 6 0 19.1 205. 115. 3.42 3.39 19.2 1 3.5 2.5
#4 6 1 20.6 155 132. 3.81 2.76 16.3 0 4.33 4.67
#5 8 0 15.0 358. 194. 3.12 4.10 17.1 0 3 3.08
#6 8 1 15.4 326 300. 3.88 3.37 14.6 0 5 6

Use summarise and summarise_at in same dplyr chain

Suppose I want to summarise a data frame after grouping with differing functions. How can I do that?
mtcars %>% group_by(cyl) %>% summarise(size = n())
# A tibble: 3 x 2
cyl size
<dbl> <int>
1 4 11
2 6 7
3 8 14
But if I try:
mtcars %>% group_by(cyl) %>% summarise(size = n()) %>% summarise_at(vars(c(mpg, am:carb)), mean)
Error in is_string(y) : object 'carb' not found
How can I get first the size of each group with n() and then the mean of the other chosen features?
Here is one way using a dplyr::inner_join() on the two summarize operations by the grouping variable:
mtcars %>%
group_by(cyl) %>%
summarise(size = n()) %>%
inner_join(
mtcars %>%
group_by(cyl) %>%
summarise_at(vars(c(mpg, am:carb)), mean),
by='cyl' )
Output is:
# A tibble: 3 x 6
cyl size mpg am gear carb
<dbl> <int> <dbl> <dbl> <dbl> <dbl>
1 4 11 26.7 0.727 4.09 1.55
2 6 7 19.7 0.429 3.86 3.43
3 8 14 15.1 0.143 3.29 3.5
Since summarise removes the column which are not grouped or summarised, an alternative in this case would be to first add a new column with mutate (so that all other columns remain as it is) to count number of rows in each group and include that column in summarise_at calculation.
library(dplyr)
mtcars %>%
group_by(cyl) %>%
mutate(n = n()) %>%
summarise_at(vars(mpg, am:carb, n), mean)
# A tibble: 3 x 6
# cyl mpg am gear carb n
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 26.7 0.727 4.09 1.55 11
#2 6 19.7 0.429 3.86 3.43 7
#3 8 15.1 0.143 3.29 3.5 14
We can use data.table methods
library(data.table)
as.data.table(mtcars)[, n := .N, cyl][, lapply(.SD, mean), cyl,
.SDcols = c("mpg", "am", "gear", "carb", "n")]
#. yl mpg am gear carb n
#1: 6 19.74286 0.4285714 3.857143 3.428571 7
#2: 4 26.66364 0.7272727 4.090909 1.545455 11
#3: 8 15.10000 0.1428571 3.285714 3.500000 14
Or with tidyverse
library(tidyverse)
mtcars %>%
add_count(cyl) %>%
group_by(cyl) %>%
summarise_at(vars(mpg, am:carb, n), mean)
# A tibble: 3 x 6
# cyl mpg am gear carb n
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 26.7 0.727 4.09 1.55 11
#2 6 19.7 0.429 3.86 3.43 7
#3 8 15.1 0.143 3.29 3.5 14
Or using base R
nm1 <- c("mpg", "am", "gear", "carb", "cyl")
transform(aggregate(.~ cyl, mtcars[nm1], mean), n = as.vector(table(mtcars$cyl)))
# cyl mpg am gear carb n
#1 4 26.66364 0.7272727 4.090909 1.545455 11
#2 6 19.74286 0.4285714 3.857143 3.428571 7
#3 8 15.10000 0.1428571 3.285714 3.500000 14

pmap_ variants operating on data.frames as lists

I have a recollection that purrr::pmap_* can treat a data.frame as a list but the syntax eludes me.
Imagine we wanted to fit a separate lm object for each value of mtcars$vs and mtcars$am
library(tidyverse)
library(broom)
d1 <- mtcars %>%
group_by(
vs, am
) %>%
nest %>%
mutate(
coef = data %>%
map(
~lm(mpg ~ wt, data =.) %>%
tidy
)
)
If I wanted to extract the coefficient estimates as an un-nested data.frame, and append the values of am and vs, I might try
d1[, -3] %>%
pmap_dfr(
function(i, j, k)
k %>%
mutate(
vs = i,
am = j
)
)
But this results in an error. More explicitly declaring these variables as separate lists has the desired effect
list(
d1$vs,
d1$am,
d1$coef
) %>%
pmap_dfr(
function(i, j, k)
k %>%
mutate(
vs = i,
am = j
)
)
Is there a succinct way for pmap_* to treat a data.frame as a list?
We can use the standard option to extract the components (..1, ..2, etc)
d1[, -3] %>%
pmap_dfr(~ ..3 %>%
mutate(vs = ..1, am = ..2))
# A tibble: 8 x 7
# term estimate std.error statistic p.value vs am
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 (Intercept) 42.4 3.30 12.8 0.000213 0 1
#2 wt -7.91 1.14 -6.93 0.00227 0 1
#3 (Intercept) 44.1 6.96 6.34 0.00144 1 1
#4 wt -7.77 3.36 -2.31 0.0689 1 1
#5 (Intercept) 31.5 8.98 3.51 0.0171 1 0
#6 wt -3.38 2.80 -1.21 0.281 1 0
#7 (Intercept) 25.1 3.51 7.14 0.0000315 0 0
#8 wt -2.44 0.842 -2.90 0.0159 0 0
This is because the second list has no names attribute. If you unname d1 it works. The fact that you used the list function in the second example doesn't make a difference (except that it removed the names), because both objects are lists (data frames are lists).
d1[, -3] %>%
unname %>%
pmap_dfr(
function(i, j, k)
k %>%
mutate(
vs = i,
am = j
)
)
# # A tibble: 8 x 7
# term estimate std.error statistic p.value vs am
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 42.4 3.30 12.8 0.000213 0 1
# 2 wt -7.91 1.14 -6.93 0.00227 0 1
# 3 (Intercept) 44.1 6.96 6.34 0.00144 1 1
# 4 wt -7.77 3.36 -2.31 0.0689 1 1
# 5 (Intercept) 31.5 8.98 3.51 0.0171 1 0
# 6 wt -3.38 2.80 -1.21 0.281 1 0
# 7 (Intercept) 25.1 3.51 7.14 0.0000315 0 0
# 8 wt -2.44 0.842 -2.90 0.0159 0 0
You can also name the arguments in your first code block's function to match (or use ..1 etc) for the same result
d1[, -3] %>%
pmap_dfr(
function(vs, am, coef)
coef %>%
mutate(
vs = vs,
am = am
)
)
# # A tibble: 8 x 7
# term estimate std.error statistic p.value vs am
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 42.4 3.30 12.8 0.000213 0 1
# 2 wt -7.91 1.14 -6.93 0.00227 0 1
# 3 (Intercept) 44.1 6.96 6.34 0.00144 1 1
# 4 wt -7.77 3.36 -2.31 0.0689 1 1
# 5 (Intercept) 31.5 8.98 3.51 0.0171 1 0
# 6 wt -3.38 2.80 -1.21 0.281 1 0
# 7 (Intercept) 25.1 3.51 7.14 0.0000315 0 0
# 8 wt -2.44 0.842 -2.90 0.0159 0 0
You could also use wap from the experimental rap package
library(rap)
d1[, -3] %>%
wap( ~ coef %>%
mutate(
vs = vs,
am = am)) %>%
bind_rows
# # A tibble: 8 x 7
# term estimate std.error statistic p.value vs am
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 42.4 3.30 12.8 0.000213 0 1
# 2 wt -7.91 1.14 -6.93 0.00227 0 1
# 3 (Intercept) 44.1 6.96 6.34 0.00144 1 1
# 4 wt -7.77 3.36 -2.31 0.0689 1 1
# 5 (Intercept) 31.5 8.98 3.51 0.0171 1 0
# 6 wt -3.38 2.80 -1.21 0.281 1 0
# 7 (Intercept) 25.1 3.51 7.14 0.0000315 0 0
# 8 wt -2.44 0.842 -2.90 0.0159 0 0

replacing `.x` with column name after running `map::purrr()` function

I run lm() for every column of a dataset with one of the column as the dependent variable, using purrr:map() function.
The results are almost perfect except for this - I want to replace .x in the results with the variable that i run lm() for.
The post R purrr map show column names in output is related but I want to avoid creating a function.
Below, are the codes using the mtcars dataset. I know, for example that .x for the first output refers to $mpg. I am not sure if setNames() will work.
library(tidyverse)
library(broom)
mod3 <- map(mtcars, ~ lm(mpg ~ .x, data = mtcars)) %>%
map(~tidy(.x))
#> Warning in summary.lm(x): essentially perfect fit: summary may be
#> unreliable
mod3
#> $mpg
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -5.02e-15 9.94e-16 -5.06e 0 0.0000198
#> 2 .x 1.00e+ 0 4.74e-17 2.11e16 0
#>
#> $cyl
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 37.9 2.07 18.3 8.37e-18
#> 2 .x -2.88 0.322 -8.92 6.11e-10
#>
#> $disp
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 29.6 1.23 24.1 3.58e-21
#> 2 .x -0.0412 0.00471 -8.75 9.38e-10
#>
#> $hp
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 30.1 1.63 18.4 6.64e-18
#> 2 .x -0.0682 0.0101 -6.74 1.79e- 7
#>
#> $drat
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -7.52 5.48 -1.37 0.180
#> 2 .x 7.68 1.51 5.10 0.0000178
#>
#> $wt
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 37.3 1.88 19.9 8.24e-19
#> 2 .x -5.34 0.559 -9.56 1.29e-10
#>
#> $qsec
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -5.11 10.0 -0.510 0.614
#> 2 .x 1.41 0.559 2.53 0.0171
#>
#> $vs
#> # A tibble: 2 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 16.6 1.08 15.4 8.85e-16
#> 2 .x 7.94 1.63 4.86 3.42e- 5
Here is one way to do it
library(tidyverse)
library(broom)
names(mtcars)[-1] %>%
set_names() %>%
map(~ lm(as.formula(paste0('mpg ~ ', .x)), data = mtcars)) %>%
map_dfr(., broom::tidy, .id = "variable")
#> # A tibble: 20 x 6
#> variable term estimate std.error statistic p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 cyl (Intercept) 37.9 2.07 18.3 8.37e-18
#> 2 cyl cyl -2.88 0.322 -8.92 6.11e-10
#> 3 disp (Intercept) 29.6 1.23 24.1 3.58e-21
#> 4 disp disp -0.0412 0.00471 -8.75 9.38e-10
#> 5 hp (Intercept) 30.1 1.63 18.4 6.64e-18
#> 6 hp hp -0.0682 0.0101 -6.74 1.79e- 7
#> 7 drat (Intercept) -7.52 5.48 -1.37 1.80e- 1
#> 8 drat drat 7.68 1.51 5.10 1.78e- 5
#> 9 wt (Intercept) 37.3 1.88 19.9 8.24e-19
#> 10 wt wt -5.34 0.559 -9.56 1.29e-10
#> 11 qsec (Intercept) -5.11 10.0 -0.510 6.14e- 1
#> 12 qsec qsec 1.41 0.559 2.53 1.71e- 2
#> 13 vs (Intercept) 16.6 1.08 15.4 8.85e-16
#> 14 vs vs 7.94 1.63 4.86 3.42e- 5
#> 15 am (Intercept) 17.1 1.12 15.2 1.13e-15
#> 16 am am 7.24 1.76 4.11 2.85e- 4
#> 17 gear (Intercept) 5.62 4.92 1.14 2.62e- 1
#> 18 gear gear 3.92 1.31 3.00 5.40e- 3
#> 19 carb (Intercept) 25.9 1.84 14.1 9.22e-15
#> 20 carb carb -2.06 0.569 -3.62 1.08e- 3
Created on 2019-02-10 by the reprex package (v0.2.1.9000)
Hi you can use purrr::imap() like so:
mod3 <- map(mtcars, ~ lm(mpg ~ .x, data = mtcars)) %>%
map(tidy) %>%
imap( ~ {.x[2, 1] <- .y ; return(.x)} )
imap sends two things to the function/ formula : .x the item and .y which is either the name of the item (name in this case) or the index. I had to wrap everything in {} in this case to get the assignment to work

Resources