R: Coding looping constructs

R: Coding looping constructs - r

I am looking for some help on achieving below looping requirement through a simplified indexing or apply constructs in R. Doing it by 'for' loops seem computationally complex and inefficient. Hence, I am looking for any help to achieve it in an efficient manner;
The data table reference is as below;
The sequence I am trying to get is all positive and negative number index (row and column) sequences per row as below;
For row 1: 1-4-5, 1-4-8, 1-4-11;
Column 'Occurances' specifies the potential number of sequences per row.
Finally, I am trying to get a data frame similar to below (shown only for first and second rows) with all occurrences with each index on a column;
Any help is highly appreciated. Thank you very much

There's a lot of way you can do this. To do it in an efficient manner you should probably use base R. The more rows and columns you have to check, the more you will need to be careful with how you code this.
Here are two examples of how you could run it, see which works best for you.
library(purrr)
library(dplyr)
# create table to test code on, n1 x n2 dataframe with a random sample of -1, 0, 1
n1 <- 10
n2 <- 10
to_test <- map(1:n1, ~sample(c(-1, 0, 1), size = c(n2), replace = T)) %>%
`names<-`(seq_along(.)) %>%
bind_cols()
# Split table into a list of rows
to_test_row_list <- split(to_test, 1:nrow(to_test))
# For each item in the list
sub_tables <- mapply(FUN = function(list_in, row_in){
# create a dataframe with the ron number in the first row
crossing(row = row_in,
# cross join the indexes of the columns with are less than and
# more than zero for the other two cols
crossing(data.frame(gt = which(list_in > 0)),
data.frame(lt = which(list_in < 0))))},
# Inputs for the mapply function FUN, the list of rows and the number for each row
list_in = to_test_row_list,
row_in = names(to_test_row_list),
# Do not simply dataframes into lists
SIMPLIFY = F)
# Turn list of tables into one long table
res1 <- bind_rows(sub_tables)
res1
# The same code in one pipe
res2 <- to_test %>%
split(seq_along(.)) %>%
map2(.x = .,
.y = names(.),
~crossing (data.frame(gt = which(.x > 0)),
data.frame(lt = which(.x < 0))) %>%
mutate(row = .y) %>% select(row, everything())) %>%
bind_rows()
res2

This works in Base-R, the task is accomplished almost entirely in the first line of code. The rest is just cleaning the output to make it exactly as asked for. Without proper example data (you can use dput(...) to share) there will certainly be issues with using this code exactly as presented with your data.
new_data <- do.call(rbind,apply(mydata,1, function(x) merge(x[x > 0], x[x < 0]) ))
new_data$from <- sub("X(\\d).*","\\1",row.names(new_data))
new_data <- new_data[,c(3,1,2)]
rownames(new_data) <- c()
sample data:
mydata <- data.frame("1"=c(0,0,0,-45,57,0,0,51,0,0,45,0),"3"=c(4,4,0,5,654,34,-6,65,-37,4,56,56))
mydata <- t(mydata)
output:
> new_data
from x y
1 1 57 -45
2 1 51 -45
3 1 45 -45
4 3 4 -6
5 3 4 -6
6 3 5 -6
7 3 654 -6
8 3 34 -6
9 3 65 -6
10 3 4 -6
11 3 56 -6
12 3 56 -6
13 3 4 -37
14 3 4 -37
15 3 5 -37
16 3 654 -37
17 3 34 -37
18 3 65 -37
19 3 4 -37
20 3 56 -37
21 3 56 -37

Here's a tidyverse approach if you want to keep things nested neatly:
library(tidyverse)
df <- tibble::tribble(
~`1`, ~`2`, ~`3`, ~`4`, ~`5`, ~`6`, ~`7`, ~`8`, ~`9`, ~`10`, ~`11`, ~`12`,
0, 0, 0L, -45.2, 57, 0, 0, 82.7, 0, 0, 58.7, 0,
48.8, 65, 0L, 35.5, 50.8, 42.2, -89.6, 52.8, -45.8, 26.4, 51.1, 85.7,
63.1, 83.3, 0L, 21.5, 60, 0, 0, 69, 0, -84.3, 61, 0
)
df %>%
rownames_to_column(var = "row_idx") %>%
pivot_longer(cols = -row_idx, names_to = "col_idx") %>%
group_by(row_idx) %>%
nest() %>%
mutate(
df_of_pairs = map(data, ~ expand.grid(which(.$value < 0), which(.$value > 0))),
combos = map_int(df_of_pairs, nrow)
)
#> # A tibble: 3 x 4
#> # Groups: row_idx [3]
#> row_idx data df_of_pairs combos
#> <chr> <list> <list> <int>
#> 1 1 <tibble [12 x 2]> <df[,2] [3 x 2]> 3
#> 2 2 <tibble [12 x 2]> <df[,2] [18 x 2]> 18
#> 3 3 <tibble [12 x 2]> <df[,2] [6 x 2]> 6
Created on 2020-05-13 by the reprex package (v0.3.0)
Then if you want to get the list of pairs, simply add %>% unnest(df_of_pairs) to the end of the pipeline:
df %>%
rownames_to_column(var = "row_idx") %>%
pivot_longer(cols = -row_idx, names_to = "col_idx") %>%
group_by(row_idx) %>%
nest() %>%
mutate(
df_of_pairs = map(data, ~ expand.grid(which(.$value < 0), which(.$value > 0))),
combos = map_int(df_of_pairs, nrow)
) %>%
unnest(df_of_pairs)
# A tibble: 27 x 5
# Groups: row_idx [3]
row_idx data Var1 Var2 combos
<chr> <list> <int> <int> <int>
1 1 <tibble [12 x 2]> 4 5 3
2 1 <tibble [12 x 2]> 4 8 3
3 1 <tibble [12 x 2]> 4 11 3
4 2 <tibble [12 x 2]> 7 1 18
5 2 <tibble [12 x 2]> 9 1 18
6 2 <tibble [12 x 2]> 7 2 18
7 2 <tibble [12 x 2]> 9 2 18
8 2 <tibble [12 x 2]> 7 4 18
9 2 <tibble [12 x 2]> 9 4 18
10 2 <tibble [12 x 2]> 7 5 18
# ... with 17 more rows

Related

How to mutate to input the mean as a column after using nest?

I'm trying to learn how to use nest(), and I'm trying to nest by once of 3 time periods participants could be in and I want to add two columns. The first column is the overall mean, which I have figured out. Then, I want to nest by the time variable and create 3 datasets (which I have figured out) and then compute the group mean. I read that you should create a function (here, section 6.3.1), but my function keeps failing. How would I do this?
Also, please use nest or nest_by in the solution. I know I could use group_by(), like someone else did here, but in my actual data, I need these to be 3 separate datasets due to other computations that I need to do.
#Here's my setup and sample data
library(dplyr)
library(purrr)
library(tidyr)
set.seed(1414)
test <- tibble(id = c(1:100),
condition = c(rep(c("pre", "post"), 50)),
time = c(case_when(condition == "pre" ~ 0,
condition == "post" ~ sample(c(1, 2), size = c(100), replace = TRUE))),
score = case_when(time == 0 ~ 1,
time == 1 ~ 10,
time == 2 ~ 100))
#Here's what I tried
#Nesting the data (works)
nested_test <- test %>%
unite(col = "all_combos", c(condition, time)) %>%
mutate(score2 = mean(score)) %>%
nest_by(all_combos)
#Make mean function and map it (doesn't work)
my_mean <- function(data) {
mean(score, na.rm = T)
}
nested_test %>%
mutate(score3 = map(data, my_mean))

We may need to ungroup as there is rowwise attribute and then loop over the data with map and create the column with mutate on the nested data
library(dplyr)
library(purrr)
nested_test_new <- nested_test %>%
ungroup %>%
mutate(data = map(data, ~ .x %>%
mutate(score3 = mean(score, na.rm = TRUE))))
-output
nested_test_new
# A tibble: 3 × 2
all_combos data
<chr> <list>
1 post_1 <tibble [19 × 4]>
2 post_2 <tibble [31 × 4]>
3 pre_0 <tibble [50 × 4]>
> nested_test_new$data
[[1]]
# A tibble: 19 × 4
id score score2 score3
<int> <dbl> <dbl> <dbl>
1 2 10 33.4 10
2 4 10 33.4 10
3 14 10 33.4 10
4 16 10 33.4 10
5 18 10 33.4 10
6 28 10 33.4 10
7 30 10 33.4 10
8 32 10 33.4 10
9 38 10 33.4 10
10 44 10 33.4 10
11 48 10 33.4 10
12 60 10 33.4 10
13 64 10 33.4 10
14 78 10 33.4 10
15 80 10 33.4 10
16 86 10 33.4 10
17 92 10 33.4 10
18 96 10 33.4 10
19 100 10 33.4 10
[[2]]
# A tibble: 31 × 4
id score score2 score3
<int> <dbl> <dbl> <dbl>
1 6 100 33.4 100
2 8 100 33.4 100
3 10 100 33.4 100
4 12 100 33.4 100
...
Or another option is nest_mutate from nplyr
library(nplyr)
test %>%
unite(col = "all_combos", c(condition, time)) %>%
mutate(score2 = mean(score)) %>%
nest(data = -all_combos) %>%
nest_mutate(data, score3 = mean(score, na.rm = TRUE))
-output
# A tibble: 3 × 2
all_combos data
<chr> <list>
1 pre_0 <tibble [50 × 4]>
2 post_1 <tibble [19 × 4]>
3 post_2 <tibble [31 × 4]>

turn time series data into nested data frame where groups are increasing time periods (w/dplyr & tidyr)

I have a dataset with daily counts per year spanning several decades, and I'd like to run a function on different subsets of that data based on an increasing timespan. For example, I'd like to run the function on the first decade of data (1995-2005), then on the first decade + 1 (1995-2006), first decade + 2 (1995-2007), and so on until the end of the time series. This is what I had in mind:
dat <- tibble(
year = rep(1995:2014, each = 30),
count = rpois(600, 5)
)
dat
# A tibble: 600 x 2
year count
<int> <int>
1 1995 8
2 1995 3
3 1995 9
4 1995 2
5 1995 8
6 1995 7
7 1995 3
8 1995 6
9 1995 1
10 1995 7
# … with 590 more rows
with the final product looking like this:
# A tibble: 3 x 2
time_span data
<chr> <list>
1 1995-2004 <tibble [300 × 1]>
2 1995-2005 <tibble [330 × 1]>
3 1995-2006 <tibble [360 × 1]>
...
I would then apply my function to the nested data frame:
dat_nested %>%
mutate(result = map(data, my_function))
I'm struggling to think of a way to create these subsets with dplyr...any suggestions? Thanks!

Here's a way using map :
library(dplyr)
n <- min(dat$year)
purrr::map_df((n+10):max(dat$year),
~dat %>%
filter(between(year, n, .x)) %>%
summarise(year = paste(min(year), max(year), sep = '-'),
data = list(count)))
#If you want dataframe
#data = list(data.frame(count = count))))
# year data
# <chr> <list>
# 1 1995-2005 <int [330]>
# 2 1995-2006 <int [360]>
# 3 1995-2007 <int [390]>
# 4 1995-2008 <int [420]>
# 5 1995-2009 <int [450]>
# 6 1995-2010 <int [480]>
# 7 1995-2011 <int [510]>
# 8 1995-2012 <int [540]>
# 9 1995-2013 <int [570]>
#10 1995-2014 <int [600]>

The result could be directly calculated from the original data frame without the need of an intermediate nested data frame and we show that below; however, if you do want to create a nested data frame anyways then use the same code but use it with
my_function <- base::list
to nest the two columns or with
my_function <- function(x) list(x["count"])
to just nest the count column. The solution only uses dplyr. It does not use tidyr or purrr.
library(dplyr)
my_function <- function(x) sum(x$count) # test function
dat %>%
group_by(year) %>%
summarize(result = my_function(.[.$year <= first(year), ]), .groups = "drop") %>%
mutate(year = paste(first(year), year, sep = "-")) %>%
tail(-9)
giving:
# A tibble: 11 x 2
year result
<chr> <int>
1 1995-2004 1502
2 1995-2005 1647
3 1995-2006 1810
4 1995-2007 1957
5 1995-2008 2106
6 1995-2009 2258
7 1995-2010 2398
8 1995-2011 2547
9 1995-2012 2697
10 1995-2013 2855
11 1995-2014 3016
With my_function <- function(x) list(x["count"]) the output looks like this:
# A tibble: 11 x 2
year result
<chr> <list>
1 1995-2004 <tibble [300 x 1]>
2 1995-2005 <tibble [330 x 1]>
3 1995-2006 <tibble [360 x 1]>
4 1995-2007 <tibble [390 x 1]>
5 1995-2008 <tibble [420 x 1]>
6 1995-2009 <tibble [450 x 1]>
7 1995-2010 <tibble [480 x 1]>
8 1995-2011 <tibble [510 x 1]>
9 1995-2012 <tibble [540 x 1]>
10 1995-2013 <tibble [570 x 1]>
11 1995-2014 <tibble [600 x 1]>
Note
The test input dat in reproducible form is:
set.seed(123)
dat <- data.frame(year = rep(1995:2014, each = 30), count = rpois(600, 5))

Here is my attempt to create a nested data with time-series data on a rolling window basis. (note: rlang usage of var=enquo(str_varname) with !!var may change in the future versions.)
library(dplyr)
library(tidyr)
create_rolling_yr_data <- function(df, year='year', rolling=9,
var_list=c('count'), newvar='rolling') {
year <- enquo(year)
var_list <- enquos(var_list)
df <- df %>% dplyr::select(!!year, !!!var_list)
df_nest <- df %>% group_by(year) %>% nest()
print(df_nest)
list_data <- list()
yrs <- unique(df[[ensym(year)]])
yr_end <- max(yrs) - rolling
for (i in seq_along(yrs)) {
yr <- yrs[i]
if (yr <= yr_end) {
list_data[[i]] <- df %>% filter(year >= yr, year <= (yr+rolling))
} else {
list_data[[i]] <- list()
}
}
df_nest[[newvar]] <- list_data
return(df_nest %>% filter(year <= yr_end))
}
create_rolling_yr_data(dat, year='year', rolling=9,
var_list=c('count'), newvar='rolling')

R: Apply function over nested lists with varying length

I have the following dataframe:
df <- data.frame(id = paste0('id', sample(c(1:4),80000, replace = TRUE)), date = as.Date(rbeta(80000, 0.7, 10) * 100, origin = "2016-01-01"),
variant = sample(c(0:1), 80000, replace = TRUE), type = sample(paste0(LETTERS[1:3],LETTERS[1]), 80000, TRUE), code = sample(letters[1:2], 80000, TRUE),
level = sample(LETTERS[1:8], 80000, TRUE), number = sample(c(1:100), 80000, replace = TRUE) )
Next, I split the dataframe several times and combine them (plus the original df) in a list:
dfs <- split(df,df$id)
df2 <- lapply(dfs, function(x) split(x,x$type))
df3 <- lapply(dfs, function(x) split(x,x$code))
df4 <- lapply(dfs, function(x) split(x,x$level))
df_all <- list(dfs,df2,df3,df4)
Thus, I first split the dataframe by Id, after which their are splitted on several conditions: none,type,code and level. Where "none" means that I don't split it any further.
My first question: is there a faster/cleaner way to achieve this?
Second question: how do I apply a function to each element of this list? It will probably will have something to do with lapply, but I can't figure out how, as the number of nested lists varies. Thus, to make it more clear, I would like to know how to apply my function to:
df_all[[1]]$id1
df_all[[1]]$id2
df_all[[1]]$id3
df_all[[1]]$id4
df_all[[2]]$id1$AA
df_all[[2]]$id1$BA
df_all[[2]]$id1$CA
df_all[[2]]$id2$AA
etc.
My function is as follows:
func <- function(x){
x <- x %>%
group_by(variant) %>%
summarise(H = sum(number)) %>%
ungroup()

If all you wanted to do is group by different combination of variables and summarize, then splitting the groups is probably not a good idea, just modify the function so that you can input different combinations of group by variables like the following:
library(dplyr)
func2 <- function(x, ...){
group_quo = quos(...)
x %>%
group_by(!!!group_quo) %>%
summarize(H = sum(number))
}
Result:
> func2(df, id, variant)
# A tibble: 8 x 3
# Groups: id [?]
id variant H
<fct> <int> <int>
1 id1 0 500192
2 id1 1 508282
3 id2 0 505829
4 id2 1 511855
5 id3 0 502280
6 id3 1 510854
7 id4 0 502621
8 id4 1 510372
> func2(df, id, type, variant)
# A tibble: 24 x 4
# Groups: id, type [?]
id type variant H
<fct> <fct> <int> <int>
1 id1 AA 0 167757
2 id1 AA 1 169025
3 id1 BA 0 166225
4 id1 BA 1 168208
5 id1 CA 0 166210
6 id1 CA 1 171049
7 id2 AA 0 169277
8 id2 AA 1 172240
9 id2 BA 0 168596
10 id2 BA 1 169396
# ... with 14 more rows
etc.
If you're trying to apply something more complex or you want to keep the hierarchical structure of the lists, you can try to use nested data.frames:
library(dplyr)
library(tidyr)
library(purrr)
func <- function(x){
x %>%
group_by(variant) %>%
summarize(H = sum(number))
}
df_nested = df %>%
group_by(id) %>%
nest() %>%
mutate(df1 = data %>% map(func),
df2 = data %>% map(~group_by(., type) %>% nest()),
df3 = data %>% map(~group_by(., code) %>% nest()),
df4 = data %>% map(~group_by(., level) %>% nest())) %>%
mutate_at(vars(df2:df4),
funs(map(., function(x) mutate(x, data = map(data, func)) %>% unnest)))
Result:
> df_nested
# A tibble: 4 x 6
id data df1 df2 df3 df4
<fct> <list> <list> <list> <list> <list>
1 id1 <tibble [19,963 x 6]> <tibble [2 x 2]> <tibble [6 x 3]> <tibble [4 x 3]> <tibble [16 x 3]>
2 id3 <tibble [19,946 x 6]> <tibble [2 x 2]> <tibble [6 x 3]> <tibble [4 x 3]> <tibble [16 x 3]>
3 id2 <tibble [20,114 x 6]> <tibble [2 x 2]> <tibble [6 x 3]> <tibble [4 x 3]> <tibble [16 x 3]>
4 id4 <tibble [19,977 x 6]> <tibble [2 x 2]> <tibble [6 x 3]> <tibble [4 x 3]> <tibble [16 x 3]>
> df_nested %>%
+ select(id, data) %>%
+ unnest()
# A tibble: 80,000 x 7
id date variant type code level number
<fct> <date> <int> <fct> <fct> <fct> <int>
1 id1 2016-01-05 1 AA b H 71
2 id1 2016-01-01 0 CA a G 85
3 id1 2016-01-03 0 CA a E 98
4 id1 2016-01-01 1 BA b E 78
5 id1 2016-01-01 1 BA b G 64
6 id1 2016-01-18 1 AA a E 69
7 id1 2016-01-04 1 BA b E 12
8 id1 2016-01-02 0 CA b B 32
9 id1 2016-01-01 1 CA a B 44
10 id1 2016-01-02 0 BA a F 89
# ... with 79,990 more rows
> df_nested %>%
+ select(id, df1) %>%
+ unnest()
# A tibble: 8 x 3
id variant H
<fct> <int> <int>
1 id1 0 500192
2 id1 1 508282
3 id3 0 502280
4 id3 1 510854
5 id2 0 505829
6 id2 1 511855
7 id4 0 502621
8 id4 1 510372

Use filter() (and other dplyr functions) inside nested data frames with map()

I'm trying to use map() of purrr package to apply filter() function to the data stored in a nested data frame.
"Why wouldn't you filter first, and then nest? - you might ask.
That will work (and I'll show my desired outcome using such process), but I'm looking for ways to do it with purrr.
I want to have just one data frame, with two list-columns, both being nested data frames - one full and one filtered.
I can achieve it now by performing nest() twice: once on all data, and second on filtered data:
library(tidyverse)
df <- tibble(
a = sample(x = rep(c('x','y'),5), size = 10),
b = sample(c(1:10)),
c = sample(c(91:100))
)
df_full_nested <- df %>%
group_by(a) %>%
nest(.key = 'full')
df_filter_nested <- df %>%
filter(c >= 95) %>% ##this is the key step
group_by(a) %>%
nest(.key = 'filtered')
## Desired outcome - one data frame with 2 nested list-columns: one full and one filtered.
## How to achieve this without breaking it out into 2 separate data frames?
df_nested <- df_full_nested %>%
left_join(df_filter_nested, by = 'a')
The objects look like this:
> df
# A tibble: 10 x 3
a b c
<chr> <int> <int>
1 y 8 93
2 x 9 94
3 y 10 99
4 x 5 97
5 y 2 100
6 y 3 95
7 x 7 96
8 y 6 92
9 x 4 91
10 x 1 98
> df_full_nested
# A tibble: 2 x 2
a full
<chr> <list>
1 y <tibble [5 x 2]>
2 x <tibble [5 x 2]>
> df_filter_nested
# A tibble: 2 x 2
a filtered
<chr> <list>
1 y <tibble [3 x 2]>
2 x <tibble [3 x 2]>
> df_nested
# A tibble: 2 x 3
a full filtered
<chr> <list> <list>
1 y <tibble [5 x 2]> <tibble [4 x 2]>
2 x <tibble [5 x 2]> <tibble [4 x 2]>
So, this works. But it is not clean. And in real life, I group by several columns, which means I also have to join on several columns... It gets hairy fast.
I'm wondering if there is a way to apply filter to the nested column. This way, I'd operate within the same object. Just cleaner and more understandable code.
I'm thinking it'd look like
df_full_nested %>% mutate(filtered = map(full, ...))
But I am not sure how to map filter() properly
Thanks!

You can use map(full, ~ filter(., c >= 95)), where . stands for individual nested tibble, to which you can apply the filter directly:
df_nested_2 <- df_full_nested %>% mutate(filtered = map(full, ~ filter(., c >= 95)))
identical(df_nested, df_nested_2)
# [1] TRUE

How to add calculated columns to nested data frames (list columns) using purrr

I would like to perform calculations on a nested data frame (stored as a list-column), and add the calculated variable back to each dataframe using purrr functions. I'll use this result to join to other data, and keeping it compact helps me to organize and examine it better. I can do this in a couple of steps, but it seems like there may be a solution I haven't come across. If there is a solution out there, I haven't been able to find it easily.
Load libraries. example requires the following packages (available on CRAN):
library(dplyr)
library(purrr)
library(RcppRoll) # to calculate rolling mean
Example data with 3 subjects, and repeated measurements over time:
test <- data_frame(
id= rep(1:3, each=20),
time = rep(1:20, 3),
var1 = rnorm(60, mean=10, sd=3),
var2 = rnorm(60, mean=95, sd=5)
)
Store the data as nested dataframe:
t_nest <- test %>% nest(-id)
id data
<int> <list>
1 1 <tibble [20 x 3]>
2 2 <tibble [20 x 3]>
3 3 <tibble [20 x 3]>
Perform calculations. I will calculate multiple new variables based on the data, although a solution for just one could be expanded later. The result of each calculation will be a numeric vector, same length as the input (n=20):
t1 <- t_nest %>%
mutate(var1_rollmean4 = map(data, ~RcppRoll::roll_mean(.$var1, n=4, align="right", fill=NA)),
var2_delta4 = map(data, ~(.$var2 - lag(.$var2, 3))*0.095),
var3 = map2(var1_rollmean4, var2_delta4, ~.x -.y))
id data var1_rollmean4 var2_delta4 var3
<int> <list> <list> <list> <list>
1 1 <tibble [20 x 3]> <dbl [20]> <dbl [20]> <dbl [20]>
2 2 <tibble [20 x 3]> <dbl [20]> <dbl [20]> <dbl [20]>
3 3 <tibble [20 x 3]> <dbl [20]> <dbl [20]> <dbl [20]>
my solution is to unnest this data, and then nest again. There doesn't seem to be anything wrong with this, but seems like a better solution may exist.
t1 %>% unnest %>%
nest(-id)
id data
<int> <list>
1 1 <tibble [20 x 6]>
2 2 <tibble [20 x 6]>
3 3 <tibble [20 x 6]>
This other solution (from SO 42028710) is close, but not quite because it is a list rather than nested dataframes:
map_df(t_nest$data, ~ mutate(.x, var1calc = .$var1*100))
I've found quite a bit of helpful information using the purrr Cheatsheet but can't quite find the answer.

You can wrap another mutate when mapping through the data column and add the columns in each nested tibble:
t11 <- t_nest %>%
mutate(data = map(data,
~ mutate(.x,
var1_rollmean4 = RcppRoll::roll_mean(var1, n=4, align="right", fill=NA),
var2_delta4 = (var2 - lag(var2, 3))*0.095,
var3 = var1_rollmean4 - var2_delta4
)
))
t11
# A tibble: 3 x 2
# id data
# <int> <list>
#1 1 <tibble [20 x 6]>
#2 2 <tibble [20 x 6]>
#3 3 <tibble [20 x 6]>
unnest-nest method, and then reorder the columns inside:
nest_unnest <- t1 %>%
unnest %>% nest(-id) %>%
mutate(data = map(data, ~ select(.x, time, var1, var2, var1_rollmean4, var2_delta4, var3)))
identical(nest_unnest, t11)
# [1] TRUE

It seems like for what you're trying to do, nesting is not necessary
library(tidyverse)
library(zoo)
test %>%
group_by(id) %>%
mutate(var1_rollmean4 = rollapplyr(var1, 4, mean, fill=NA),
var2_delta4 = (var2 - lag(var2, 3))*0.095,
var3 = (var1_rollmean4 - var2_delta4))
# A tibble: 60 x 7
# Groups: id [3]
# id time var1 var2 var1_rollmean4 var2_delta4 var3
# <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 1 9.865199 96.45723 NA NA NA
# 2 1 2 9.951429 92.78354 NA NA NA
# 3 1 3 12.831509 95.00553 NA NA NA
# 4 1 4 12.463664 95.37171 11.277950 -0.10312483 11.381075
# 5 1 5 11.781704 92.05240 11.757076 -0.06945881 11.826535
# 6 1 6 12.756932 92.15666 12.458452 -0.27064269 12.729095
# 7 1 7 12.346409 94.32411 12.337177 -0.09952197 12.436699
# 8 1 8 10.223695 100.89043 11.777185 0.83961377 10.937571
# 9 1 9 4.031945 87.38217 9.839745 -0.45357658 10.293322
# 10 1 10 11.859477 97.96973 9.615382 0.34633428 9.269047
# ... with 50 more rows
Edit You could nest the result with %>% nest(-id) still
If you still prefer to nest or are nesting for other reasons, it would go like
t1 <- t_nest %>%
mutate(data = map(data, ~.x %>% mutate(...)))
That is, you mutate on .x within the map statement. This will treat data as a data.frame and mutate will column-bind results to it.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: Coding looping constructs - r

Related

How to mutate to input the mean as a column after using nest?

turn time series data into nested data frame where groups are increasing time periods (w/dplyr & tidyr)

R: Apply function over nested lists with varying length

Use filter() (and other dplyr functions) inside nested data frames with map()

How to add calculated columns to nested data frames (list columns) using purrr

Categories

Resources