This question already has answers here:
Combining elements of list of lists by index
(3 answers)
in r combine a list of lists into one list
(3 answers)
Closed 4 years ago.
I have 2 list of lists in R with the same list name as follow :
str(total_delta_final[[1]])
List of 4
$ sector1_T02 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.737 0.737 0.693 0.738 0.738 ...
..$ DeltaF_2: num [1:24] 0.24 0.24 0.279 0.239 0.239 ...
..$ DeltaF_3: num [1:24] 0.0233 0.0233 0.0275 0.0232 0.0232 ...
$ sector2_T03 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.582 0.582 0.568 0.69 0.69 ...
..$ DeltaF_2: num [1:24] 0.377 0.377 0.39 0.282 0.282 ...
..$ DeltaF_3: num [1:24] 0.0406 0.0406 0.0426 0.0278 0.0278 ...
$ sector3_T03 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.607 0.607 0.495 0.409 0.375 ...
..$ DeltaF_2: num [1:24] 0.356 0.356 0.451 0.519 0.544 ...
..$ DeltaF_3: num [1:24] 0.0373 0.0373 0.0541 0.072 0.0809 ...
$ sector12_T02:'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.743 0.743 0.758 0.689 0.705 ...
..$ DeltaF_2: num [1:24] 0.234 0.234 0.22 0.283 0.269 ...
..$ DeltaF_3: num [1:24] 0.0226 0.0226 0.0213 0.028 0.0263 ...
> str(total_TI_final[[1]])
List of 4
$ sector1_T02 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0756 0.083 0.0799 0.0799 ...
..$ I_2: num [1:24] 0.122 NA 0.163 0.172 0.172 ...
..$ I_3: num [1:24] 0.212 0.211 NA 0.266 0.273 ...
$ sector2_T03 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0986 0.1013 0.1011 0.101 ...
..$ I_2: num [1:24] 0.15 NA 0.184 0.211 0.211 ...
..$ I_3: num [1:24] 0.249 0.249 NA 0.331 0.337 ...
$ sector3_T03 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.119 0.115 0.113 0.105 ...
..$ I_2: num [1:24] 0.193 NA 0.2 0.193 0.177 ...
..$ I_3: num [1:24] 0.323 0.323 NA 0.277 0.256 ...
$ sector12_T02:'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0825 0.0681 0.0723 0.0706 ...
..$ I_2: num [1:24] 0.138 NA 0.146 0.145 0.144 ...
..$ I_3: num [1:24] 0.24 0.24 NA 0.22 0.226 ...
How could I merge these 2 list of lists in a way that my final output looks like total_TI_final[[1]][1] and the second list total_delta_final[[1]][1] then total_TI_final[[1]][2] and total_delta_final[[1]][2] and so on ...
We can use Map
Map(c, total_delta_final, total_TI_final)
Related
I have a dataset and would like to take a lot of subsets based on various columns, values, and conditional operators. I think the most desirable output is a list containing all of these subsetted data frames as separate elements in the list. I attempted to do this by building a data frame that contains the subset conditions I would like to use, building a function, then using apply to feed that data frame to the function, but that didn't work. I'm sure there's probably a better method that uses an anonymous function or something like that, but I'm not sure how I would implement that. Below is an example code that should produce 8 subsets of data.
Original dataset, where x1 and x2 are scored on items that won't be used for subsetting and RT and LS are the variables that will be a subset on:
df <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
RT = abs(rnorm(100)),
LS = sample(1:10, 100, replace = T))
Dataframe containing the conditions for subsetting. E.g., the first subset of data should be any observations with values greater than or equal to 0.5 in the RT column, the second subset should be any observations greater than or equal to 1 in the subset column, etc. There should be 8 subsets, 4 done on the RT variable and 4 done on the LS variable.
subsetConditions <- data.frame(column = rep(c("RT", "LS"), each = 4),
operator = rep(c(">=", "<="), each = 4),
value = c(0.5, 1, 1.5, 2,
9, 8, 7, 6))
And this is the ugly function I wrote to attempt to do this:
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Thanks for any help!
Consider Map (wrapper to mapply) without any eval + parse. Since ==, <=, >=, and other operators can be used as functions with two arguments where 4 <= 5 can be written as `<=`(4,5) or "<="(4, 5), simply pass arguments elementwise and use get to reference the function by string:
sub_data <- function(col, op, val) {
df[get(op)(df[[col]], val),]
}
sub_dfs <- with(subsetConditions, Map(sub_data, column, operator, value))
Output
str(sub_dfs)
List of 8
$ RT:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 1.63 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.662 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.543 ...
..$ LS: int [1:62] 5 5 1 2 9 1 5 9 3 10 ...
$ RT:'data.frame': 36 obs. of 4 variables:
..$ x1: num [1:36] -0.745 0.848 0.908 -0.761 0.74 ...
..$ x2: num [1:36] -2.3849 -0.3131 -2.4645 -0.0784 0.8512 ...
..$ RT: num [1:36] 1.66 2.15 1.74 1.65 1.13 ...
..$ LS: int [1:36] 5 2 1 5 9 10 2 7 1 3 ...
$ RT:'data.frame': 14 obs. of 4 variables:
..$ x1: num [1:14] -0.745 0.848 0.908 -0.761 -1.063 ...
..$ x2: num [1:14] -2.3849 -0.3131 -2.4645 -0.0784 -2.9886 ...
..$ RT: num [1:14] 1.66 2.15 1.74 1.65 2.63 ...
..$ LS: int [1:14] 5 2 1 5 5 6 9 4 8 4 ...
$ RT:'data.frame': 3 obs. of 4 variables:
..$ x1: num [1:3] 0.848 -1.063 0.197
..$ x2: num [1:3] -0.313 -2.989 0.709
..$ RT: num [1:3] 2.15 2.63 2.05
..$ LS: int [1:3] 2 5 6
$ LS:'data.frame': 92 obs. of 4 variables:
..$ x1: num [1:92] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:92] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:92] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:92] 5 5 1 2 1 9 1 5 9 3 ...
$ LS:'data.frame': 78 obs. of 4 variables:
..$ x1: num [1:78] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:78] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:78] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:78] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:75] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:75] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:75] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:62] 5 5 1 2 1 1 5 3 5 2 ...
You were actually pretty close with your function, but just needed to make an adjustment. So, with paste for each row, you need to collapse all 3 columns so that it is only 1 string rather than 3, then it can properly evaluate the expression.
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x, collapse = ""))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Output
Then, it will return the 8 subsets.
str(subsets)
List of 8
$ :'data.frame': 67 obs. of 4 variables:
..$ x1: num [1:67] -1.208 0.606 -0.17 0.728 -0.424 ...
..$ x2: num [1:67] 0.4058 -0.3041 -0.3357 0.7904 -0.0264 ...
..$ RT: num [1:67] 1.972 0.883 0.598 0.633 1.517 ...
..$ LS: int [1:67] 8 9 2 10 8 5 3 4 7 2 ...
$ :'data.frame': 35 obs. of 4 variables:
..$ x1: num [1:35] -1.2083 -0.4241 -0.0906 0.9851 -0.8236 ...
..$ x2: num [1:35] 0.4058 -0.0264 1.0054 0.0653 1.4647 ...
..$ RT: num [1:35] 1.97 1.52 1.05 1.63 1.47 ...
..$ LS: int [1:35] 8 8 5 4 7 3 1 6 8 6 ...
$ :'data.frame': 16 obs. of 4 variables:
..$ x1: num [1:16] -1.208 -0.424 0.985 0.99 0.939 ...
..$ x2: num [1:16] 0.4058 -0.0264 0.0653 0.3486 -0.7562 ...
..$ RT: num [1:16] 1.97 1.52 1.63 1.85 1.8 ...
..$ LS: int [1:16] 8 8 4 6 10 2 6 6 3 9 ...
$ :'data.frame': 7 obs. of 4 variables:
..$ x1: num [1:7] 0.963 0.423 -0.444 0.279 0.417 ...
..$ x2: num [1:7] 0.6612 0.0354 0.0555 0.1253 -0.3056 ...
..$ RT: num [1:7] 2.71 2.15 2.05 2.01 2.07 ...
..$ LS: int [1:7] 2 6 9 9 7 7 4
$ :'data.frame': 91 obs. of 4 variables:
..$ x1: num [1:91] -0.952 -1.208 0.606 -0.17 -0.048 ...
..$ x2: num [1:91] -0.645 0.406 -0.304 -0.336 -0.897 ...
..$ RT: num [1:91] 0.471 1.972 0.883 0.598 0.224 ...
..$ LS: int [1:91] 6 8 9 2 1 8 4 5 3 4 ...
$ :'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -0.952 -1.208 -0.17 -0.048 -0.424 ...
..$ x2: num [1:75] -0.6448 0.4058 -0.3357 -0.8968 -0.0264 ...
..$ RT: num [1:75] 0.471 1.972 0.598 0.224 1.517 ...
..$ LS: int [1:75] 6 8 2 1 8 4 5 3 4 1 ...
$ :'data.frame': 65 obs. of 4 variables:
..$ x1: num [1:65] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:65] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:65] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:65] 6 2 1 4 5 3 4 1 7 4 ...
$ :'data.frame': 58 obs. of 4 variables:
..$ x1: num [1:58] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:58] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:58] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:58] 6 2 1 4 5 3 4 1 4 2 ...
This question already has answers here:
Can lists be created that name themselves based on input object names?
(4 answers)
Closed 2 years ago.
I have several lists (ListA, ListB, ListC...) with the same internal structure as the example below. I would like to combine all of them, keeping their structure, and have one list with all lists (ListAll). How can I do this?
Example:
I have:
ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
and several lists with the same structure.
I would like to get:
ListAll
$ ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
$ ListB
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.132
..$ sd : num 1.01
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0114 -0.0123 0.0378 -0.0102 -0.0340 ...
..$ sd : num [1:1000] 1.013 1.011 0.876 1.012 1.023 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 4.41 1.61 1.42 1.96 2.07 ...
..$ D: num [1:35] 2.41 2.19 2.54 2.08 2.53 ...
** and names(listAll) would be:**
ListaA, ListB, ListC...
You can create a list of lists in base R.
ListAll <- list(ListA, ListB, ListC)
I have a data.table called td.br.2, in which some columns are completely NAs. These columns are of type numeric. What I would like to do, is only for these columns to transform them to factors.
I have tried the following, but it does not work ( I do not get an error but it does not do the job either)
td.br.2[] <- td.br.2[,lapply(.SD, function(x) {ifelse(sum(is.na(x)==nrow(td.br.2)),as.factor(x),x)})]
n=10#nr of rows
m=10#nr of cols
N<-n*m
m1<-matrix(runif(N),nrow=n,ncol = m)
dt<-data.table(m1)
names(dt)<-letters[1:m]
dt<-cbind(dt,xxx=rep(NA,nrow(dt)))#adding NA column
At this point
str(dt)
Classes ‘data.table’ and 'data.frame': 10 obs. of 11 variables:
$ a : num 0.661 0.864 0.152 0.342 0.989 ...
$ b : num 0.06036 0.67587 0.00847 0.37674 0.30417 ...
$ c : num 0.3938 0.6274 0.0514 0.882 0.1568 ...
$ d : num 0.777 0.233 0.619 0.117 0.132 ...
$ e : num 0.655 0.926 0.277 0.598 0.237 ...
$ f : num 0.649 0.197 0.547 0.585 0.685 ...
$ g : num 0.6877 0.3676 0.009 0.6975 0.0327 ...
$ h : num 0.519 0.705 0.457 0.465 0.966 ...
$ i : num 0.43777 0.00961 0.30224 0.58172 0.37621 ...
$ j : num 0.44 0.481 0.485 0.125 0.263 ...
$ xxx: logi NA NA NA NA NA NA ...
So by executing:
dt<-dt[, lapply(.SD, function(x){ if(all(is.na(x)))as.factor(as.character(x)) else x}),]
yields:
str(dt)
Classes ‘data.table’ and 'data.frame': 10 obs. of 11 variables:
$ a : num 0.0903 0.0448 0.5956 0.418 0.1316 ...
$ b : num 0.672 0.582 0.687 0.113 0.371 ...
$ c : num 0.404 0.16 0.848 0.863 0.737 ...
$ d : num 0.073 0.129 0.243 0.334 0.285 ...
$ e : num 0.485 0.186 0.539 0.486 0.784 ...
$ f : num 0.4685 0.4815 0.585 0.3596 0.0764 ...
$ g : num 0.958 0.194 0.549 0.71 0.737 ...
$ h : num 0.168 0.355 0.552 0.765 0.605 ...
$ i : num 0.665 0.88 0.23 0.575 0.413 ...
$ j : num 0.1113 0.8797 0.1244 0.0741 0.8724 ...
$ xxx: Factor w/ 0 levels: NA NA NA NA NA NA NA NA NA NA
I am not sure why you would want to do that, but here you are:
naColumns <- sapply(td.br.2, function(x) { all(is.na(x)) })
for (col in which(naColumns))
set(td.br.2, j=col, value=as.factor(x[[col]]))
The factors will have no levels, but you can deal with that as necessary.
(The for loop is partly based on this.)
I have a list
str(overlaps)
List of 7
$ a5: chr [1:6] "calc_a1c" "predmdx_flag" "bmi" "systolic" ...
$ a2: chr [1:2] "age" "yr"
$ a4: chr(0)
$ a6: chr(0)
$ a1: chr [1:2] "trig_3cat" "glipizide_flag"
$ a3: chr [1:2] "email_flag" "statins_flag"
$ a7: chr [1:4] "trig_3cat.>=200" "antihtn_flag" "black_flag" "gender.M"
And I want to reorder the list such that the resulting list is one where the elements are in numerical order. ie a1, a2, a3, etc
Some random data:
set.seed(42)
overlaps <- replicate(5, runif(3), simplify=FALSE)
names(overlaps) <- paste0("a", sample(5))
str(overlaps)
# List of 5
# $ a5: num [1:3] 0.915 0.937 0.286
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a2: num [1:3] 0.935 0.255 0.462
The sort:
str(overlaps[ sort(names(overlaps)) ])
# List of 5
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a2: num [1:3] 0.935 0.255 0.462
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a5: num [1:3] 0.915 0.937 0.286
This is an alternative to r2evans' answer, which is excellent. I used his example dataset.
str(overlaps[order(names(overlaps))])
#List of 5
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a2: num [1:3] 0.935 0.255 0.462
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a5: num [1:3] 0.915 0.937 0.286
I have a dataset that I need to clean removing the rows that contain values that are above 4 stdev. I need to delete rows where one or more of the columns (2:18) value is above 4 stdev. How to best do this?
data.frame': 154940 obs. of 19 variables:
$ msec: int 0 170 340 500 670 840 1010 1180 1340 1510 ...
$ a412: num 0.0607 0.0584 0.0644 0.0607 0.0577 ...
$ a440: num 0.0697 0.0649 0.0706 0.0706 0.0649 ...
$ a488: num 0.0663 0.0633 0.0653 0.0673 0.0653 ...
$ a510: num 0.466 0.459 0.44 0.462 0.445 ...
$ a532: num 0.453 0.444 0.45 0.454 0.444 ...
$ a555: num 0.428 0.424 0.436 0.426 0.428 ...
$ a650: num 0.0839 0.0839 0.0839 0.0891 0.0865 ...
$ a676: num 0.0963 0.0954 0.0963 0.1 0.0991 ...
$ a715: num 0.0899 0.0912 0.0893 0.0887 0.0887 ...
$ c412: num 0.343 0.337 0.342 0.346 0.344 ...
$ c440: num 0.341 0.343 0.344 0.353 0.348 ...
$ c488: num 0.33 0.335 0.337 0.345 0.34 ...
$ c510: num 0.081 0.0802 0.0794 0.0794 0.081 ...
$ c532: num 0.0594 0.0606 0.0582 0.057 0.0594 ...
$ c555: num 0.067 0.0633 0.0615 0.0633 0.0689 ...
$ c650: num 0.562 0.56 0.565 0.571 0.556 ...
$ c676: num 0.549 0.552 0.551 0.55 0.537 ...
$ c715: num 0.487 0.481 0.481 0.489 0.473 ...