Sorting Elements of a List - r

I have a list
str(overlaps)
List of 7
$ a5: chr [1:6] "calc_a1c" "predmdx_flag" "bmi" "systolic" ...
$ a2: chr [1:2] "age" "yr"
$ a4: chr(0)
$ a6: chr(0)
$ a1: chr [1:2] "trig_3cat" "glipizide_flag"
$ a3: chr [1:2] "email_flag" "statins_flag"
$ a7: chr [1:4] "trig_3cat.>=200" "antihtn_flag" "black_flag" "gender.M"
And I want to reorder the list such that the resulting list is one where the elements are in numerical order. ie a1, a2, a3, etc

Some random data:
set.seed(42)
overlaps <- replicate(5, runif(3), simplify=FALSE)
names(overlaps) <- paste0("a", sample(5))
str(overlaps)
# List of 5
# $ a5: num [1:3] 0.915 0.937 0.286
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a2: num [1:3] 0.935 0.255 0.462
The sort:
str(overlaps[ sort(names(overlaps)) ])
# List of 5
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a2: num [1:3] 0.935 0.255 0.462
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a5: num [1:3] 0.915 0.937 0.286

This is an alternative to r2evans' answer, which is excellent. I used his example dataset.
str(overlaps[order(names(overlaps))])
#List of 5
# $ a1: num [1:3] 0.737 0.135 0.657
# $ a2: num [1:3] 0.935 0.255 0.462
# $ a3: num [1:3] 0.705 0.458 0.719
# $ a4: num [1:3] 0.83 0.642 0.519
# $ a5: num [1:3] 0.915 0.937 0.286

Related

Take 20+ subsets of data?

I have a dataset and would like to take a lot of subsets based on various columns, values, and conditional operators. I think the most desirable output is a list containing all of these subsetted data frames as separate elements in the list. I attempted to do this by building a data frame that contains the subset conditions I would like to use, building a function, then using apply to feed that data frame to the function, but that didn't work. I'm sure there's probably a better method that uses an anonymous function or something like that, but I'm not sure how I would implement that. Below is an example code that should produce 8 subsets of data.
Original dataset, where x1 and x2 are scored on items that won't be used for subsetting and RT and LS are the variables that will be a subset on:
df <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
RT = abs(rnorm(100)),
LS = sample(1:10, 100, replace = T))
Dataframe containing the conditions for subsetting. E.g., the first subset of data should be any observations with values greater than or equal to 0.5 in the RT column, the second subset should be any observations greater than or equal to 1 in the subset column, etc. There should be 8 subsets, 4 done on the RT variable and 4 done on the LS variable.
subsetConditions <- data.frame(column = rep(c("RT", "LS"), each = 4),
operator = rep(c(">=", "<="), each = 4),
value = c(0.5, 1, 1.5, 2,
9, 8, 7, 6))
And this is the ugly function I wrote to attempt to do this:
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Thanks for any help!
Consider Map (wrapper to mapply) without any eval + parse. Since ==, <=, >=, and other operators can be used as functions with two arguments where 4 <= 5 can be written as `<=`(4,5) or "<="(4, 5), simply pass arguments elementwise and use get to reference the function by string:
sub_data <- function(col, op, val) {
df[get(op)(df[[col]], val),]
}
sub_dfs <- with(subsetConditions, Map(sub_data, column, operator, value))
Output
str(sub_dfs)
List of 8
$ RT:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 1.63 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.662 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.543 ...
..$ LS: int [1:62] 5 5 1 2 9 1 5 9 3 10 ...
$ RT:'data.frame': 36 obs. of 4 variables:
..$ x1: num [1:36] -0.745 0.848 0.908 -0.761 0.74 ...
..$ x2: num [1:36] -2.3849 -0.3131 -2.4645 -0.0784 0.8512 ...
..$ RT: num [1:36] 1.66 2.15 1.74 1.65 1.13 ...
..$ LS: int [1:36] 5 2 1 5 9 10 2 7 1 3 ...
$ RT:'data.frame': 14 obs. of 4 variables:
..$ x1: num [1:14] -0.745 0.848 0.908 -0.761 -1.063 ...
..$ x2: num [1:14] -2.3849 -0.3131 -2.4645 -0.0784 -2.9886 ...
..$ RT: num [1:14] 1.66 2.15 1.74 1.65 2.63 ...
..$ LS: int [1:14] 5 2 1 5 5 6 9 4 8 4 ...
$ RT:'data.frame': 3 obs. of 4 variables:
..$ x1: num [1:3] 0.848 -1.063 0.197
..$ x2: num [1:3] -0.313 -2.989 0.709
..$ RT: num [1:3] 2.15 2.63 2.05
..$ LS: int [1:3] 2 5 6
$ LS:'data.frame': 92 obs. of 4 variables:
..$ x1: num [1:92] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:92] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:92] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:92] 5 5 1 2 1 9 1 5 9 3 ...
$ LS:'data.frame': 78 obs. of 4 variables:
..$ x1: num [1:78] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:78] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:78] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:78] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:75] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:75] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:75] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:62] 5 5 1 2 1 1 5 3 5 2 ...
You were actually pretty close with your function, but just needed to make an adjustment. So, with paste for each row, you need to collapse all 3 columns so that it is only 1 string rather than 3, then it can properly evaluate the expression.
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x, collapse = ""))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Output
Then, it will return the 8 subsets.
str(subsets)
List of 8
$ :'data.frame': 67 obs. of 4 variables:
..$ x1: num [1:67] -1.208 0.606 -0.17 0.728 -0.424 ...
..$ x2: num [1:67] 0.4058 -0.3041 -0.3357 0.7904 -0.0264 ...
..$ RT: num [1:67] 1.972 0.883 0.598 0.633 1.517 ...
..$ LS: int [1:67] 8 9 2 10 8 5 3 4 7 2 ...
$ :'data.frame': 35 obs. of 4 variables:
..$ x1: num [1:35] -1.2083 -0.4241 -0.0906 0.9851 -0.8236 ...
..$ x2: num [1:35] 0.4058 -0.0264 1.0054 0.0653 1.4647 ...
..$ RT: num [1:35] 1.97 1.52 1.05 1.63 1.47 ...
..$ LS: int [1:35] 8 8 5 4 7 3 1 6 8 6 ...
$ :'data.frame': 16 obs. of 4 variables:
..$ x1: num [1:16] -1.208 -0.424 0.985 0.99 0.939 ...
..$ x2: num [1:16] 0.4058 -0.0264 0.0653 0.3486 -0.7562 ...
..$ RT: num [1:16] 1.97 1.52 1.63 1.85 1.8 ...
..$ LS: int [1:16] 8 8 4 6 10 2 6 6 3 9 ...
$ :'data.frame': 7 obs. of 4 variables:
..$ x1: num [1:7] 0.963 0.423 -0.444 0.279 0.417 ...
..$ x2: num [1:7] 0.6612 0.0354 0.0555 0.1253 -0.3056 ...
..$ RT: num [1:7] 2.71 2.15 2.05 2.01 2.07 ...
..$ LS: int [1:7] 2 6 9 9 7 7 4
$ :'data.frame': 91 obs. of 4 variables:
..$ x1: num [1:91] -0.952 -1.208 0.606 -0.17 -0.048 ...
..$ x2: num [1:91] -0.645 0.406 -0.304 -0.336 -0.897 ...
..$ RT: num [1:91] 0.471 1.972 0.883 0.598 0.224 ...
..$ LS: int [1:91] 6 8 9 2 1 8 4 5 3 4 ...
$ :'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -0.952 -1.208 -0.17 -0.048 -0.424 ...
..$ x2: num [1:75] -0.6448 0.4058 -0.3357 -0.8968 -0.0264 ...
..$ RT: num [1:75] 0.471 1.972 0.598 0.224 1.517 ...
..$ LS: int [1:75] 6 8 2 1 8 4 5 3 4 1 ...
$ :'data.frame': 65 obs. of 4 variables:
..$ x1: num [1:65] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:65] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:65] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:65] 6 2 1 4 5 3 4 1 7 4 ...
$ :'data.frame': 58 obs. of 4 variables:
..$ x1: num [1:58] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:58] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:58] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:58] 6 2 1 4 5 3 4 1 4 2 ...

group_modify does not behave as expected when the function is psych::alpha()

I'm trying to run psych::alpha on a grouped dataset group_map works but as expected the list doesn't state the groups, it indexes the countries ([[1]] etc) which is not useful to me, so it is not a viable alternative.
The reference website examples imply there is no argument additions between group_map and group_modify but passing through group_modify gives me the error:
Number of categories should be increased in order to count frequencies.
Error: The result of .f should be a data frame.
Backtrace:
1. `%>%`(...)
3. dplyr:::group_modify.grouped_df(., ~psych::alpha(.x, check.keys = TRUE))
5. dplyr:::group_map.data.frame(.data, fun, .keep = .keep)
6. dplyr:::map2(chunks, group_keys, .f, ...)
7. base::mapply(.f, .x, .y, MoreArgs = list(...), SIMPLIFY = FALSE)
>
This happens in both my dataset where I:
df%>% select(groupVar, vars1:var4)%>% group_by(groupVar)%>%
group_modify(~ psych::alpha(.x, check.keys = TRUE))
as well as with the example code adapted from the tidyverse website which substitutes head() for psych::alpha:
iris %>% group_by(Species) %>%
group_modify(~ psych::alpha(.x, check.keys = TRUE))
The issue is that group_modify expects a data.frame as output. According to ?group_modify
group_modify() is good for "data frame in, data frame out".
We could use group_map as the output of alpha is a list
group_map() returns a list of results from calling .f on each group.
library(dplyr)
out <- iris %>%
group_by(Species) %>%
group_map(~ psych::alpha(.x, check.keys = TRUE))
check the output of elements of the list output from group_map
str(out[[1]])
List of 14
$ total :'data.frame': 1 obs. of 9 variables:
..$ raw_alpha: num 0.663
..$ std.alpha: num 0.672
..$ G6(smc) : num 0.68
..$ average_r: num 0.338
..$ S/N : num 2.05
..$ ase : num 0.0535
..$ mean : num 2.54
..$ sd : num 0.196
..$ median_r : num 0.273
$ alpha.drop :'data.frame': 4 obs. of 8 variables:
..$ raw_alpha: num [1:4] 0.34 0.425 0.69 0.691
..$ std.alpha: num [1:4] 0.496 0.553 0.683 0.663
..$ G6(smc) : num [1:4] 0.404 0.454 0.672 0.664
..$ average_r: num [1:4] 0.247 0.292 0.418 0.396
..$ S/N : num [1:4] 0.986 1.239 2.153 1.965
..$ alpha se : num [1:4] 0.1243 0.1086 0.0426 0.0562
..$ var.r : num [1:4] 0.00608 0.00119 0.07961 0.09217
..$ med.r : num [1:4] 0.233 0.278 0.278 0.267
$ item.stats :'data.frame': 4 obs. of 7 variables:
..$ n : num [1:4] 50 50 50 50
..$ raw.r : num [1:4] 0.905 0.888 0.472 0.445
..$ std.r : num [1:4] 0.806 0.758 0.626 0.649
..$ r.cor : num [1:4] 0.796 0.729 0.393 0.424
..$ r.drop: num [1:4] 0.73 0.66 0.273 0.328
..$ mean : num [1:4] 5.006 3.428 1.462 0.246
..$ sd : num [1:4] 0.352 0.379 0.174 0.105
$ response.freq: NULL
$ keys : Named num [1:4] 1 1 1 1
..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
$ scores : num [1:50] 2.55 2.38 2.35 2.35 2.55 ...
$ nvar : int 4
$ boot.ci : NULL
$ boot : NULL
$ Unidim :List of 1
..$ Unidim: num 0.625
$ var.r : num 0.0418
$ Fit :List of 1
..$ Fit.off: num 0.926
$ call : language psych::alpha(x = .x, check.keys = TRUE)
$ title : NULL
- attr(*, "class")= chr [1:2] "psych" "alpha"
Update
There is a summary method for alpha which can return a data.frame
out <- iris %>%
group_by(Species) %>%
group_modify(~ psych::alpha(.x, check.keys = TRUE) %>%
summary)
-output
out
# A tibble: 3 x 10
# Groups: Species [3]
Species raw_alpha std.alpha `G6(smc)` average_r `S/N` ase mean sd median_r
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 setosa 0.663 0.672 0.680 0.338 2.05 0.0535 2.54 0.196 0.273
2 versicolor 0.833 0.877 0.877 0.640 7.10 0.0308 3.57 0.323 0.612
3 virginica 0.774 0.785 0.818 0.477 3.65 0.0418 4.28 0.364 0.429

Combine lists with keeping same structure and names [duplicate]

This question already has answers here:
Can lists be created that name themselves based on input object names?
(4 answers)
Closed 2 years ago.
I have several lists (ListA, ListB, ListC...) with the same internal structure as the example below. I would like to combine all of them, keeping their structure, and have one list with all lists (ListAll). How can I do this?
Example:
I have:
ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
and several lists with the same structure.
I would like to get:
ListAll
$ ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
$ ListB
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.132
..$ sd : num 1.01
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0114 -0.0123 0.0378 -0.0102 -0.0340 ...
..$ sd : num [1:1000] 1.013 1.011 0.876 1.012 1.023 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 4.41 1.61 1.42 1.96 2.07 ...
..$ D: num [1:35] 2.41 2.19 2.54 2.08 2.53 ...
** and names(listAll) would be:**
ListaA, ListB, ListC...
You can create a list of lists in base R.
ListAll <- list(ListA, ListB, ListC)

For loops to Create datasets in R

I would like to create several datasets via for loop.
basically I want create 29 datasets in which I can find in the 1st one the 44th and 45th column of the DF, in the 2nd one the 46th and 47th column of the DF and so on.
I tried like this with no results.
data. <- data.frame(matrix( nrow=1442, ncol=2))
for (i in 1:29){
assign(paste("data",i, sep="_"), data.)
data_[i][,1] <- DF[,c(43+i)]
data_[i][,2] <- DF[,c(44+i)]
}
Can you help me please?
Like this?
data <- list()
DF <- data.frame(matrix(runif(10000),ncol=100))
for (i in 1:29){
data[[i]] <- data.frame(DF[,c(43:44+i)])
}
str(data, list.len = 3)
One solution using purrr
DF <- data.frame(matrix(runif(10000),ncol=100))
library(purrr)
res <- 0:28 %>%
# create the indices to subset
map( ~ c(44, 45) + .x) %>%
# subset the df for each indice group
map( ~ DF[, .x])
length(res)
#> [1] 29
str(head(res))
#> List of 6
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X44: num [1:100] 0.477 0.0593 0.2616 0.7349 0.1202 ...
#> ..$ X45: num [1:100] 0.43 0.105 0.557 0.341 0.111 ...
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X45: num [1:100] 0.43 0.105 0.557 0.341 0.111 ...
#> ..$ X46: num [1:100] 0.78 0.877 0.518 0.162 0.565 ...
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X46: num [1:100] 0.78 0.877 0.518 0.162 0.565 ...
#> ..$ X47: num [1:100] 0.931 0.985 0.59 0.656 0.713 ...
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X47: num [1:100] 0.931 0.985 0.59 0.656 0.713 ...
#> ..$ X48: num [1:100] 0.82 0.899 0.359 0.809 0.329 ...
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X48: num [1:100] 0.82 0.899 0.359 0.809 0.329 ...
#> ..$ X49: num [1:100] 0.7982 0.0966 0.2716 0.3364 0.7295 ...
#> $ :'data.frame': 100 obs. of 2 variables:
#> ..$ X49: num [1:100] 0.7982 0.0966 0.2716 0.3364 0.7295 ...
#> ..$ X50: num [1:100] 0.83057 0.64207 0.94392 0.00904 0.26966 ...
Created on 2018-11-04 by the reprex package (v0.2.1)
Give this a try.
n = 1000
k = 120
DF = matrix(runif(n*k), n, k)
for (i in 1:29){
tmp = DF[,c(43, 43) + c(2*i-1, 2*i)]
assign(paste0("data_", i), tmp)
}
ls()
all(data_1 == DF[,c(44, 45)])
all(data_2 == DF[,c(46, 47)])
Doing data_[i] will make R look for the object called data_, so you can't just subscript the object name like that.

Merge 2 list of lists in R [duplicate]

This question already has answers here:
Combining elements of list of lists by index
(3 answers)
in r combine a list of lists into one list
(3 answers)
Closed 4 years ago.
I have 2 list of lists in R with the same list name as follow :
str(total_delta_final[[1]])
List of 4
$ sector1_T02 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.737 0.737 0.693 0.738 0.738 ...
..$ DeltaF_2: num [1:24] 0.24 0.24 0.279 0.239 0.239 ...
..$ DeltaF_3: num [1:24] 0.0233 0.0233 0.0275 0.0232 0.0232 ...
$ sector2_T03 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.582 0.582 0.568 0.69 0.69 ...
..$ DeltaF_2: num [1:24] 0.377 0.377 0.39 0.282 0.282 ...
..$ DeltaF_3: num [1:24] 0.0406 0.0406 0.0426 0.0278 0.0278 ...
$ sector3_T03 :'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.607 0.607 0.495 0.409 0.375 ...
..$ DeltaF_2: num [1:24] 0.356 0.356 0.451 0.519 0.544 ...
..$ DeltaF_3: num [1:24] 0.0373 0.0373 0.0541 0.072 0.0809 ...
$ sector12_T02:'data.frame': 24 obs. of 3 variables:
..$ DeltaF_1: num [1:24] 0.743 0.743 0.758 0.689 0.705 ...
..$ DeltaF_2: num [1:24] 0.234 0.234 0.22 0.283 0.269 ...
..$ DeltaF_3: num [1:24] 0.0226 0.0226 0.0213 0.028 0.0263 ...
> str(total_TI_final[[1]])
List of 4
$ sector1_T02 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0756 0.083 0.0799 0.0799 ...
..$ I_2: num [1:24] 0.122 NA 0.163 0.172 0.172 ...
..$ I_3: num [1:24] 0.212 0.211 NA 0.266 0.273 ...
$ sector2_T03 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0986 0.1013 0.1011 0.101 ...
..$ I_2: num [1:24] 0.15 NA 0.184 0.211 0.211 ...
..$ I_3: num [1:24] 0.249 0.249 NA 0.331 0.337 ...
$ sector3_T03 :'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.119 0.115 0.113 0.105 ...
..$ I_2: num [1:24] 0.193 NA 0.2 0.193 0.177 ...
..$ I_3: num [1:24] 0.323 0.323 NA 0.277 0.256 ...
$ sector12_T02:'data.frame': 24 obs. of 3 variables:
..$ I_1: num [1:24] NA 0.0825 0.0681 0.0723 0.0706 ...
..$ I_2: num [1:24] 0.138 NA 0.146 0.145 0.144 ...
..$ I_3: num [1:24] 0.24 0.24 NA 0.22 0.226 ...
How could I merge these 2 list of lists in a way that my final output looks like total_TI_final[[1]][1] and the second list total_delta_final[[1]][1] then total_TI_final[[1]][2] and total_delta_final[[1]][2] and so on ...
We can use Map
Map(c, total_delta_final, total_TI_final)

Resources