Removing multiple named list components in within() - r

I am trying to remove a named component from a list, using within and rm. This works for a single component, but not for two or more. I am completely befuddled.
For example - this works
aa = list(a = 1:3, b = 2:5, cc = 1:5)
within(aa, {rm(a)})
the output from within will have just the non-removed components.
However, this does not:
aa = list(a = 1:3, b = 2:5, cc = 1:5)
within(aa, {rm(a); rm(b)})
Neither does this:
within(aa, {rm(a, b)})
The output from within will have all the components, with the ones I am trying to remove, set to NULL. Why?

First, note the following behavior:
> aa = list(a = 1:3, b = 2:5, cc = 1:5)
>
> aa[c('a', 'b')] <- NULL
>
> aa
# $cc
# [1] 1 2 3 4 5
> aa = list(a = 1:3, b = 2:5, cc = 1:5)
>
> aa[c('a', 'b')] <- list(NULL, NULL)
>
> aa
# $a
# NULL
#
# $b
# NULL
#
# $cc
# [1] 1 2 3 4 5
Now let's look at the code for within.list:
within.list <- function (data, expr, ...)
{
parent <- parent.frame()
e <- evalq(environment(), data, parent)
eval(substitute(expr), e)
l <- as.list(e)
l <- l[!sapply(l, is.null)]
nD <- length(del <- setdiff(names(data), (nl <- names(l))))
data[nl] <- l
if (nD)
data[del] <- if (nD == 1) NULL else vector("list", nD)
data
}
Look in particular at the second to last line of the function. If the number of deleted items in the list is greater than one, the function is essentially calling aa[c('a', 'b')] <- list(NULL, NULL), because vector("list", 2) creates a two item list where each item is NULL. We can create our own version of within where we remove the else statement from the second to last line of the function:
mywithin <- function (data, expr, ...)
{
parent <- parent.frame()
e <- evalq(environment(), data, parent)
eval(substitute(expr), e)
l <- as.list(e)
l <- l[!sapply(l, is.null)]
nD <- length(del <- setdiff(names(data), (nl <- names(l))))
data[nl] <- l
if (nD) data[del] <- NULL
data
}
Now let's test it:
> aa = list(a = 1:3, b = 2:5, cc = 1:5)
>
> mywithin(aa, rm(a, b))
# $cc
# [1] 1 2 3 4 5
Now it works as expected!

Related

Referring to Elements in an Array in a For Loop in R - beginner

Edit: Someone said the question is unclear, edited.
I have made a 3 dimensional array, and assigned values as follows:
D <- c('g', 't', NA, 'd')
nPeriods = 4
column.names = c('aaa', 'bbb')
row.names = c('jjj', 'hhh')
threeD.names = c(1:nPeriods)
E = array(c(D), dim=c(2, 2, nPeriods),
dimnames = list(row.names, column.names, threeD.names))
values <- c(g = 5,
t = 2,
d = 7)
G <- apply(E, 1:3, function(x) values[x])
Now I want to make a for loop, to do things like:
for (i in 2:nPeriods){
G[1,1,i]=G[1,1,i-1]*G[2,1,i-1]+G[2,2,i]
}
But I don't want to have to find the location of g, t and d each time I want to write something like this. I just want to be able to use g, t, and d if possible.
Question ends here.
Below is some helpful code that could possibly be adapted to find a solution?
I have this code which looks up and returns an index for each value:
result <- G
for (i in 2:dim(G)[3]) {
idx <- which(E[, , 1] == 'g', arr.ind = T)
row <- idx[1, 'row']
col <- idx[1, 'col']
result[row, col, i] <- result[row, col, i-1] * 2
}
For a simpler problem, but my real array is quite large, so writing for each element will be long. Is there a way of automating this?
They also suggested this - which is great for simple sums, but I'm not sure how it could apply to the type of sum I have above:
funcs <- c(g = '*', t = '+', d = '-')
modifiers <- c(g = 2, t = 3, d = 4)
G <- apply(E, 1:3, function(x) values[x])
result <- G
for (i in 2:dim(G)[3]) {
for (j in names(values)) {
idx <- which(E[, , 1] == j, arr.ind = T)
row <- idx[1, 'row']
col <- idx[1, 'col']
result[row, col, i] <- do.call(funcs[j], args = list(result[row, col, i-1], modifiers[j]))
}
}
Based on the clarification, maybe this works - get the row/column index for 'g', 't', 'd' from the E[, , 1], loop over the nPeriods from 2, and update the 'result' by subseting the elements with a matrix index created with cbind using gidx, tidx and didx with i or i-1 to update recursively
result <- G
gidx <- which(E[, , 1] == 'g', arr.ind = TRUE)
tidx <- which(E[, , 1] == 't', arr.ind = TRUE)
didx <- which(E[, , 1] == 'd', arr.ind = TRUE)
for (i in 2:nPeriods) {
result[cbind(gidx, i)] <- result[cbind(gidx, i-1)] *
result[cbind(tidx, i-1)] + result[cbind(didx, i)]
}
-output
> result
, , 1
aaa bbb
jjj 5 NA
hhh 2 7
, , 2
aaa bbb
jjj 17 NA
hhh 2 7
, , 3
aaa bbb
jjj 41 NA
hhh 2 7
, , 4
aaa bbb
jjj 89 NA
hhh 2 7
-checking with OP's output
resultold <- G
for (i in 2:nPeriods){
resultold[1, 1, i] <- resultold[1,1,i-1]* resultold[2,1,i-1]+resultold[2,2,i]
}
identical(result, resultold)
[1] TRUE

Merge two lists into one component by component

I have two lists, plus an empty list:
A <- list(1:4,5:8,9:12)
B <- c("a","b")
C <- vector(mode = "list")
I would like to merge A and B into C as following:
C[[1]][1] = A[[1]] C[[1]][2] = B
C[[2]][1] = A[[2]] C[[2]][2] = B
C[[3]][1] = A[[3]] C[[3]][2] = B
Thank you.
How about C <- lapply(A, function(x) list(x, B))?
for example:
A <- list(1:4,5:8,9:12)
B <- c("a","b")
C <- lapply(A, function(x) list(x, B))
# C <- lapply(A, list, B) # also works
all(
C[[1]][[1]] == A[[1]],
C[[2]][[1]] == A[[2]],
C[[3]][[1]] == A[[3]],
C[[1]][[2]] == B,
C[[2]][[2]] == B,
C[[3]][[2]] == B
)
note that you'll need double [[ since each element of C is also a list (C[[1]][[1]] rather than C[[1]][1]).

For loop: paste index into string

This may strike you as odd, but I want to exactly achieve the following: I want to get the index of a list pasted into a string containing a string reference to a subset of this list.
For illustration:
l1 <- list(a = 1, b = 2)
l2 <- list(a = 3, b = 4)
l <- list(l1,l2)
X_l <- vector("list", length = length(l))
for (i in 1:length(l)) {
X_l[[i]] = "l[[ #insert index number as character# ]]$l_1*a"
}
In the end, I want something like this:
X_l_wanted <- list("l[[1]]$l_1*a","l[2]]$l_1*a")
You can use sprintf/paste0 directly :
sprintf('l[[%d]]$l_1*a', seq_along(l))
#[1] "l[[1]]$l_1*a" "l[[2]]$l_1*a"
If you want final output as list :
as.list(sprintf('l[[%d]$l_1*a', seq_along(l)))
#[[1]]
#[1] "l[[1]]$l_1*a"
#[[2]]
#[1] "l[[2]]$l_1*a"
Using paste0 :
paste0('l[[', seq_along(l), ']]$l_1*a')
Try paste0() inside your loop. That is the way to concatenate chains. Here the solution with slight changes to your code:
#Data
l1 <- list(a = 1, b = 2)
l2 <- list(a = 3, b = 4)
l <- list(l1,l2)
#List
X_l <- vector("list", length = length(l))
#Loop
for (i in 1:length(l)) {
#Code
X_l[[i]] = paste0('l[[',i,']]$l_1*a')
}
Output:
X_l
[[1]]
[1] "l[[1]]$l_1*a"
[[2]]
[1] "l[[2]]$l_1*a"
Or you could do it with lapply()
library(glue)
X_l <- lapply(1:length(l), function(i)glue("l[[{i}]]$l_l*a"))
X_l
# [[1]]
# l[[1]]$l_l*a
# [[2]]
# l[[2]]$l_l*a

Keep all names from list to data.frame

When converting a list into a data.frame, R names the variables automatically by concatenating all the sublists names. However it appears that it only keeps the last name when a list is of length 1. Is there a way to enforce a full path name for the variable name?
MWE:
> l <- list(a = list(b = 1), c = 2)
> l
$a
$a$b
[1] 1
$c
[1] 2
> data.frame(l)
b c
1 1 2
> ll <- list(a = list(b = 1, bb = 1), c = 2)
> data.frame(ll)
a.b a.bb c
1 1 1 2
Here I would like to have a.b as the name of the variable of data.frame(l) like it does for data.frame(ll).
A possible solution is to create a function that converts the list into a data frame with as.data.frame() and then sets the names to the desired values in a second step:
list_df <- function(list) {
df <- as.data.frame(list)
names(df) <- list_names(list)
return (df)
}
Obviously, defining list_names() is the hard part. One possibility is to recurse through the nested lists:
list_names <- function(list) {
recursor <- function(list, names) {
if (is.list(list)) {
new_names <- paste(names, names(list), sep = ".")
out <- unlist(mapply(list, new_names, FUN = recursor))
} else {
out <- names
}
return(out)
}
new_names <- unlist(mapply(list, names(list), FUN = recursor))
return(new_names)
}
This works for your two examples:
l <- list(a = list(b = 1), c = 2)
ll <- list(a = list(b = 1, bb = 1), c = 2)
list_df(l)
## a.b c
## 1 1 2
list_df(ll)
## a.b a.bb c
## 1 1 1 2
It also works for a list that is not nested, as well as for a list with deeper nesting:
ls <- list(a = 1, b = 3)
lc <- list(a = list(b = 1, bb = 1), c = 2, d = list(e = list(f = 1, ff = 2), ee = list(fff = 5)))
list_df(ls)
## a b
## 1 1 3
list_df(lc)
## a.b a.bb c d.e.f d.e.ff d.ee.fff
## 1 1 1 2 1 2 5

R - Looping through datasets and change column names

I'm trying to loop through a bunch of datasets and change columns in R.
I have a bunch of datasets, say a,b,c,etc, and all of them have three columns, say X, Y, Z.
I would like to change their names to be a_X, a_Y, a_Z for dataset a, and b_X, b_Y, b_Z for dataset b, and so on.
Here's my code:
name.list = ("a","b","c")
for(i in name.list){
names(i) = c(paste(i,"_X",sep = ""),paste(i,"_Y",sep = ""),paste(i,"_Y",sep = ""));
}
However, the code above doesn't work since i is in text format.
I've considered assign function but doesn't seem to fit as well.
I would appreciate if any ideas.
Something like this :
list2env(lapply(mget(name.list),function(dat){
colnames(dat) <- paste(nn,colnames(dat),sep='_')
dat
}),.GlobalEnv)
for ( i in name.list) {
assign(i, setNames( get(i), paste(i, names(get(i)), sep="_")))
}
> a
a_X a_Y a_Z
1 1 3 A
2 2 4 B
> b
b_X b_Y b_Z
1 1 3 A
2 2 4 B
> c
c_X c_Y c_Z
1 1 3 A
2 2 4 B
Here's some free data:
a <- data.frame(X = 1, Y = 2, Z = 3)
b <- data.frame(X = 4, Y = 5, Z = 6)
c <- data.frame(X = 7, Y = 8, Z = 9)
And here's a method that uses mget and a custom function foo
name.list <- c("a", "b", "c")
foo <- function(x, i) setNames(x, paste(name.list[i], names(x), sep = "_"))
list2env(Map(foo, mget(name.list), seq_along(name.list)), .GlobalEnv)
a
# a_X a_Y a_Z
# 1 1 2 3
b
# b_X b_Y b_Z
# 1 4 5 6
c
# c_X c_Y c_Z
# 1 7 8 9
You could also avoid get or mget by putting a, b, and c into their own environment (or even a list). You also wouldn't need the name.list vector if you go this route, because it's the same as ls(e)
e <- new.env()
e$a <- a; e$b <- b; e$c <- c
bar <- function(x, y) setNames(x, paste(y, names(x), sep = "_"))
list2env(Map(bar, as.list(e), ls(e)), .GlobalEnv)
Another perk of doing it this way is that you still have the untouched data frames in the environment e. Nothing was overwritten (check a versus e$a).

Resources