Weird behavior of double-nested lists in R - r

This just took me two hours of debugging to identify:
> list1 = list() # empty list
> list1['first'] = list(a=list(a1='goat', a2='horse'), b=42) # double-nested
> print(list1$first$b)
NULL # Should be 42?
> print(list1) # let's check the actual contents of list1
$first
$first$a1 # how did the contents of the innermost a-list end up here?
[1] "goat"
$first$a2
[1] "horse"
In this case, the list assigned to 'first' becomes the list in a so b just disappears without warning. What is happening here, and where did the bvalue go?
I'm using R 3.0.2. How can I do something like this when R prevents me from doing the above?

As joran pointed out in a comment, the solution is to use double-brackets in the assignment:
list1[['first']] = list(a=list(a1='goat', a2='horse'), b=42)
Apparently you get a warning in newer R versions but not in older, if you use single-brackets.

Related

Assign value to indices of nested lists stored as strings in R

I have a dataframe of nested list indices, which have been stored as strings. To give a simplified example:
df1 <- data.frame(x = c("lst$x$y$a", "lst$x$y$b"), stringsAsFactors = F)
These are then coordinates for the following list:
lst <- list(x=list(y=list(a="foo",b="bar",c="")))
I'd like replace values or assign new values to these elements using the indices in df1.
One attempt was
do.call(`<-`, list(eval(parse(text = df1[1,1])), "somethingelse"))
but this doesn't seem tow work. Instead it assigns "something" to foo.
I'm not too happy with using eval(parse(text=)) (maintaining code will become a nightmare), but recognise I may have little choice.
Any tips welcome.
Let's consider 3 situations:
Case 1
do.call(`<-`, list("lst$x$y$a", "somethingelse"))
This will create a new variable named lst$x$y$a in your workspace, so the following two commands will call different objects. (The former is the object you store in lst, and the latter is the new variable. You need to call it with backticks because its name will confuse R.)
> lst$x$y$a # [1] "foo"
> `lst$x$y$a` # [1] "somethingelse"
Case 2
do.call(`<-`, list(parse(text = "lst$x$y$a"), "somethingelse"))
You mostly get what you expect with this one but an error still occurs:
invalid (do_set) left-hand side to assignment
Let's check:
> parse(text = "lst$x$y$a") # expression(lst$x$y$a)
It belongs to the class expression, and the operator <- seems not to accept this class to the left-hand side.
Case 3
This one will achieve what you want:
do.call(`<-`, list(parse(text = "lst$x$y$a")[[1]], "somethingelse"))
If put [[1]] behind an expression object, a call object will be extracted and take effect in the operator <-.
> lst
# $x
# $x$y
# $x$y$a
# [1] "somethingelse"
#
# $x$y$b
# [1] "bar"
#
# $x$y$c
# [1] ""

How to remove all R variables except the required one? [duplicate]

I have a workspace with lots of objects and I would like to remove all but one. Ideally I would like to avoid having to type rm(obj.1, obj.2... obj.n). Is it possible to indicate remove all objects but these ones?
Here is a simple construct that will do it, by using setdiff:
rm(list=setdiff(ls(), "x"))
And a full example. Run this at your own risk - it will remove all variables except x:
x <- 1
y <- 2
z <- 3
ls()
[1] "x" "y" "z"
rm(list=setdiff(ls(), "x"))
ls()
[1] "x"
Using the keep function from the gdata package is quite convenient.
> ls()
[1] "a" "b" "c"
library(gdata)
> keep(a) #shows you which variables will be removed
[1] "b" "c"
> keep(a, sure = TRUE) # setting sure to TRUE removes variables b and c
> ls()
[1] "a"
I think another option is to open workspace in RStudio and then change list to grid at the top right of the environment(image below). Then tick the objects you want to clear and finally click on clear.
I just spent several hours hunting for the answer to a similar but slightly different question - I needed to be able to delete all objects in R (including functions) except a handful of vectors.
One way to do this:
rm(list=ls()[! ls() %in% c("a","c")])
Where the vectors that I want to keep are named 'a' and 'c'.
Hope this helps anyone searching for the same solution!
To keep all objects whose names match a pattern, you could use grep, like so:
to.remove <- ls()
to.remove <- c(to.remove[!grepl("^obj", to.remove)], "to.remove")
rm(list=to.remove)
Replace v with the name of the object you want to keep
rm(list=(ls()[ls()!="v"]))
hat-tip: http://r.789695.n4.nabble.com/Removing-objects-and-clearing-memory-tp3445763p3445865.html
To keep a list of objects, one can use:
rm(list=setdiff(ls(), c("df1", "df2")))
This takes advantage of ls()'s pattern option, in the case you have a lot of objects with the same pattern that you don't want to keep:
> foo1 <- "junk"; foo2 <- "rubbish"; foo3 <- "trash"; x <- "gold"
> ls()
[1] "foo1" "foo2" "foo3" "x"
> # Let's check first what we want to remove
> ls(pattern = "foo")
[1] "foo1" "foo2" "foo3"
> rm(list = ls(pattern = "foo"))
> ls()
[1] "x"
require(gdata)
keep(object_1,...,object_n,sure=TRUE)
ls()
From within a function, rm all objects in .GlobalEnv except the function
initialize <- function(country.name) {
if (length(setdiff(ls(pos = .GlobalEnv), "initialize")) > 0) {
rm(list=setdiff(ls(pos = .GlobalEnv), "initialize"), pos = .GlobalEnv)
}
}
assuming you want to remove every object except df from environment:
rm(list = ls(pattern="[^df]"))
let's think in different way, what if we wanna remove a group?
try this,
rm(list=ls()[grep("xxx",ls())])
I personally don't like too many tables, variables on my screen, yet I can't avoid using them. So I name the temporary things starting with "xxx", so I can remove them after it is no longer used.
# remove all objects but selected
rm(list = ls()[which("key_function" != ls())])
How about this?
# Removes all objects except the specified & the function itself.
rme <- function(except=NULL){
except = ifelse(is.character(except), except, deparse(substitute(except)))
rm(list=setdiff(ls(envir=.GlobalEnv), c(except,"rme")), envir=.GlobalEnv)
}
The following will remove all the objects from your console
rm(list = ls())

Alternative to assign function in r

I am using the following code in a loop, I am just replicating the part which I am facing the problem in. The entire code is extremely long and I have removed parts which are running fine in between these lines. This is just to explain the problem:
for (j in 1:2)
{
assign(paste("numeric_data",j,sep="_"),unique_id)
for (i in 1:2)
{
assign(paste("numeric_data",j,sep="_"),
merge(eval(as.symbol(paste("numeric_data",j,sep="_"))),
eval(as.symbol(paste("sd_1",i,sep="_"))),all.x = TRUE))
}
}
The problem that I am facing is that instead of assign in the second step, I want to use (eval+paste)
for (j in 1:2)
{
assign(paste("numeric_data",j,sep="_"),unique_id)
for (i in 1:2)
{
eval(as.symbol((paste("numeric_data",j,sep="_"))))<-
merge(eval(as.symbol(paste("numeric_data",j,sep="_"))),
eval(as.symbol(paste("sd_1",i,sep="_"))),all.x = TRUE)
}
}
However R does not accept eval while assigning new variables. I looked at the forum and everywhere assign is suggested to solve the problem. However, if I use assign the loop overwrites my previously generated "numeric_data" instead of adding to it, hence I get output for only one value of i instead of both.
Here is a very basic intro to one of the most fundamental data structures in R. I highly recommend reading more about them in standard documentation sources.
#A list is a (possible named) set of objects
numeric_data <- list(A1 = 1, A2 = 2)
#I can refer to elements by name or by position, e.g. numeric_data[[1]]
> numeric_data[["A1"]]
[1] 1
#I can add elements to a list with a particular name
> numeric_data <- list()
> numeric_data[["A1"]] <- 1
> numeric_data[["A2"]] <- 2
> numeric_data
$A1
[1] 1
$A2
[1] 2
#I can refer to named elements by building the name with paste()
> numeric_data[[paste0("A",1)]]
[1] 1
#I can change all the names at once...
> numeric_data <- setNames(numeric_data,paste0("B",1:2))
> numeric_data
$B1
[1] 1
$B2
[1] 2
#...in multiple ways
> names(numeric_data) <- paste0("C",1:2)
> numeric_data
$C1
[1] 1
$C2
[1] 2
Basically, the lesson is that if you have objects with names with numeric suffixes: object_1, object_2, etc. they should almost always be elements in a single list with names that you can easily construct and refer to.

R - Return an object name from a for loop

Using a basic function such as this:
myname<-function(z){
nm <-deparse(substitute(z))
print(nm)
}
I'd like the name of the item to be printed (or returned) when iterating through a list e.g.
for (csv in list(acsv, bcsv, ccsv)){
myname(csv)
}
should print:
acsv
bcsv
ccsv
(and not csv).
It should be noted that acsv, bcsv, and ccsvs are all dataframes read in from csvs i.e.
acsv = read.csv("a.csv")
bcsv = read.csv("b.csv")
ccsv = read.csv("c.csv")
Edit:
I ended up using a bit of a compromise. The primary goal of this was not to simply print the frame name - that was the question, because it is a prerequisite for doing other things.
I needed to run the same functions on four identically formatted files. I then used this syntax:
for(i in 1:length(csvs)){
cat(names(csvs[i]), "\n")
print(nrow(csvs[[i]]))
print(nrow(csvs[[i]][1]))
}
Then the indexing of nested lists was utilized e.g.
print(nrow(csvs[[i]]))
which shows the row count for each of the dataframes.
print(nrow(csvs[[i]][1]))
Then provides a table for the first column of each dataframe.
I include this because it was the motivator for the question. I needed to be able to label the data for each dataframe being examined.
The list you have constructed doesn't "remember" the expressions it was constructed of anymore. But you can use a custom constructor:
named.list <- function(...) {
l <- list(...)
exprs <- lapply(substitute(list(...))[-1], deparse)
names(l) <- exprs
l
}
And so:
> named.list(1+2,sin(5),sqrt(3))
$`1 + 2`
[1] 3
$`sin(5)`
[1] -0.9589243
$`sqrt(3)`
[1] 1.732051
Use this list as parameter to names, as Thomas suggested:
> names(mylist(1+2,sin(5),sqrt(3)))
[1] "1 + 2" "sin(5)" "sqrt(3)"
To understand what's happening here, let's analyze the following:
> as.list(substitute(list(1+2,sqrt(5))))
[[1]]
list
[[2]]
1 + 2
[[3]]
sqrt(5)
The [-1] indexing leaves out the first element, and all remaining elements are passed to deparse, which works because of...
> lapply(as.list(substitute(list(1+2,sqrt(5))))[-1], class)
[[1]]
[1] "call"
[[2]]
[1] "call"
Note that you cannot "refactor" the call list(...) inside substitute() to use simply l. Do you see why?
I am also wondering if such a function is already available in one of the countless R packages around. I have found this post by William Dunlap effectively suggesting the same approach.
I don't know what your data look like, so here's something made up:
csvs <- list(acsv=data.frame(x=1), bcsv=data.frame(x=2), ccsv=data.frame(x=3))
for(i in 1:length(csvs))
cat(names(csvs[i]), "\n")

R automatic names modification

I stumbled upon this weird behavior in R:
> a = 5
> names(a) <- "bar"
> b = c(foo = a)
> names(b)
[1] "foo.bar"
Why do the names get concatenated/stacked?
I found this c(a=b) syntax in a script, but I couldn't find documentation about it. Is there any documentation for that?
Why do the names get
concatenated/stacked?
Because it preserves all the name information that was present before the concatenation. If you don't like it, use unname.
I found this c(a=b) syntax in a
script, but I couldn't find
documentation about it. Is there any
documentation for that?
Some of the examples on the ?c page demonstrate c(name = value) behaviour, but there isn't much more to it than that. You might also want to look at ?names.
It might also be instructive to see what happens if a is a vector; in this case if foo=a just redefined the name, all elements of the vector would end up with the same name. Instead, as in the following example, the four elements end up with unique names, which can be nice.
> a <- c(A=1, B=2)
> b <- c(A=3, B=4)
> c(a=a, b=b)
a.A a.B b.A b.B
1 2 3 4

Resources