Assigning and removing objects in a loop: eval(parse(paste( - r

I am looking to assign objects in a loop. I've read that some form of eval(parse( is what I need to perform this, but I'm running into errors listing invalid text or no such file or directory. Below is sample code of generally what I'm attempting to do:
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(paste(letters[i],"<-mean(x[,,",i,"])",sep="")
}
And when I'm finished using these objects, I would like to remove them (the actual objects are very large and cause memory problems later on...)
for (i in 1:length(x[1,1,])) eval(parse(paste("rm(",letters[i],")",sep="")))
Both eval(parse(paste( portions of this script return errors for invalid text or no such file or directory. Am I missing something in using eval(parse(? Is there a easier/better way to assign objects in a loop?

That's a pretty disgusting and frustrating way to go about it. Use assign to assign and rm's list argument to remove objects.
> for (i in 1:length(x[1,1,])) {
+ assign(letters[i],mean(x[,,i]))
+ }
> ls()
[1] "a" "b" "c" "i" "x"
> a
[1] 3.5
> b
[1] 9.5
> c
[1] 15.5
> for (i in 1:length(x[1,1,])) {
+ rm(list=letters[i])
+ }
> ls()
[1] "i" "x"
>
Whenever you feel the need to use parse, remember fortune(106):
If the answer is parse() you should
usually rethink the question.
-- Thomas Lumley, R-help (February 2005)

Although it seems there are better ways to handle this, if you really did want to use the "eval(parse(paste(" approach, what you're missing is the text flag.
parse assumes that its first argument is a path to a file which it will then parse. In your case, you don't want it to go reading a file to parse, you want to directly pass it some text to parse. So, your code, rewritten (in what has been called disgusting form above) would be
letters=c('a','b','c')
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(text=paste(letters[i],"<-mean(x[,,",i,"])",sep="")))
}
In addition to not specifying "text=" you're missing a few parentheses on the right side to close your parse and eval statements.
It sounds like your problem has been solved, but for people who reach this page who really do want to use eval(parse(paste, I wanted to clarify.

Very bad idea; you should never use eval or parse in R, unless you perfectly know what you are doing.
Variables can be created using:
name<-"x"
assign(name,3) #Eqiv to x<-3
And removed by:
name<-"x"
rm(list=name)
But in your case, it can be done with simple named vector:
apply(x,3,mean)->v;names(v)<-letters[1:length(v)]
v
v["b"]
#Some operations on v
rm(v)

It is best to avoid using either eval(paste( or assign in this case. Doing either creates many global variables that just cause additional headaches later on.
The best approach is to use existing data structures to store your objects, lists are the most general for these types of cases.
Then you can use the [ls]apply functions to do things with the different elements, usually much quicker than looping through global variables. If you want to save all the objects created, you have just one list to save/load. When it comes time to delete them, you just delete 1 single object and everything is gone (no looping). You can name the elements of the list to refer to them by name later on, or by index.

Related

Referencing recently used objects in R

My question refers to redundant code and a problem that I've been having with a lot of my R-Code.
Consider the following:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
combined_df_putnam$fu_time<-combined_df_putnam$age*365.25
combined_df_einstein$fu_time<-combined_einstein$age*365.25
combined_df_newton$fu_time<-combined_newton$age*365.25
...
combined_leibniz$fu_time<-combined_leibniz$age*365.25
I am trying to slim-down my code to do something like this:
list_names<-c("putnam","einstein","newton","kant","hume","locke","leibniz")
paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1)
paste0("combined_df_",list_names[0:7]) <- paste0("combined_df_",list_names[0:7])$age*365.25
When I try to do that, I get "target of assignment expands to non-language object".
Basically, I want to create a list that contains descriptors, use that list to create a list of dataframes/lists and use these shortcuts again to do calculations. Right now, I am copy-pasting these assignments and this has led to various mistakes because I failed to replace the "name" from the previous line in some cases.
Any ideas for a solution to my problem would be greatly appreciated!
The central problem is that you are trying to assign a value (or data.frame) to the result of a function.
In paste0("combined_df_",list_names[0:7]) <- data.frame("age"=1), the left-hand-side returns a character vector:
> paste0("combined_df_",list_names[0:7])
[1] "combined_df_putnam" "combined_df_einstein" "combined_df_newton"
[4] "combined_df_kant" "combined_df_hume" "combined_df_locke"
[7] "combined_df_leibniz"
R will not just interpret these strings as variables that should be created and be referenced to. For that, you should look at the function assign.
Similarily, in the code paste0("combined_df_",list_names[0:7])$age*365.25, the paste0 function does not refer to variables, but simply returns a character vector -- for which the $ operator is not accepted.
There are many ways to solve your problem, but I will recommend that you create a function that performs the necessary operations of each data frame. The function should then return the data frame. You can then re-use the function for all 7 philosophers/scientists.

Trouble interpolating column name for subset in R function

I have (fbodata) 'data.frame': 6181090 obs. of 41 variables:
I want to subset it and save the portion that pertains to a specific subset (like a zip). My approach seems to work when it is not in a function, but I ultimately want to use sapply.
nmakedir <- function(item, ccol) {
snipped a bunch of code that works
trim<- fbodata[ which(paste(ccol)==item),]
trim%>% drop_na(paste(ccol))
trim<- droplevels(trim
save(trim, file = paste(item, "rda", sep="."))
}
The line that doesn't work is one where I creating the subset with which. If i hardcode the line using fbodata$zip instead of paste(ccol) it works fine. Eventually, I plan to call it with something like:
sapply(unique(fbodata$zip),zip, FUN = nmakedir)
I appreciate any clues, I have been on this for a good long while.
A few things going on:
ccol is a string. paste(ccol) is the same string. You never need to call paste with only one argument. (You can use paste to coerce non-strings to strings, but in that case you should use as.character() to be clear.)
Keeping in mind that ccol is a string, what is fbodata$zip? It's a column! What is the equivalent using ccol and brackets? fbodata[[ccol]] or fbodata[, ccol]. You can use either of those interchangeably with fbodata$zip. So, this bad line
fbodata[ which(paste(ccol)==item),]
# should be this:
fbodata[which(fbodata[[ccol]] == item), ]
drop_na, like most dplyr functions, expects (quoting from the help) "bare variable names", not strings. Also from the help, "See Also: drop_na_ for a version that uses regular evaluation and is suitable for programming with". In this case, I don't think you need to do anything more than replace drop_na with drop_na_.
You are missing a right parenthesis on your droplevels command.
There might be more, but this is much as I can see without any sample data. Your sapply call looks funny to me because I thought zip is supposed to be a column name, but when you call sapply(unique(fbodata$zip),zip, FUN = nmakedir) it needs to be an object in your global environment. I would think sapply(unique(fbodata$zip), 'zip', FUN = nmakedir) makes more sense, but without a reproducile example there's no way to know.
It also seems like you're coding your own version of split. I would probably start this off with fbo_split = split(fbodata, fbodata$zip) and then use lapply to drop_na_, droplevels, and save, but maybe your snipped code makes that a less good idea.

How to remove selected R variables without having to type their names

While testing a simulation in R using randomly generated input data, I have found and fixed a few bugs and would now like to re-run the simulation with the same data, but with all intermediate variables removed to ensure it's a clean test.
Is there a way to remove several dozen manually selected variables from the workspace without having to:
a) clobber the entire workspace, e.g. rm(list=ls()), or b) type each variable name, e.g. remove(name1, name2, ...)?
Ideal solution would be to use ls() to inspect the definitions and then pick out the indices of the ones I want to remove, e.g.
ls() # inspect definitions
delme <- c(3,5,7:9,11,13) # names selected for removal
remove(ls()[delme]) # DESIRED SOLUTION -- doesn't quite work this way
(In hindsight, I should have used a fixed seed to generate the random input data, which allow clearing everything and then re-running the test...)
There is a much simpler and more direct solution:
vars.to.remove <- ls()
vars.to.remove <- temp[c(1,2,14:15)]
rm(list = vars.to.remove)
Or, better yet, if you are good about variable naming schemes, you can use the following pattern matching strategy:
E.g. I name all temporary variables with the starting string "Temp."
... so, you can have Temp.Names, Temp.Values, Temp.Whatever
The following produces the list of variables that match this pattern
ls(pattern = "^Temp\\.")
So, you can remove all unneeded variables using ONE line of code, as follows:
rm(list = ls(pattern = "^Temp\\."))
Hope this helps.
Assad, while I think the actual answer to the question is in the comments, let me suggest this pattern as a broader solution:
rm(list=
Filter(
Negate(is.na), # filter entries corresponding to objects that don't meet function criteria
sapply(
ls(pattern="^a"), # only objects that start with "a"
function(x) if(is.matrix(get(x))) x else NA # return names of matrix objects
) ) )
In this case, I'm removing all matrix object that start with "a". By modifying the pattern argument and the function used by sapply here, you can get pretty fine control over what you delete, without having to specify many names.
If you are concerned that this could delete something you don't want to delete, you can store the result of the Filter(... operation in a variable, review the contents, and then execute the rm(list=...) command.
Try
eval(parse(text=paste("rm(",paste(ls()[delme],sep=","),")")))
I had a similar requirement. I pulled all the elements I needed to a list:
varsToPurge = as.list(ls())
I then reassign the few values I wish to keep with new variable names which will not be in the variable varsToPurge. After that I looped through the elements
for (j in 1:length(varsToPurge)){
rm(list = as.character(varsToPurge[j]))
}
Do a little garbage collecting, and you maintain a clean environment as you go through your code.
gc()
You can also use a vector of row numbers you wish to keep instead and run through the vector in the loop but it won't be as dynamic if you add rough work you wish to remove.

Using lists in R

Sorry for possibly a complete noob question but I have just started programming with R today and I am stuck already.
I am reading some data from a file which is in the format.
3.482373 8.0093238198371388 47.393873
0.32 20.3131 31.313
What I want to do is split each line then deal with each of the individual numbers.
I have imported the stringr package and using
x = str_split(line, " ")
This produces a list which I would like to index but don't know how.
I have learnt that x[[1:2]] gets the second element but that is about it. Ideally I would like something like
x1 = x[1]
x2 = x[2]
x3 = x[3]
But can't find anyway of doing this.
Thanks in advance
By using unlist you will get a vector instead of a list of vectors, and you will then be able to index it directly :
R> unlist(str_split("foo bar baz", " "))
[1] "foo" "bar" "baz"
But maybe you should read your file directly from read.table or one of its variant ?
And if you are beginning with R, you really should read one of the introduction available if you want to understand subsetting, indexing, etc.
you can wrap your call to str_split with unlist to get the behavior you're looking for.
The usual way to get this in would be to import it into a dataframe (a special sort of list). If file name is "fil.dat"" and is in "C:/dir/"
dfrm <- read.table("C:/dir/fil.dat") # resist the temptation to use backslashes
dfrm[2,2] # would give you the second item on the second row.
By default the field separator in R is "white-space" and that seems to be what you have, so you do not need to supply a sep= argument and the read.table function will attempt to import as numeric. To be on the safe side, you might consider forcing that option with colClasses=rep("numeric", 3) because if it encounters a strange item (such as often produced by Excel dumps), you will get a factor variable and will probably not understand how to recover gracefully.

R data table issue

I'm having trouble working with a data table in R. This is probably something really simple but I can't find the solution anywhere.
Here is what I have:
Let's say t is the data table
colNames <- names(t)
for (col in colNames) {
print (t$col)
}
When I do this, it prints NULL. However, if I do it manually, it works fine -- say a column name is "sample". If I type t$"sample" into the R prompt, it works fine. What am I doing wrong here?
You need t[[col]]; t$col does an odd form of evaluation.
edit: incorporating #joran's explanation:
t$col tries to find an element literally named 'col' in list t, not what you happen to have stored as a value in a variable named col.
$ is convenient for interactive use, because it is shorter and one can skip quotation marks (i.e. t$foo vs. t[["foo"]]. It also does partial matching, which is very convenient but can under unusual circumstances be dangerous or confusing: i.e. if a list contains an element foolicious, then t$foo will retrieve it. For this reason it is not generally recommended for programming.
[[ can take either a literal string ("foo") or a string stored in a variable (col), and does not do partial matching. It is generally recommended for programming (although there's no harm in using it interactively).

Resources