Best practice for naming archived objects? - r

I've got a function that has a list output. Every time I run it, I want to export the results with save. After a couple of runs I want to read the files in and compare the results. I do this, because I don't know how many tasks there will be, and maybe I'll use different computers to calculate each task. So how should I name the archived objects, so later I can read them all in?
My best guess would be to dynamically name the variables before saving, and keep track of the object names, but I've read everywhere that this is a big no-no.
So how should I approach this problem?

You might want to use the saveRDS and readRDS functions instead of save and load. The RDS version functions will save and read single objects without the attached name. You would create your object and save it to a file (using paste0 or sprintf to create unique names), then when processing the results you can read in one object at a time, or read several into a list to work with them.

You can use scope to hide the retrieved name inside a function, so first you might save a list to a file:
mybiglist <- list(fred=1, john='dum di dum', mary=3)
save(mybiglist, file='mybiglist1.RData')
Then you can load it back in through a function and give it whatever name you like be it inside another list or just a plain object:
# Use the fact that load returns the name of the object loaded
# and that scope will hide this object
myspecialload <- function(RD.fnam) {
return(eval(parse(text=load(RD.fnam))))
}
# now lets reload that file but put it in another object
mynewbiglist <- myspecialload('mybiglist1.RData')
mynewbiglist
$fred
[1] 1
$john
[1] "dum di dum"
$mary
[1] 3
Note that this is not really a generic 'use it anywhere' type function, as for an RData file with multiple objects it appears to return the last object saved... so best stick with one list object per file for now!

One time I was given several RData files, and they all had only one variable called x. In order to read all of them in my workspace, I loaded sequentially each the variable to its environment, and I used get() to read its value.
tenv <- new.env()
load("file_1.RData", envir = tenv)
ls(tenv) # x
myvar1 <- get(ls(tenv), tenv)
rm(tenv)
....
This code can be repeated for each file.

Related

How to Read Data from .rda with read.table [duplicate]

I am trying to load an .rda file in r which was a saved dataframe. I do not remember the name of it though.
I have tried
a<-load("al.rda")
which then does not let me do anything with a. I get the error
Error:object 'a' not found
I have also tried to use the = sign.
How do I load this .rda file so I can use it?
I restared R with load("al.rda) and I know get the following error
Error: C stack usage is too close to the limit
Use 'attach' and then 'ls' with a name argument. Something like:
attach("al.rda")
ls("file:al.rda")
The data file is now on your search path in position 2, most likely. Do:
search()
ls(pos=2)
for enlightenment. Typing the name of any object saved in al.rda will now get it, unless you have something in search path position 1, but R will probably warn you with some message about a thing masking another thing if there is.
However I now suspect you've saved nothing in your RData file. Two reasons:
You say you don't get an error message
load says there's nothing loaded
I can duplicate this situation. If you do save(file="foo.RData") then you'll get an empty RData file - what you probably meant to do was save.image(file="foo.RData") which saves all your objects.
How big is this .rda file of yours? If its under 100 bytes (my empty RData files are 42 bytes long) then I suspect that's what's happened.
I had to reinstall R...somehow it was corrupt. The simple command which I expected of
load("al.rda")
finally worked.
I had a similar issue, and it was solved without reinstall R. for example doing
load("al.rda) works fine, however if you do
a <- load("al.rda") will not work.
The load function does return the list of variables that it loaded. I suspect you actually get an error when you load "al.rda". What exactly does R output when you load?
Example of how it should work:
d <- data.frame(a=11:13, b=letters[1:3])
save(d, file='foo.rda')
a <- load('foo.rda')
a # prints "d"
Just to be sure, check that the load function you actually call is the original one:
find("load") # should print "package:base"
EDIT Since you now get an error when you load the file, it is probably corrupt in some way. Try this and say what it prints:
file.info("a1.rda") # Prints the file size etc...
readBin("a1.rda", "raw", 50) # reads first 50 bytes from the file
Without having access to the file, it's hard to investigate more... Maybe you could share the file somehow (http://www.filedropper.com or similar)?
I usually use save to save only a single object, and I then use the following utility method to retrieve that object into a given variable name using load, but into a temporary namespace to avoid overwriting existing objects. Maybe it will be helpful for others as well:
load_first_object <- function(fname){
e <- new.env(parent = parent.frame())
load(fname, e)
return(e[[ls(e)[1]]])
}
The method can of course be extended to also return named objects and lists of objects, but this simple version is for me the most useful.

Can convert a string to an object but can't save() it -- why? [duplicate]

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

change object name in loop inR

I have a for loop which reads in a different RData file in each iteration and that works well with just paste.
However, once the RData file is loaded there is an object loaded called
topy in the first instance of the loop, in the second it is ropy, then eopy and so on.
What I now tried is
vals<-c("topy","ropy","eopy")
paste("vals[i]")->r
to assign these different objects to r which is used further in the script and is overwirtten in every step of the loop. But that does not work. Topy and ropy and the rest ar matrixes.
When I load the RData file and then just type manually topy the matrix will be shown but if I do paste and then type r it will only show "topy". I also tried assign - did not work..any idea?
I'm not sure whether I got your point, is the following what you need?
p <-parse(text=vals[i])
r <- eval(p)

Store a variable directly from loading in r

I have an RData "E.g.RData"
I loaded it into R console using the load function.
load("E.g.RData")
it has a variable e.g. in RData.
I am doing like this -
e <- load("E.g.RData")
then e gets the character vector as "e.g."
but I want the contents of e.g. into e.
Is there a way to do it in R?
Yeah, the problem is that E.g maintains its name during the saving of the object. You could try assigning the new name "e" to the E.g. object and then remove the E.g. object:
E.g <- runif(100)
save(E.g, file="E.g.Rdata")
load("E.g.Rdata")
assign("e", E.g)
rm(E.g)
This can be done using:
y <- get(load("path/E.g.RData"))
y will contain the contents of the e.g. variable.
Rather than using the load function with its defaults, which overwrites anything of the same name in the global workspace, you may prefer to use attach to attach the workspace, then copy just the object(s) of interest with the names you want, then detach the workspace.

How to save() with a particular variable name

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

Resources