Trying to save everything in R environment to disk - r

I need to save items in my environment in R to disk. I can't figure out why the following code doesn't work :
op <- function(){
for(i in 1:length(ls())){
file <- paste0(ls()[i],".Rds")
saveRDS(file,file)
}
}

There are actually couple things wrong here:
I suspect you want to save .GlobalEnv, not just op's environment. However the calls to ls will list objects in op's environment (which is only i by the time you call ls). If you want to list object in .GlobalEnv, call ls(.GlobalEnv)
Also, when you are calling saveRDS, you are telling it to save a string stored in file into path stored in file. So you are essentially only saving the path. Instead you need to get the object from .GlobalEnv
So one of correct ways to do it would be:
op <- function(){
obj_names <- ls(.GlobalEnv)
for(i in 1:length(obj_names){
file <- paste0(obj_names[i],".Rds")
saveRDS(get(obj_names[i], envir = .GlobalEnv), file)
}
}
Or a bit more idiomatic,
op <- function()
sapply(ls(.GlobalEnv), function(x) saveRDS(get(x, envir = .GlobalEnv), paste0(x, ".Rds"))
Also save function might be useful, if you don't mind saving all objects in one file. More at ?save

The code you wrote only saves a list of files with names identical to the names in the environment of your function (i.e. a single file "i.rds").
If you want to save the contents of an environment to a file, you might want to try the save() or save.image() function which does exactly what you are looking for.
For information try ?save. Here is some code:
a <- 1
b <- 2
save(list=ls(), file="myfile.rda")
rm(list=ls())
load(file="myfile.rda")
ls()
yielding:
[1] "a" "b"

Related

Load csv file in R

is there a way to load a csv file in R and define the variable automatically from the filename?  So, if you have a csv file called 'hello',  can I load it in R and create the df/var. without defining it? 
So, rather than define hello in the load procedure: hello=read("filepath/hello"); instead we have read("filepath/hello") but include a command to create and name a variable that is the same name of the file name (hello in this example?)
Depending on why you would like to do this I would offer you another solution:
I suppose your problem is that you have a big folder with a lot of csv files and you would like to load them all and give the variables the name of the csv file without typing everything manually.
Then you can run
> setwd("C:/Users/Testuser/testfiles")
> file_names <- list.files()
> file_names
[1] "rest" "test1.txt" "test2.csv" "test3.csv"
where as path you use the path where all your csv files are stored.
Then if there are stored any other files and you only would like to have the csv files we have to grep them with regex
> file_names_csv <- file_names[grepl(".csv",file_names)]
> file_names_csv
[1] "test2.csv" "test3.csv"
Now we load them with a for loop and assign them to a variable that is named as the corresponding csv file
for( name in file_names_csv){
assign(paste(name, sep=""), read.csv(file = paste(name, sep="")))
}
And we have
> test2.csv
test
1 1234
> test3.csv
test
1 2323
You can also gsub the .csv away before you load the data by
> file_names_csv <- gsub(".csv","",file_names_csv )
> file_names_csv
[1] "test2" "test3"
So basically you have exactly what you have asked for without using global variables.
I have advised you not to do this in any real world scenario, but if it helps understanding the concepts, this is not a complete solution but the important ingredients.
<<- the superassignement operator in the enclosing environment, which in the following case is the global namespace:
rm(hello) # just in case, ignore warning if there is any
dont <- function(){
hello <<- 42
}
print(hello)
dont()
print(hello)
So you can define values in the enclosing environment within a function without a return value.
The name of that variable does not have to be fixed (as hello in the example above) but can depend on an argument to that function as in
dontdothis <- function(name){
eval(parse(text = paste0(name, " <<- 42")))
}
dontdothis("frederik")
print(frederik * 2)
You will need to add the file operations and some small detail but that is how one could do it. You may want to google for namespaces and environments and assignment operators in R to get a better understanding of the details in there.
Worthwhile short read to distinguish between global environment and enclosing environment: Why is using `<<-` frowned upon and how can I avoid it?

Can convert a string to an object but can't save() it -- why? [duplicate]

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

Clarity on issue with nested functions and environments in R

I created a function to save data to a specific location which is loaded as part of a package when launching rstudio:
save_data <- function(fileName, file_name){
file_path <- paste0("~/", file_name)
save(list=deparse(substitute(fileName)), file=file_path)
}
but when calling this function from inside another function it returns an "Error saving the following file: ...".
To reproduce the issue I created a savetestfunction:
savetest <- function(fileName){
data1 <- fileName * 10
save_data(data1, file_name = "test.RData")
data1
}
and a small savetestscript:
source("savetestfunction.R")
x <- c(1:10)
data1 <- savetest(x)
I spent a long time assuming the issue stemmed from the environments and what was being seen from each (e.g. save_data function loaded on startup couldn't see the temporary environment created when calling the savetestfunction) but as a test I tried adding a print(fileName) to the save_data function and to my surprise it could in fact see what fileName was at this point.
The Fix: I updated the function so that it assigned fileName within the environment created by the save_data function and it now functions as intended...
save_data <- function(fileName, file_name){
file_path <- paste0("//placeholder//", file_name)
assign(deparse(substitute(fileName)), fileName)
save(list=deparse(substitute(fileName)), file=file_path)
}
The Confusion: This then led me to believe that fileName was perhaps a promise when it was being saved however save() contains an argument eval.promises which is True by default so it can't be this!
The Question: What was causing this issue? (Confusion related to environments? Promises?) What could have been done to avoid this?
Edit: Tried to use saveRDS but didn't have success with this. More interested in why the save doesn't work in the first place rather than an actual fix as the assign within the save_data function already lets it function.

R: Save all data.frames in workspace to separate .RData files

I have several data.frames in an environment which I would like to save into separate .RData files. Is there a function which is able to save to whole workspace?
I usually just do this with the following function:
save(x, file = "xy.RData")
but is there a way I could save all the data.frames separately at once?
Creating a bunch of different files isn't how save() is vectorized. Probably better to use a loop here. First, get a vector of all of your data.frame names.
dfs<-Filter(function(x) is.data.frame(get(x)) , ls())
Now write each to a file.
for(d in dfs) {
save(list=d, file=paste0(d, ".RData"))
}
Or if you just wanted them all in one file
save(list=dfs, file="alldfs.RData")
To save your workspace you just need to do:
save.image("willcontainworkspace.RData")
This creates a single file that contains the entire workspace which may or may not be what you want but your question wasn't completely clear to me.
Similar to #MrFlick's approach, you can do something like this:
invisible({
sapply(ls(envir = .GlobalEnv), function(x) {
obj <- get(x, envir = .GlobalEnv)
if (class(obj) == "data.frame") {
save(obj, file = paste0(x, ".RData"))
}
})
})

How to save() with a particular variable name

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

Resources