How to save() with a particular variable name - r

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)

Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)

My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.

You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)

you can use
save.image("myfile.RData")

This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

Related

Save .RData in a different directory

I load my files (.RData) from a particular folder, and i created a subfolder to save some samples and subsets. So, i want to save these elements in the subfolder, and they don't have the same name structure because i have multiple datasets (for example it cannot be sub1, sub2 etc, i have to write try1, full_sample, sub_2021 and so on).
I tried the following :
subsets_samples <- file.path <-("/Volumes/WD_BLACK/Merge/SAMPLES_SUBSETS")
fname <- file.path(subsets_samples, ".RData")
save(mydata, file=fname)
But obviously there is a problem with the saving part. My goal is to have something like :
save(mydata, file = "newname")
With the .RData format from fname that is put automatically.
I saw some answers with loops and so on but i don't really understand the process i'm sorry.
Thanks !
The problem with file.path is that it will place a separator (e.g., /¸) between each of the elements. So you would have to use paste0 in addition for the actual file name:
# If I understand you correctly, you want the iteration, like try1, full_sample, sub_2021 and so on in your file name. define them somewhere in your loop/script
iteration <- "full_sample"
fname <- file.path("Volumes", "WD_BLACK", "Merge", "SAMPLES_SUBSETS", paste0(iteration, ".Rds"))
Additionally, I would suggest to use saveRDS instead of save, since it is the appropriate function if you want to save just one object.
saveRDS(mydata, file = fname)

Read many .sas7bdat files via a loop in R

I am a noob in R and a experience a lot of trouble with the following:
I have to read in over 200 datasets and I want to do this automatically. I wrote some code that works perfectly for Rdata extensions but if I try it for SAS-files it always blocks...
path= "road"
# I make a list of all the different paths of all the files in my folder
File_pathnames <- list.files (path= Road, pattern = "*.sas7bdat", full.names=T)
# I create an empty list
list.data<-list()
# I try to run a loop to load all the SAS files:
for (i in 1: length(File_pathnames))
{
list.data[[i]] <- read_sas(File_pathnames[i])
}
Problem: it does not load the tables into my global environment (when I used the rdata files I used the load function and all the data appeared in the global environment). How Can I solve this?
many thanks!
Actually, your data ARE in the global environment, as elements of list.data (check list.data[[1]], list.data[[2]], ...)
The issue you have is linked to the fact that load loads an object in the environment using the name it had when it was saved. As an example
x <- 10
save(x, file='tmp')
rm(x)
x
load('tmp')
x
save x and reload it, while read_sas only load the data that you have to assign to a variable.
If you want to assign specifically each data set, you have to define a name for each of them and assign the data. Your loop would look like
for (i in 1: 1: length(File_pathnames))
{
namei <- paste0("name",i)
data <- read_sas(File_pathnames[i])
assign(namei, data)
}
and your data would be stored in "name1", "name2", ...
You should the assign each SAS files read in File_pathnames[i] as an object named FilenamesS[i]. Try
for (i in 1: length(File_pathnames))
{
data <- read_sas(File_pathnames[i])
assign (FilenamesS[i], data)
}

Can convert a string to an object but can't save() it -- why? [duplicate]

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

How do I use lapply to load files into the global environment?

I have the following working code:
############################################
###Read in all the wac gzip files###########
###Add some calculated fields ###########
############################################
library(readr)
setwd("N:/Dropbox/_BonesFirst/65_GIS_Raw/LODES/")
directory<-("N:/Dropbox/_BonesFirst/65_GIS_Raw/LODES/")
to.readin <- as.list(list.files(pattern="2002.csv"))
LEHD2002<-lapply(to.readin, function(x) {
read.table(gzfile(x), header = TRUE, sep = ",", colClasses = "numeric", stringsAsFactors = FALSE)
})
But I would like to load the things from lapply into the global environment, for debugging reasons.
This provides a way to do so.
# Load data sets
lapply(filenames, load, .GlobalEnv)
But when I attempt to use it, I get the following error:
Error in FUN(X[[i]], ...) : bad restore file magic number (file may be
corrupted) -- no data loaded In addition: Warning message: file
‘az_wac_S000_JT00_2004.csv.gz’ has magic number 'w_geo' Use of save
versions prior to 2 is deprecated
Am I doing something wrong, or is 'load' deprecated or the like?
The gzfile(x) converts the .gz (zipped) file to a .csv so that shouldn't be an issue...
load loads files in a binary format (e.g., .rda files). You're loading in files in a textual format, .csv files. This is why you're using read.table. When you try to read textual format files using load, you will get that error.
The usage: lapply(filenames, load, .GlobalEnv), passes .GlobalEnv to load, not to lapply. This is just a different way of reading in a list of files that are in a different format than yours. load can put the objects in a different environment as a way to protect you from overwriting objects in your current environment with the same name as the objects you're loading. Binary objects created using save (which you can load in with load) carry their names with them. When you load them in, you do not assign them to a name. They are accessible in the environment you choose to load them into with their original name.
Both methods load the objects into .GlobalEnv. So your code works the way you want it to. You can tell that your objects have not somehow been read into a different environment by trying to access them after you run the code. If you can access them using the object you named them with
Quick and dirty way is to load it into the global environment, with <<- rather than <-
LEHD2002<<-lapply(to.readin, function(x)
LEHD2002<-lapply(to.readin, function(x)
attach() can also be used; but is touchier, and attaching multiple files makes a mess. (ie, make sure you detach() any files you attach().

Best practice for naming archived objects?

I've got a function that has a list output. Every time I run it, I want to export the results with save. After a couple of runs I want to read the files in and compare the results. I do this, because I don't know how many tasks there will be, and maybe I'll use different computers to calculate each task. So how should I name the archived objects, so later I can read them all in?
My best guess would be to dynamically name the variables before saving, and keep track of the object names, but I've read everywhere that this is a big no-no.
So how should I approach this problem?
You might want to use the saveRDS and readRDS functions instead of save and load. The RDS version functions will save and read single objects without the attached name. You would create your object and save it to a file (using paste0 or sprintf to create unique names), then when processing the results you can read in one object at a time, or read several into a list to work with them.
You can use scope to hide the retrieved name inside a function, so first you might save a list to a file:
mybiglist <- list(fred=1, john='dum di dum', mary=3)
save(mybiglist, file='mybiglist1.RData')
Then you can load it back in through a function and give it whatever name you like be it inside another list or just a plain object:
# Use the fact that load returns the name of the object loaded
# and that scope will hide this object
myspecialload <- function(RD.fnam) {
return(eval(parse(text=load(RD.fnam))))
}
# now lets reload that file but put it in another object
mynewbiglist <- myspecialload('mybiglist1.RData')
mynewbiglist
$fred
[1] 1
$john
[1] "dum di dum"
$mary
[1] 3
Note that this is not really a generic 'use it anywhere' type function, as for an RData file with multiple objects it appears to return the last object saved... so best stick with one list object per file for now!
One time I was given several RData files, and they all had only one variable called x. In order to read all of them in my workspace, I loaded sequentially each the variable to its environment, and I used get() to read its value.
tenv <- new.env()
load("file_1.RData", envir = tenv)
ls(tenv) # x
myvar1 <- get(ls(tenv), tenv)
rm(tenv)
....
This code can be repeated for each file.

Resources