I want to save data into an .RData file.
For instance, I'd like to save into 1.RData with two csv files and some information.
Here, I have two csv files
1) file_1.csv contains object city[[1]]
2) file_2.csv contains object city[[2]]
and additionally save other values, country and population as follows.
So, I guess I need to make objects 'city' from two csv files first of all.
The structure of 1.RData may looks like this:
> data = load("1.RData")
> data
[1] "city" "country" "population"
> city
[[1]]
NEW YORK 1.1
SAN FRANCISCO 3.1
[[2]]
TEXAS 1.3
SEATTLE 1.4
> class(city)
[1] "list"
> country
[1] "east" "west" "north"
> class(country)
[1] "character"
> population
[1] 10 11 13 14
> class(population)
[1] "integer"
file_1.csv and file_2.csv have bunch of rows and columns.
How can I create this type of RData with csv files and values?
Alternatively, when you want to save individual R objects, I recommend using saveRDS.
You can save R objects using saveRDS, then load them into R with a new variable name using readRDS.
Example:
# Save the city object
saveRDS(city, "city.rds")
# ...
# Load the city object as city
city <- readRDS("city.rds")
# Or with a different name
city2 <- readRDS("city.rds")
But when you want to save many/all your objects in your workspace, use Manetheran's answer.
There are three ways to save objects from your R session:
Saving all objects in your R session:
The save.image() function will save all objects currently in your R session:
save.image(file="1.RData")
These objects can then be loaded back into a new R session using the load() function:
load(file="1.RData")
Saving some objects in your R session:
If you want to save some, but not all objects, you can use the save() function:
save(city, country, file="1.RData")
Again, these can be reloaded into another R session using the load() function:
load(file="1.RData")
Saving a single object
If you want to save a single object you can use the saveRDS() function:
saveRDS(city, file="city.rds")
saveRDS(country, file="country.rds")
You can load these into your R session using the readRDS() function, but you will need to assign the result into a the desired variable:
city <- readRDS("city.rds")
country <- readRDS("country.rds")
But this also means you can give these objects new variable names if needed (i.e. if those variables already exist in your new R session but contain different objects):
city_list <- readRDS("city.rds")
country_vector <- readRDS("country.rds")
Just to add an additional function should you need it. You can include a variable in the named location, for example a date identifier
date <- yyyymmdd
save(city, file=paste0("c:\\myuser\\somelocation\\",date,"_RData.Data")
This was you can always keep a check of when it was run
Related
I have a parameter called country_name that reflects the name of a country I am interested in and which I change sometimes when I run my code, I would like my RDA file to reflect that name change once saved and loaded back into the environment.
Currently what happens is this:
Name the country:
country_name <- "Ireland"
Create a simple data frame:
x <- 10
vars_2_keep <- data.frame(x)
vars_2_keep
It contains x=10
x
1 10
I save it, renaming the data frame with the country name so that when I do this with a different country I will have country specific information:
save(vars_2_keep, file=paste("my_data_", country_name[1], ".rda", sep = ""))
I delete everything, and load it back in:
rm(list=ls())
load(file='my_data_Ireland.rda')
Unfortunately, in the environment instead of my data frame being called "my_data_Ireland", it is still called vars_2_keep.
How can I update the name of this data frame to my_data_country_name[1] (which in this example would be my_data_Ireland)
Thank you
I want to save data into an .RData file.
For instance, I'd like to save into 1.RData with two csv files and some information.
Here, I have two csv files
1) file_1.csv contains object city[[1]]
2) file_2.csv contains object city[[2]]
and additionally save other values, country and population as follows.
So, I guess I need to make objects 'city' from two csv files first of all.
The structure of 1.RData may looks like this:
> data = load("1.RData")
> data
[1] "city" "country" "population"
> city
[[1]]
NEW YORK 1.1
SAN FRANCISCO 3.1
[[2]]
TEXAS 1.3
SEATTLE 1.4
> class(city)
[1] "list"
> country
[1] "east" "west" "north"
> class(country)
[1] "character"
> population
[1] 10 11 13 14
> class(population)
[1] "integer"
file_1.csv and file_2.csv have bunch of rows and columns.
How can I create this type of RData with csv files and values?
Alternatively, when you want to save individual R objects, I recommend using saveRDS.
You can save R objects using saveRDS, then load them into R with a new variable name using readRDS.
Example:
# Save the city object
saveRDS(city, "city.rds")
# ...
# Load the city object as city
city <- readRDS("city.rds")
# Or with a different name
city2 <- readRDS("city.rds")
But when you want to save many/all your objects in your workspace, use Manetheran's answer.
There are three ways to save objects from your R session:
Saving all objects in your R session:
The save.image() function will save all objects currently in your R session:
save.image(file="1.RData")
These objects can then be loaded back into a new R session using the load() function:
load(file="1.RData")
Saving some objects in your R session:
If you want to save some, but not all objects, you can use the save() function:
save(city, country, file="1.RData")
Again, these can be reloaded into another R session using the load() function:
load(file="1.RData")
Saving a single object
If you want to save a single object you can use the saveRDS() function:
saveRDS(city, file="city.rds")
saveRDS(country, file="country.rds")
You can load these into your R session using the readRDS() function, but you will need to assign the result into a the desired variable:
city <- readRDS("city.rds")
country <- readRDS("country.rds")
But this also means you can give these objects new variable names if needed (i.e. if those variables already exist in your new R session but contain different objects):
city_list <- readRDS("city.rds")
country_vector <- readRDS("country.rds")
Just to add an additional function should you need it. You can include a variable in the named location, for example a date identifier
date <- yyyymmdd
save(city, file=paste0("c:\\myuser\\somelocation\\",date,"_RData.Data")
This was you can always keep a check of when it was run
I have about 30 separate dataframes loaded in my R session each with various names. I also have a character vector called mydfs which contains the names of all those dataframes loaded into my R session. I am trying to loop over mydfs and save out as an rds file each dataframe listed in the elements of mydfs, but for some reason, I'm only able to save out the character string of the name of the dataframe I'm trying to save (not the datafame itself). Here is simulated, reproducible example of what I have:
#Create vector of dataframes that exist in base r to create a reproducible example
mydfs<-c("cars","iris","iris3","mtcars")
#My code that creates files, but they don't contain my dataframe data for some reason
for (i in 1:length(mydfs)){
savefile<-paste0(paste0("D:/Data/", mydfs[i]), ".Rds")
saveRDS(mydfs[i], file=savefile)
print(paste("Dataframe Saved:", mydfs[i]))
}
This results in the following log output:
[1] "Dataframe Saved: cars"
[1] "Dataframe Saved: iris"
[1] "Dataframe Saved: iris3"
[1] "Dataframe Saved: mtcars"
Then I try to read back in any of the files I created:
#But when read back in only contain a single character string of the dataframe name
a<-readRDS("D:/Data/iris3.Rds")
str(a)
chr "iris3"
Note that when I read iris3.Rds back into a new R session using readRDS, I don't have a dataframe as I was expecting, but a single character vector containing the name of the datafame and not the data.
I haven't been programming in R for a while, since my current client preferred SAS, so I think I am somehow getting macro variable looping in SAS confused with R and so that when I call saveRDS, I'm passing in a single character vector instead of the actual dataframe. How can I get the dataframe to be passed into saveRDS instead of the character?
Thanks for helping me untangle my SAS thinking with my somewhat rusty R thinking.
You're currently just saving the names of the dataframes. You can use the get function as follows:
mydfs<-c("cars","iris","iris3","mtcars")
for (i in 1:length(mydfs)){
savefile<-paste0(paste0("D:/Data/", mydfs[i]), ".Rds")
saveRDS(get(mydfs[i]), file=savefile)
print(paste("Dataframe Saved:", mydfs[i]))
}
readRDS('D:/Data/iris3.RDS')
I am performing a set of analyses in R. The flow of the analysis is reading in a dataframe (i.e. input_dataframe), performing a set of calculations that then result in a new, smaller dataframe (called final_result). A set of exact calculations is performed on 23 different files, each of which contains a dataframe.
My question is as follows: For each file that is read in (i.e. the 23 files) I am trying to save a unique R object: How do I do so? When I save the resulting final_result dataframe (using save() to an R object, I then cannot read all 23 objects into a new R session without having the different R objects override each other. Other suggestions (such as Create a variable name with "paste" in R?) did not work for me, since they rely on the fact that once the new variable name is assigned, you then call that new variable by its name, which I cannot do in this case.
To Summarize/Reword: Is there a way to save an object in R but change the name of the object for when it will be loaded later?
For example:
x=5
magicSave(x,file="saved_variable_1.r",to_save_as="result_1")
x=93
magicSave(x,file="saved_variable_2.r",to_save_as="result_2")
load(saved_variable_1)
load(saved_variable_2)
result_1
#returns 5
result_2
#returns 93
In R it's generally a good idea to actually store as a list everything that can be seen as a list. It will make everything more elegant afterwards.
First you put all your paths in a list or a vector :
paths <- c("C:/somewhere/file1.csv",
"C:/somewhere/file2.csv") # etc
Then you read them :
objects <- lapply(paths,read.csv) # objects is a list of tables
Then you apply your transformation on each element :
output <- lapply(objects,transformation_function)
And then you can save your output (I find saveRDS cleaner than save as you know what variables you'll be inviting in your workspace when loading) :
saveRDS(output,"C:/somewhere/output.RDS")
which you will load with
output <- readRDS("C:/somewhere/output.RDS")
OR if you prefer for some reason to save as different objects:
output_paths <- paste0("C:/somewhere/output",seq_along(output),".csv")
Map(saveRDS,output,output_paths)
To load later with:
output <- lapply(paths, readRDS)
x=5
write.csv(x,"one_thing.csv", row.names = F)
x=93
write.csv(x,"two_thing.csv", row.names = F)
result_1 <- read.csv("one_thing.csv")
result_2 <- read.csv("two_thing.csv")
result_1
# x
# 1 5
result_2
# x
# 1 93
I would like to load a data file in R using data(), with the data set's name stored in a variable. Doing this without the data set name stored in a variable is trivial:
> library(ChIPpeakAnno)
> data(TSS.human.NCBI36)
> # Use data:
> TSS.human.NCBI36 # Prints out contents of data set
When the data set name is stored in a variable, however, I'm not sure how to accomplish the same task.
> library(ChIPpeakAnno)
> assembly <- 'TSS.human.NCBI36'
> data(list=c(assembly)) # Hackish way of loading the data from a variable
> # Now I wish to access the data, but I don't know how.
data()'s return value is simply the name of the data set loaded. The data file I'm trying to load is located at ~/R/2.15/library/ChIPpeakAnno/data/TSS.human.NCBI36.rda -- I do not believe there is anything Bioconductor-specific to it.
Thanks!
If you're trying to figure out how to access data programmatically when you just have the objects name in a character vector you can use get.
library(ChIPpeakAnno)
assembly <- 'TSS.human.NCBI36'
data(list=c(assembly))
# Now store the data into 'dat'
dat <- get(assembly)
# Now you can use 'dat' anywhere you would normally use TSS.human.NCBI36
head(start(dat))
#[1] 1873 4274 20229 24417 24417 42912
head(start(TSS.human.NCBI36))
#[1] 1873 4274 20229 24417 24417 42912