How to share data frames between scripts in R - r

I've got multiple R scripts; one that cleans my original data and produces a tidy data frame, and several others that performs functions on that data frame.
When I wrote them, the data frame produced by the first script was in my RStudio environment and the other scripts referenced the resulting data frame without trouble.
Now that I'm trying to run them from the console, the data frame produced by the first script isn't reference-able for the others.
What's the best way to share a data frame between scripts?

You could try using the commands save.image() and load() to save your workspace to a file and then load it onto your console environment as it's likely that your console instance and RStudio each have their own independent environments.
Doing this way, you would have access to all objects that the previous scripts executed. However, if you're only interested in the generated data, you could save your data.frame using save() and open it using load().
As mentioned by #Dirk Eddelbuettel, there are also plenty good functions to save single variables like saveRDS() and readRDS() (which provides a better serialization than save()) and write.csv() and read.csv().

Related

How to use a file modified by a R chunk in a Python one

I am working in Rmarkdown into primarily R chunks, which I used to modify data frames. Now that they are ready, a colleague gave me Python codes to process some of the data. But when transitioning from a R chunk to a Python one, the environment changes and I do not know how to use the previous files directly.
reticulate::repl_python()
biodata_file = women_personal_data
NameError: name 'women_personal_data' is not defined
NameError: name 'women_personal_data' is not defined
Ideally, I would like not to have to save the files on my computer between R and Python, and then back at R again, to avoid accumulating files that are not completely clean yet (because I figured it could be a solution).
I tried this solution but seems to not work with Data Frames
Thanks !
biodata_file = r.women_personal_data
The '.r' makes it take it from R, because the variable was called
r women_personal_data
TIP = to come back to R, the variable is now called py$women_personal_data

Save or Retrieve the data in Rstudio

Sir,i am a student, learning R,I have a question about how to store data in R, or how to retrieve data that has been erased.
Sir,
Using RStudio is not much different than using, say, Word or Notepad, but with some differences.
First the similarities:
If you do not save your Rscript or data, it might not be available after you restart RStudio or if you overwrite/erase your data.
The advantage of using R and Rstudio is that you can script how you load and manipulate your data, hence recreate the data. If you use a script and do not rely only on the console (interactive) part.
For the differences, Rstudio can be set to save your current workspace. This is were all data and variables loaded reside. To change the settings, go to "Tools" --> "Global options" and you should see the options as depicted below.
However, if you erase your data by overwriting with other values or using the command unset, the data is lost. Your only recourse is to retrace how it was loaded/modified, using either your script or going through the "history".
For saving data, see e.g. http://www.sthda.com/english/wiki/saving-data-into-r-data-format-rds-and-rdata. Note the difference between save and saveRDS where the former saves data with their variable names, whereas saveRDS saves the data without and must be loaded into a variable.

How to output a list of dataframes, which is able to be used by another user

I have a list whose elements are several dataframes, which looks like this
Because it is hard for another user to use these data by re-running my original code. Hence, I would like to export it. As the graph shows, the dataframes in that list have different number of rows. I am wondering if there is any method to export it as file without damaging any information, and make it be able to be used by Rstudio. I have tried to save it as RData, but I don't know how to save the information.
Thanks a lot
To output objects in R, here are 4 common methods:
dput() writes a text representation of an R object
This is very convenient if you want to allow someone to get your object by copying and pasting text (for instance on this site), without having to email or upload and download a file. The downside however is that the output is long and re-reading the object into R (simply by assigning the copied text to an object) can hang R for large objects. This works best to create reproducible examples. For a list of data frames, this would not be a very good option.
You can print an object to a .csv, .xlsx, etc. file with write.table(), write.csv(), readr::write_csv(), xlsx::write.xlsx(), etc.
While the file can then be used by other software (and re-imported into R with read.csv(), readr::read_csv(), readxl::read_excel(), etc.), the data can be transformed in the process and some objects cannot be printed in a single file without prior modifications. So this is not ideal in your case either.
save.image() saves your entire workspace (objects + environment)
The workspace can then be recreated with load(). This can be useful, but you are here only interested in saving one object. In that case, it is preferable to use:
saveRDS() which allows to write one object to file
The object can then be re-created with readRDS(). This is the best option to save an R object to file, without any modification and then re-create it.
In your situation, this is definitely the best solution.

Command to use with easy way the insert of R dataframe

I have a dataframe loaded successfully in R.
I would like to give the data of df to someone else to use them with quick and easy way without need to load again the file into a df.
Which is the command to give the whole data of df (not the str())
You can save the file into a .RData using save or save.image, depending on your needs. First one will save specific objects while the latter will dump the whole workspace to a file. This method has the advantage of working on probably any R object.
Another option is as #user1945827 mentioned, using dput which will produce a string that is parseable into another R session. This will not work for complex (like S4) objects.

R load script objects to workspace

This is a rookie question that I cannot seem to figure out. Say you've built an R script that manipulates a few data frames. You run the script, it prints out the result. All is well. How is it possible to load objects created within the script to be used in the workspace? For example, say the script creates data frame df1. How can we access that in the workspace? Thanks.
Here is the script...simple function just reads a csv file and computes diff between columns 2 and 3...basically I would like to access spdat in workspace
mspreaddata<-function(filename){
# read csv file
rdat<-read.csv(filename,header=T,sep=",")
# compute spread value column 2-3
spdat$sp<-rdat[,2]-rdat[,3]
}
You should use the source function.
i.e. use source("script.R")
EDIT:
Check the documentation for further details. It'll run the script you call. The objects will then be in your workspace.
Alternatively you can save those objects using save and then load them using load.
So when you source that, the function mspreaddata is not available in your workspace? Because in there spdat is never created. You are just creating a function and not running it. That object spdat only exists within that function and not in any environment external to that. You should add something like
newObject <- mspreaddata("filename.csv")
Then you can access newObject
EDIT:
It is also the case that spdat is not created in your function so the call to spdat$sp<-rdat[,2]-rdat[,3] is itself incorrect. Simply use return(rdat[,2]-rdat[,3]) instead.

Resources