Batch R Script - setting the working directory and selecting output folder - r

I've been digging into several places for 2 simple needs but couldn't find a final answer.
I'm running an R script in batch mode. Not sure whether my solution is the best one but I'm using R CMD BATCH as per http://stat.ethz.ch/R-manual/R-patched/library/utils/html/BATCH.html included in a bat file.
First I'd like to have the directory where the R script is saved set up as the working directory rather than where the bat file is saved.
Secondly I'd like to divert all the output from the R script (csv files and charts) to a specific directory other than the working directory. I cannot find any options for such basic requirement.
The final idea is to be able to run the bat file across different computers no matter where the R script is saved.
Thanks

You don't give code so my answer will be just an advise or what I would do for such job.
Use Rscript.exe it is the way to go for batch script. R CMD is a sort of legacy tool.
You don't need to set or change the working directory. It is a source of problems
You can launch you bat file where you want and within it you go to R script location using cd for example you bat file can be like:
cd R_SCRIPT_PATH
Rscript youscript.R arg1 arg2
You can use one of the script argument as an output directory for your result files. For example inside your script you do somehing like this:
args <- commandArgs(trailingOnly = TRUE)
resultpath <- as.character(args[1])
.....
write.table(res1, file=paste(resultpath,'res1.csv',sep='/')

Related

How to run R projects / use their relative paths from the terminal without setwd() resp. cd

I'm kinda lost on that one:
I have set up an R project, let's call it "Test Project.Rproj". The beauty of R projects is the possibility to use relative paths (relative to the .Rproj file). My project consists of a "main.R" script, which is saved on the same level as the .Rproj file.
Additionally I have a directory called 'Output', where I want my plots and exported data to be saved. My "main.R" file looks like the following:
my_df <- data.frame(A = 1:10, B = 11:20)
my_df |>
writexl::write_xlsx(here::here("Output",
paste0("my_df_",
stringr::str_replace_all(as.character(Sys.time()), ":", ""),
".xlsx")))
My final goal is to automate the execution of the 'main.R' file using the Windows Task Scheduler. But in order to do so, I have to be able to run the script from the terminal. The problem here is the working directory. When opening an R project, all the paths are relative to .Rproj file. But in the terminal the current working directory is <C:\Users\my_name>. Of course I could manually set the working directory via cd "path\to\my\project. But I would like to avoid that.
My current call for the execution of the main.R file in the terminal is the following:
"C:\Program Files\R\R-4.1.0\bin\Rscript" -e "source('C:/Users/my_name/path/to/my/project/main.R')"
My two ideas for a solution are the following, but I am happy for other suggestions as well.
In order to replicate the usual use of a project: Is there a way to execute the .Rproj
file from the terminal? In order to create a similar environment as in RStudio, where all the relative paths are working, when executing scripts from the project afterwards?
There are two packages adressing the problem of relative paths: rprojroot and here, where the former is the basis for the latter. I am pretty sure that here does not provide the needed functionality. I tried adding here::i_am("main.R) to my main.R file, but the project root directory still is not found when executing in the terminal from a working directory outside the project.
For rprojroot to work, I think it is also necessary to have your current working directory somewhere within the project. But this package offers a lot of functionality, so I am not sure wheter I am overlooking something.
So I would be happy about any help. Maybe it is impossible and I have to change the working directory manually - then I would be glad to know that as well.
Some links I used in my research:
https://www.tidyverse.org/blog/2017/12/workflow-vs-script/
https://malco.io/2018/11/05/why-should-i-use-the-here-package-when-i-m-already-using-projects/
http://jenrichmond.rbind.io/post/how-to-use-the-here-package/
Thanks a lot!
Edit: My current implementation is an additional R script, where I manually set the working directory via setwd() and source the main.R file. However it is always suggested to avoid setwd, which is why this whole question exists.

How to "source" (not run line by line) an R script from the Windows command line

I am creating an R script myscript.R which manipulates an excel file by means of XLConnectpackage.
The point is it refers to several external files: the excel file itself and another R file (functions library), so I need to set a working directory in the location of the script (so that relative paths to external files work properly).
I am using the following in my script
new_wd <- dirname(sys.frame(1)$ofile)
setwd(new_wd)
When I source the script from my RStudio it gets the job done. The problem is that the script is to be used by non-programmers, non-Rstutio-users, so I create .bat file (which I want to turn into an .exe one)
"C:\Program Files\R\R-4.0.3\bin\Rscript.exe" "C:\my\path\to\myscript.R"
It executes the script line by line but sys.frame(1) only works when sourcing.
How could I solve it?
Thanx
I have found a solution and it works properly.
From CMD command line or from a .bat file one can add an argument -e to the command, so that you can use a R language.
absolute\path\to\Rscript.exe -e "source('"relative\path\to\myscript.R"')"
It worked for me.
Besides, as Compo commented, I think there's no need for a .exe file, since a .bat does the job.

How to specify where cronR saves output file

I have successfully run a simple cronR tutorial using the cronR add in within R Studio. Here is the following code that I have saved in a R script file:
library(glue)
current_time <- Sys.time()
print(current_time)
msg <- glue::glue("This is a test I am running at {current_time}.")
cat(msg, file = "test.txt")
The R script file is saved within a specific project directory. The log file when the job runs is also saved there. However, the output of the script file, test.txt, is saved in my home directory. I am on a Mac. In the tutorial, it was stated that this would happen, that cron would save any output in the home directory and that if I want to change the location that I have to "specify otherwise". However, the tutorial gives no instructions for how to do this and I am not sure if I am supposed to do this through the terminal in mac and if so how? Changing the file path in the script file (e.g. Documents/test.txt) changes nothing, as the test.txt file is still saved in the home drive. I suspect I have to make this change somewhere else but I am not sure where. Any help would be appreciated.
For anyone who runs into the same issue, I was able to solve my problem by using launchD. I followed this tutorial https://babichmorrowc.github.io/post/launchd-jobs/. It worked as anticipated and now my .txt file is saved to the correct place. There is also a tutorial there for cron jobs, though if you are on a mac, launchD is apparently the preferred method.

Same R history from different workspace

I am new to R and I just figured out why my history did not contain all my previous commands. R create a .Rhistory file in each working directory.
I often change working directory and I would like to have the history of all my past sessions in the same file. Is there a simple way to do that ?
Thanks.
(I am on Mac OS 10.6 and I use Rstudio)
An easy way would be to manually save your history like this:
savehistory(file = "~/.Rhistory")
and then load it when you open an R command session:
loadhistory(file = "~/.Rhistory")
Otherwise you can edit your 'Rprofile.site' and add savehistory() and loadhistory() to the functions .Last and .First respectively.
More info about Rprofile.site: here
At startup, R will source the Rprofile.site file. It will then look for a .Rprofile file to source in the current working directory. If it doesn't find it, it will look for one in the user's home directory. There are two special functions you can place in these files. .First( ) will be run at the start of the R session and .Last( ) will be run at the end of the session.

Set working directory to mapped network drive in BATCH mode

I'm having issues on windows with R failing when changing the working directory to a mapped network drive (e.g. \Share\Folder mapped to Z:) in batch mode. If I run the same script in an interactive console I don't have any issues. I am accomplishing this by running R.exe with the script specified inside of a windows batch (.bat) file. The .bat file contains the following.
"C:\RRO\R-3.2.1\bin\R.exe" CMD BATCH "C:/Scripts/Rscript.R"
The error is simply...
> setwd( 'Z:/' )
Error in setwd("Z:/") : cannot change working directory
I'd be open to a different approach entirely for scheduling these scripts via the windows task scheduler if that helps avoid the issue. The reason for mapping the drive is that I need to supply some credentials in order to access it, which is done automatically when it is mapped, but can test to see if that's not the case in R if anyone knows how.
I hope this can help with your question.
I duplicated the problem with no errors by using Rscript command instead of a CMD BATCH
my R code which I saved as a script (test1.R)
library(openxlsx)
setwd("P:/Records/Indexing Operations/Indexing Data Analysis/Daily Reports")
my.data = read.xlsx("FSI Daily Project Status Report - 18 Mar 2016.xlsx", sheet = 1)
setwd("C:/Users/golieth/Documents/")
png(filename = "test.png", width = 500, height = 350 )
plot(my.data$Total.Images, my.data$Completed.Images.A,
main = Sys.time())
dev.off()
Note I change the directory 2 times in this file. Once to access data on a mapped network drive and a 2nd to save the image to the computer. I put a timestamp of the current time as the main plot title so you can run the batch file repeatedly and verify it works
my batch file
cd C:\Program Files\R\R-3.2.3\bin\i386
Rscript C:\Users\golieth\Documents\test1.R
Note: On the batch file if your code relies on 32 bit you need to change the directory of your R program (cd) to the R 32bit program. Same with R64. Next the Rscript should reference where you have saved your .R file
Finally, and this might be stating the obvious but make sure you are connected to your VPN before running the batch file.
Imagine a batch file with
cd Z:\<Destination>
Z:
RScript "C:/Scripts/Rscript.R"
This will enable Windows to change to the directory with all credentials and then start R within that directory. So the working dir. is the location from where R is started. Doing so requires that "C:\RRO\R-3.2.1\bin\" is part of your PATH variable.
Good luck!
When writing a .bat file, remember that cd is not used to change drive letters. To change drive letters you simply enter the name of the drive letter, which should be done prior to issuing the final cd to the working directory.
Like this:
sample.bat
z:
cd z:\your\working\directory\
C:\RRO\R-3.2.1\bin\Rscript.exe C:/Scripts/Rscript.R
You can save the files locally in your code, and use file.copy in your code to copy the files over to your network drive. Also try replacing the path in file.copy the network drive letter by the full network address name eg. \\....\.....\

Resources