Rscript logfile with input, errors and output - r

Is there a way to fully log (stdin, stdout, stderror) a R session executed via Rscript in batch mode?
I know, that I can use R CMD BATCH to log the full session of a batch job. I also know how to write errors and output into a logfile using Rscript but since it is often said that R CMD BATCH is a relict from the old days and one should use Rscript, I was wondering if it can also log everything.
I suppose it is impossible but I am not sure - neither do I know why.

Related

Close and reopen + run an R script from itself on Mac

I would like to write an R script that can close and reopen + run itself.
This is needed for an API query that I am trying to make and which seems to require me going through these steps once every hour to be able to make additional requests. I tried to use the source() function - and simply run my script from itself every hour- but with this the API keeps rejecting additional requests; it seems that actually closing and opening the program is necessary.
I also tried to use the system() command - as described here to actually open R and execute the script - but I was not able to figure out how to implement this in a Mac environment.
Would you have any suggestions on how to do this?
The usual way to run a script every x amount of time on a Unix system is a cronjob. I'm not familiar with macOS, but apparently it works just as on Linux.
Open the job list to edit it (you can of course use another editor instead of nano)
env EDITOR=nano crontab -e
Add a line/job to the file. This runs any command line command. You can use Rscript to run an R script.
0 * * * * Rscript "/path/to/your/script.R"
Exit and safe. This script should now run every hour.
If you want to change the timing check out crontab.guru. Check out this answer, if the cronjob reports Rscript to be missing.

Different behavior when Rscript & rstan is run as a cron job

I try to run an R script at regular intervals to update a webpage. The script runs fine when called from the terminal like this:
/usr/local/bin/Rscript /Users/me/path/myscript.R
However, if I try running it as a cron job, I get an error. I add the job to crontab like this:
46 10 * * * /usr/local/bin/Rscript '/Users/me/path/myscript.R' >> '/Users/me/path/mylog.log' 2>&1
The script does run in R, but aborts due to an error. Specifically, I fit some models using rstan, and get an initialization error. (The error only applies to some models, while others still run fine.) The initialization values are valid by definition, but do not seem to be used properly. It is like rstan is doing math differently (and wrong) when it is run through cron.
The session info from R is identical whether I run the script in the terminal or as a cron job. My question is what else might still differ depending on how the script is run. Could rstan be using a different version of C++ when run as a cron job? Are there other paths I may need to set to get this to work correctly?
Update: The script also works if I run it using R CMD BATCH in terminal, but not if I use R CMD BATCH in a cron job. Using launchd triggers the same issue. I also tried using CmdStan through cmdstanr, and the same same thing happens: Runs fine until added to a cron job.
Edit 2: The models I thought ran fine in cron, were not actually fine. The results were wrong, until I used the fix explained below.
It looks like I finally managed to fix this, and I'm posting my solution here for anyone who encounters the same problem.
I ran env in terminal to see my current user environment. I copy-pasted the full output to the top of my crontab file. (Simply adding the PATH variable was not sufficient. I suppose it was SHELL or perhaps both PATH and SHELL that did the trick, but I haven't explored this further.)
To edit my user's crontab, I ran crontab -e, then pressed i to edit the file, pasted everything from env at the top of the file, stopped editing by pressing ctrl + c, and quit by typing :wq and hitting enter.

R.exe and Rscript.exe

I'm new to programming and I'm confused on the difference between the two. I have googled this and I am still confused on the difference after reading the responses.
Part of the reason I am confused is I am thinking in terms of running script in Batch files. For instance, lets say I have a script in R and I create a batch file that runs the script where I use R.exe. When I put this in the command prompt and run the batch file, it just takes the script I made and runs it in the console of R right?
I've seen that you can run batch files uses Rscript.exe, which confuses me because when if I take an R script I made and put it into the script part of R (above the console) how would this do anything since the script must be put into the console for it to run. (Unless Rscript.exe runs whatever it is in the script part of R?)
If anyone could please explain how this all works to me, I would greatly appreciate it.
Thanks!
First, some terminology: even though the concept of batch processing is generic, and it means unassisted execution, the term batch file is usually reserved for MS-Windows files processed by cmd.exe, MS-Windows traditional script files. The term used for files containing R commands is usually R scripts, or Rscripts.
That said, please consider the following simple R script, named HelloFriend.R:
my.name <- readline(prompt="Enter name: ")
print(paste("Hello, ", my.name, "!"))
When run directly in R console, as
> source('HelloFriend.R')
it will show the output
Enter name:
If the user types Some Name and hits Enter, the program will output
[1] "Hello, Some Name !"
If it's run in the command line as R --no-save --quiet < HelloFriend.R, it will generate the output
> my.name <- readline(prompt="Enter name: ")
Enter name:
> print(paste("Hello, ", my.name, "!"))
[1] "Hello, !"
>
And finally, if run with Rscript --vanilla HelloFriend.R, it will generate the output
Enter name:
[1] "Hello, !"
In other words, when run inside the R console, the user input will be expected. When run under R, but in the command line, the program will not give the user the opportunity to type anything, but the command echo will be shown.
And finally, under Rscript, the user input will also not be expected, but the command echo will not be shown.
Rscript is the preferred form of running R scripts, as its name suggests. The passing of R scripts in the command line to R via redirection also gives batch processing but will echo the commands executed. Therefore it can help debug code, but it's not the preferred way of executing production code.
The analogy with batch files is a good one. R.exe is for interacting with the language, entering one statement at a time, and evaluating the results before entering the next statement. Rscript.exe is for running an existing script (file) containing R commands. You generally invoke Rscript.exe with the name of the script.
Running Rscript.exe my_script.R from the command-line is sort of like running
source("my_script.R")
q("no")
from the R console.

R script to Task Scheduler

I have an R script that gets data from databases on another server and brings it into my database. I have it saved as "dataimport.R"
I followed a few answers from here and from other websites and created a batch file like this:
"C:\Program Files\R\R-3.4.0\bin\R.exe" CMD BATCH --vanilla --slave "C:\dataimport.R"
This is not working. The cmd window opens up but the tables are not recreated and I dont get any error. I wanted to run the Task Scheduler to automate the process. Any ideas on how to fix this?
I kept at it and interestingly the answer to this was this:
"C:\Program Files\R\R-3.4.0\bin\R.exe" "C:\dataimport.R"
I dont know the reason for this but as long as it works.
I've had loads of problems with this, but finally managed to get it to work. To be more comprehensive, here are some of the things I've tried (in case one of these work for other persons):
#echo off, R CMD BATCH C:\myfolder\script.R
R CMD BATCH C:\myfolder\script.R
Using the package taskschedulerR (somehow didn't work overnight)
Used the answer provided above ("C:\Program Files\R\R-3.4.0\bin\R.exe" "C:\dataimport.R")
and all kinds of variations and combinations off these. (can't remember them all exactly)
What finally worked was:
Make an R script and save it (C:\myfolder\Test.R for example)
Through notepad, fill in: "C:\Program Files\R\R-3.5.2\bin\x64\R.exe" CMD BATCH "C:\myfolder\Test.R" (also tried Rscript.exe, didn't work for me).
in Windows Task Scheduler (v1.0) do 'Create task'
fill in time triggers.
in Actions, make an action with Start a program and in the Program/Script line provide the location where your bat script is. C:\myfolder\Test.bat
in the Start in (optional) line: enter C:\myfolder\
Note: both your .bat file and .R script are in the "C:\myfolder" folder.

Run R script using batch file

I'm attempting to schedule an R script as mentioned in
the Scheduling R Script stackoverflow answer but I keep getting an error that says
ARGUMENT 'BATCH' \__ignored__
ARGUEMENT 'C:/Users/cburton/AutomaticJobPuller/jobsPuller.R' \__ignored__
and then it has the startup prompt for the R programming environment.
The contents in my batch file are
C:\Users\cburton\Documents\R\R-3.2.0\bin\R.exe BATCH C:\Users\cburton\AutomaticJobPuller\jobsPuller.R --vanilla
This is my first batch file, so I'm sure it's something simple, but I can't find the answer to this.

Resources