Schedule R script using cron - r

I am trying to schedule my R script using cron, but it is not working. It seems R can not find packages in cron. Anyone can help me? Thanks.
The following is my bash script
# source my profile
. /home/winie/.profile
# script.R will load packages
R CMD BATCH /home/script.R

Consider these tips
Use Rscript (or littler) rather than R CMD BATCH
Make sure the cron job is running as you
Make sure the script runs by itself
Test it a few times in verbose mode
My box is running the somewhat visible CRANberries via a cronjob calling an R script
(which I execute via littler but Rscript
should work just as well). For this, the entry in /etc/crontab on my Ubuntu server is
# every few hours, run cranberries
16 */3 * * * edd cd /home/edd/cranberries && ./cranberries.r
so every sixteen minutes past every third hour, a shell command is being run with my id. It changes into the working directory, and call the R script (which has executable modes etc).
Looking at this, I could actually just run the script and have setwd() command in it....

Related

Close and reopen + run an R script from itself on Mac

I would like to write an R script that can close and reopen + run itself.
This is needed for an API query that I am trying to make and which seems to require me going through these steps once every hour to be able to make additional requests. I tried to use the source() function - and simply run my script from itself every hour- but with this the API keeps rejecting additional requests; it seems that actually closing and opening the program is necessary.
I also tried to use the system() command - as described here to actually open R and execute the script - but I was not able to figure out how to implement this in a Mac environment.
Would you have any suggestions on how to do this?
The usual way to run a script every x amount of time on a Unix system is a cronjob. I'm not familiar with macOS, but apparently it works just as on Linux.
Open the job list to edit it (you can of course use another editor instead of nano)
env EDITOR=nano crontab -e
Add a line/job to the file. This runs any command line command. You can use Rscript to run an R script.
0 * * * * Rscript "/path/to/your/script.R"
Exit and safe. This script should now run every hour.
If you want to change the timing check out crontab.guru. Check out this answer, if the cronjob reports Rscript to be missing.

Different behavior when Rscript & rstan is run as a cron job

I try to run an R script at regular intervals to update a webpage. The script runs fine when called from the terminal like this:
/usr/local/bin/Rscript /Users/me/path/myscript.R
However, if I try running it as a cron job, I get an error. I add the job to crontab like this:
46 10 * * * /usr/local/bin/Rscript '/Users/me/path/myscript.R' >> '/Users/me/path/mylog.log' 2>&1
The script does run in R, but aborts due to an error. Specifically, I fit some models using rstan, and get an initialization error. (The error only applies to some models, while others still run fine.) The initialization values are valid by definition, but do not seem to be used properly. It is like rstan is doing math differently (and wrong) when it is run through cron.
The session info from R is identical whether I run the script in the terminal or as a cron job. My question is what else might still differ depending on how the script is run. Could rstan be using a different version of C++ when run as a cron job? Are there other paths I may need to set to get this to work correctly?
Update: The script also works if I run it using R CMD BATCH in terminal, but not if I use R CMD BATCH in a cron job. Using launchd triggers the same issue. I also tried using CmdStan through cmdstanr, and the same same thing happens: Runs fine until added to a cron job.
Edit 2: The models I thought ran fine in cron, were not actually fine. The results were wrong, until I used the fix explained below.
It looks like I finally managed to fix this, and I'm posting my solution here for anyone who encounters the same problem.
I ran env in terminal to see my current user environment. I copy-pasted the full output to the top of my crontab file. (Simply adding the PATH variable was not sufficient. I suppose it was SHELL or perhaps both PATH and SHELL that did the trick, but I haven't explored this further.)
To edit my user's crontab, I ran crontab -e, then pressed i to edit the file, pasted everything from env at the top of the file, stopped editing by pressing ctrl + c, and quit by typing :wq and hitting enter.

Use crontab to automate R script

I am attempting to automate a R script using Rstudio Server on ec2 machine.
The R script is working without errors. I then navigated to the terminal on RStudio Sever and attempted to run the R script using the command - Rscript "Rfilename" and it works.
At this point I created a shell script and placed the command above for running the R script in there. This shell command is also running fine - sh "shellfilename"
But when I try to schedule this shell command using crontab, it does not produce any result. I am using the following cron entry :
* * * * * /usr/bin/sh ./shellfilename.sh
I am using cron for the first time and need help debug what is going wrong. My intuition is that there is there is difference in the environments used by the command when I run it on terminal and when I use the same in crontab. In case it is relevant information - am doing all of this on a user account created for myself on this machine so would differ from admin account.
Can someone help resolve this issue? Thanks!
The issue arose due to relative paths used in the script for importing files and objects. Changing this to absolute path resolved the described issue.

Automating R to run scripts

I'm basically looking for any way to automatically run R scripts just like it would run as if I was copy and pasting it into console. I've tried the package 'taskscheduleR' however it just seems to output to a log file in the directory which isn't as if I were to just run it inside the Rstudio application.
An example might be, say I want to get the last closing stock prices of 5 stocks each night, then the script in Rstudio and have the variables there and all of the code would be in the script file.
Any thoughts?
I would suggest the in-built Task Scheduler application if you using Windows.
Create a task that will run a batchscript file. This batchscript file has only 1 line which executes the Rscript you want. Set it to run each night (or whatever time you want).
I am not that well-versed in linux and MacOS but here's what I know:
Linux has cron. Add a job to crontab with your preferred timing and execute your script 'path/to/bin/r /path/to/script.r'
MacOS has Automator + iCal (for scheduling). It also has crontab like Linux.

Problems scheduling an R Script and saving data to the wd

I've just set up an R script to run on my Windows machine - I'm trying to have it save a dataframe on the working directory (which I know from getwd()).
I can see from the Task Scheduler that the script must be running as saves the last run time, however when I check the wd for new time stamps on the dataframes I'm trying to save, they haven't updated? (I save over them each time, or at least that's what I'd like to do, I manually saved them in there to start with).
I'm using this on the scheduler:
C:\Program Files\R\R-2.13.1\bin\R.exe" CMD BATCH  --vanilla --slave “C:\my projects\my_script.R
That appears to be working, but can anyone offer a reason as to why the script that I call doesn't seem to be saving my NEW DF to the wd? I'm using this command to save the DF:
write.table(m23,file="m23.csv",sep=",",row.names=F)
so DF m23 should get updated everyday in the wd when the scheduler calls the script at 6am?
Paul.
Are you sure you know what the current working directory is when the scheduler runs the script? My guess is it might not be what you think is it. I would look at the answers to this question: Rscript: Determine path of the executing script, especially the suggestion about using commandArgs to figure out where you are. Or you can explicitly set the working directory in your script with setwd()

Resources