How to Run an Rscript as a crontab? - r

I'm using a fedora system and trying to automate an Rscript on my system. So I figured out the process of getting a crontab running. However it isn't working for my Rscript.
So to test whether I can run this Rscript normally from my root (root user and ~ directory) , I'm running:
/bin/Rscript /home/computername/path_to_r_script/script.R > /root/Preprocessingv3.Rout
But this is unable to find or import libraries and hits an error at the first library being imported as:
Error in library(RForcecom) : there is no package called 'RForcecom'
Execution halted
This is strange as historically I have been running this script many times as from command line as R CMD BATCH script.R > script.Rout by cding into that path_to_r_script folder as a non root user. Unless I'm missing something very fundamental of how Linux/fedora works, there should be no reason why a script that works from one location from one user but doesn't work from a different location from root user. So I'm guessing I'm not calling it the right way.
Could any one help me understand how we are supposed to automate Rscripts (and to an extent python scripts) as crontab? And what I have to do get it running on my system?

Related

(Dagster) Schedule my_hourly_schedule was started from a location that can no longer be found

I'm getting the following Warning message when trying to start the dagster-daemon:
Schedule my_hourly_schedule was started from a location Scheduler that can no longer be found in the workspace, or has metadata that has changed since the schedule was started. You can turn off this schedule in the Dagit UI from the Status tab.
I'm trying to automate some pipelines with dagster and created a new project using dagster new-project Scheduler where "Scheduler" is my project.
This command, as expected, created a diretory with some hello_world files. Inside of it I put the dagster.yaml file with configuration for a PostgreDB to which I want to right the logs. The whole thing looks like this:
However, whenever I run dagster-daemon run from the directory where the workspace.yaml file is located, I get the message above. I tried runnning running the daemon from other folders, but it then complains that it can't find any workspace.yaml files.
I guess, I'm running into a "beginner mistake", but could anyone help me with this?
I appreciate any counsel.
One thing to note is that the dagster.yaml file will not do anything unless you've set your DAGSTER_HOME environment variable to point at the directory that this file lives.
That being said, I think what's going on here is that you don't have the Scheduler package installed into the python environment that you're running your dagster-daemon in.
To fix this, you can run pip install -e . in the Scheduler directory, although the README.md inside that directory has more specific instructions for working with virtualenvs.

Error that says Rscript is not recognized as an internal or external command, operable program or batch file [duplicate]

shell_exec("Rscript C:\R\R-3.2.2\bin\code.R ");
This is the call to script.On calling the above script, the error occurs.
I am trying to call my R script from the above path but no output is being shown. While checking the error logs of PHP, it says 'Rscript' is not recognized as an internal or external command, operable program or batch file.' The script is working fine on the Rstudio but not running on the command line.
Add the Rscript path to your environment variables in Windows:
Go to Control Panel\System and Security\System and click Advanced System Settings, then environment variables, click on path in the lower box, edit, add "C:\R\R-3.2.2\bin"
Restart everything. Should be good to go. Then you should be able to do
exec('Rscript PATH/TO/my_code.R')
instead of typing the full path to Rscript. Won't need the path to your my_code.R script if your php file is in the same directory.
You need to set the proper path where your RScript.exe program is located.
exec ("\"C:\\R\\R-3.2.2\\bin\\Rscript.exe\"
C:\\My_work\\R_scripts\\my_code.R my_args";
#my_args only needed if you script take `args`as input to run
other way is you declare header in your r script (my_code.r)
#!/usr/bin/Rscript
and call it from command line
./my_code.r
If you are running it in Git Bash terminal, you could follow a revised version of the idea suggested by #user5249203: in the first line of your file my_code.R, type the following
#!/c/R/R-3.2.2/bin/Rscript.exe
I assumed that your path to Rscript.exe is the one listed above C:\R\R-3.2.2\bin. For anyone having a different path to Rscript.exe in Windows, just modify the path-to-Rscript accordingly. After this modification of your R code, you could run it in the Git Bash terminal using path-to-the-code/mycode.R. I have tested it on my pc.
I faced the same problem while using r the first time in VS Code, just after installing the language package (CRAN).
I restart the application and everything worked perfectly. I think restarting would work for you as well.

Use crontab to automate R script

I am attempting to automate a R script using Rstudio Server on ec2 machine.
The R script is working without errors. I then navigated to the terminal on RStudio Sever and attempted to run the R script using the command - Rscript "Rfilename" and it works.
At this point I created a shell script and placed the command above for running the R script in there. This shell command is also running fine - sh "shellfilename"
But when I try to schedule this shell command using crontab, it does not produce any result. I am using the following cron entry :
* * * * * /usr/bin/sh ./shellfilename.sh
I am using cron for the first time and need help debug what is going wrong. My intuition is that there is there is difference in the environments used by the command when I run it on terminal and when I use the same in crontab. In case it is relevant information - am doing all of this on a user account created for myself on this machine so would differ from admin account.
Can someone help resolve this issue? Thanks!
The issue arose due to relative paths used in the script for importing files and objects. Changing this to absolute path resolved the described issue.

SparkR: source() other R files in an R Script when running spark-submit

I'm new to Spark and newer to R, and am trying to figure out how to 'include' other R-scripts when running spark-submit.
Say I have the following R script which "sources" another R script:
main.R
source("sub/fun.R")
mult(4, 2)
The second R script looks like this, which exists in a sub-directory "sub":
sub/fun.R
mult <- function(x, y) {
x*y
}
I can invoke this with Rscript and successfully get this to work.
Rscript file.R
[1] 8
However, I want to run this with Spark, and use spark-submit. When I run spark-submit, I need to be able to set the current working directory on the Spark workers to the directory which contains the main.R script, so that the Spark/R worker process will be able to find the "sourced" file in the "sub" subdirectory. (Note: I plan to have a shared filesystem between the Spark workers, so that all workers will have access to the files).
How can I set the current working directory that SparkR executes in such that it can discover any included (sourced) scripts?
Or, is there a flag/sparkconfig to spark-submit to set the current working directory of the worker process that I can point at the directory containing the R Scripts?
Or, does R have an environment variable that I can set to add an entry to the "R-PATH" (forgive me if no such thing exists in R)?
Or, am I able to use the --files flag to spark-submit to include these additional R-files, and if so, how?
Or is there generally a better way to include R scripts when run with spark-submit?
In summary, I'm looking for a way to include files with spark-submit and R.
Thanks for reading. Any thoughts are much appreciated.

Unable to run R script through .bat files in Windows Server

I'm trying to run a R script through a .bat file. When I run myself the commands line by line it works. But when I try to run the .bat file, it doesn't works.
This is the .bat file
cd "C:\Program Files\R\R-3.1.2\bin"
R CMD BATCH "C:\Users\Administrator\Downloads\testa_vps.R"
This is the R script
setwd('C:\Users\Administrator\Documents')
file.create('mycsv.csv')
I'm not an expert with Windows and generally try to stick to Unix-like systems for things like this, but I have found that using programs non-interactively (e.g. via .bat files) is usually less error-prone when you add the appropriate directories to your (user) PATH variable, rather than cding into the directory and calling the executable from within the .bat file. For example, among other things, my user PATH variable contains C:\PROGRA~1\R\R-3.0\bin\; - the directory that contains both R.exe and Rscript.exe - (where PROGRA~1 is an alias for Program Files which you can use in an unquoted file path, since there are no spaces in the name).
After you do this, you can check that your PATH modification was successful by typing Rscript in a new Command Prompt - it should print out usage information for Rscript rather than the typical xxx is not recognized as an internal or external command... error message.
In the directory C:\Users\russe_000\Desktop\Tempfiles, I created test_r_script.r, which contains
library(methods)
setwd("C:\Users\russe_000\Desktop\Tempfiles")
file.create("mycsv.csv")
and test_r.bat, which contains
Rscript --vanilla --no-save "C:\Users\russe_000\Desktop\Tempfiles\test_r_script.r"
Clicking on the Windows Batch File test_r ran the process successfully and produced mycsv.csv in the correct folder.
Before running test_r.bat:
After running test_r.bat:
I've never worked with a Windows server, but I don't see why the process would be fundamentally different than on a personal computer; you just may need your sysadmin to modify the PATH variable if you don't have sufficient privileges to alter environment variables.
As already suggested by #nrussel in the comments you should use RScript.exe for this.
Create a file launcher.bat with the following content:
cd C:\Users\Administrator\Documents
Rscript testa_vps.R
In addition, add C:\Program Files\R\R-[your R version]\bin\x64; or C:\Program Files\R\R-[your R version]\bin\i386to the System PATH variable in the Environment Variables menu depending if you run R on a 64-bit or 32-bit system.
I just tested the approach above successfully on a Windows Server 2008 64-bit system and mycsv.csv got created as expected.
EDIT
One important point I forgot to mention is the following: You need to specify the path in your R file in the setwd() call using \\ instead of \.
setwd('C:\\Users\\Administrator\\Documents')
Here is a screenshot of the successful run on the Windows 2008 server:
Note: I added cmd /k to the .bat file so that the cmd window stays open after clicking on the file.

Resources