How to use windows scheduler to execute R script - r

I have found some way to solve it
e.g: windows system's task scheduler and R package called taskscheduleR
I only can execute a simple R script , like
cat(aa,file="ttt.txt",sep="\n",append=TRUE)
but I can't execute the program I really want to execute
the picture is the situation
I want to know:
if this R script needs other ***.r
what should I key on the command line?
or just put them into the same folder?
now the situation is

Accordingly to the error message, the script is trying to install a package, and you don't have any selected cran mirror in your .Rprofile. Then, it fails as you should manually select a mirror interactively. Either fix this by selecting a mirror, or remove/comment the line in your script that is installing a package from cran and it should works fine.

Related

Arrow R package fails to install on Databricks

About 6 weeks ago (early April 2022), I had tested a Databricks workflow to ensure that I could trigger jobs on databricks remotely from Airflow, which was successful.
As part of the process the workflow activates a pre-built compute, it then loads various R libraries from DBFS into the compute, one of the packages is 'arrow', however while all the other packages load without issue this package fails to load successfully and then causes my workflow to crash.
When I look into the workflow I get the following error 'DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: Command to install library [RCranPkgId(arrow,None,None)] on [0303-130414-840hkwxf] orgId [5132544506122561] failed inside Databricks infrastructure', see image below. Arrow Fail Message" data-fileid="0698Y00000JFZosQAHI have triggered workflow directly inside of databricks and still get the same problem, so clearly it does not have an airflow related cause.
I tried to delete the arrow package from dbfs to see if I could run the test without it, but everytime I delete it, it returns when I retry the workflow.
I then checked CRAN to see if arrow had been updated recently, it was on the 2022-05-09, so I loaded an older version instead, (having first deleted everything relating to it from dbfs), this didn't work either, see images attached.
# Databricks notebook source
.libPaths()
# COMMAND ----------
dir("/databricks/spark/R/lib")
# COMMAND ----------
## Add current working directory to library paths
.libPaths(c(getwd(), .libPaths()))
# COMMAND ----------
## The latest versions from CRAN
install.packages(c('arrow', 'tidyverse', 'aws.s3', 'sparklyr', 'cluster', 'sqldf', 'lubridate', 'ChannelAttribution'), repos = "http://cran.us.r-project.org")
# COMMAND ----------
dir("/tmp/Rserv/conn970")
# COMMAND ----------
## Copy from driver to DBFS
system("cp -R /tmp/Rserv/conn970 /usr/lib/R/site-library")
# COMMAND ----------
dir("/usr/lib/R/site-library")
# COMMAND ----------
## Copy from driver to DBFS
system("cp -R /tmp/Rserv/conn970 /dbfs/r-libraries")
# COMMAND ----------
dir("/dbfs/r-libraries")
# COMMAND ----------
# Add packages to libPaths
.libPaths("/dbfs/r-libraries")
# COMMAND ----------
# Check that the dbfs libraries are in libPath
.libPaths()
I have also attached the script I'm using in R to load the packages to dbfs, which I think is good as every other package load properly, however it may be of use in understanding what I am doing or why the error occurs.
What I'd like to know is:
Do you see anything that I might be doing incorrectly inside my attached libraries script?
Is there an issue with loading the arrow package and if so do you have a work around that can prevent the failure?
Why does dbfs continue to re-install arrow despite my removal of it from the directory?
Can I permanently remove the arrow package from dbfs without it returning everytime I trigger the workflow?
Many thanks in advance for any help you guys can offer.
I figured it out, the issue had two aspects to it. The first was plain to see once I looked at the json file for the job. In it I specified that certain packages should be loaded when building the compute, this configuration overrode my model's script by virtue of the fact that it ran first. Seeing this in the json file explained why the arrow library kept trying to load regardless of the fact that I had removed it from dbfs.
The second part of the solution was how to get arrow to load without failing, which continued to happened when I retried to reinstall it on dbfs or independently on my model. By default Databricks seems to try to load packages from CRAN (https://cran.r-project.org/), I don't understand why it failed, maybe an issue with ubuntu from the last update???
The solution was to install it from a snapshot I got from MRAN at the following location 'https://cran.microsoft.com/snapshot/2022-02-24/'. Thank you user2554330, your comment got me rethinking and set me on the right direction.
I hope that this helps someone else if they are having similar issues.

r: errors creating package with devtools & roxygen2

I'm writing a package containing several functions to make running and evaluating models more streamlined.
I have a function that I'm going to make the first function within my package detailed with roxygen2 comments, which I can include into this write-up as an edit if necessary, but my issue is more with Package Creation.
I've created a separate .R file for the function and it lives within the R folder in within my package folder. I've run R CMD build pkgname and R CMD INSTALL pkgname successfully.
At the document() stage I run it (from console or whether in my terminal using R -e 'library(devtools);document()', deleting the existing NAMESPACE file first) and I get the following error: Try removing ‘/Library/Frameworks/R.framework/Versions/ 3.5/Resources/library/00LOCK-pkgname.
I've already seen the [issue posted here][1] and haven't had success after deleting the 00LOCK-pkgname folder, for two reasons: when I run document(), even when it throws the above error, it doesn't stop running, it just keeps looping (that happens whether I run this in R or use the Terminal). Additionally, no matter how many times I delete the folder, it keeps re-appearing even though I've stopped running the function.
Any insight into why that error is being thrown and the document() function continually runs in a loop?
Best answer I've found is in this blog post: Hilary Parker R-Package Blog Post
The steps I follow to document and install are as follows:
Within the project that contains my package, open a new R Script and run setwd('..')
Run devtools::document()
Run devtools::install()
This works for me when initially installing my package and also updating it.

How does a typical Rcpp edit-compile-test cycle look like?

I can only find information on how to install a ready-made R extension package, but it is nowhere mentioned which commands a developer of an extension package has to use during daily development. I am using Rcpp and I am on Windows.
If this were a typical C++ project, it would go like this:
edit
make # oops, typo
edit # fix typo
make # oops, forgot an #include
edit
make # good; updates header dependencies for subsequent 'make' automatically
./fooreader # test it
make install # only now I'm ready
Which commands do I need for daily development of an Rcpp package project?
I've allocated a skeleton project using these commands from the R command line:
library(Rcpp)
Rcpp.package.skeleton("FooReader", example_code=FALSE,
author="My Name", email="my.email#example.com")
This allocated 3 files:
DESCRIPTION
NAMESPACE
man/FooReader-package.Rd
Now I dropped source code into
src/readfoo.cpp
with these contents:
#include <Rcpp.h>
#error here
I know I can run this from the R command line:
Rcpp::sourceCpp("D:/Projects/FooReader/src/readfoo.cpp")
(this does run the compiler and indicates the #error).
But I want to develop a package ultimately.
There is no universal answer for everybody, I guess.
For some people, RStudio is everything, and with some reason. One can use the package creation facility to create an Rcpp package, then edit and just hit the buttons (or keyboard shortcuts) to compile and re-load and test.
I also work a lot on a shell, so I do a fair amount of editing in Emacs/ESS along with R CMD INSTALL (where thanks to ccache recompilation of unchanged code is immediate) with command-line use via r of the littler package -- this allows me to write compact expressions loading the new package and evaluating: r -lnewpackage -esomeFunc(somearg) to test newpackage::someFunc() with somearg.
You can also launch the build and test from Emacs. As I said, it all depends.
Both those answers are for package, where I do real work. When I just test something in a single file, I do that in one Emacs buffer and sourceCpp() in an R session in another buffer of the same Emacs. Or sometimes I edit in Emacs and run sourceCpp() in RStudio.
There is no one answer. Find what works for you.
Also, the first part of your question describes the initial setup of a package. That is not part of the edit/compile/link/test cycle as it is a one off. And for that too do we have different approaches many of which have been discussed here.
Edit: The other main misunderstanding of your question is that once you have package you generally do not use sourceCpp() anymore.
In order to test an R package, it has to be installed into a (temporary) library such that it can be attached to a running R process. So you will typically need:
R CMD build . to build package_version.tar.gz
R CMD check <package_version.tar.gz> to test your package, including tests placed into the testsfolder
R CMD INSTALL <package_version.tar.gz> to install it into a library
After that you can attach the package and test it. Quite often I try to use a more TTD approach, which means I do not have to INSTALL the package. Running the unit tests (e.g. via R CMD check) is enough.
All that is independent of Rcpp. For a package using Rcpp you need to call Rcpp::compileAttributes() before these steps, e.g. with Rscript -e 'Rcpp::compileAttributes()'.
If you use RStudio for package development, it offers a lot of automation via the devtools package. I still find it useful to know what has to go on under the hood and it is by no means required.

Should your title be "Running R script from Command Line Can't Find Packages

I am troubleshooting an issue that I am running into on a new instance of RStudio on a virtual machine. If I run an R script expFit.R in RStudio, it has no problem finding the needed packages (in this case RODBC and minpack.lm) but if I try to run from the command line
c:\"Program Files"\R\R-3.4.1\bin\x64\Rscript.exe e:\expFit.R
I get the error 'minpack.lm' is not a valid installed package. If I move RODBC to be the first package to load, then it errors on that.
The reason that I am running from command line is that in the real application I am running it from a command line stored procedure in SQL. It doesn't work either way (stored procedure or straight command line). I made sure that the path to the packages is a Path variable.
This works on 2 of my other servers, the only difference being that RStudio is installed so that all users can use it on the server that is getting the error, while the other servers have it installed only in my account.
Any advice would be much appreciated!

How to use PowerShell to schedule sourcing an R script?

My goal is to use PowerShell to schedule the sourcing of an R script.
My current work flow is that I open RStudio, click the "Source" button in the upper right corner. Then I wait until it's finished, and close RStudio. I change nothing in the R script.
In PowerShell I've been using its Register-ScheduledJob cmdlet to kick off C# programs on a daily schedule. And here's the problem, I can't find an example of effectively using PowerShell to source an R script.
I believe the PowerShell script should probably use the Invoke-Expression cmdlet. But I'm not 100% sure.
To no avail I've tried this:
Start-Process "C:\Program Files\R\R-3.2.4revised\bin\x64\Rterm.exe" -RedirectStandardInput "C:\MyScript.R"
Also, I'd like to avoid the solution that uses CMD BATCH as that's defeating the purpose of using PowerShell.
If just sourcing the R script is what you're looking for then one way to do is something like this
& "C:\Program Files\R\R-3.1.1\bin\Rscript.exe" "C:/Program Files/R/R-3.1.1/tests/demos.R"
where "C:\Program Files\R\R-3.1.1\bin\Rscript.exe" is path Rscript in your local R installation and "C:/Program Files/R/R-3.1.1/tests/demos.R" is path to script you'd normally source() directly in RStudio.
One thing to keep in mind is depending on location of files your R script needs you might need to adjust your script with appropriate setwd()

Resources