I'm trying to run a daily taskscheduleR script that pulls data into R from an API. It works when I run it as a one time task but for some reason it won't work as a daily task. I keep getting the following error in the log file:
<HEAD><TITLE>Authorization Required</TITLE></HEAD>
<BODY BGCOLOR=white FGCOLOR=black>
<H1>Authorization Required</H1><HR>
<FONT FACE=Helvetica,Arial>
<B>Description: Authorization is required for access to this proxy</B>
</FONT>
<HR>
<!-- default Authorization Required response (401) -->
Here's the code:
library(httr)
library(jsonlite)
library(tidyverse)
library(taskscheduleR)
# Url to feed into GET function
url<-"https://urldefense.com/v3/__http://files.airnowtech.org/airnow/yesterday/daily_data_v2.dat__;!!J30X0ZrnC1oQtbA!Yh5wIss-mzbpMRXugALJoWEKLKcg1-7VmERQwcx2ESK0PZpM5NWNml5s9MVgwHr5LD1i5w$ "
# Sends request to AirNow API to get access to data
my_raw_result<-httr::GET(url)
# Retrieve contents of a request
my_content<-httr::content(my_raw_result,as="text")
# Parse content into a dataframe
my_content_from_delim <- my_content %>% textConnection %>% readLines %>% read.delim(text = ., sep = "|",header = FALSE)
head(my_content_from_delim)
I have been using the Rstudio add-in to create the task.
If you are trying to access this on a work computer, you may need to allow downloads from the url link. Open a browser, paste that url, click 'allow downloads', run the script.
I am not sure whether the solution I will offer will work for you, but it won't harm to try. If the problem related to the task scheduler, the following solution might work. However, if the problem of authorization issues, you may need to get some IT help from your workplace.
For the task scheduler issue, you can directly send your script to the windows task scheduler with a batch file and create a schedule for it.
To make it easy, you can use the following code. First, open a new folder and copy-paste your R script there. To run the following code, you should call you R script as My Script.r.
Then, in the same folder, create a batch file with the following codes. To create a batch file, you should copy the following code into a Notepad and save it as Run R Script.bat in the same folder.
cd %~dp0
"C:\PROGRA~1\R\R-40~1.0\bin\R.exe" -e "setwd(%~dp0)" CMD BATCH --vanilla --slave "%~dp0My Script.r" Log.txt
Here, cd %~dp0 will set the directory for the windows batch to the folder you run this batch. "C:\PROGRA~1\R\R-40~1.0\bin\R.exe" will specify your R.exe. You may need to change the path based on your system files.
-e "setwd(%~dp0)" will set the directory of R to the same folder in which the batch and script will be run.
"%~dp0My Script.r" Log.txt will define R script pathname and the log file for the batch.
Second, to create a daily schedule, we are going to create another batch file. To do so, copy and paste the following codes into a notepad and save as Daily Schedule.bat.
When you click the Daily Schedule.bat, it will create a daily task and run for the first time in one minute, and every day it will repeat itself at the same time when you first run this batch.
#echo off
for /F "tokens=1*" %%A in ('
powershell -NoP -C "(Get-Date).AddMinutes(1).ToString('MM/dd/yyyy HH:mm:ss')"
') do (
Set "MyDate=%%A"
set "MyTime=%%B"
)
::Execute path to bat path
cd %~dp0
::Create Task
SchTasks /Create /SC DAILY /TN "MY R TASK" /TR "%~dp0Run R Script.bat" /sd %MyDate% /st %MyTime%
This code will create a task called as "MY R TASK". To see whether it is scheduled, you can run the following codes on the windows prompt: taskschd.msc. This will open your task scheduler, and you can find your task there. If you want to modify or delete, you can use this task scheduler program; it has a nice GUI and easy to navigate.
For more details about the Task scheduler syntax, see the following link
If you have any questions, let me know.
Related
I am writing an R code on a Linux system using RStudio. At some point in the code, I need to use a system call to a command that will download a few thousand of files from the lines of a text file:
down.command <- paste0("parallel --gnu -a links.txt wget")
system(down.command)
However, this command takes a little while to run (a couple of hours), and the R prompt stays locked while the command runs. I would like to keep using R while the command runs on the background.
I tried to use nohup like this:
down.command <- paste0("nohup parallel --gnu -a links.txt wget > ~/down.log 2>&1")
system(down.command)
but the R prompt still gets "locked" waiting for the end of the command.
Is there any way to circumvent this? Is there a way to submit system commands from R and keep them running on the background?
Using ‘processx’, here’s how to create a new process that redirects both stdout and stderr to the same file:
args = c('--gnu', '-a', 'links.txt', 'wget')
p = processx::process$new('parallel', args, stdout = '~/down.log', stderr = '2>&1')
This launches the process and resumes the execution of the R script. You can then interact with the running process via the p name. Notably you can signal to it, you can query its status (e.g. is_alive()), and you can synchronously wait for its completion (optionally with a timeout after which to kill it):
p$wait()
result = p$get_exit_status()
Based on the comment by #KonradRudolph, I became aware of the processx R package that very smartly deals with system process submissions from within R.
All I had to do was:
library(processx)
down.command <- c("parallel","--gnu", "-a", "links.txt", "wget", ">", "~/down.log", "2>&1")
processx::process$new("nohup", down.comm, cleanup=FALSE)
As simple as that, and very effective.
I am trying to build up a dataframe with financial data from an API. R should pull a new record every minute from that API and append it to the existing dataframe.
U created a dataframe from that API with one record named "XRP_TimeSeries".
Then I wrote the script which should be executed every minute to append a new record to the dataframe:
XRP_TimeSeries <- rbind(XRP_TimeSeries,
fromJSON("https://api.coingecko.com/api/v3/coins/markets?vs_currency=eur&ids=ripple"))
By executing the code manually, it works. Executing it e.g. 10 times I have 10 records in the desired dataframe.
Then I set the TaskscheduleR Addin to run this script every minute.
The scheduler starts the script, a Windows Command Prompt pops up and closes again, but nothing else happens.
On the log-file I see an error:
object XRP_TimeSeries not found
Can someone help me get this thing running?
The system I use is to call R by using batch files and use Windows task scheduler to schedule to execute the script for me.
"C:\R-3.6.2\bin\R.exe" where you had located your R.exe file.
#echo off
"C:\R-3.6.2\bin\R.exe" CMD BATCH "C:\Users\YOUR_USER.NAME\FOLDER\YOUR_R_SCRIPT.R"
#this is an example
command /data:
trigger:
broadcast "&c&l&o%player%, You won the data!!!"
execute console command "XRP_TimeSeries <- rbind(XRP_TimeSeries,
fromJSON("https://api.coingecko.com/api/v3/coins/markets?vs_currency=eur&ids=ripple"))"
I am trying to understand how, for the above command, the shell arranges for output re-direction when the process associated with the shell itself is replaced by that of echo? Would very much appreciate any help.
Regards
Rupam
Output redirection of an external command is achieved by manipulating file descriptor, typically involving dup2, a system call that assigns a file descriptor to an existing open file. In this case, the standard output of the process that executes echo is instructed to point to the target output file.
The shell does a close equivalent of the following steps:
create a copy of the current process by invoking fork(); the subsequent steps happen in the child process. (This step is omitted when the command is invoked with exec, as in your example.)
open test for writing and remember the file descriptor
make file descriptor 1 equal to the descriptor, by calling dup2(fd, 1)
call execlp or similar to replace current process execution with execution of echo
The echo command simply writes to standard output, a fancy name for file descriptor number 1. It doesn't care if that ends up writing to a TTY, to a file, or to another program. The above steps make sure that file descriptor 1 really points to the open file test.
Okay - so here is what I'm trying to do.
I've got this password protected CSV file I'm trying to import into R.
I can import it fine using:
read.csv()
and when I run my code in RStudio everything works perfect.
However, when I try and run my .R file using a batch file (windows .bat) it doesn't work. I want to use the .BAT file so that I can set up a scheduled task to run my code every morning.
Here is my .BAT file:
"E:\R-3.0.2\bin\x64\R.exe" CMD BATCH "E:\Control Files\download_data.R" "E:\Control Files\DailyEmail.txt"
And here is my .R file:
url <- "http://username:password#www.url.csv"
data <- read.csv(url, skip=1)
** note, I've put my username/password and the exact location of the CSV in my code. I've used generic stuff here, as this is work related and posting usernames and passwords is probably frowned upon.
As I've said, this code works fine when I use it in RStudio. But fails when I use the .BAT file.
I get the following error message:
Error in download.file(url, "E:/data/data.csv") :
cannot open URL 'websiteurl'
In addition: Warning message:
In download.file(url, "E:/data/data.csv") :
unable to resolve 'username'
Execution halted
** above websiteurl is the http above (I can't post links)
So obviously, the .BAT is having trouble with the username/password? Any thoughts?
* EDIT *
I've gone so far as trying this on Linux. Thinking maybe windows was playing silly bugger.
Just from the terminal, I run Rscript -e "download_data.r" and get the EXACT same error message as I did in Windows. So I suspect this may be a problem with where I'm getting the data? Could the provider be blocking data from the command line, but not from with Rstudio?
I have had similar problems which had to do with file permissions. The .bat file somehow does not have the same privileges as you running the code directly from Rstudio. Try using rscript (http://stat.ethz.ch/R-manual/R-devel/library/utils/html/Rscript.html) within your .bat file like
Rscript "E:\Control Files\download_data.R"
What is the purpose of the argument "E:\Control Files\DailyEmail.txt"? Is the program suppose to use it in any way?
So, I've found a solution, which is likely not the most practical for most people, but works for me.
What I did was migrated my project over to a Linux system. Running daily scripts, is easier on Linux anyways.
The solution makes use of the "wget" function in linux.
You can either run the wget right in your shell script, or make use of the system() function in R to run the wget.
code looks like:
wget -O /home/user/.../file.csv --user=userid --password='password' http://www.url.com/file.csv
And you can do something like:
syscomand >- "wget -O /home/.../file.csv --user=userid --password='password' http://www.url.com/file.csv"
system (syscommand)
in R to download the CSV to a location on your hard drive, then grab the CSV using read.csv()
Doing it this way gave me some more insight into the potential root cause of the problem. While the system(syscommand) is running, I get the following output:
Connecting to www.website.com (www.website.com)|ip.ad.re.ss|:80... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Reusing existing connection to www.weburl.com:80.
HTTP request sent, awaiting response... 200 OK
Not sure why it has to send the request twice? And why I'm getting a 401 Unauthorized the first try?
I have written an R script that pulls some data from a database, performs several operations on it and post the output to a new database.
I would like this script to run every day at a specific time but I can not find any way to do this effectively.
Can anyone recommend a resource I could look at to solve this issue? I am running this script on a Windows machine.
Actually under Windows you do not even have to create a batch file first to use the Scheduler.
Open the scheduler: START -> All Programs -> Accesories -> System Tools -> Scheduler
Create a new Task
under tab Action, create a new action
choose Start Program
browse to Rscript.exe which should be placed e.g. here:
"C:\Program Files\R\R-3.0.2\bin\x64\Rscript.exe"
input the name of your file in the parameters field
input the path where the script is to be found in the Start in field
go to the Triggers tab
create new trigger
choose that task should be done each day, month, ... repeated several times, or whatever you like
Supposing your R script is mytest.r, located in D:\mydocuments\, you can create a batch file including the following command:
C:\R\R-2.10.1\bin\Rcmd.exe BATCH D:\mydocuments\mytest.r
Then add it, as a new task, to windows task scheduler, setting there the triggering conditions.
You could also omit the batch file. Set C:\R\R-2.10.1\bin\Rcmd.exe in the program/script textbox in task scheduler, and give as Arguments the rest of the initial command: BATCH D:\mydocuments\mytest.r
Scheduling R Tasks via Windows Task Scheduler (Posted on February 11, 2015)
taskscheduleR: R package to schedule R scripts with the Windows task manager (Posted on March 17, 2016)
EDIT
I recently adopted the use of batch files again, because I wanted the cmd window to be minimized (I couldn't find another way).
Specifically, I fill the windows task scheduler Actions tab as follows:
Program/script:
cmd.exe
Add arguments (optional):
/c start /min D:\mydocuments\mytest.bat ^& exit
Contents of mytest.bat:
C:\R\R-3.5.2\bin\x64\Rscript.exe D:\mydocuments\mytest.r params
Now there is built in option in RStudio to do this, to run scheduler first install below packages
install.packages('data.table')
install.packages('knitr')
install.packages('miniUI')
install.packages('shiny')
install.packages("taskscheduleR", repos = "http://www.datatailor.be/rcube", type =
"source")
After installing go to
**TOOLS -> ADDINS ->BROWSE ADDINS ->taskscheduleR -> Select it and execute it.**
Setting up the task scheduler
Step 1) Open the task scheduler (Start > search Task Scheduler)
Step 2) Click "Action" > "Create Task"
Step 3) Select "Run only when the user is logged on", uncheck "Run with highest priveledges", name your task,
configure for "Windows Vista/Windows Server 2008"
Step 4) Under the "Triggers" tab, set when you would like the script to run
Step 5) Under the "Actions" tab, put the full location of the Rscript.exe file, i.e.
"C:\Program Files\R\R-3.6.2\bin\Rscript.exe" (include the quotes)
Put the name of your script with with -e and source() in arguments wrapping it like this:
-e "source('C:/location_of_my_script/test.R')"
Troubleshooting a Rscript scheduled in the Task Scheduler
When you run a script using the Task Scheduler, it is difficult to troubleshoot any issues because you don't get any error messages.
This can be resolved by using the sink() function in R which will allow you to output all error messages to a file that you specify. Here is how you can do this:
# Set up error log ------------------------------------------------------------
error_log <- file("C:/location_of_my_script/error_log.Rout", open="wt")
sink(error_log, type="message")
try({
# insert your code here
})
The other thing that you will have to change to make your Rscript work is to specify the full file path of any file paths in your script.
This will not work in task scheduler:
source("./functions/import_function.R")
You will need to specify the full file path of any scripts you are sourcing within your Rscript:
source("C:/location_of_my_script/functions/import_function.R")
Additionally, I would remove any special characters from any file paths that you are referencing in your R script. For example:
df <- fread("C:/location_of_my_data/file#2342.csv")
may not run. Instead, try:
df <- fread("C:/location_of_my_data/file_2342.csv")
Changing windows passwords
Beware: Changing windows passwords will pause your task scheduler script(s). You will need to log back into the task scheduler and enter your password to get them started again.
I set up my tasks via the SCHTASKS program. For running scripts on startup, you would write something along the lines of
SCHTASKS /Create /SC ONSTART /TN MyProgram /TR "R CMD BATCH --vanilla d:\path\to\script.R"
See this website for more details on SCHTASKS. More details at Microsoft's website.
You can use Windows Task Scheduler.
After following any combination of these steps and you receive the "Argument Batch Ignored" error after R.exe runs, try this, it worked for me.
In Windows Task Scheduler:
Replace BATCH "C:\Users\desktop\yourscript.R"in the arguments field
with
CMD BATCH --vanilla --slave "C:\Users\desktop\yourscript.R"