Stop Kettle/Spoon from crashing with one line R script - r

Let suppose there is a simple R script with only one statement:
q()
Using the R Script plugin in Pentaho Kettle/Spoon, executing the above R script causes Spoon/Kettle to crash.
How can we stop Kettle/Spoon from crashing abnormally with the above statement in our R script?
Kettle should instead stop executing the script and execution control should return to Kettle.

Try to use a return(value) instead q() to expect kettle handle the value from R script and continue the common kettle row flow.

Related

How to release R's prompt when using 'system'?

I am writing an R code on a Linux system using RStudio. At some point in the code, I need to use a system call to a command that will download a few thousand of files from the lines of a text file:
down.command <- paste0("parallel --gnu -a links.txt wget")
system(down.command)
However, this command takes a little while to run (a couple of hours), and the R prompt stays locked while the command runs. I would like to keep using R while the command runs on the background.
I tried to use nohup like this:
down.command <- paste0("nohup parallel --gnu -a links.txt wget > ~/down.log 2>&1")
system(down.command)
but the R prompt still gets "locked" waiting for the end of the command.
Is there any way to circumvent this? Is there a way to submit system commands from R and keep them running on the background?
Using ‘processx’, here’s how to create a new process that redirects both stdout and stderr to the same file:
args = c('--gnu', '-a', 'links.txt', 'wget')
p = processx::process$new('parallel', args, stdout = '~/down.log', stderr = '2>&1')
This launches the process and resumes the execution of the R script. You can then interact with the running process via the p name. Notably you can signal to it, you can query its status (e.g. is_alive()), and you can synchronously wait for its completion (optionally with a timeout after which to kill it):
p$wait()
result = p$get_exit_status()
Based on the comment by #KonradRudolph, I became aware of the processx R package that very smartly deals with system process submissions from within R.
All I had to do was:
library(processx)
down.command <- c("parallel","--gnu", "-a", "links.txt", "wget", ">", "~/down.log", "2>&1")
processx::process$new("nohup", down.comm, cleanup=FALSE)
As simple as that, and very effective.

Scheduling an R script in TaskscheduleR results in "object not found" error

I am trying to build up a dataframe with financial data from an API. R should pull a new record every minute from that API and append it to the existing dataframe.
U created a dataframe from that API with one record named "XRP_TimeSeries".
Then I wrote the script which should be executed every minute to append a new record to the dataframe:
XRP_TimeSeries <- rbind(XRP_TimeSeries,
fromJSON("https://api.coingecko.com/api/v3/coins/markets?vs_currency=eur&ids=ripple"))
By executing the code manually, it works. Executing it e.g. 10 times I have 10 records in the desired dataframe.
Then I set the TaskscheduleR Addin to run this script every minute.
The scheduler starts the script, a Windows Command Prompt pops up and closes again, but nothing else happens.
On the log-file I see an error:
object XRP_TimeSeries not found
Can someone help me get this thing running?
The system I use is to call R by using batch files and use Windows task scheduler to schedule to execute the script for me.
"C:\R-3.6.2\bin\R.exe" where you had located your R.exe file.
#echo off
"C:\R-3.6.2\bin\R.exe" CMD BATCH "C:\Users\YOUR_USER.NAME\FOLDER\YOUR_R_SCRIPT.R"
#this is an example
command /data:
trigger:
broadcast "&c&l&o%player%, You won the data!!!"
execute console command "XRP_TimeSeries <- rbind(XRP_TimeSeries,
fromJSON("https://api.coingecko.com/api/v3/coins/markets?vs_currency=eur&ids=ripple"))"

R: "Error calling capture_console_output: 87" when using terminalExecute()

I am trying to run an executable called swat_edit.exe in R. It works perfectly when I run it directly in the command prompt, and also when I run it directly in the Terminal tab in R. However, when I try to write a function in R to run the executable, I get an error (I get a number of different errors...).
I have tried to use different methods of running the file:
1: I used system("swat_edit"), which returns the following error:
Unhandled Exception: System.IO.IOException: The handle is invalid.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.Console.set_CursorVisible(Boolean value)
at SWEdit.Program.Run(String[] args)
at SWEdit.Program.Main(String[] args)
[1] 17234
2: I used shell("swat_edit"), which returns the exact same error as (1).
3: I used shell.exec("swat_edit"). This works, but it opens the executable in a new window, which then runs for a few seconds and closes (as intended). I need the program to run in the R terminal window so it can run many iterations in the background without disrupting other things. This is not a viable option.
4: I tried using terminalSend(ID,"swat_edit") (from the rstudioapi package). This works in that it sends the command to the terminal window in R. When I move there and hit enter it executes perfectly, running in the terminal window like I want it to. However, I need to run many iterations so this is not viable either. I tried using KeyboardSimulator to go to the Terminal tab and hitting enter (which worked), but this also does not let me use the PC for other purposes while running my code.
5: I tried using terminalExecute("swat_edit"), which returns the following error code:
Error calling capture_console_output: 87
[Process completed]
[Exit code: -532462766]
6: I tried making a python file that runs swat_edit.exe, and then running that file in R. The python file works when I run it by itself, from the command prompt, or from the terminal in R. It does not, however, work when I try to run it in the R terminal using terminalExecute (same error as in (5)).
NOTE: I have another executable called swat.exe (entirely different program) that works with all of the above-mentioned methods.
So in summary: swat_edit.exe runs perfectly in command prompt and R terminal, but does not work when I try to run it using R code (either system(), shell(), or terminalExecute().
I can't figure out the difference between terminalExecute() and typing the string into terminal and hitting enter, but apparently there is something happening in between...
It will be tedious to reproduce this since it uses external programs, but if anyone has any idea about the error messages or how I can copy a string and run it in the terminal without any interference, that would be greatly appreciated.
EDIT: I found a method that solves my problem. I created a .bat file that runs swat_edit minimized. I was able to run this .bat file with the shell function (or any of the other commands I mentioned) in R. This doesn't answer why I was having the issues I described, and it doesn't let me run swat_edit in the R terminal, but it's good enough for me.
The .bat file was simply the following:
"START /MIN /WAIT C:\~\SWAT_Edit.exe"

Starting Rserve in debug mode and printing variables from Tableau to R

I can't start Rserve in debug mode.
I wrote these commands in R:
library(Rserve)
Rserve(debug=T, args="RS-enable-control", quote=T, port = 6311)
library(RSclient)
c=RSconnect(host = "localhost", port = 6311)
RSeval(c, "xx<-12")
RSeval(c, "2+6")
RSeval(c, "xx")
RSclose(c)
install.packages("fpc")
I placed the Rserve_d.exe in the same directory where the R.dll file is located. But when I launch it and I launch Tableau with the Rserve connection I can't see anything in the debug console, just these few lines.
Rserve 1.7-3 () (C)Copyright 2002-2013 Simon Urbanek
$Id$
Loading config file Rserv.cfg
Failed to find config file Rserv.cfg
Rserve: Ok, ready to answer queries.
-create_server(port = 6311, socket = <NULL>, mode = 0, flags = 0x4000)
INFO: adding server 000000000030AEE0 (total 1 servers)
I tried another solution by the command Rserve(TRUE) in R, but I can't see the transactions between R and Tableau neither in the Rstudio console.
I wanted then to print the output of the variable in R from the R-script function, by print(.arg1). But nothing appears in the R console
but when I run print in the R console it works fine.
According to this article*, RServe should be run with the following command to enable debugging:
R CMD Rserve_d
An alternative is to use the ‘write.csv’ command within the calculated field that calls an R script, as suggested by this FAQ document from Tableau
Starting Rserve_d.exe from command line works. Most likely you have multiple instances of Rserve running and Tableau is sending requests to one that is not Rserve_d running in the command line.
Did you try killing all Rserve processes and then starting Rserve_d from command line?
If you don't want to run from the command line you can try starting Rserve in process from RStudio by typing run.Rserve() then using print() statements in your Tableau calculated fields for things you want to print.
In the R bin directory, you have two executables Rserve for normal execution and Rserve.dbg for debug execution. Use
R CMD Rserve.dbg
My OS is CENTOS7 and I am using the R installation from anaconda. If your RServe debug executable has a different name you should be using that.

How to stop entire script from running when certain condition met without error in R

i=14
l=8
if(i>l){q()}
print(i)
print(l)
above code is what I simplfied and when I run code above, it ends up with " R session aborted. R encountered a fatal error"
pls advise me way to avoid this error
Calling q() inside an if block from a script in the editor pane of RStudio crashes my RStudio in a similar manner, with a fatal error dialog box. I suspect this is an RStudio bug and should be reported if it recurs with the latest RStudio.
Just putting q() in a script not in an if block quits RStudio as expected, without error messages.
The correct way to terminate a script without killing R in any way is to use stop("why").
if(1>0)stop("am stopping")
print("No")

Resources