Avoid pauses due to readline() while testing - r

I am running tests in R using the test_dir() function from the testthat package. In some of the test scripts there are functions that call readline(), which - in interactive mode - causes the testing to pause and wait for user input. The functions calling readline() are not my own and I don't have any influence on them. The user input is irrelevant for the output of those functions.
Is there a way to avoid these pauses during testing?
Approaches that come to mind, but I wouldn't know how to implement them:
disable interactive mode while R is running
use another function from the testthat package that runs scripts in non-interactive mode
somehow divert stdin to something else than the terminal(??)
wrap functions calling readline() in another script that is called in non-interactive mode from my testing script and makes the results available
Testing only from the command line using Rscript is an option, but I'd rather stay in the RStudio workflow.
======
Example Code
with_pause <- function () {
readline()
2
}
without_pause <- function () {
2
}
expect_equal(with_pause(), without_pause())

I have a similar problem. I solved it with a global option setting.
original_test_mode <- getOption('my_package.test_mode')
options('my_package.test_mode' = TRUE)
# ... some tests ...
options('my_package.test_mode' = original_test_mode)
In my scripts I have a if statement
if(getOption('my_package.test_mode', FALSE)) {
# This happens in test mode
my_value <- 5
} else {
# normal processing
my_value <- readline('please write value: ')
}
Also not the nicest way but it works for me.
Maybe one more hint. It happened to that my test script failed. The problem here is, that the global option stays TRUE and in the next round and also for executing the script in the same session, it will never prompt you to write a value. I guess I should put some stuff in a tryCatch function or so. But if you have this problem in mind, just "sometimes" options('my_package.test_mode', NULL) helps :-)

Related

using rstudioapi in devtools tests

I'm making a package which contains a function that calls rstudioapi::jobRunScript(), and I would like to to be able to write tests for this function that can be run normally by devtools::test(). The package is only intended for use during interactive RStudio sessions.
Here's a minimal reprex:
After calling usethis::create_package() to initialize my package, and then usethis::use_r("rstudio") to create R/rstudio.R, I put:
foo_rstudio <- function(...) {
script.file <- tempfile()
write("print('hello')", file = script.file)
rstudioapi::jobRunScript(
path = script.file,
name = "foo",
importEnv = FALSE,
exportEnv = "R_GlobalEnv"
)
}
I then call use_test() to make an accompanying test file, in which I put:
test_that("foo works", {
foo_rstudio()
})
I then run devtools::test() and get:
I think I understand the basic problem here: devtools runs a separate R session for the tests, and that session doesn't have access to RStudio. I see here that rstudioapi can work inside child R sessions, but seemingly only those "normally launched by RStudio."
I'd really like to use devtools to test my function as I develop it. I suppose I could modify my function to accept an argument passed from the test code which will simply run the job in the R session itself or in some other kind of child R process, instead of an RStudio job, but then I'm not actually testing the normal intended functionality, and if there's an issue which is specific to the rstudioapi::jobRunScript() call and which could occur during normal use, then my tests wouldn't be able to pick it up.
Is there a way to initialize an RStudio process from within a devtools::test() session, or some other solution here?

How to call a parallelized script from command prompt?

I'm running into this issue and I for the life of me can't figure out how to solve it.
Quick summary before example:
I have several hundred data sets from which I want create reports on everyday. In order to do this efficiently, I parallelized the process with doParallel. From within RStudio, the process works fine, but when I try to make the process automatic via Task Scheduler on windows, I can't seem to get it to work.
The process within RStudio is:
I call a script that sources all of my other scripts, each individual script has a header section that performs the appropriate package import, so for instance it would look like:
get_files <- function(){
get_files.create_path() -> path
for(file in path){
if(!(file.info(paste0(path, file))[['isdir']])){
source(paste0(path, file))
}
}
}
get_files.create_path <- function(){
return(<path to directory>)
}
#self call
get_files()
This would be simply "Source on saved" and brings in everything I need into the .GlobalEnv.
From there, I could simply type: parallel_report() which calls a script that sources another script that houses the parallelization of the report generations. There was an issue awhile back with simply calling the parallelization directly (I wonder if this is related?) and so I had to make the doParallel script a non-function housing script and thus couldn't be brought in with the get_files script which would start the report generation every time I brought everything in. Thus, I had to include it in its own script and save it elsewhere to be called when necessary. The parallel_report() function would simply be:
parallel_report <- function(){
source(<path to script>)
}
Then the script that is sourced is the real parallelization script, and would look something like:
doParallel::registerDoParallel(cl = (parallel::detectCores() - 1))
foreach(name = report.list$names,
.packages = c('tidyverse', 'knitr', 'lubridate', 'stringr', 'rmarkdown'),
.export = c('generate_report'),
.errorhandling = 'remove') %dopar% {
tryCatch(expr = {
generate_report(name)
}, error = function(e){
error_handler(error = e, caller = paste0("generate report for ", name, " from parallel"), line = 28)
})
}
doParallel::stopImplicitCluster()
The generate_report function is simply an .Rmd and render() caller:
generate_report <- function(<arguments>){
#stuff
generate_report.render(<arguments>)
#stuff
}
generate_report.render <- function(<arguments>){
rmarkdown::render(
paste0(data.information#location, 'report_generator.Rmd'),
params = list(
name = name,
date = date,
thoughts = thoughts,
auto = auto),
output_file = paste0(str_to_upper(stock), '_report_', str_remove_all(date, '-'))
)
}
So to recap, in RStudio I would simply perform the following:
1 - Source save the script to bring everything
2 - type parallel_report
2.a - this calls directly the doParallization of generate_report
2.b - generate_report calls an .Rmd file that houses the required function calling and whatnot to produce the reports
And the process starts and successfully completes without a hitch.
In order to make the situation automatic via the Task Scheduler, I made a script that the Task Scheduler can call, named automatic_caller:
source(<path to the get_files script>) # this brings in all the scripts and data into the global, just
# as if it were being done manually
tryCatch(
expr = {
parallel_report()
}, error = function(e){
error_handler(error = e, caller = "parallel_report from automatic_callng", line = 39)
})
The error_handler function is just an in-house script used to log errors throughout.
So then on the Task Schedule's tasks I have the Rscript.exe called and then the automatic_caller after that. Everything within the automatic_caller function works except for the report generation.
The process completes almost automatically, and the only output I get is an error:
"pandoc version 1.12.3 or higher is required and was not found (see the help page ?rmarkdown::pandoc_available)."
But rmarkdown is within the .export call of the doParallel and it is in the scripts that use it explicitly, and in the actual generate_report it is called directly via rmarkdown::render().
So - I am at a complete loss.
Thoughts and suggestions would be completely appreciated.
So pandoc is apprently an executable that helps convert files from one extension to another. RStudio comes with its own pandoc executable so when running the scripts from RStudio, it knew where to point when pandoc is required.
From the command prompt, the system did not know to look inside of RStudio, so simply downloading pandoc as a standalone executable gives the system the proper pointer.
Downloded pandoc and everything works fine.

Stop R executing System or Shell commands

i'd like to disable commands that can execute other non R related stuff like System(), Shell() e.g.
for (year in 2010:2915){
system("calc")
}
from running within R.
any suggestions other than locking down the user executing?
thanks
edit: to add more context, we allow the users to create R scripts in our solution which are passed to the R Engine to execute, we then process those results.
Short of editing the R source code to remove the undesirable functions, which would be tedious and probably a bit dangerous, I would override these functions:
# override system()
env <- as.environment("package:base")
unlockBinding("system", env) # bindings in the base R are write-protected
assign(
"system",
function(...){stop("This is a forbidden command!")},
envir=env
)
lockBinding("system", env)
This would give the following when a user runs system():
> system()
Error in system() : this is a forbidden command
So that the changes take effect each time R is started, you could override as many functions as you want this way, adding them to .First() in your (write-protected) "Rprofile.site" file:
.First <- function(){
# code to override system() here
# code to override shell() here
# ...
}
Note that this will not prevent an ill-intentioned determined user from re-implementing the forbidden functionality though.

How to run a function asynchronous in with RGtk2

As the title implies I would like to be able to run a function asynchronous in a GUI created with RGtk2.
The function in itself is an R wrapper for a system command, thus the bulk of the processing time is used on a system() call, and the processing time can range from 10 min to an hour. I would like the GUI to still be responsive in that period.
As it is now the function is put in a gSignalConnect(GtkButton, 'clicked') and the rest of the GUI is thus unresponsive until the 'clicked' signal is terminated.
Does anyone have an idea regarding whether this is possible?
best
Thomas
There might be a more direct way, but I think you can do this with gTimeoutAdd:
library(RGtk2)
w <- gtkWindow()
g <- gtkVBox(); w$add(g)
b1 <- gtkButton("Start timer"); g$packStart(b1)
b2 <- gtkButton("click me"); g$packStart(b2)
gSignalConnect(b1, "clicked", function(...) {
id <- gTimeoutAdd(1, function(...) {
Sys.sleep(5) # replace me
message("Okay, I'm up")
FALSE # one shot
})
})
gSignalConnect(b2, "clicked", function(...) message('clicked me'))
What might work (although not tested, and I am not too familiar with RGtk so no guarantees) is to use the wait=FALSE option in the system call. The system call is then executed asynchronously. In your gtk GUI you would then have to periodically check if your system call has finished. I believe it is possible using RGtk to have a function using that is periodically called (from the documentation of RGtk this is probably gtkTimeoutAdd()).

call to sapply() works in interactive mode, not in batch mode

I need to execute some commands in batch mode (e.g., via Rscript). They work in interactive mode, but not in batch mode. Here is a minimal example: sapply(1:3, is, "numeric"). Why does this work in interactive mode but return an error in batch mode? Is there a way to make a command like this work in batch mode?
More specifically, I need to write scripts and to run them in batch mode. They need to call a function (which I didn't write and can't edit) that looks like this:
testfun <- function (...)
{
args <- list(...)
if (any(!sapply(args, is, "numeric")))
stop("All arguments must be numeric.")
else
writeLines("All arguments look OK.")
}
I need to pass a list to this function. A command like testfun(list(1, 2, 3)) works in interactive mode. But in batch mode, it produces an error: Error in match.fun(FUN) : object 'is' not found. I tried debugger() to get a handle on the problem, but it didn't give me any insight. I also looked through r-help, the R FAQ, R Inferno, but I couldn't find anything that spoke to this problem.
Rscript doesn't load the methods package by default because it takes a lot of time. From the Details section of ?Rscript:
‘--default-packages=list’ where ‘list’ is a comma-separated list
of package names or ‘NULL’. Sets the environment variable
‘R_DEFAULT_PACKAGES’ which determines the packages loaded on
startup. The default for ‘Rscript’ omits ‘methods’ as it
takes about 60% of the startup time.
You can make it load methods by using the --default-packages argument.
> Rscript -e 'sapply(1:3, is, "numeric")' --default-packages='methods'
[1] TRUE TRUE TRUE

Resources