R Script running successfully on local machine, not on EC2 instance - r

I have an R script (an R plumber API) that I have deployed to an EC2 instance and managing with pm2, and I am running into a struggling issue. I have pinpointed the exact location of the error, and am hoping to understand this error a bit better.
When I run the script on my local machine (RStudio on my Mac) it works okay. When I run the script using Rscript myrfile.R from the EC2 instance command line, it breaks.
I have pinpointed that the line of code that breaks the script the on EC2 instance, as well as its error, are:
my_df <- my_df %>%
dplyr::mutate(AwayScore = ifelse(dplyr::row_number() == 1, 0, AwayScore),
HomeScore = ifelse(dplyr::row_number() == 1, 0, HomeScore))
# with the following error
<Rcpp::eval_error in mutate_impl(.data, dots): Evaluation error: argument "x" is missing, with no default.>
I am 100% sure that dplyr is installed on the EC2 instance, since my script uses it throughout. I am also 100% sure that the my_df dataframe here has the columns AwayScore and homeScore, and also that my_df doesnt have any other issues.
I am left to assume that this error is specifically due to the dplyr::row_number() function, which the EC2 instance does not seem to be able to handle, although I am not positive on this.
Any thoughts / help / things I should try / etc. would be greatly appreciated on this, thanks!!

While I appreciate you have avoided the problem by not requiring the library, at some point you may find you want to run codes in a similar way where loading a library will be necessary.
I ran into a similar problem using R script. I found it could not find the libraries I had installed. It is possible to use R.exe instead of Rscript.exe, but this causes other headaches. I found that the environment when using Rscript doesn't contain the R_LIBS_USER path
If you append the following code to the top of your R script it should work
p <- "\directory path of local R packages"
.libPaths(c(p,.libPaths()))
putting the folder path to where your libraries are found on the computer. This is the path that would be returned by Sys.getenv("R_LIBS_USER") if running R in the GUI

It was easy enough for me to simply change my code to the following:
if(is.na(my_df$AwayScore[1])) { my_df$AwayScore[1] = 0 }
if(is.na(my_df$HomeScore[1])) { my_df$HomeScore[1] = 0 }
... so I will likely not waste too much more time trying to debug this.

Related

Why does R source() sometimes work and sometimes gives an error

I have the API credentials in one separate R script to keep it out of Git. I want to run this script in the beginning of the scripts that actually interact with the different servers. I have successfully used the same strategy for all my global functions.
setwd("G:/script")
source("API_credentials.R") # gives always an error
# > source("API_credentials.R")
# Error: '\s' is an unrecognized escape in character string starting ""g:\s"
source("ProVeg_functions.R") # runs fine
Problem:
Why does the first source() not work, while the second one does? The error message does not make any sense to me.
Solutions tried:
I have tried different escape chars \.
I have tried writing full path & file names.
I have tried putting the file name in as a variable, which gets its
content from a dir() search, to make sure that the file exists and
the name is correctly written.
Order of source() does not change situation.
Isolating the piece of code with error, and restarting R.
upgraded all my packages and R to version 4.0.2.
The API_credentials.r script works fine when run on its own. the Sys.setenv() works fine and I can read the API keys with Sys.getenv().
I am not sure if it is related to my problem, but if I do usethis::edit_r_environ() I can not see my API keys.
Setup
Windows 10, R-Studio 1.3.1093, R version 4.0.2 (2020-06-22)
I mistakenly assumed that the error message was related to the script calling the API_credentials.R, but it actually was an error message indicating an error in the API_credentials.R script. Fixed a typo and all is good.

R cmd check note: unable to verify current time

When running R CMD check I get the following note:
checking for future file timestamps ... NOTE
unable to verify current time
I have seen this discussed here, but I am not sure which files it is checking for timestamps, so I'm not sure which files I should look at. This happens locally on my windows and remotely on different systems (using github actions).
Take a look at https://svn.r-project.org/R/trunk/src/library/tools/R/check.R
The check command relies on an external web resource:
now <- tryCatch({
foo <- suppressWarnings(readLines("http://worldclockapi.com/api/json/utc/now",
warn = FALSE))
This resource http://worldclockapi.com/ is currently not available.
Hence the following happens (see same package source):
if (is.na(now)) {
any <- TRUE
noteLog(Log, "unable to verify current time")
See also references:
https://community.rstudio.com/t/r-devel-r-cmd-check-failing-because-of-time-unable-to-verify-current-time/25589
So, unfortunately this requires a fix in the check function by the R development team ... or the web-resource coming online again.
To add to qasta's answer, you can silence this check by setting the _R_CHECK_SYSTEM_CLOCK_ environment variable to zero e.g Sys.setenv('_R_CHECK_SYSTEM_CLOCK_' = 0)
To silence this in a persistent manner, you can set this environment variable on R startup. One way to do so is through the .Renviron file, in the following manner:
install.packages("usethis") (If not installed already)
usethis::edit_r_environ()
Add _R_CHECK_SYSTEM_CLOCK_=0 to the file
Save, close file, restart R

Publishing AzureML Webservice from R requires external zip utility

I want to deploy a basic trained R model as a webservice to AzureML. Similar to what is done here:
http://www.r-bloggers.com/deploying-a-car-price-model-using-r-and-azureml/
Since that post the publishWebService function in the R AzureML package was has changed it now requires me to have a workspace object as first parameter thus my R code looks as follows:
library(MASS)
library(AzureML)
PredictionModel = lm( medv ~ lstat , data = Boston )
PricePredFunktion = function(percent)
{return(predict(PredictionModel, data.frame(lstat =percent)))}
myWsID = "<my Workspace ID>"
myAuth = "<my Authorization code"
ws = workspace(myWsID, myAuth, api_endpoint = "https://studio.azureml.net/", .validate = TRUE)
# publish the R function to AzureML
PricePredService = publishWebService(
ws,
"PricePredFunktion",
"PricePredOnline",
list("lstat" = "float"),
list("mdev" = "float"),
myWsID,
myAuth
)
But every time I execute the code I get the following error:
Error in publishWebService(ws, "PricePredFunktion", "PricePredOnline", :
Requires external zip utility. Please install zip, ensure it's on your path and try again.
I tried installing programs that handle zip files (like 7zip) on my machine as well as calling the utils library in R which allows R to directly interact with zip files. But I couldn't get rid of the error.
I also found the R package code that is throwing the error, it is on line 154 on this page:
https://github.com/RevolutionAnalytics/AzureML/blob/master/R/internal.R
but it didn't help me in figuring out what to do.
Thanks in advance for any Help!
The Azure Machine Learning API requires the payload to be zipped, which is why the package insists on the zip utility being installed. (This is an unfortunate situation, and hopefully we can find a way in future to include a zip with the package.)
It is unlikely that you will ever encounter this situation on Linux, since most (all?) Linux distributions includes a zip utility.
Thus, on Windows, you have to do the following procedure once:
Install a zip utility (RTools has one and this works)
Ensure the zip is on your path
Restart R – this is important, otherwise R will not recognize the changed path
Upon completion, the litmus test is if R can see your zip. To do this, try:
Sys.which("zip")
You should get a result similar to this:
zip
"C:\\Rtools\\R-3.1\\bin\\zip.exe"
In other words, R should recognize the installation path.
On previous occasions when people told me this didn’t work, it was always because they thought they had a zip in the path, but it turned out they didn’t.
One last comment: installing 7zip may not work. The reason is that 7zip contains a utility called 7zip, but R will only look for a utility called zip.
I saw this link earlier but the additional clarification which made my code not work was
1. Address and Path of Rtools was not as straigt forward
2. You need to Reboot R
With regards to the address - always look where it was installed . I also used this code to set the path and ALWAYS ADD ZIP at the end
##Rtools.bin="C:\\Users\\User_2\\R-Portable\\Rtools\\bin"
Rtools.bin="C:\\Rtools\\bin\\zip"
sys.path = Sys.getenv("PATH")
if (Sys.which("zip") == "" ) {
system(paste("setx PATH \"", Rtools.bin, ";", sys.path, "\"", sep = ""))
}
Sys.which("zip")
you should get a return of
" C:\\RTools|\bin\zip"
From looking at Andrie's comment here: https://github.com/RevolutionAnalytics/AzureML/commit/9cf2c5c59f1f82b874dc7fdb1f9439b11ab60f40
Implies we can just download RTools and be done with it.
Download RTools from:
https://cran.r-project.org/bin/windows/Rtools/
During installation select the check box to modify the PATH
At first it didn't work. I then tried R32bit, and that seemed to work. Then R64 bit started working again. Honestly, not sure if I did something in the middle to make it work. Only takes a few minutes so worth a punt.
Try the following
-Download the Rtools file which usually contains the zip utility.
-Copy all the files in the "bin" folder of "Rtools"
-Paste them in "~/RStudio/bin/x64" folder

RStudio : Rook does not work?

I would like to build a simple webserver using Rook, however I am having strange errors when trying it in R-Studio:
The code
library(Rook)
s <- Rhttpd$new()
s$start()
print(s)
returns the rather useless error
"Error in listenPort > 0 :
comparison (6) is possible only for atomic and list types".
When trying the same code in a simple R-Console,everything works - so I would like to understand why that happens and how I can fix it.
RStudio is Version 0.99.484 and R is R 3.2.2
I've experienced same thing.
TLDR: This pull request solves the problem: https://github.com/jeffreyhorner/Rook/pull/31
RStudio is treated in different way and Rook port is same as tools:::httpdPort value. The problem is that in current Rook master tools:::httpdPort is assigned directly. It's a function that's why we need to evaluate it first.
If you want to have it solved right now, without waiting for merge into master: install devtools and load package from my fork #github.
install.packages("devtools")
library(devtools)
install_github("filipstachura/Rook")

Debugging 'testthat' tests in RStudio

Is it possible to invoke the debugger in RStudio when running testthat tests? I haven't been able to find a setup that allows this (various combinations of "use devtools package functions if available" in the settings, hitting the "Test Package" option in the "Build -> More" menu, running test() in the console, putting in browser() calls, etc.) but haven't found a way yet.
I also find myself getting lost a lot when testing, unsure whether the code being run has been installed in the system libraries (by doing 'Build & Reload'), or is being run in situ from the local R directory, or what - sometimes RStudio complains that a breakpoint can't be set until the package is rebuilt (so I suspect the former) or doesn't (so I suspect the latter). Not sure if this issue is closely related or not to my main question.
Without finding a way to drop into the debugger, I end up pasting test code into the console & working in a very ad-hoc fashion, and basically shooting my TDD habits in the foot. So any advice would be appreciated - if it's not possible to invoke the debugger, any suggested workarounds?
I'm running RStudio version 0.99.447 on OS X, in local mode, with R 3.2.1.
Edit - I'd also love to know more background about the options, e.g. "option X will never support debugging, because it's running in a forked process, try this other option Y instead."
Update - having had no responses here, I also asked at https://support.rstudio.com/hc/communities/public/questions/204779797-Debugging-testthat-tests-in-RStudio (where I also haven't had any responses).
The following works for me:
Insert a call to browser() somewhere within the testthat unit tests.
Run devtools::test() from the RStudio console (instead of using the "Test Package" menu item from the UI)
Then, when the test runner hits the browser() invocation, you should be able to use the environment browser and step through the code.
I haven't found a way to get testthat to stop at breakpoints, but inserting browser() invocations is a pretty close substitute.
To be absolutely sure that you're starting from a consistent state when it comes to loading packages, you can close RStudio, re-open it, run "Clean and rebuild", and then devtools::test()
Lastly, if you're working within an RStudio package, you may want to check out the following advice from RStudio support:
In order to debug effectively in your package, you’ll also want to ensure that your package is compiled with the --with-keep.source option. This option is the default for new packages in RStudio; if you need to set it manually, it can be found in Tools -> Project Options -> Build Tools.
If you want to use editor breakpoints from RStudio in order to debug unit tests, you can do it by working around testthat. As Hadley said in a Github issue, testthat usually runs in a separate session from RStudio, so it won't ever be able to use editor breakpoints. However, the tests, themselves, are function calls with two arguments, a description and code. You can read these and run them with a little code.
testf_trace <- function(filename, match) {
env <- new.env()
test_that <- function(desc, code) {
if (length(grep(match, desc)) > 0) {
eval(substitute(code), env)
}
}
env$test_that <- test_that
source(filename, env)
}
If you have a test that reads test_that("invariant is true", { f(3) == 6 }), then you can test this by running testf_trace(filename, "is true").
If you're unsure where a function is defined, and hence whether the breakpoint will work, you can find out by typing the name of the function at the console. The last line or two will tell you in which environment the function is defined, if it was defined in an environment other than .Global.

Resources