getForm: retry connection upon error - r

For a web scraping project I am making frequent requests on a particular site. Sometimes the connection times out with an error and I would like for it to retry instead of erroring out. I've written out the code below for it to keep trying, but I don't think it works because I still error out.
url = "www.google.com"
while(true){
withRestarts(tryCatch(
sourcecode <- getForm(urls[n]),
finally = Sys.sleep(2),
abort = function(){})
}
Error in function (type, msg, asError = TRUE) : couldn't connect to
host

Got it after experimenting:
while(length(sourcecode.ad) == 0){
try({
sourcecode <- getForm(urls[n])
print(urls[n])
Sys.sleep(1)
})
}
Try() will allow a continuation after an error occurs. Combined with the loop, it will keep retrying.

Related

Problem in writing error logs to a file in R

I'd like to do some error handling in my R program. So I'm using tryCatch function and I would like to write the error message (in case there is any error to a file). Here is the code I have
basicConfig(level='INFO')
addHandler(writeToFile, file=file_name.txt, level='INFO')
tryCatch(
{
logger <- getLogger()
...
},
error=function(cond) {
logger$error(cond)
})
but it looks like cond does not contain the error message and the log file ends up as empty. How can I write down th error thread/message then?

How to use option(error = ) with a custom function and still make the script abort (in R)

Can anyone point me to the best way to use option(error = function(...){}) properly? I want to write errors to a log file and then terminate as usually. Currently I use
options(error = function(...) {
#... write to logfile ...
options(error = NULL)
stop(geterrmessage())
})
But resetting the option and calling stop() again looks like a hack to me. I also tried q("no", status = 1, runLast = FALSE) (as from the documentation of stop()), but this does not seem to be equivalent to a normal stop(). For example, in RStudio server it quits the whole session.
I need to use the option() instead of tryCatch() because I want to catch all possible errors that occur in the script. I launch my script via a cron job, and I want to get an email/log entry as soon as the script fails.
A tryCatch block would probably be the best option for this type of situation.
tryCatch({
#... main code to run ...
}, warning = function(w) {
#... code to run if any warnings occur ...
warning(w) # Show the warning
}, error = function(e) {
#... write to log file ...
stop(e) # Stop script and show error message. Delete this line if you do not want to stop script
}, finally = {
#... code to run whether or not error occurs ...
})

open.connection failing in geocoding function

I'm currently running a geocoding function (using the google_places function in the googleway package). The function will run for a while (I have almost 3k locations), then throw the following error:
Error in open.connection(con, "rb") :
schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.
Having consulted the system event log, I found the following information:
The machine-default permission settings do not grant Local Activation permission for the COM Server application with CLSID
{9BA05972-F6A8-11CF-A442-00A0C90A8F39}
and APPID
{9BA05972-F6A8-11CF-A442-00A0C90A8F39}
I'm not really sure what to do with this information. From my limited knowledge, it appears this is some sort of security/firewall issue. How should I go about giving R the permissions needed to run this function?
I am running Windows 10 with Windows Defender as antivirus/firewall. For reference, this is the function I am using for geocoding:
metro.locater <- function(lat, lon){
library(googleway)
#putting latitude and longitude into the same vector
latlon <- c(lat, lon)
#getting places result
res <- google_places(location = latlon,
place_type = "subway_station", radius = 50000,
rankby="distance",
key = "myKey")
#condition handling
if(res$status == 'OK'){
closest <- res$results[1:3, ]
return(closest)} else {
try(return(res$status))
}
}
I was able to fix the issue by using an adverb I'd used with another geocoding function that attempts to run the function 5 times when it fails to provide results. Given that this worked, it seems likely that this was just a transient error rather than a systemic issue.
The adverb I used:
safely <- function(fn, ..., max_attempts = 5) {
function(...) {
this_env <- environment()
for(i in seq_len(max_attempts)) {
ok <- tryCatch({
assign("result", fn(...), envir = this_env)
TRUE
},
error = function(e) {
FALSE
}
)
if(ok) {
return(this_env$result)
}
}
msg <- sprintf(
"%s failed after %d tries; returning NULL.",
deparse(match.call()),
max_attempts
)
warning(msg)
NULL
}
}
Taken from Repeating values in loop until error disappears.

Stopping an R script without getting "Error during wrapup" message

I wrote an R script which writes messages (progress report) to a text file. I modified the error option so that when an error occurs, the error message is also written to that file:
options(error = function() {
cat(geterrmessage(),file = normalizePath("logs/messages.txt"),append = TRUE)
stop()
})
It works, but I get this message in the console/terminal window when an error does occur:
Error during wrapup:
Execution halted
So I'm thinking there's a better way to interrupt the execution of the script... or is there?
I just found this inside R source code:
if (inError) {
/* fail-safe handler for recursive errors */
if(inError == 3) {
/* Can REprintf generate an error? If so we should guard for it */
REprintf(_("Error during wrapup: "));
/* this does NOT try to print the call since that could
cause a cascade of error calls */
Rvsnprintf(errbuf, sizeof(errbuf), format, ap);
REprintf("%s\n", errbuf);
}
stop() causes the error handler to be executed. If the stop() call occurs within the error handler, R displays the Error during wrapup: message and prevents you from the infinite recursion that would occur otherwise.
Do not call stop() from inside your options$error.
Use q(save="no", status=1, runLast=FALSE) instead, that should do exactly what the default error handler does for non-interactive use. See ?options for the meaning of options$error and ?stop for details about error handling.

sink() doesn't work in tryCatch block

I'm trying to close my logger instance in a finally block, as follows:
logger <- file("all.Rout", open="wt")
sink(logger, type="message")
tryCatch({
warning('test')
message("A")
log('a')
message("B")
}, error = function(e) {
}, finally = {
sink(type="message")
close(logger)
})
However, only message("A") is saved to log and nothing else is. If I do the follow, the problem is fixed:
logger <- file("all.Rout", open="wt")
sink(logger, type="message")
tryCatch({
warning('test')
message("A")
log('a')
message("B")
}, error = function(e) {
}, finally = {
})
sink(type="message")
close(logger)
However, I really need the closing to be in the finally block so that I can view the logs if an error was thrown.
How do I fix this?
The problem is that the default setting is not to print warnings as they happen. They are accumulated and then printed when convenient. So R doesn't think the finally block is a convenient time to print those warnings because you aren't ineractive at that point and might not see them. One work-around is to change the setting to report every warning as it happens rather than waiting till the current call is done. you can do that with this
logger <- file("log.txt", open="wt")
sink(logger, type="message")
tryCatch({
ow<-options(warn=1)
warning('test')
message("A")
log('a')
message("B")
},
error = function(e) {
}, finally = {
options(ow)
sink(type="message")
close(logger)
})
Here we change the options() at the beginning of the try block and then reset them in the finally.
The contents of the log file are then
Warning in doTryCatch(return(expr), name, parentenv, handler) : test
A
You'll notice in the other method the messages are reversed even though the warning come first. Again, R was just waiting till the end of the current call to return the warnings messages to you and that was after the tryCatch() finished running.

Resources