RCurl::getURL exceeds maximum number of clients - r

I am trying to list files hosted on the FTP servers of the MODIS Global Evapotranspiration Project (MOD16) recurrently.
## required package
library(RCurl)
## ftp server
ch_ftp <- "ftp://ftp.ntsg.umt.edu/pub/MODIS/NTSG_Products/MOD16/MOD16A2.105_MERRAGMAO/"
## list and reformat available subfolders
ch_fls <- getURL(ch_ftp, verbose = TRUE, dirlistonly = TRUE)
ls_fls <- strsplit(ch_fls, "\n")
ch_fls <- unlist(ls_fls)
## list files in current folder
for (i in ch_fls) {
ch_hdf <- paste0(ch_ftp, i)
getURL(ch_hdf, verbose = TRUE, dirlistonly = TRUE)
}
After some iterations, RCurl::getURL throws the following error message.
< 530 Sorry, the maximum number of clients (5) from your host are already connected.
* Access denied: 530
* Closing connection 16
Show Traceback
Rerun with Debug
Error in function (type, msg, asError = TRUE) : Access denied: 530
Obviously, RCurl::getURL opens connections to the FTP server during each iteration without closing them fast enough. A few minutes later, the server is accessible again, but the same error message will be thrown when re-initializing the script and waiting for the first few iterations. Is there a way to manually close the connections established by RCurl::getURL immediately after having retrieved a list of files?

I was dealing with the same issue.
Using Sys.sleep(2) fixed it for me.
## list files in current folder
for (i in ch_fls) {
ch_hdf <- paste0(ch_ftp, i)
getURL(ch_hdf, verbose = TRUE, dirlistonly = TRUE)
Sys.sleep(2)
}

Related

R, httr, and an unstoppable error message

In summary, I'm finding that when running httr::POST against a plumber api, within an R.utils::withTimeout, the post is throwing an error message which can't be suppressed. The error message is:
Error in .Call(R_curl_fetch_memory, enc2utf8(url), handle, nonblocking) : reached elapsed time limit
Things I've tried:
suppressmessages/warnings
tryCatch with no error response
Using sinks
I can't think of any other way of stopping this error. The code carries on running but it's resulting in users of the api getting spammed with error messages when checking for an available port to call. Any ideas welcome.
Reproducible example (note this uses rstudio jobs pkg and a separate plumber file, but i've tried without the jobs pkg and the issue persists so it's not connected to that.) :
plumber.R:
#* quick ping
#* #post /ping
function() {
list(msg = "you got here!")
}
#* slow 10s call
#* #post /slowfxn
function() {
Sys.sleep(10)
1
}
code which calls and produces errors:
#run the plumber file in a separate job:
r <- plumber::plumb("errortest/plumber.R")
job::job({r$run(swagger = F,port = 1111)})
#this function will throw the error if no response comes within 3s:
call_with_timeout = function() {
R.utils::withTimeout(
rawToChar(httr::POST("http://127.0.0.1:1111/echo")$content)
,timeout = 3,onTimeout = "silent")
}
#now call the fxn (again in a job) which takes 10 s to run:
job::job({httr::POST(url = "http://127.0.0.1:1111/slowfxn")})
#wait a second for that job to be initiated:
Sys.sleep(1)
#while that's running, try and call the quick fxn, with a 3 second timeout:
#try wrap it in a trycatch to suppress:
tryCatch( {
call_with_timeout()
},
error = function(x) {},
TimeoutException = function(x){},
warning = function(x) {}
)
suppressMessages(suppressWarnings(call_with_timeout()))
#not even a sink can stop it!
sink("delete.txt")
call_with_timeout()
sink(NULL)

Download files from FTP folder using Loop

I am trying to download all the files inside FTP folder
temp <- tempfile()
destination <- "D:/test"
url <- "ftp://XX.XX.net/"
userpwd <- "USER:Password"
filenames <- getURL(url, userpwd = userpwd,ftp.use.epsv = FALSE,dirlistonly = TRUE)
filenames <- strsplit(filenames, "\r*\n")[[1]]
When I am printing "filenames" I am getting all the file names which are inside the FTP folder - correct output till here
[1] "2018-08-28-00.gz" "2018-08-28-01.gz"
[3] "2018-08-28-02.gz" "2018-08-28-03.gz"
[5] "2018-08-28-04.gz" "2018-08-28-05.gz"
[7] "2018-08-28-08.gz" "2018-08-28-09.gz"
[9] "2018-08-28-10.gz" "2018-08-28-11.gz"
[11] "2018-08-28-12.gz" "2018-08-28-13.gz"
[13] "2018-08-28-14.gz" "2018-08-28-15.gz"
[15] "2018-08-28-16.gz" "2018-08-28-17.gz"
[17] "2018-08-28-18.gz" "2018-08-28-23.gz"
for ( i in filenames ) {
download.file(paste0(url,i), paste0(destination,i), mode="w")
}
I got this error
trying URL 'ftp://XXX.net/2018-08-28-00.gz'
Error in download.file(paste0(url, i), paste0(destination, i), mode = "w") :
cannot open URL 'ftp://XXX.net/2018-08-28-00.gz'
In addition: Warning message:
In download.file(paste0(url, i), paste0(destination, i), mode = "w") :
InternetOpenUrl failed: 'The login request was denied'
I modified the code to
for ( i in filenames )
{
#download.file(paste0(url,i), paste0(destination,i), mode="w")
download.file(getURL(paste(url,filenames[i],sep=""), userpwd =
"USER:PASSWORD"), paste0(destination,i), mode="w")
}
After that, I got this error
Error in function (type, msg, asError = TRUE) : RETR response: 550
Without a minimal, complete, and verifiable example it is a challenge to directly replicate your problem. Assuming the file names don't include the URL, you'll need to combine them to access the files.
download.file() requires a file to be read, an output file, as well as additional flags regarding whether you want a binary download or not.
For example, I have data from Alberto Barradas' Pokémon Stats kaggle.com data set stored on my Github site. To download some of the files to the test subdirectory of my R Working Directory, I can use the following code:
filenames <- c("gen01.csv","gen02.csv","gen03.csv")
fileLocation <- "https://raw.githubusercontent.com/lgreski/pokemonData/master/"
# use ./ for subdirectory of current directory, end with / to work with paste0()
destination <- "./test/"
# note that these are character files, so use mode="w"
for (i in filenames){
download.file(paste0(fileLocation,i),
paste0(destination,i),
mode="w")
}
...and the output:
The paste0() function concatenates text without spaces, which allows the code to generate a fully qualified path name for the url of each source file, as well as the subdirectory where the destination file will be stored.
To illustrate what's happening with paste0() in the for() loop, we can use message() to print to the R console.
> # illustrate what paste0() does
> for (i in filenames){
+ message(paste("Source is: ",paste0(fileLocation,i)))
+ message(paste("Destination is:",paste0(destination,i)))
+ }
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen01.csv
Destination is: ./test/gen01.csv
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen02.csv
Destination is: ./test/gen02.csv
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen03.csv
Destination is: ./test/gen03.csv
>

Suppress but capture warnings in R

I am building a function to connect to a specific password-protected ODBC data source that will be used many members of a team - it may be used in multiple environments. In the event that the connection is rejected, I would like to display the warning messages but mask the password that's displayed. If I use suppressWarnings() nothing gets captured as far as I can tell, and if I don't, then the message is displayed in the standard output with the password. Here's the function so far:
connectToData <- function(uid, pswd, dsn='myDSN') {
# Function to connect to myDSN data
#
# Args:
# uid: The user's ID for connecting to the database
# pswd: The user's password for connecting to the database.
# dsn: The DSN for the (already existing) ODBC connection to the 5G
# data. It must be set up on an individual Windows user's machine,
# and they could use any name for it. The default is 'myDSN'
#
# Returns:
# The 'RODBC' class object returned by the RODBC:odbcConnect() function.
#
# TODO: 1) See if you can specify the connection using odbcDriverConnect()
# so as to not rely on user's ODBC connections
# 2) Capture warnings from odbcConnect() and print them while
# disguising password using gsub, as I've attempted to do below.
library('RODBC')
db.conn <- odbcConnect(dsn,
uid=uid,
pwd=pswd)
if(class(db.conn) != 'RODBC') { # Error handling for connections that don't make it
print(gsub(pswd,'******',warnings())) # This doesn't work like I want it to
stop("ODBC connection could not be opened. See warnings()")
} else {
return(db.conn)
}
}
When I run it with the right username/password, I get the right result but when I run it with a bad password, I get this:
> db.conn <- connectTo5G(uid='myID',pswd='badpassword', dsn='myDSN')
[1] "RODBC::odbcDriverConnect(\"DSN=myDSN;UID=myID;PWD=******\")"
[2] "RODBC::odbcDriverConnect(\"DSN=myDSN;UID=myID;PWD=******\")"
Error in connectTo5G(uid = "myID", pswd = "badpassword", dsn = "myDSN") :
ODBC connection could not be opened. See warnings()
In addition: Warning messages:
1: In RODBC::odbcDriverConnect("DSN=myDSN;UID=myID;PWD=badpassword") :
[RODBC] ERROR: state 28000, code 1017, message [Oracle][ODBC][Ora]ORA-01017: invalid username/password; logon denied
2: In RODBC::odbcDriverConnect("DSN=myDSN;UID=myID;PWD=badpassword") :
ODBC connection failed
The print(gsub(...)) appears to work on the most recent warnings from before the function was invoked, and it also only prints the function call that produced the warning, not the text of the warning.
What I would like to do is capture everything after "In addition: Warning messages:" so that I can use gsub() on it, but avoid printing it before the gsub() gets a chance to work on it. I think I need to use withCallingHandlers() but I've looked through the documentation and examples and I cannot figure it out.
Some extra background: This is an Oracle database that locks users out after three attempts to connect so I want to use stop() in case someone writes code that calls this function multiple times. Different users in my group work in both Windows and Linux (sometimes going back and forth) so any solution needs to be flexible.
Catching error messages
I do not fully understand what you want to accomplish with ODBC but in terms of converting the error message, you can use tryCatch as #joran suggested
pswd = 'badpassword'
# Just as a reproducable example, a function which fails and outputs badpassword
failing <- function(){
badpassword == 1
}
# This would be the error handling part
tryCatch(failing(),
error = function(e) gsub(pswd, '******', e))
[1] "Error in failing(): object '******' not found\n"
e in this case is the error message and you could think of other ways to manipulate what is put to your screen, so it would not be as easy to guess passwords based on what was replaced. Note for example that 'object' would have been replaced as well if the password had been 'object' for some reason. Or even parts of words, which get replaced as well. At the very least, it would make sense to include word boundaries in the gsub command:
pswd = 'ling'
failing <- function(){
ling == 1
}
tryCatch(failing(),
error = function(e) gsub(paste0("\\b", pswd, "\\b"), '******', e))
[1] "Error in failing(): object '******' not found\n"
For other improvements you should look closely at the specific error messages.
Warnings
trycatch can also manipulate warning:
pswd = 'ling'
failing <- function(){
warning("ling")
ling == 1
}
tryCatch(failing(),
warning = function(w) gsub(paste0("\\b", pswd, "\\b"), '******', w),
error = function(e) gsub(paste0("\\b", pswd, "\\b"), '******', e))
[1] "simpleWarning in failing(): ******\n"
This will not show the error then, however.
withCallingHandlers
If you really want to catch all output from errors and warnings, you do indeed need withCallingHandlers, which works mostly in the same way, except that it does not terminate the rest of the evaluation.
pswd = 'ling'
failing <- function(pswd){
warning(pswd)
warning("asd")
stop(pswd)
}
withCallingHandlers(failing(),
warning = function(w) {
w <- gsub(paste0("\\b", pswd, "\\b"), '******', w)
warning(w)},
error = function(e){
e <- gsub(paste0("\\b", pswd, "\\b"), '******', e)
stop(e)
})

R: change port to connect with SFTP server

I have a connection with an ftp server with the following code:
url <- "ftp://MyServer"
userpwd <- "MyUser:MyPass"
filenames <- getURL(url, userpwd = userpwd, ftp.use.epsv = FALSE, dirlistonly = TRUE, port = 22)
filen <- "MyFile.csv"
rawdata <- getURL(paste(url, filen, sep = ""), userpwd = userpwd, crlf = TRUE)
The file will be moved to an SFTP server, so I need to change the input. This new SFTP server is accessed via port 22 instead of the standard port 21. At the moment the connection fails with the following error
Error in function (type, msg, asError = TRUE) :
Failed to connect to MyServer port 21: Connection refused
It takes the wrong port, but how do I tell R to choose port 22?
You need to specify the SFTP protocol in the URL, so the line
url <- "ftp://MyServer"
should become
url <- "sftp://MyServer"
getUrl will then use the SSH port (22).

system open RStudio close connection

I'm attempting to use R to open a .Rproj file used in RStudio. I have succeeded with the code below (stolen from Ananda here). However, the connection to open RStudio called from R is not closed after the file is opened. How can I sever this "connection" after the .Rproj file is opened? (PS this has not been tested on Linux or Mac yet).
## Create dummy .Rproj
x <- c("Version: 1.0", "", "RestoreWorkspace: Default", "SaveWorkspace: Default",
"AlwaysSaveHistory: Default", "", "EnableCodeIndexing: Yes",
"UseSpacesForTab: No", "NumSpacesForTab: 4", "Encoding: UTF-8",
"", "RnwWeave: knitr", "LaTeX: pdfLaTeX")
loc <- file.path(getwd(), "Bar.rproj")
cat(paste(x, collapse = "\n"), file = loc)
## wheresRStudio function to find RStudio location
wheresRstudio <-
function() {
myPaths <- c("rstudio", "~/.cabal/bin/rstudio",
"~/Library/Haskell/bin/rstudio", "C:\\PROGRA~1\\RStudio\\bin\\rstudio.exe",
"C:\\RStudio\\bin\\rstudio.exe")
panloc <- Sys.which(myPaths)
temp <- panloc[panloc != ""]
if (identical(names(temp), character(0))) {
ans <- readline("RStudio not installed in one of the typical locations.\n
Do you know where RStudio is installed? (y/n) ")
if (ans == "y") {
temp <- readline("Enter the (unquoted) path to RStudio: ")
} else {
if (ans == "n") {
stop("RStudio not installed or not found.")
}
}
}
temp
}
## function to open .Rproj files
open_project <- function(Rproj.loc) {
action <- paste(wheresRstudio(), Rproj.loc)
message("Preparing to open project!")
system(action)
}
## Test it (it works but does no close)
open_project(loc)
It's not clear what you're trying to do exactly. What you've described doesn't really sound to me like a "connection" -- it's a system call.
I think what you're getting at is that after you run open_project(loc) in your above example, you don't get your R prompt back until you close the instance of RStudio that was opened by your function. If that is the case, you should add wait = FALSE to your system call.
You might also need to add an ignore.stderr = TRUE in there to get directly back to the prompt. I got some error about "QSslSocket: cannot resolve SSLv2_server_method" on my Ubuntu system, and after I hit "enter" it took me back to the prompt. ignore.stderr can bypass that (but might also mean that the user doesn't get meaningful errors in the case of serious errors).
In other words, I would change your open_project() function to the following and see if it does what you expect:
open_project <- function(Rproj.loc) {
action <- paste(wheresRstudio(), Rproj.loc)
message("Preparing to open project!")
system(action, wait = FALSE, ignore.stderr = TRUE)
}

Resources