Download.file fails in RStudio - r

file<-tempfile(fileext=".csv")
download.file(url="ftp://pubftp.spp.org/Markets/DA/LMP_By_SETTLEMENT_LOC/2014/03/28/DA-LMP-SL-201403280100.csv",destfile=file,mode="wb")
This works in R proper (I'm not sure what to call it). However in RStudio it hangs for several minutes and then I get the following
trying URL 'ftp://pubftp.spp.org/Markets/RTBM/LMP_By_SETTLEMENT_LOC/2014/03/25/11/RTBM-LMP-SL-201403251015.csv'
using Synchronous WinInet calls
Error in download.file(url = "ftp://pubftp.spp.org/Markets/RTBM/LMP_By_SETTLEMENT_LOC/2014/03/25/11/RTBM-LMP-SL-201403251015.csv", :
cannot open URL 'ftp://pubftp.spp.org/Markets/RTBM/LMP_By_SETTLEMENT_LOC/2014/03/25/11/RTBM-LMP-SL-201403251015.csv'
In addition: Warning message:
In download.file(url = "ftp://pubftp.spp.org/Markets/RTBM/LMP_By_SETTLEMENT_LOC/2014/03/25/11/RTBM-LMP-SL-201403251015.csv", :
InternetOpenUrl failed: ''
It is a small file so it shouldn't time out but I really don't know what the problem is.

I found two solutions.
1) Go to Tools > Global Options > Packages, and unselect "Use Internet Explorer library/proxy for HTTP".
2) This worked for another user, but not for me: setInternet2(use=FALSE)
(https://support.rstudio.com/hc/communities/public/questions/200656136-Issue-With-RStudio-and-GEOquery)
Note: when in RGUI I entered setInternet2(use=TRUE), then tried the download, it gave the "using Synchronous WinInet calls" messages and hung; but then Windows Firewall popped up, and when I allowed RGUI through it, the download began.

I have the same problem when I download a file in RStudio when using
> source("http://www.statmethods.net/RiA/wmc.txt")
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
InternetOpenUrl failed: '安全频道支持出错'
Then I try
> options(download.file.method="libcurl", url.method="libcurl")
> source("http://www.statmethods.net/RiA/wmc.txt")
> wmc
It worked

I had a similar issue using R's download.file in a for loop in RStudio. It would download the url for the first several links and then I'd get "InternetOpenUrl failed: 'The operation timed out'" for all subsequent downloads. I tried the the suggestion by sssheridan to unselect the Internet Explorer option in R global options which did not work. I also tried setInternet2(use=T) but this is no longer available in R.
What worked for me was to remove the cache by including cacheOK = F as an argument in download.file. I think this is because I had previously hit the links that were timing out.

Go to Tools > Global Option > Packages > uncheck "Use secure download method for HTTP".

Related

RSelenium + RTools42 issue cannot open the connection: no such file or directory

Trying to use RSelenium and firefox to interact with a website but keep bonking on no such path or directory for the zip file despite other temp files getting added to the same folder.
None of the other discussions of the issue have fixed it. Tried reinstalling rtools42 tried adding it to the PATH environmental variable
Here's the code I'm using that works for two colleagues on the same network:
firefoxProfile <- makeFirefoxProfile(list(browser.helperApps.neverAsk.saveToDisk = "application/comma-seperated-values, text/csv, text/plain, application/zip, application/octet-stream"))
Here's the error message only I get:
Error in file(tmpfile, "rb") : cannot open the connection
In addition: Warning message:
In file(tmpfile, "rb") :
cannot open file 'C:\Users\user\AppData\Local\Temp\file\filename.zip': No such file or directory
Baffled
Reinstalled a few times then reinstalled and installed Rcpp as mentioned in this video: https://www.youtube.com/watch?v=hBTObNFFkhs
Still a bit mysterious to me why getting Rcpp fixed it given that my colleagues didn't have to do this but this did solve it for me.

Web Scraping with R: error related to reset of the connection with server

I have a problem with obtaining data from specific website - when trying to download raw website data with R 3.6.3 using following example code:
website_raw <- readLines("https://tge.pl/gaz-rdn?dateShow=09-02-2022")
The result I got is:
Error in file(con, "r") : cannot open the connection In addition: Warning message: In file(con, "r") : InternetOpenUrl failed: 'the connection with the server was reset'
readLines() method used to work fine on this website but from one week on it fails. I've tried also download.file() method: at the beginning the result was the same (error, connection reset) but after setting options(download.file.method = "libcurl"), website file starts to download but then it suddenly stops with information:
trying URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022'
Error in download.file("https://tge.pl/gaz-rdn?dateShow=09-02-2022", "test.html") :
cannot open URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022'
In addition: Warning message:
In download.file("https://tge.pl/gaz-rdn?dateShow=09-02-2022", "test.html") :
URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022': status was 'Failure when receiving data from the peer'
I've tried also disabling Use Internet Explorer library/proxy for HTTP in Rstudio Global Options but it didn't help. Another solution that I've tested was read_html() from rvest package - getting following error:
Error in open.connection(x, "rb") : Send failure: Connection was reset
Downloading data from other websites works fine though, with all considered methods.
Is there any way I can download data from this website with R?
Any kind of help or suggestion will be highly appreciated

R: source() cannot open the connection, status was 'Couldn't resolve host name'

I am trying to source a script from my github repo containing functions I use often.
So I have that line at the beginning of my script:
source("https://github.com/jogaudard/common/blob/master/fun-fluxes.R")
In RStudio it returns (same with R in the terminal)
Error in source("https://github.com/jogaudard/common/blob/master/fun-fluxes.R") :
https://github.com/jogaudard/common/blob/master/fun-fluxes.R:6:1: unexpected '<'
5:
6: <
^
In an online R editor I got
Error in file(filename, "r", encoding = encoding) :
cannot open the connection to 'https://github.com/jogaudard/common/blob/master/fun-fluxes.R'
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
URL 'https://github.com/jogaudard/common/blob/master/fun-fluxes.R': status was 'Couldn't resolve host name'
Execution halted
I tried with other scripts. I get the same error with anything that is online. source() is working fine with scripts in local (both in the same directory or somewhere else).
It happened since I installed a package that messed a bit with curl. So I thought that might be the issue. But when I tried from another computer I got the same error.
Both computers have R version 3.6.3 on Ubuntu 18.04.5 LTS
I am honestly lost. Cannot find any similar issues anywhere.
You are sourcing a file with html markup and R considers that markup (correctly) to be syntax errors. Use source("https://raw.githubusercontent.com/jogaudard/common/master/fun-fluxes.R") to source the "raw" file.
Your issue with the online R services is a red herring. If you need someone to look into that, you should provide a URL for the service.

Connecting to Spark with Sparklyr gives Permission Denied Error

After installing sparklyr package I followed the instruction here ( http://spark.rstudio.com/ ) to connect to spark. But faced with this error. Am I doing something wrong. Please help me.
sc = spark_connect( master = 'local' )
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\USER\AppData\Local\Temp\RtmpYb3dq4\fileff47b3411ae_spark.log':
Permission denied
But I am able to find the file at the stated location. And on opening, I found it to be empty.I
First of all, did you install sparklyr from github devtools::install_github("rstudio/sparklyr") or CRAN?
There were some issues some time ago with Windows installations.
The issue you have seems to be related to TEMP and TMP folder level permission on Windows or to file creation permission. Every time you start sc <- spark_connect(), it tries to create a folder and file to write the log files.
Make sure you have a write access to these locations.
I could observe the same error message with version 2.4.3 and 2.4.4
in different cases:
When trying to connect to a non "local" master, using spark_connect(master="spark://192.168.0.12:7077", ..),
if the master is not started or not responding at the specified master url.
when setting a specific incomplete configuration
in my case trying to set dynamicAllocation to true, without other required dynamicAllocation settings:
conf <- spark_config()
conf$spark.dynamicAllocation.enabled <- "true"

Error in RStudio with read.table(url...)

I tried to run the code below in RStudio, it always returns the error message of connection failure. It works in RGui. Any idea why this is and how to fix it? Is it problem with my Rstudio (I'm running Windows 8)?
survey <- read.table(url("ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult/adult.data"),
header=FALSE, sep=",", quote="", stringsAsFactors=FALSE)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
InternetOpenUrl failed: 'The FTP session was terminated
b'
Alternative code I have tried:
survey <- read.table(url('ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult/adult.data'),
header=FALSE, sep=",", quote="", stringsAsFactors=FALSE)
survey <- read.table('ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult/adult.data',
header=FALSE, sep=",", quote="", stringsAsFactors=FALSE)
Thanks!
RStudio and RGUI may be using different defaults for setInternet2(), so try running setInternet2(use=FALSE) at the beginning of your session.
Here is an explanation from RStudio (2013):
Also we call setInternet2(use=TRUE) on startup which takes proxy
settings from Internet Explorer to make proxies work in the majority
of cases. We also have an open bug to allow users to turn this off on
startup, but for now you'll have to call setInternet2(use=FALSE)
manually for each session. For more information on the command, call
the following from the console: ?setInternet2()
https://support.rstudio.com/hc/communities/public/questions/200657716-Rstudio-with-aproxy
For Windows, you may have the option to turn this off within RStudio, as someone with Windows mentioned the solution below (2014):
One solution was to go to Tools > Global Options > Packages, and
unselect "Use Internet Explorer library/proxy for HTTP".
https://support.rstudio.com/hc/communities/public/questions/201327633-Bug-report-RStudio-causes-download-file-to-fail-with-FTP-downloads

Resources