I am facing an issue while running the rsDriver() function to open up the chrome browser.
Code:
library("RSelenium")
library("wdman")
mybrowser <- rsDriver(browser=c("chrome"), chromever="80.0.3987.16",port = 443L)
remDr <- mybrowser$client
remDr$navigate("https://google.co.in/")
Sys.sleep(2)
When I run this code on my machine while connected to my home network the code works as expected. But when I run this code from my office network, the rsDriver(browser=c("chrome"), chromever="80.0.3987.16",port = 443L) gives me the below error and I am stuck at this point.
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
Error in open.connection(con, "rb") :
Timeout was reached: [www.googleapis.com] Operation timed out after 10000 milliseconds with 0 out of
0 bytes received
I tried connecting through the company's proxy with the below code but still no luck. I tried using the port numbers 4444,4445,4567 but still the same error.
cprof <- list(chromeOptions = list(args = list("--proxy-server= gproxy.go.company.org:8080")))
mybrowser <- rsDriver(browser=c("chrome"), chromever="80.0.3987.16", port = 443L,extraCapabilities = cprof)
It would be very helpful someone can help me in understanding the issue and suggest me a solution. Am I missing something in the code. Any help would be highly appreciated.
Also do let me know for any additional information required.
To me this looks like a proxy issue. Are you able to retrieve an arbitrary website? E.g. using httr::GET("www.google.com"). If not, this would also point to a problem with the proxy.
Have you tried to configure it in .Renviron? Like so:
file.edit('~/.Renviron')
Add this line to the file and restart RStudio:
http_proxy=USER:PASSWORD#PROXY:PORT
Another option: setting proxy with httr/curl:
set_config(use_proxy(url="proxy.com",
port = 8080,
username = "foo",
password = "bar"))
Achieved this by switching the networks, first connected to my local network and when the browser opens up switch to company's network.
Related
I am trying to use RSelenium for webscraping. I am following the basics tutorial as explained on cran. The recommended approach is to install Docker (see tutorial as well as this stackoverflow answer). If I understand correctly, this is not an option for me as I am operating on Windows 7 for which Docker seems not to be available (see docker forum).
Thus, I am trying option 2 using the RSDriver. I run
RSelenium::rsDriver()
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4445L,
browserName = "firefox"
)
remDr$open()
and get the error
> remDr$open()
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4445: Connection refused
This question has been asked and answered before here, here, here and here, though these are about the same error when using Docker and their solutions did not work for me.
Is there anyway to get this running with rsDriver? Is there any option for me as a Windows 7 user?
With RSelenium version 1.7.7 this is a workaround:
library(RSelenium)
remDr <- rsDriver(
port = 4445L,
browser = "firefox"
)
This command combines the server setup, and driver initation.
My issue (on Mac) was updating Java:
https://www.oracle.com/java/technologies/downloads/#jdk19-mac
Worked after this.
I am attempting to geocode addresses in R using the geocode() function in the ggmap package. I have done this on my personal computer relatively easily and want to attempt this on my work computer. Generally, I am supposed to register with Google, library the ggmap package, read-in my security key, and then I can use the geocode() function. But I get errors. See below:
# Library package
library(ggmap)
# Set key file
gmAPI <- "key_file.txt"
# Read-in Google API key
gmAPIKey <- readLines(gmAPI)[1]
# Register key
register_google(key = gmAPIKey)
# Geocode Waco, TX
geocode("waco, texas", output = "latlona")
Instead of generating geocoded output, I receive:
Source : https://maps.googleapis.com/maps/api/geocode/json?address=waco,+texas&key=xxx.txt
Error in curl::curl_fetch_memory(url, handle = handle) :
Failed to connect to maps.googleapis.com port 443: Timed out
or sometimes:
Source : https://maps.googleapis.com/maps/api/geocode/json?address=waco,+texas&key=xxx.txt
Error in curl::curl_fetch_memory(url, handle = handle) :
Failed to connect to url port ###: Connection refused
Note: I replaced the actual url/port posted in the error message with url port ### as I imagine this is specific to my computer.
I have a feeling this has to do with my work network. Similar questions have set some configuration using the httr package, but those solutions have not worked for me. It's possible I am entering the wrong information. Any help?
I have had a similar issue at my work, which I managed to solve adding one line of code with a function from the httr library indeed.
I did:
library(httr)
set_config(use_proxy(url="http://proxy.mycompanyname.com", port=****))
Just insert the proxy through which a computer in your company's network connects to the internet and the port that needs to be opened. Commonly used web proxy server ports are 3128, 8080, 6588 and 80.
Hope this helps!
After trying each of the solutions here, and none of them working, I figured out the problem through trial and error. Ultimately, my proxy and port were wrong. In my example, I followed instructions in the link to find my proxy by IE -> Tools -> Internet Options -> Connections tab -> LAN Settings. However, the proxy was somewhat different from what my computer was using. Thus a fool-proof method was to use the curl package and to use the ie_get_proxy_for_url() function to do so programmatically. When I used the output of the ie_get_proxy_for_url() function, #Lennyy's solution worked (and is thus credited). See code:
library(curl)
library(ggmap)
library(httr)
# Get proxy and port
proxyPort <- ie_get_proxy_for_url()
# Split the string to feed the proxy and port arguments
proxyURL <- strsplit(proxyPort, ":")[[1]][1]
portUsed <- as.integer(strsplit(proxyPort, ":")[[1]][2])
# Set configuration
set_config(use_proxy(url=proxyURL, port = portUsed), override = TRUE)
# Geocode Waco, TX
geocode("waco, texas", output = "latlona")
# Output commented below:
# A tibble: 1 x 3
# lon lat address
# <dbl> <dbl> <chr>
# 1 -97.1 31.5 waco, tx, usa
I am trying to use RSelenium with Dockerto crawl a website. However, I have some issues trying to get RSelenium/Docker to work.
Specifically, I installed Docker on my computer, which appears to be running fine (I see the image of the whale below when I open it).
In R, I then run the following code with no problems and see the expected output.
shell('docker run -d -p 4445:4444 selenium/standalone-chrome')
shell('docker ps')
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d7de815ce644 selenium/standalone-chrome "/opt/bin/entry_poin…" 13 minutes ago Up 13 minutes 0.0.0.0:4445->4444/tcp zen_mclean
But when I then run the following code, I always receive the following error message:
remDr <- RSelenium::remoteDriver(remoteServerAddr = "localhost",
port = 4444,
browserName = "chrome")
remDr$open()
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4444: Connection refused
I am not sure what is going on here (I'm new to scraping). Can anybody help me figure out what to do here?
If it helps, I am running Windows 10.
In docker, you've binded your hosts port 4445 to the selenium-driver port 4444.
Which means if you run R in your host, you need to specify port = 4445
Does that solve it?
I managed to solve the problem by uninstalling Docker Toolbox and VMBox, which I was using, and installing the latest version of Docker from their website instead.
I am trying to connect to shinyapps via Rstudio using the setAccountInfo function in the rsconnect package:
rsconnect::setAccountInfo(name='MYACCOUNTNAME',
token='TOKEN',
secret='<SECRET>')
But I am getting the following error:
Error in function (type, msg, asError = TRUE) :
Failed to connect to api.shinyapps.io port 443: Timed out
I am in my office PC and one of the more likely problems would be the firewall of the enterprise, so my questions would be:
Is there a way to workaround this problem and connect anyway?
If not, what would be the instruction I would have to give the IT department to be capable of connecting?
The following options should help you see whats happening:
library(rsconnect)
options(rsconnect.http.trace = TRUE, rsconnect.error.trace = TRUE, rsconnect.http.verbose = TRUE)
rsconnect::setAccountInfo(name='MYACCOUNTNAME',
token='TOKEN',
secret='<SECRET>')
By running this you should see what IP addresses rsconnect is trying to use. Try adding this to a whitelist for your firewall.
If this doesn't work it may be a proxy issue. Issue setting up my shinyapps.io + AUTHORIZE ACCOUNT + time out port 443 This should help set up a proxy in rStudio.
I'm using the RSelenium package to connect to firefox, but I wish to do it via a socks proxy.
In Python, this is achievable using the webdriver package and setting the preferences of the FirefoxProfile, e.g.
profile=webdriver.FirefoxProfile()
profile.set_preference('network.proxy.socks', x.x.x.x)
profile.set_preference('network.proxy.socks_port', ****)
browser=webdriver.Firefox(profile)
However, I can't find how to try set the proxy to be a socks proxy, or to set the socks port in RSelenium. I've tried setting it using the RCurl options, as follows
options(RCurlOptions = list(proxy = "socks5h://x.x.x.x:****"))
but this gives me the following error message
Error in function (type, msg, asError = TRUE) :
Can't complete SOCKS5 connection to 0.0.0.0:0. (1)
Has anyone successfully connected to Firefox using a socks proxy using R code?
I am using version 1.3.5 of RSelenium and version 28.0 of Firefox.
Not tested but something like the following should work:
fprof <- makeFirefoxProfile(list(
"network.proxy.socks" = "squid.home-server"
, "network.proxy.socks_port" = 3128L
, "network.proxy.type" = 1L
)
)
remDr <- remoteDriver(extraCapabilities = fprof)
remDr$open()