My download code stopped working as my code stopped passing "extraCapabilities" properly.
This is what used to work:
require(RSelenium)
require(XML)
require(data.table)
source(file.path(find.package("RSelenium"), "examples/serverUtils/checkForServer.r"))
source(file.path(find.package("RSelenium"), "examples/serverUtils/startServer.r"))
checkForServer();
server<-startServer()
referencedirectory <- "d://temp"
fprof <- makeFirefoxProfile(list(browser.download.dir = referencedirectory, browser.download.folderList = 2L, browser.download.manager.showWhenStarting = FALSE,
browser.helperApps.neverAsk.saveToDisk="text/xml",browser.tabs.remote.autostart = FALSE,browser.tabs.remote.autostart.2 = FALSE,browser.tabs.remote.desktopbehavior = FALSE))
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox",extraCapabilities = fprof)
remDr$open()
Now it throws an error:
Selenium message:Profile has been set on both the capabilities and these options, but they're different. Unable to determine which one you want to use.
Error: Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
class: java.lang.IllegalStateException
Further Details: run errorDetails method
I have tried an alternative:
rD <- rsDriver(port = 4444L, browser = "firefox", version = "latest", geckover = "0.15.0", iedrver = NULL, phantomver = "2.1.1",
verbose = TRUE, check = TRUE, extraCapabilities = fprof)
That produces the same error in addition to complaining (these complaints do not result in an error by themselves):
Selenium message:wrong number of arguments
If extraCapabilities are removed, the above code executes, but if you then try:
rD <- rsDriver(port = 4446L, browser = "firefox", version = "latest", geckover = "0.15.0", iedrver = NULL, phantomver = "2.1.1",
verbose = TRUE, check = TRUE)
remDr <- rD[["client"]]
fprof <- makeFirefoxProfile(list(browser.download.dir = "D:/temp"))
remDr <- remoteDriver(extraCapabilities = fprof)
remDr$open()
You get the same error after the last line. rsDriver opens a browser, but that browser does not have any of the desired properties. If you close the browser (without closing the server) before trying to assign remDr and open it, you will still get the same error.
I have tried version 13, 14, and 15 of the driver and the Server 3.1.0, with the same result.
I have found the line in Java that is throwing the error, but I cannot figure out how to pass a different Firefox profile than the one that gets automatically generated behind the scenes. I have tried various versions of "Profile"/"requiredProfile"/"FirefoxProfile ", etc., but that does not get recognized as a valid input... I also see some discussion of how it may be done in Java, but not in R.
The code used to work for me until about 36 hours ago, and I have been trying to find the way out of it ever since. I am now at complete loss.
UPDATE: the setup with very sensitive about combination of versions. The brand new Selenium server version (3.3.1) works with Gecko 0.15.0 and Firefox 52. Some other combinations may work, but most do not.
Also, when setting the folder location string you need to be careful. In most contexts within R, the forward slash, / is OS-neutral, as such, I use it most of the time both in UNIX and Windows. However, when setting browser.download.dir in Windows, one apparently has to use the (escaped) backslash, \\. Otherwise the directory assignment will appear to work, but it does not work de facto.
Finally, the recommended approach with rsDriver works AND the approach with the defunct functions also works again (checkForServer() and startServer). Lesson to be learned: do not be unlucky like me in choosing the moment to update your Selenium code
It appears to be an issue with geckodriver(0.15.0)/selenium(3.3.0). I used the following:
library(RSelenium)
referencedirectory <- "c://temp"
fprof <- makeFirefoxProfile(list(browser.download.dir = referencedirectory, browser.download.folderList = 2L, browser.download.manager.showWhenStarting = FALSE,
browser.helperApps.neverAsk.saveToDisk="text/xml",browser.tabs.remote.autostart = FALSE,browser.tabs.remote.autostart.2 = FALSE,browser.tabs.remote.desktopbehavior = FALSE))
rD <- rsDriver(port = 4444L, browser = "firefox", version = "3.1.0", geckover = "0.14.0", iedrver = NULL, phantomver = "2.1.1",
verbose = TRUE, check = TRUE, extraCapabilities = fprof)
which appeared to function correctly. As noted in the documentation I would advise if possible to use a Docker image to run a Selenium Server which will prevent issues with incompatible browser/driver versions.
Update:
There is an updated version of selenium server which should now address this issue:
rD <- rsDriver(port = 4444L, browser = "firefox", version = "3.3.1", geckover = "0.15.0",
verbose = TRUE, check = TRUE, extraCapabilities = fprof)
it seems that you don't really need to makefireprof.
the code is indeed very simple:
remDr=rsDriver(browser=browserName,extraCapabilities=list(acceptInsecureCerts=TRUE,acceptUntrustedCerts=TRUE))
Related
I intend to download and clean databases using RSelenium. I am able to open the link however I am having trouble downloading and opening the database. I believe the xpath is right but when I try to open I receive the following error
Selenium message:no such element: Unable to locate element: {"method":"xpath","selector":"//*[#id="ESTBAN_AGENCIA"]"}
My code is the following:
dir <- getwd()
file_path <- paste0(dir,"\\DataBase") %>% str_replace_all("/", "\\\\\\")
eCaps <- list(
chromeOptions =
list(prefs = list('download.default_directory' = file_path))
)
system("taskkill /im java.exe /f", intern=FALSE, ignore.stdout=FALSE)
#Creating server
rD <- rsDriver(browser = "chrome",
chromever = "101.0.4951.15",
port = 4812L,
extraCapabilities = eCaps)
#Creating the driver to use R
remDr <- remoteDriver(
remoteServerAddr = "localhost",
browserName = "chrome",
port = 4812L)
#Open server
remDr$open()
#Navegating in the webpage of ESTABAN
remDr$navigate("https://www.bcb.gov.br/acessoinformacao/legado?url=https:%2F%2Fwww4.bcb.gov.br%2Ffis%2Fcosif%2Festban.asp")
##Download
remDr$findElement(using ="xpath", '//*[#id="ESTBAN_AGENCIA"]/option[1]')
The element you are trying to access is inside an iframe and you need switch that iframe first in order to access the element.
remDr$navigate("https://www.bcb.gov.br/acessoinformacao/legado?url=https:%2F%2Fwww4.bcb.gov.br%2Ffis%2Fcosif%2Festban.asp")
#Switch to Iframe
webElem <- remDr$findElement("css", "iframe#framelegado")
remDr$switchToFrame(webElem)
##Download
remDr$findElement(using ="xpath", '//*[#id="ESTBAN_AGENCIA"]/option[1]')
First of all, I use a remote R interpreter.
When I unselect "Disable .Rprofile execution on console start" in the settings of DataSpell and save it, IDE throws a weird error when I try to start an R console as below:
CompositeException (3 nested):
------------------------------
[1]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.ui.remote.JupyterRemoteTreeModelServiceVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
[2]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.remote.modules.JupyterRemoteEphemeralModuleManagerVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
[3]: Cannot cast org.jetbrains.plugins.notebooks.jupyter.ui.remote.JupyterRemoteVfsListener to org.jetbrains.plugins.notebooks.jupyter.remote.vfs.JupyterVFileEvent$Listener
------------------------------
I tried to give an empty .Rprofile file. Nothing changed. It throws the same error. Anyway, here is my .Rprofile file:
options(java.parameters = "-Xmx4G")
options(download.file.method = "wget")
project_base <- getwd()
print(paste("getwd:", getwd()))
Sys.setenv(R_PACKRAT_CACHE_DIR = "~/.rcache")
#### -- Packrat Autoloader (version 0.7.0) -- ####
source("packrat/init.R")
#### -- End Packrat Autoloader -- ####
# These ensures that the project uses it private library
p <- .libPaths()[[1]]
Sys.setenv(R_LIBS_SITE = p)
Sys.setenv(R_LIBS_USER = p)
Sys.setenv(R_PACKRAT_DEFAULT_LIBPATHS = p)
packrat::set_opts(use.cache = TRUE)
print(paste("whoami:", system("whoami", intern = TRUE)))
print(paste("libpaths:", .libPaths()))
print(paste0("cache_path: ", packrat:::cacheLibDir()))
restore_packrat <- function(restart = FALSE) {
packrat::restore(
overwrite.dirty = TRUE, prompt = F, restart =
restart, dry.run = F
)
}
snapshot_packrat <- function() {
packrat::snapshot(
ignore.stale = TRUE, snapshot.sources = FALSE,
infer.dependencies = FALSE
)
}
I appreciate the help of anyone who faced this issue and solved it.
PS: I also issued a bug report to the developers. If you have the same problem, please upvote the issue and this question.
https://youtrack.jetbrains.com/issue/R-1393
I am trying to use RSelenium with firefox using a local proxy (Tor) on a linux machine.
I had no problem in installing Tor following this tuto, and the command line wget -qO - https://api.ipify.org; echo do get me an new IP.
Now I am willing to use firefox with RSelenium going through the Tor localhost on port 9050:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 127.0.0.1:9050 *:*
LISTEN 0 128 127.0.0.1:9051 *:*
I use a standalone selenium java (selenium-server-standalone-2.53.0.jar), which work fine with regular RSelenium: here is an example getting the ip displayed on ipchicken
library(RSelenium)
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4444L,
browserName = "firefox"
)
remDr$open()
remDr$navigate("https://ipchicken.com/")
ip <- remDr$findElements(using = "css", value ='b')
print(ip[[1]]$getElementText())
And I do get my IP. Now I want to see it happen with Tor. I thus try to add the proxy option when connecting the remotedriver with firefox:
eCaps <- list("moz:firefoxOptions" = list(
args = c('--proxy-server=localhost:9050'
)))
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4444L,
browserName = "firefox",
extraCapabilities = eCaps
)
I tried '--proxy-server=localhost:9050', '--proxy-server=http://localhost:9050','--proxy-server=socks5://localhost:9050', '--proxy-server=127.0.0.1:9050', and it did not output any error and gave me my initial IP. So it is not working. The standalone says it does execute with the options: for example
22:59:10.288 INFO - Executing: [new session: Capabilities [{nativeEvents=true, browserName=firefox, javascriptEnabled=true, moz:firefoxOptions={args=--proxy-server= 127.0.0.1:9050}, version=, platform=ANY}]])
22:59:10.297 INFO - Creating a new session for Capabilities [{nativeEvents=true, browserName=firefox, javascriptEnabled=true, moz:firefoxOptions={args=--proxy-server= 127.0.0.1:9050}, version=, platform=ANY}]
22:59:30.323 INFO - Done: [new session: Capabilities [{nativeEvents=true, browserName=firefox, javascriptEnabled=true, moz:firefoxOptions={args=--proxy-server= 127.0.0.1:9050}, version=, platform=ANY}]]
What Am I doing wrong ?
Edit
After user1207289's answer, and after realizing that you could directly create a firefox profile in RSelenium, I tried:
eCaps <- makeFirefoxProfile(list(network.proxy.type = 1,
network.proxy.socks = "127.0.0.1",
network.proxy.socks_port = 9050,
network.proxy.socks_version = 5))
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4444L,
browserName = "firefox",
extraCapabilities = eCaps
)
I used integer for network.proxy.socks_port, network.proxy.socks_port and network.proxy.type because of this question, but tried with character also, without any success. I tried with and without network.proxy.socks_version = 5, and it did not work (I am getting my normal ip). I tried network.proxy.socks_port = 9150, but it did not work.
I also tried
eCaps <- list("moz:firefoxOptions" = list(
args = c('network.proxy.socks=127.0.0.1:9050' ,'network.proxy.type=1' )
)
)
but that did not work either.
I could connect to TOR using webdriver and firefox with the below . Just make sure TOR is installed and running. I used it on mac (catalina). You can check port settings according to your OS , in case they are different.
It is in c# but you can pretty much do it for any binding
FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.SetPreference("network.proxy.type", 1);
firefoxOptions.SetPreference("network.proxy.socks", "127.0.0.1");
firefoxOptions.SetPreference("network.proxy.socks_port", 9150);
FirefoxDriverService service = FirefoxDriverService.CreateDefaultService();
IWebDriver driver = new FirefoxDriver(service, firefoxOptions);
When this opens a firefox browser instance , Just visit https://check.torproject.org/ on the same instance to check if it is connected to TOR. And that will confirm you are connected and will show your new ip also
After A lot of searching, I found a way: RSelenium has the getFirefoxProfile function which allows to get a firefox profile.
So I first configured the profile directly from firefox following the same tuto and copied it to my R folder. Using
fprof <- getFirefoxProfile("myprofile.default")
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4444L,
browserName = "firefox",
extraCapabilities = fprof
)
Did work.
I'm trying to download an Excel workbook xls using R's download.file function (Windows 10, R version 3.4.4 (2018-03-15)).
When I download the file manually (using Internet Explorer or Chrome) then the file downloads and I can then open it in Excel without any problems.
When I use download.file in R, the file downloads but size is smaller than correct download file - this file is hmtl file with some notes that my browser is not supported. Tyred different modes and no luck.
My code:
download.file(
url = "https://www.atsenergo.ru/nreport?fid=696C3DB7A3F6019EE053AC103C8C8733",
destfile = "C:/MyExcel.xls",
mode = "wb",
method = "auto"
)
Solving this problem with RSelenium library. ATS site reject any query for downloading file (return .hmtl file with Required javascript enabled message) and in this case Selenium method only works. My code below (where urlList data frame with files download links):
rD <- rsDriver(port = 4444L,
browser = "chrome",
check = FALSE,
geckover = NULL,
iedrver = NULL,
phantomver = NULL)
remDr <- rD$client
for (i in 1:nrow(urlList)) {
tryCatch({
row <- urlList[i,]
remDr$navigate(row$url)
webElem <-
remDr$findElement(using =
'link text', row$FileName)
webElem$clickElement()
},
error = function(e)
logerror(paste(
substr(e, 1, 50),
atsCode,
dateFileName,
sep = "\t"
), logger = loggerName),
finally = next)
}
remDr$close()
# stop the selenium server
rD[["server"]]$stop()
I am trying to run a RSelenium instance to download some pdf files for me without having to click on the dialog boxes (or it opening using pdfjs).
But even if I set my configurations, the Firefox instance still loads the default profile.
RSelenium version: 1.73
Firefox version: 56.0 (32-bit)
Windows: 7 Ultimate
Create profile and start server:
library(RSelenium)
library(rvest)
library(XML)
library(stringi)
cprof <- makeFirefoxProfile(list(
pdfjs.disabled = TRUE,
plugin.scan.plid.all = FALSE,
plugin.scan.Acrobat = "99.0",
browser.helperApps.neverAsk.saveToDisk = 'application/pdf',
browser.download.dir = "C:\\temp")
)
remDr <- rsDriver(port = 4477L, browser = "firefox", check = FALSE, extraCapabilities = cprof)
remDr <- remDr[["client"]]
After Firefox launches I check the configs, the settings have remained in their default state: