How to clear history in Rselenium's internal web browser? - r

In web browsers it is possible to clear history.
How is it possible to clear history in firefox browser in RSelenium using R commands?

See Possible to disable firefox and chrome default caching?
In RSelenium and firefox you could pass as follows:
fprof <- makeFirefoxProfile(
list(
"browser.cache.disk.enable" = FALSE,
"browser.cache.memory.enable" = FALSE,
"browser.cache.offline.enable" = FALSE,
"network.http.use-cache" = FALSE
)
)
remDr <- remoteDriver(extraCapabilities = fprof)
remDr$open()

Related

How to use Google Chrome Beta with RSelenium

According to this answer https://stackoverflow.com/a/72793082/2554330, there are some bugs in the latest version of chromedriver that have been fixed in the version that works with Google Chrome Beta, so I'd like to try the beta.
This answer https://stackoverflow.com/a/65975577/2554330 shows how to run Google Chrome Beta from Javascript. I'd like to do the same from RSelenium, but I can't spot an equivalent of chrome_options.binary_location.
How do I specify the Chrome location when using RSelenium?
Try following codes:
cPath <- "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe"
ecap <- list(chromeOptions = list("binary" = cPath))
remDr <- remoteDriver(browserName = "chrome", extraCapabilities = ecap)
remDr$open()
Note that startServer() func is now defunct.
These were obtained from this comment by the author of RSelenium.
Try this, the chromedriver is wherever you place it and the beta browser is wherever it gets installed. It's been a long time since I used r/selenium, so slashes maybe the wrong way
require(RSelenium)
RSelenium::startServer(args = c("-Dwebdriver.chrome.driver=C:\\Users\\me\\Documents\\chromedriver.exe")
, log = FALSE, invisible = FALSE)
remDr <- remoteDriver(
browserName = "chrome",
extraCapabilities = list("chrome.binary" = "C:\\Program Files\\ChromeBeta\\chrome.exe")
)
remDr$open()
head(remDr$sessionInfo)

Is it possible to use firefox portable in RSelenium

I would like to use firefox portable with the RSelenium package.
I tried to use the following command seeing commands from other browsers without success.
library(RSelenium)
ecap <- list("moz:firefox.binary.path" = "mozilla-firefox-portable-98-0-1.exe",
"moz:firefoxOptions" = list(args = list("--headless")))
rD <- rsDriver(
browser = "firefox",
extraCapabilities = ecap,
port = 4580L)
Thanks for any help!

RSelenium - Firefox SEC_ERROR_UNKNOWN_ISSUER

i'm trying to open this page in firefox using RSelenium and it throws me the error SEC_ERROR_UNKNOWN_ISSUER.
library(RSelenium)
rD <- rsDriver(verbose = TRUE,
port=3490L,
browser=c("firefox"),
geckover = 'latest',
check = TRUE
)
remote_driver <- rD[["client"]]
remote_driver$maxWindowSize()
remote_driver$setTimeout(type = "implicit", milliseconds = 100000)
remote_driver$setTimeout(type = "page load", milliseconds = 100000)
remote_driver$navigate("https://www.farmaciasahumada.cl")
I added the server certificate error exception in firefox, and I still can't get into the site.
I saw that this problem can be skipped in python or java, but I have not found any solution for R.
With other browsers I have no problems.
I hope someone can help me:(
I can load the page in firefox,
library(RSelenium)
driver <- rsDriver(browser=c("firefox"))
remDr <- driver$client
remDr$navigate("https://www.farmaciasahumada.cl")
Try chrome, update your package and web browser.

How can I stop RSelenium browser from closing automatically when left idle?

I use RSelenium to run a scraping loop which sometimes (infrequently) meets an error and then stops.
The problem for me is that when this happens and I don't check in on the RSelenium session for a while (for like half an hour or so..?), the RSelenium session closes automatically, which removes logs from the session that I want to check.
How can I stop this from happening -- or more precisely, how can I prevent the RSelenium session (and the Firefox browser opened from RSelenium) from closing when left idle for an extended time period?
The following is how I start the scraping -- I open the Firefox browser like this, then go to the URL that I want and then start scraping.
library(RSelenium)
# Running with the browser open ------------------------------------------------
rD <- RSelenium::rsDriver(port = 4454L, browser = "firefox")
remDr <- rD$client
remDr$open()
P.S. Just to clarify, it's okay that the scraping stops once in a while -- that's how I can check for loopholes that I am missing. What I need is a way for me to stop the RSelenium session from closing when left idle. Thank you in advance for any help you can give!
Found a similar issue, https://github.com/ropensci/RSelenium/issues/241
chrome_prefs =
list(
# chrome prefs
"profile.default_content_settings.popups" = 0L,
"download.prompt_for_download" = FALSE
)
chrome_args =
c(
# chrome command arguments
'--headless',
'--window-size=1200,1800',
'-sessionTimeout 57868143'
)
eCaps_notimeout =
list(chromeOptions =
list(
prefs = chrome_prefs,
args = chrome_args
))
remDr <- remoteDriver(
browserName = "chrome",
extraCapabilities = eCaps_withhead
)
Further reference Is there a way too prevent selenium automatically terminating idle sessions?

Rselenium - How to disable images in Firefox profile

How can image downloading be disabled when using Firefox in Rselenium? I want to see if doing so makes a scraping script faster.
I've read the Reselnium package manual including the sections on getFirefoxProfile & makeFirefoxProfile.
I've found this link that shows how to handle chromedriver.
I can disable images for a Firefox instance that I manually open in Windows 10 but Rselenium does not appear to use that same profile.
Previously you would need to set the appropriate preference (in this case
permissions.default.image) however there is now an issue with firefox resetting this value see:
https://github.com/seleniumhq/selenium/issues/2171
a work around is given:
https://github.com/gempesaw/Selenium-Remote-Driver/issues/248
implementing this in RSelenium:
library(RSelenium)
fprof <- makeFirefoxProfile(list(permissions.default.image = 2L,
browser.migration.version = 9999L))
rD <- rsDriver(browser = "firefox", extraCapabilities = fprof)
remDr <- rD$client
remDr$navigate("http://www.google.com/ncr")
remDr$screenshot(display = TRUE)
# clean up
rm(rD)
gc()

Resources