Using RSelenium to open a site like this:
require(RSelenium)
RSelenium::startServer()
remDr <- remoteDriver(browserName = "chrome")
remDr$open()
remDr$navigate("http://www.adobe.com/") #the site is just an example
What command should I use to take in R the results of window.s_adobe?
You could try something in the veins of
res <- remDr$executeScript('return window.screenX;')
Related
According to this answer https://stackoverflow.com/a/72793082/2554330, there are some bugs in the latest version of chromedriver that have been fixed in the version that works with Google Chrome Beta, so I'd like to try the beta.
This answer https://stackoverflow.com/a/65975577/2554330 shows how to run Google Chrome Beta from Javascript. I'd like to do the same from RSelenium, but I can't spot an equivalent of chrome_options.binary_location.
How do I specify the Chrome location when using RSelenium?
Try following codes:
cPath <- "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe"
ecap <- list(chromeOptions = list("binary" = cPath))
remDr <- remoteDriver(browserName = "chrome", extraCapabilities = ecap)
remDr$open()
Note that startServer() func is now defunct.
These were obtained from this comment by the author of RSelenium.
Try this, the chromedriver is wherever you place it and the beta browser is wherever it gets installed. It's been a long time since I used r/selenium, so slashes maybe the wrong way
require(RSelenium)
RSelenium::startServer(args = c("-Dwebdriver.chrome.driver=C:\\Users\\me\\Documents\\chromedriver.exe")
, log = FALSE, invisible = FALSE)
remDr <- remoteDriver(
browserName = "chrome",
extraCapabilities = list("chrome.binary" = "C:\\Program Files\\ChromeBeta\\chrome.exe")
)
remDr$open()
head(remDr$sessionInfo)
I'm trying to webscrape the job ads from this page: https://con.arbeitsagentur.de/prod/jobboerse/jobsuche-ui/?was=Soziologie%20(grundst%C3%A4ndig)%20(weiterf%C3%BChrend)&wo=&FCT.ANGEBOTSART=ARBEIT&FCT.BEHINDERUNG=AUS&page=1&size=50&aktualitaet=100
However I'm unable to get the information from the individual job ads. I tried it with rvest, xml2 and V8, but I'm a beginner in webscraping and can't manage to solve this problem. It seems that the link doesn't contain the information about the individual job ads, so that navigating with the xPath doesn't work properly.
Does anyone has an idea how to solve this?
Thanks :)
I have been able to extract the job descriptions with the following code :
library(RSelenium)
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate("https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&was=Soziologie%20(grundst%C3%A4ndig)%20(weiterf%C3%BChrend)&id=10000-1189146489-S")
Sys.sleep(10)
list_Button <- remDr$findElements("class name", "ergebnisliste-item")
Sys.sleep(3)
list_Link_Job_Descriptions <- lapply(X = list_Button, FUN = function(x) x$getElementAttribute("href"))
nb_Links <- length(list_Link_Job_Descriptions)
list_Text_Job_Description <- list()
for(i in 1 : nb_Links)
{
print(i)
remDr$navigate(list_Link_Job_Descriptions[[i]][[1]])
Sys.sleep(1)
web_Obj2 <- remDr$findElement("id", "jobdetails-beschreibung")
list_Text_Job_Description[[i]] <- web_Obj2$getElementText()
}
I am new in Rselenium, I have been trying to scrape a web page with the following code:
library(reshape)
library(plyr)
library(RSelenium)
#start RSelenium
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
I want to select the area categories (Área m2:), I don't have any problems selecting most of them (for example):
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = 'Hasta 60']"))$clickElement()
But with the last category:
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = '1001 o más']"))$clickElement()
I am having an error:
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
class: org.openqa.selenium.NoSuchElementException
I suppose that the problem has to do with the accent but I have not been able to solve it, how can I select this element?
I could solve it, it seems that R reads the word "más" as "más", I just change the letter á with á and it works
I am trying to scrape the annual maximum flow data from this National River Flow Archive (UK) website:
http://nrfa.ceh.ac.uk/data/station/info/69032
using RSelenium.
I can't find a way to negotiate the drop down menu. At present I can semi-automate the process using:
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open()
i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
# read the raw html and parse
doc<-htmlParse(remDr$getPageSource()[[1]])
peak.flows <- as.numeric(readHTMLTable(doc)$tablesorter[, "Flow (m3/s)"])
This is a bit of a hack and involves me having to click a few buttons on the page rather than getting RSelenium to do it. Any suggestions as to how RSelenium can select the "Peak flow data" tab and then the "Maximum Annual (AMAX) data" option from the drop-down menu?
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open() i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
remDr$findElement(using="css selector",'.selected a')$clickElement()
Sys.sleep(5)
remDr$findElement(using = "css selector", "#selectDataType")$clickElement()
remDr$findElement(using = "css selector", "#selectDataType")$sendKeysToElement(list(key="down_arrow", key="enter"))
Sys.sleep(2)`
If you want to know about the css id of the element of interest, please install [SELECTOR GADGET] plugin into chrome. Highlight the element you want RSelenium to click, then grab the css id.
So I'm not 100% sure this is possible, but I found a good solution in Ruby and in python, so I was wondering if something similar might work in R.
Basically, given a URL, I want to render that URL, take a screenshot of the rendering as a .png, and save the screenshot to a specified folder. I'd like to do all of this on a headless linux server.
Is my best solution here going to be running system calls to a tool like CutyCapt, or does there exist an R-based toolset that will help me solve this problem?
You can take screenshots using Selenium:
library(RSelenium)
rD <- rsDriver(browser = "phantomjs")
remDr <- rD[['client']]
remDr$navigate("http://www.r-project.org")
remDr$screenshot(file = tf <- tempfile(fileext = ".png"))
shell.exec(tf) # on windows
remDr$close()
rD$server$stop()
In earlier versions, you were able to do:
library(RSelenium)
startServer()
remDr <- remoteDriver$new()
remDr$open()
remDr$navigate("http://www.r-project.org")
remDr$screenshot(file = tf <- tempfile(fileext = ".png"))
shell.exec(tf) # on windows
I haven't tested it, but this open source project seems to do exactly that: https://github.com/wch/webshot
It is a easy as:
library(webshot)
webshot("https://www.r-project.org/", "r.png")