Retrieve data from a web page table using RSelenium - r

I am trying to scrape the annual maximum flow data from this National River Flow Archive (UK) website:
http://nrfa.ceh.ac.uk/data/station/info/69032
using RSelenium.
I can't find a way to negotiate the drop down menu. At present I can semi-automate the process using:
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open()
i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
# read the raw html and parse
doc<-htmlParse(remDr$getPageSource()[[1]])
peak.flows <- as.numeric(readHTMLTable(doc)$tablesorter[, "Flow (m3/s)"])
This is a bit of a hack and involves me having to click a few buttons on the page rather than getting RSelenium to do it. Any suggestions as to how RSelenium can select the "Peak flow data" tab and then the "Maximum Annual (AMAX) data" option from the drop-down menu?

library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open() i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
remDr$findElement(using="css selector",'.selected a')$clickElement()
Sys.sleep(5)
remDr$findElement(using = "css selector", "#selectDataType")$clickElement()
remDr$findElement(using = "css selector", "#selectDataType")$sendKeysToElement(list(key="down_arrow", key="enter"))
Sys.sleep(2)`
If you want to know about the css id of the element of interest, please install [SELECTOR GADGET] plugin into chrome. Highlight the element you want RSelenium to click, then grab the css id.

Related

R Selenium TripAdvisor detailed member info

I'm trying to get the detailed information about reviewer that wrote a certain review.
The problem is that reviewer's information pop-up, when you move to a certain section. I can do that with selenium.
url<-"https://www.tripadvisor.com/Hotel_Review-g644300-d668891-Reviews-Hotel_Creina-Kranj_Upper_Carniola_Region.html#REVIEWS"
driver<- rsDriver()
remDr <- driver[["client"]]
remDr$open()
remDr$navigate(url)
details <- remDr$findElement(using = "xpath", paste("(//div[#class='username mo'])"))
remDr$mouseMoveToLocation(webElement=details)
How can I get a memberid ? See image.
library(rvest)
url<-"https://www.tripadvisor.co.kr/ShowUserReviews-g294197-d306114-r457560253-Grand_Hilton_Seoul-Seoul.html#CHECK_RATES_CONT"
h<-read_html(url)
id<-html_attr(html_node(h,".expand_inline"),"class")
id is expand_inline scrname mbrName_9520BF5DXXXXX
id2<-gsub("expand_inline scrname mbrName_","",id)
id2 is 9520BF5DXXXXX
indurl<-paste0("https://www.tripadvisor.co.kr/MemberProfile-a_uid.",id2)
indinfo<-read_html(indurl)
name<-html_text(html_node(indinfo,".nameText"));name

Webscrape w/ Rselenium and Rvest from dropdown box where id changes

I am looking to scrape some NBA date from the website numberfire at: https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections
I am trying to go into a drop down box and switch the displayed data from Fanduel to Draftkings. The 1st encountered problem is that the web page does not change with the changes to the that pull down menu. I installed and am successfully running selenium to counter this. However the next problem has been that the id for this pull down menu (and the id for all pull down menus) on this site changes with each refresh. This is causing an error in R as it says there is "NoSuchElement", as it cannot lock on to the proper menu box when it goes to the page.
Is there a way with RSelenium to or another package to fix this?
Here is my code in R:
require(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445, browserName = "chrome")
remDr$open()
remDr$navigate("https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections")
iframe <- remDr$findElement(using='id', value="select2-dy8e-container")
remDr$switchToFrame(iframe)
option <- remDr$findElement(using = 'xpath', "//*/option[#value = 'DraftKings']")
option$clickElement()
option
Update after doing a lot of searching on nonstatic Id's I came up with this and it worked:
remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445, browserName = "chrome")
remDr$open()
remDr$navigate("https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections")
webElem <- remDr$findElement('xpath', '//*[(#class = "dropdown-custom dfs-option select2-hidden-accessible")]/option[#value = "4"]')
webElem$clickElement()

Rselenium problems finding an element

I am new in Rselenium, I have been trying to scrape a web page with the following code:
library(reshape)
library(plyr)
library(RSelenium)
#start RSelenium
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
I want to select the area categories (Área m2:), I don't have any problems selecting most of them (for example):
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = 'Hasta 60']"))$clickElement()
But with the last category:
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = '1001 o más']"))$clickElement()
I am having an error:
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
class: org.openqa.selenium.NoSuchElementException
I suppose that the problem has to do with the accent but I have not been able to solve it, how can I select this element?
I could solve it, it seems that R reads the word "más" as "más", I just change the letter á with á and it works

Taking window value using RSelenium

Using RSelenium to open a site like this:
require(RSelenium)
RSelenium::startServer()
remDr <- remoteDriver(browserName = "chrome")
remDr$open()
remDr$navigate("http://www.adobe.com/") #the site is just an example
What command should I use to take in R the results of window.s_adobe?
You could try something in the veins of
res <- remDr$executeScript('return window.screenX;')

dropdown boxes in RSelenium

How can one interact with dropdown boxes in RSelenium? In particular, I can select the dropdown box using findElement but how does one select an option with it?
here is the code to select a drop down list based on xpath.
Since the dropdown is inside an iframe, I have to switch into that iframe first.
It probably is much easier in your situation.
New to RSelenium, check out the quick start tutorial, want to learn more about the function, refer to the pdf documentation.
require(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox")
remDr$open()
remDr$navigate("http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_select")
iframe <- remDr$findElement(using='id', value="iframeResult")
remDr$switchToFrame(iframe)
# change audi to whatever your option value is
option <- remDr$findElement(using = 'xpath', "//*/option[#value = 'audi']")
option$clickElement()

Resources