R Selenium TripAdvisor detailed member info - web-scraping

I'm trying to get the detailed information about reviewer that wrote a certain review.
The problem is that reviewer's information pop-up, when you move to a certain section. I can do that with selenium.
url<-"https://www.tripadvisor.com/Hotel_Review-g644300-d668891-Reviews-Hotel_Creina-Kranj_Upper_Carniola_Region.html#REVIEWS"
driver<- rsDriver()
remDr <- driver[["client"]]
remDr$open()
remDr$navigate(url)
details <- remDr$findElement(using = "xpath", paste("(//div[#class='username mo'])"))
remDr$mouseMoveToLocation(webElement=details)
How can I get a memberid ? See image.

library(rvest)
url<-"https://www.tripadvisor.co.kr/ShowUserReviews-g294197-d306114-r457560253-Grand_Hilton_Seoul-Seoul.html#CHECK_RATES_CONT"
h<-read_html(url)
id<-html_attr(html_node(h,".expand_inline"),"class")
id is expand_inline scrname mbrName_9520BF5DXXXXX
id2<-gsub("expand_inline scrname mbrName_","",id)
id2 is 9520BF5DXXXXX
indurl<-paste0("https://www.tripadvisor.co.kr/MemberProfile-a_uid.",id2)
indinfo<-read_html(indurl)
name<-html_text(html_node(indinfo,".nameText"));name

Related

RSelenium: click button?

I am trying to scrape a page, getting the move list of a game of chess, which is located in the menu on the right, under the "moves" tab.
library(RSelenium)
url <- "https://play.xiangqi.com/game/oX00ly"
rD <- RSelenium::rsDriver(browser = "firefox", check = F)
remDr <- rD$client
remDr$navigate(url = url)
when manually clicking the Moves tab in the browser, I can get the desired text via
webElem <- remDr$findElement("css selector", ".Wrapper__MovesTabWrapper-sc-13rqht3-2")
webElem$getElementText()[[1]]
which (correctly) returns
[1] "1\np3+1\nP3+1\n2\ne3+5\nH2+3\n3\nh8+7\nH8+7\n4\nh2+3\nR1+1\n5\nc8=9\nH3+2\n6\nc2+1\nE7+5\n7\nh3+4\nA6+5\n8\nh4+3\nR9=6\n9\nr1=3\nR6+6\n10\nc2+2\nH2+3\n11\nr9=8\nC2=3\n12\nr8+3\nR1=4\n13\nc2-1\nR6=8\n14\nr8+4\nH3+1\n15\ne7+9\nC3+5\n16\ne9-7\nR4+3\n17\nc2=1\nR8=9\n18\nh3-4\nR4=6\n19\nc1=2\nR9-1\n20\nr3=2\nC8+7\n21\ne5-3\nR9=8\n22\nh4-3\nR8+2\n23\nh3-2\nR8+2\n24\ne7+5\nH7+8\n25\nr8-5\nC3+1\n26\nr8+2\nH8+7\n27\np9+1\nH7+5\n28\na6+5\nH5+7\n29\nk5=6\nR6=4\n30\na5+6\nR4+3"
Problem
When trying to click the button through RSelenium, by using
webElem <- remDr$findElement("css selector", "#moves-tab")
webElem <-webElem$clickElement() # or webElem$click()
Nothing seems to happen, and I'm at a loss on how to proceed troubleshooting.
Question
How can I switch to the Moves tab by simulating a click (active event listener)?
Bonus pts: is this possible using the rvest package?
Sometimes being too trigger happy is a problem.
Adding
webElem <- webElem$clickElement()
Sys.sleep(2)
solved the problem.

Why can’t RSelenium press this button?

I’m trying to automate browsing on a site with RSelenium in order to retrieve the latest planned release dates. My problem lies in that there is an age-check that pops up when I visit the URL. The page(age-check-page) concists of two buttons, which I haven’t succeeded to click on through RSelenium. The code that I use thus far is appended below, what is the solution for this problem?
#Varialble and URL
s4 <- "https://www.systembolaget.se"
#Start Server
rd <- rsDriver()
remDr <- rd[["client"]]
#Load Page
remDr$navigate(s4)
webE <- remDr$findElements("class name", "action")
webE$isElementEnabled()
webE$clickElement()
You need to more accurately target the selector:
#Varialble and URL
s4 <- "https://www.systembolaget.se"
#Start Server
rd <- rsDriver()
remDr <- rd[["client"]]
#Load Page
remDr$navigate(s4)
webE <- remDr$findElement("css", "#modal-agecheck .action.primary")
webE$clickElement()

Webscrape w/ Rselenium and Rvest from dropdown box where id changes

I am looking to scrape some NBA date from the website numberfire at: https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections
I am trying to go into a drop down box and switch the displayed data from Fanduel to Draftkings. The 1st encountered problem is that the web page does not change with the changes to the that pull down menu. I installed and am successfully running selenium to counter this. However the next problem has been that the id for this pull down menu (and the id for all pull down menus) on this site changes with each refresh. This is causing an error in R as it says there is "NoSuchElement", as it cannot lock on to the proper menu box when it goes to the page.
Is there a way with RSelenium to or another package to fix this?
Here is my code in R:
require(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445, browserName = "chrome")
remDr$open()
remDr$navigate("https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections")
iframe <- remDr$findElement(using='id', value="select2-dy8e-container")
remDr$switchToFrame(iframe)
option <- remDr$findElement(using = 'xpath', "//*/option[#value = 'DraftKings']")
option$clickElement()
option
Update after doing a lot of searching on nonstatic Id's I came up with this and it worked:
remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445, browserName = "chrome")
remDr$open()
remDr$navigate("https://www.numberfire.com/nba/daily-fantasy/daily-basketball-projections")
webElem <- remDr$findElement('xpath', '//*[(#class = "dropdown-custom dfs-option select2-hidden-accessible")]/option[#value = "4"]')
webElem$clickElement()

Rselenium problems finding an element

I am new in Rselenium, I have been trying to scrape a web page with the following code:
library(reshape)
library(plyr)
library(RSelenium)
#start RSelenium
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
I want to select the area categories (Área m2:), I don't have any problems selecting most of them (for example):
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = 'Hasta 60']"))$clickElement()
But with the last category:
checkForServer()
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$navigate(paste0("http://www.metrocuadrado.com/web/apartamentos/venta/c:bogota"))
remDr$findElement(using = "xpath", paste0("//select[#name = 'arearango']/option[#value = '1001 o más']"))$clickElement()
I am having an error:
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
class: org.openqa.selenium.NoSuchElementException
I suppose that the problem has to do with the accent but I have not been able to solve it, how can I select this element?
I could solve it, it seems that R reads the word "más" as "más", I just change the letter á with á and it works

Retrieve data from a web page table using RSelenium

I am trying to scrape the annual maximum flow data from this National River Flow Archive (UK) website:
http://nrfa.ceh.ac.uk/data/station/info/69032
using RSelenium.
I can't find a way to negotiate the drop down menu. At present I can semi-automate the process using:
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open()
i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
# read the raw html and parse
doc<-htmlParse(remDr$getPageSource()[[1]])
peak.flows <- as.numeric(readHTMLTable(doc)$tablesorter[, "Flow (m3/s)"])
This is a bit of a hack and involves me having to click a few buttons on the page rather than getting RSelenium to do it. Any suggestions as to how RSelenium can select the "Peak flow data" tab and then the "Maximum Annual (AMAX) data" option from the drop-down menu?
library(RSelenium)
checkForServer()
startServer()
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "firefox", platform = "LINUX")
remDr$open() i <- "69032"
remDr$navigate(paste0("http://nrfa.ceh.ac.uk/data/station/peakflow/", i))
remDr$findElement(using="css selector",'.selected a')$clickElement()
Sys.sleep(5)
remDr$findElement(using = "css selector", "#selectDataType")$clickElement()
remDr$findElement(using = "css selector", "#selectDataType")$sendKeysToElement(list(key="down_arrow", key="enter"))
Sys.sleep(2)`
If you want to know about the css id of the element of interest, please install [SELECTOR GADGET] plugin into chrome. Highlight the element you want RSelenium to click, then grab the css id.

Resources