I'm trying to scrape reviews from this Google Play site using R (manily "RSelenium" and "rvest":
https://play.google.com/store/apps/details?id=hr.mireo.arthur&hl=en&fbclid=IwAR3c-PkUXOea8KrKLp9Q3JUjCidGmgO2jYX_Qb7O8VuWlHXPIS5nDOfKRKI&showAllReviews=true
I've managed to load page using RSelenium and make a loop which scrolls down the page and clicks on all "Show more" buttons. Here is the code I've used:
#Load packages
library(rvest)
library(dplyr)
library(wdman)
library(RSelenium)
#Open website using RSelenium
url <- 'https://play.google.com/store/apps/details?id=hr.mireo.arthur&hl=en&fbclid=IwAR3c-PkUXOea8KrKLp9Q3JUjCidGmgO2jYX_Qb7O8VuWlHXPIS5nDOfKRKI&showAllReviews=true'
rD <- rsDriver(port = 4567L, browser=c("chrome"), chromever="80.0.3987.106")
remDr <- rD[["client"]]
remDr$open()
remDr$navigate(url)
#Load whole page by scrolling and showing more
xp_show_more <- "//*[#id='fcxH9b']/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div[2]/div"
replicate(5,
{
replicate(5,
{
# scroll down
webElem <- remDr$findElement("css", "body")
webElem$sendKeysToElement(list(key = "end"))
# wait
Sys.sleep(1)
})
# find button
morereviews <- remDr$findElement(using = 'xpath', xp_show_more)
# click button
tryCatch(morereviews$clickElement(),error=function(e){print(e)}) # trycatch to prevent any error from stopping the loop
# wait
Sys.sleep(3)
})
This results with loading all 573 comments, but several comments have "Full review" buttons which have to be clicked in order to see the rest of it. I'm trying to make a script which clicks on all "Full review" buttons (I beleave there are just over 30 of them), but I can't manage to do so. My current script clicks on the first "Full review" buttons
#Click on "Full Review"
xp_full_review <- '//*[#id="fcxH9b"]/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div[2]/div/div[2]/div[2]/span[1]/div/button'
replicate(35,
{
fullreviews <- remDr$findElement(using = 'xpath', xp_full_review)
fullreviews$clickElement()
Sys.sleep(0.5)
})
Can someone spot a mistake and find a way to click on all "Full review" buttons?
Thank you
Related
Would appreciate some help as I am stuck here.
I am trying to write an automated script to download data from the Microsoft Power BI site of the WHO, which can be found here.
But when I try to retrieve the data, the right click function doesn't seem to work - or more likely: I am doing something wrong.
I created a container on Docker that I am accessing with Selenium in R. The script below generates a click on the first page (on the "Download data" button at the lower left-hand side corner of the screen. After a long load time the next screen appears. The goal is to RIGHT-CLICK on the download button of "Vaccine uptake by target group" x "Data".
Here's two screenshots of where I need to create the first left-click and the second right-click.
I have tried multiple approaches, including first selecting the iframe, switching the frame view and then selecting an xpath pointing to the clickable area. That seemed to work.
But when I give the command to right click nothing happens as verified in the VNC rendition. The contextual menu doesn't appear.
Anyone knows what went wrong?
Here's the code I entered:
library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate("https://app.powerbi.com/view?r=eyJrIjoiMWNjNzZkNjctZTNiNy00YmMzLTkxZjQtNmJiZDM2MTYxNzEwIiwidCI6ImY2MTBjMGI3LWJkMjQtNGIzOS04MTBiLTNkYzI4MGFmYjU5MCIsImMiOjh9")
Sys.sleep(30) # WHO is taking its time
#This is first button to bring us to the next page
webElem <- remDr$findElement(using = "xpath", value = "/html/body/div[1]/report-embed/div/div/div[1]/div/div/div/exploration-container/div/docking-container/div/div/div/div/exploration-host/div/div/exploration/div/explore-canvas/div/div[2]/div/div[2]/div[2]/visual-container-repeat/visual-container[22]/transform/div/div[3]/div/visual-modern")
webElem$highlightElement()
webElem$clickElement()
Sys.sleep(30) # again need some time to fully load
# This selects the iframe
webElem <- remDr$findElement(using = "xpath", value = "/html/body/div[1]/report-embed/div/div/div[1]/div/div/div/exploration-container/div/docking-container/div/div/div/div/exploration-host/div/div/exploration/div/explore-canvas/div/div[2]/div/div[2]/div[2]/visual-container-repeat/visual-container[19]/transform/div/div[3]/div/visual-modern/div/iframe")
remDr$switchToFrame(webElem)
# This selects the area for the second click
webElem <- remDr$findElement(using = "xpath", value="/html/body/div/div/a[1]")
remDr$mouseMoveToLocation(webElement = webElem)
# And then the right-click but none of these seem to work:
remDr$click(buttonId = 2)
remDr$click('right')
Thanks for any advice.
I am trying to scrape a page, getting the move list of a game of chess, which is located in the menu on the right, under the "moves" tab.
library(RSelenium)
url <- "https://play.xiangqi.com/game/oX00ly"
rD <- RSelenium::rsDriver(browser = "firefox", check = F)
remDr <- rD$client
remDr$navigate(url = url)
when manually clicking the Moves tab in the browser, I can get the desired text via
webElem <- remDr$findElement("css selector", ".Wrapper__MovesTabWrapper-sc-13rqht3-2")
webElem$getElementText()[[1]]
which (correctly) returns
[1] "1\np3+1\nP3+1\n2\ne3+5\nH2+3\n3\nh8+7\nH8+7\n4\nh2+3\nR1+1\n5\nc8=9\nH3+2\n6\nc2+1\nE7+5\n7\nh3+4\nA6+5\n8\nh4+3\nR9=6\n9\nr1=3\nR6+6\n10\nc2+2\nH2+3\n11\nr9=8\nC2=3\n12\nr8+3\nR1=4\n13\nc2-1\nR6=8\n14\nr8+4\nH3+1\n15\ne7+9\nC3+5\n16\ne9-7\nR4+3\n17\nc2=1\nR8=9\n18\nh3-4\nR4=6\n19\nc1=2\nR9-1\n20\nr3=2\nC8+7\n21\ne5-3\nR9=8\n22\nh4-3\nR8+2\n23\nh3-2\nR8+2\n24\ne7+5\nH7+8\n25\nr8-5\nC3+1\n26\nr8+2\nH8+7\n27\np9+1\nH7+5\n28\na6+5\nH5+7\n29\nk5=6\nR6=4\n30\na5+6\nR4+3"
Problem
When trying to click the button through RSelenium, by using
webElem <- remDr$findElement("css selector", "#moves-tab")
webElem <-webElem$clickElement() # or webElem$click()
Nothing seems to happen, and I'm at a loss on how to proceed troubleshooting.
Question
How can I switch to the Moves tab by simulating a click (active event listener)?
Bonus pts: is this possible using the rvest package?
Sometimes being too trigger happy is a problem.
Adding
webElem <- webElem$clickElement()
Sys.sleep(2)
solved the problem.
I have this website (https://www.sofascore.com/pt/torneio/futebol/brazil/brasileiro-serie-a/325) that I want to get some stats from games by round. There is 38 rounds and the base just shows the first 11. For me to get the rest of the rounds I have to scroll this inner scroll bar but I don't know how to do it.
I use the package RSelenium in R.
Here's the code (so far)...
After this, I don't know what to do...
require(RSelenium)
click <- function(xpath){
webElem <- remDr$findElement(using = "xpath", value = xpath)
webElem$clickElement()
}
driver <- rsDriver(port = 5799L, browser = c('chrome'), chromever = "88.0.4324.96")
url = 'https://www.sofascore.com/pt/torneio/futebol/brazil/brasileiro-serie-a/325'
remDr <- driver[['client']]
remDr$navigate(url) #link
Sys.sleep(1)
# games by round
click('//*[#id="__next"]/main/div/div[2]/div[1]/div[1]/div[6]/div/div[1]/a[2]')
# round options
click('//*[#id="__next"]/main/div/div[2]/div[1]/div[1]/div[6]/div/div[2]/div/div/div/div[1]/div/div[1]/div[2]/div')
I'm trying to interact with all elements of the list in R.
Precisely, I'm trying to click on all elements in list using RSelenium using clickElement() function.
I'm scraping data from this webpage:
https://play.google.com/store/apps/details?id=hr.mireo.arthur&hl=en&fbclid=IwAR3c-PkUXOea8KrKLp9Q3JUjCidGmgO2jYX_Qb7O8VuWlHXPIS5nDOfKRKI&showAllReviews=true
Here is my code so far:
url <- 'https://play.google.com/store/apps/details?id=hr.mireo.arthur&hl=en&fbclid=IwAR3c-PkUXOea8KrKLp9Q3JUjCidGmgO2jYX_Qb7O8VuWlHXPIS5nDOfKRKI&showAllReviews=true'
#Open webpage using RSelenium
rD <- rsDriver(port = 4445L, browser=c("chrome"), chromever="80.0.3987.106")
remDr <- rD[["client"]]
remDr$navigate(url)
#-----------------------------------------Load whole page by scrolling and showing more
xp_show_more <- "//*[#id='fcxH9b']/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div[2]/div"
replicate(5,
{
replicate(5,
{
# scroll down
webElem <- remDr$findElement("css", "body")
webElem$sendKeysToElement(list(key = "end"))
# wait
Sys.sleep(1)
})
# find button
morereviews <- remDr$findElement(using = 'xpath', xp_show_more)
# click button
tryCatch(morereviews$clickElement(),error=function(e){print(e)}) # trycatch to prevent any error from stopping the loop
# wait
Sys.sleep(3)
})
This code will load the whole page and show all comments, but some comments are long and have the "Full review" buttons which have to be clicked in order to show the whole lenght of the comment. I have managed to find all of those buttons (there are 36 of them) using the "findElements" function withfollowing code:
fullreviews <- remDr$findElements(using = 'css selector', value = ".OzU4dc")
This code results in a list of 36 elements and when I want to click on them using this code:
fullreviews$clickElement()
I get the this error:
Error: attempt to apply non-function
I can click on all 36 elements using this 36 lines of code:
fullreviews[[1]]$clickElement()
fullreviews[[2]]$clickElement()
fullreviews[[3]]$clickElement()
...ans so on until 36.
But I would like to make this possible with one function. In case I have to scrape some bigger webpage and have thousands of elements to click.
I have tried this code, but it doesn't work
fullreviews[[1:36]]$clickElement()
I guess some sort of lapply function is needed, but I can't manage to creat one which is working.
Is there a way to do this in a single loop or function?
I want to download a file from a website using RSelenium, with Firefox browser.
I do everything correctly (navigate, select the correct element and write what I want);
now I click the "download" button, then a firefox popup opens and ask me if I want to download the file or "open with..." something else.
Unfortunately I cannot write an example due to privacy constraints.
My question is: how can I switch to the popup window / alert and click "OK" when needed?
I tried the following methods with no success:
remDrv$acceptAlert() -> tells me: NoAlertOpenError
remDrv$executeScript("driver.switchTo().alert().accept()")
I also tried the method
remDrv$getWindowHandles()
but even if the popup is open, the command return me one window only (the beginning one, not the popup), therefore I'm not able to use the:
remDrv$switchToWindow()
to switch to the popup window.
Any ideas?
Thanks
What you are seeing is not a popup it is a download dialog. The download dialog is native in all browsers and cannot be controlled with JavaScript. You can configure Firefox to automatically download for certain file types. You havent given us alot of information.
It can be done by setting an appropriate profile. Here is an example that downloads some financial data. We set four options in a bespoke profile. We have to jump through some hoops selecting options before we get a file to download:
require(RSelenium)
fprof <- makeFirefoxProfile(list(browser.download.dir = "C:\\temp"
, browser.download.folderList = 2L
, browser.download.manager.showWhenStarting = FALSE
, browser.helperApps.neverAsk.saveToDisk = "application/zip"))
RSelenium::startServer()
remDr <- remoteDriver(extraCapabilities = fprof)
remDr$open(silent = TRUE)
remDr$navigate("https://www.chicagofed.org/applications/bhc_data/bhcdata_index.cfm")
# click year 2012
webElem <- remDr$findElement("name", "SelectedYear")
webElems <- webElem$findChildElements("css selector", "option")
webElems[[which(sapply(webElems, function(x){x$getElementText()}) == "2012" )]]$clickElement()
# click required quarter
webElem <- remDr$findElement("name", "SelectedQuarter")
Sys.sleep(1)
webElems <- webElem$findChildElements("css selector", "option")
webElems[[which(sapply(webElems, function(x){x$getElementText()}) == "4th Quarter" )]]$clickElement()
# click button
webElem <- remDr$findElement("id", "downloadDataFile")
webElem$clickElement()